We tested the government’s official new AI nutrition tool: Grok

admin-cdn3 hours ago

0 0 3 minutes read

We tested the government’s official new AI nutrition tool: Grok

How trustworthy is the new U.S. food pyramid? It’s a mixed bag, according to the government website devoted to that pyramid.

Kyle Diamantas, head of the Human Foods Program at the Food and Drug Administration, alerted the public this week to a generative artificial intelligence tool added to the government’s “transformational” realfood.gov site. The tool, with a headline “Use AI to get real answers about real food,” features “AI integration to provide parents and consumers with clear and concise answers at the click of a button,” Diamantas wrote on X.

It turns out that the click of a button leads users to Grok — the generative chatbot that’s part of the X social media platform owned by former Trump administration adviser Elon Musk. Asked if the new food pyramid is backed by high-quality research, Grok responds: “Many nutrition scientists and organizations have raised concerns about the evidence quality and process for the final version.” While the guidelines’ recommendations on limiting added sugars and ultra-processed foods are backed by research, Grok says, “the emphasis on saturated fats and animal proteins contradicts longstanding evidence.”

That’s a pretty accurate summary of the nutrition community’s response to the new guidelines and pyramid, which features a prominent rib-eye steak and stick of butter along with less controversial items like broccoli, salmon, and olive oil. But in general, researchers are wary of the risks of turning to AI for nutrition advice.

“I think the use of AI holds promise for providing tailored nutrition advice in a way that is convenient and low-cost,” said Alyssa Moran, a nutrition policy researcher and epidemiologist at the University of Pennsylvania. But she notes that generative AI, just like human health care providers, tends to perpetuate stereotypes about eating and weight, such as stigmatizing obesity. Overall, she said, the models need a lot more testing on how they answer nutrition questions before they’re ready for widespread public use — “[a]nd certainly before these tools are promoted by the government, which is supposed to be looking out for the public’s health.”

Sign up for Morning Rounds

Understand how science, health policy, and medicine shape the world every day

The National Design Studio, which worked on the dietary guidelines website, has not yet responded to a request for comment.

The Grok tool on the government’s dietary guidelines website also includes some sample questions, most of which use language involving “REAL FOOD.” One concerns feeding a vegetarian, Indian-food-loving family “REAL FOOD” on a budget of $200 a week. Grok recommends lentils, rice, and buying in bulk.

Another sample question asks about “REAL FOOD” during pregnancy. Among Grok’s recommendations are “Folic acid/folate (400–800 mcg/day, often via prenatal vitamin),” listing food sources like fortified cereals and leafy greens as well. Medical organizations specifically recommend that pregnant women take folic acid supplements to prevent birth defects rather than relying solely on food sources. Grok does not make that distinction, which could inadvertently play into unfounded fears about folic acid supplements perpetuated by some wellness influencers.

Regardless of concerns about the reliability of large language models, Americans are increasingly turning to them with health questions. Another study published this week in Nature Medicine highlights that differences in how people phrase their questions and what information they emphasize or exclude impacts the accuracy of chatbots’ answers. “Evaluation mistakes are often made when systems are assessed in artificial environments,” co-author Adam Mahdi, an associate professor at Oxford Internet Institute, told STAT via email, explaining why chatbots may outperform doctors under ideal conditions and then falter in the real world.

STAT Plus: Behind new dietary guidelines: Industry-funded studies, opaque science, crushing deadline pressure

All this means users may find their own experiences with Grok diverge when it comes to the tool’s utility. But the STAT newsroom had fun testing it out. One reporter asked, given butter’s prominence on the food pyramid, how many sticks of butter they should be eating each day.

Zero, Grok said: The pyramid’s visual hierarchy shouldn’t be interpreted literally.

“Ah, thank you,” the reporter replied. “That was very confusing. Can you suggest a better way to lay out food guidelines visually to avoid this kind of mistake?”

Sure, Grok said. A better, more intuitive approach would be getting rid of the government’s new inverted food pyramid, and going back to MyPlate.

STAT’s coverage of chronic health issues is supported by a grant from Bloomberg Philanthropies. Our financial supporters are not involved in any decisions about our journalism.

admin-cdn3 hours ago

0 0 3 minutes read