Scientific Junk Food: A brief dissection of nutritional epidemiology
Ignore any research based on food frequency questionnaires. It is cognitive junk food. Reading it will make you mentally obese.
A new study recently came up in journal JAMA Internal Medicine. For reasons we’ll come to, this work is an absurd display of statistical masturbation—a classic example of, “garbage in, garbage out.” Its flaws are so obvious and severe that anyone with even a little basic training in research should be able to spot them, at least if they care at all about rigor and take a moment to read the study design.
If what I’m saying is true, the study should never have been published. But it was. People with PhDs worked on it for years. It went through peer review at a major journal, where it was deemed fit for publication. Journalists wrote articles about it. Doctors shared it on social media.
How can this be?
My intention here is to explain how junk science like this is laundered through the University Industrial Complex, passed off as high-quality research, and disseminated widely via social media and mainstream journalism. This can create an illusion of “scientific consensus”—a bunch of studies pointing in the same direction, not because a diversity of researchers are rigorously measuring the same external reality, but because of a symbiotic relationship between researchers and non-researchers (e.g. journalists, academic journal publishers) whose career interests are all aligned with the perpetuation of a socially ascendent narrative.
This particular study is just one small example of a massive problem in academic science: junk is routinely published, especially in certain fields. One of those fields is, “nutritional epidemiology.”
Let’s first describe the study and the fundamental reasons why it’s results cannot possibly be trustworthy. After that, we will examine the ways in which it has been propagated by media, why the whole field of "nutritional epidemiology” should be approached with extreme skepticism, and the social forces behind why junk studies continue to build up in the digital arteries that carry information throughout modern society.
Scientific Publications: How they work… in theory
Scientific research is hard. After establishing a research question worth pursuing, you have to thoughtfully and carefully execute three broad areas of work before sending a paper out for peer review:
Data collection. Great pains must be taken to collect raw data. Measurements and starting datapoints must be accurate, reliable, and acquired with as much precise as possible. Everything depends on this step: garbage in garbage out.
Data analysis. Data must be processed, analyzed, and visualized with rigor. Statistical tests appropriate to the dataset must be used to test hypotheses. Many studies perform statistical tests that are inappropriate for the datasets they analyze, or perform elaborate analyses on bad data.
Data communication. Critical feedback from colleagues. Additional data collection and analysis, as needed. Eventually, everything is written up as a manuscript, the purpose of which is to clearly communicate what you did and why, so that qualified reviewers can decide whether the work merits publication.
In theory, a research project that gets written up will only make it through the peer review process and get published if it’s good enough. In theory, this means that the results are believable because the underlying dataset was sound and analyzed appropriately—that the results are significant, compelling, and adequately described so they can be understood and reproduced by researchers in the field.
In theory.
In practice, there’s more at play here. One of the major, non-scientific factors that determines whether a study gets published is the potential for the results to get attention. And not just the attention of scientists—attention from wider audiences, such as journalists and the public, who tend to assume that if a study is published you can take the headline results to the bank.
Scientists need to publish work that’s sexy enough to get printed in high-profile journals, which helps them secure future grants. Journals need to sell subscriptions. Universities take a (big) cut of their scientists’ research grants in order to cover costs for things like utilities and building maintenance, the salaries and benefits of administrative staff, and the obscene subscription fees they pay to the journals that determine what is publication-worthy. The fees paid to journals by universities result in some of the biggest profit margins seen in private industry.
Let’s briefly conduct our own review of this paper, which JAMA Internal Medicine will allow you to read for $40. Alternatively, you can purchase an individual subscription to the journal for $346. If you’re affiliated with a major university, you should be able to read it for free because your institution already paid thousands of dollars to subscribe to this one journal. (People use various strategies to avoid paying for access to research papers).
Related content:
The Cholesterol Cult & Heart Mafia: How the process of science evolves into The Science™ of public policy
M&M #17: Scientific Publishing & the Business of Science | Michael Eisen, PhD
M&M #135: History of Diet Trends & Medical Advice in the US, Fat & Cholesterol, Seed Oils, Processed Food, Ketogenic Diet, Can We Trust Public Health Institutions? | Orrin Devinsky, MD
M&M #176: Bad Science, Nutrition Epidemiology, History of Obesity Research, Diet & Metabolic Health | Gary Taubes
New JAMA study: garbage in, garbage out
The purpose of this study was to look for patterns of association between the consumption of dietary fats and mortality. In particular, they were interested in plant vs. animal fat intake. Is consuming fats from plant foods associated with a lower risk of dying compared to fats from animal foods? According to the fancy statistical analyses performed on this dataset, the answer is: Yes.
“The findings from this prospective cohort study demonstrated consistent but small inverse associations between a higher intake of plant fat, particularly fat from grains and vegetable oils, and a lower risk for both overall and CVD mortality.”
Recall the three general steps to producing a scientific paper above: data collection, data analysis, data communication. Each step depends on the previous one. When reviewing a paper, always start at the beginning: the data. All the analysis and brain power in the world won’t matter if a study is built on shoddy data. Garbage in, garbage out.
And that’s where this paper fails. The dataset it relies on is garbage—totally, laughably bad. If you want to understand the dynamics of temperature change in your home using a thermometer to record temperature, you should first validate the thermometer’s accuracy. If the temperate measurements are inaccurate, there’s no point in doing analysis… at least if you care about uncovering what’s true about the world.
At a glance this paper might seem impressive, especially to the untrained eye. Data on over 400,000 adults, a "large cohort study" with 24 years of follow-up. The authors spent three years analyzing the data. Lots of time, energy, and money went into this.
Here’s the key part of the methods description that tells you the results are untrustworthy: “Specific food sources of dietary fats and other dietary information were collected at baseline, using a validated food frequency questionnaire.”
In a nutshell, the way they “know” what people ate is to ask them and take their word for it. People have to remember what they ate, how much of it they ate, and record that in a questionnaire. If I asked you what you ate last week, could you give me a comprehensive list? Could you tell me how much of each food you consumed, in grams? Would someone else be able to accurately calculate from this the specific macronutrient composition of your diet? Of course not.
It gets more ridiculous: The average age of the people they gave questionnaires to: 61.2 years. Imagine asking your grandparents to write down, from memory, everything they’ve been eating lately. That’s more or less the quality of the dataset this paper is based on.
And even more ridiculous: they assumed everyone’s diet remained unchanged for 24 years! When they write, “dietary information were collected at baseline,” they mean that they asked people once, at the very beginning, to remember what they ate. They further assumed this provides an accurate picture of the nutrient composition of their diet, and further assumed that people maintained the same diet for decades.
One thoughtful person who submitted a public comment on the JAMA website described the absurdity:
Think about it: an entire team of highly educated researchers spent three years analyzing this data. Food questionnaires. Filled out from memory. Assumed to accurately reflect people’s diets for over two decades.
Even if they gave people frequent questionnaires, this would all still be based on the assumption that people have reliable memories. I couldn’t tell you everything I ate last weekend. Never mind the fact that they also lacked any information about how the food was cooked and prepared, where it was purchased, how fresh it was, etc.
Despite this basic, obvious, critical flaw in the data, JAMA Internal Medicine decided the study was fit for publication. Why? One potential answer is some combination of laziness or incompetence. Another possible answer is that they knew this type of study, with this conclusion, would garner a lot of attention.
And indeed, it already has.
Science journalism & science journals: unholy alliance
Science journalists love studies like this. People click on the headlines, which often parrot the basic findings without any critical evaluation. Here are some example headlines that were written about this study:
“Greater plant fat intake associated with lower overall and cardiovascular disease mortality”
“Consuming more plant-based fat reduces the risk for overall, CVD mortality”
“Heart disease: Eating more plant fats may lower risk up to 14%”
These attention-grabbing headlines not only generate clicks for the publisher, but presumably make both the writers feel good. They probably believe they are helping people live healthier lives. The more often studies like this are published, and the more often journalists parrot the headline results, the more it seems like there’s a legitimate scientific consensus.
This pattern of media reporting, with little to no consideration of the critical flaws in the underlying data, facilitates the spread of ideologies that align with specific business interests. In this case, that would be the “plant-based movement,” which believes that substituting plant-based foods for animal-based foods is critical for human health. People like Bill Gates seem to believe such things—after all, look at all the studies and headlines that point to an apparent scientific consensus. Mr. Gates may not look healthy, but likely believes that he leads a healthy lifestyle because he has studies like this to point to.
Without being equipped to critically evaluate the quality of the evidence, it probably seems compelling. This is presumably why people like Mr. Gates have invested big bucks into plant-based meat startups. Once you invest that kind of cash in something, it becomes really hard to shake your beliefs. As Upton Sinclair put it: “It is difficult to get a man to understand something when his salary depends on his not understanding it.” This why public health institutions become committed to perpetuating bad ideas—once you’ve invested enough time and money into something, it becomes a part of your social identity. Your reputation depends on it.
The entire field of “nutritional epidemiology” more or less works in the manner illustrated by the individual study above: researchers generate statistical associations from bad data, publish the results in scientific journals, and get lots of feel-good attention when journalists regurgitate the headline results out to the public. Everyone involved reaps career and social benefits: scientists get to publish in brand-name journals, academic journals generate fees from the publication process, journalists get to write feel-good articles that generate clicks. Win-win-win. The only loser is the public, many of whom will internalize and act on what’s reported. And why wouldn’t they? Journalists are simply reporting “the facts” published in prestigious academic journals.
In theory, this is actually how things should work. Anything published in scientific journals should have passed through a rigorous filter (peer review), which journalists should be able to report on as is.
In theory.
But again, the process by which journals publish “the facts” is broken. This is why there are more scientific articles published than ever, in more scientific journals than ever, all while rates of misconduct are going up and many disciplines exist in a state of replication crisis.
To repeat: studies like the one we looked at above get published all the time, in ”top-tier” journals, based entirely on data where they ask senior citizens to write down what they ate from memory. I cannot emphasize enough how absurd and how common this is.
Nutritional Epidemiology: A waste of time & resources
A huge number of studies are published every year based on “food frequency questionnaires.” The problems with doing this have been discussed in more detail elsewhere, and should be obvious with just a little reflection and common sense. Human beings cannot accurately report everything they eat, down to the nearest gram. I mean come on. Eye-witnesses to major crimes, sincerely reporting on events they claim to confidently remember, are often wrong. And yet nutritional epidemiologists would have us believe that you can draw firm conclusions about the nutrient composition of someone’s diet by asking them to write down what they think they remember they ate for lunch last month.
In my discussion with Gary Taubes, he explained that it’s not simply a matter of poor quality data—nutritional epidemiology would be problematic even with higher quality data.
“[Bad data isn’t even] the major issue when it comes to nutritional epidemiology. Even if they could measure the diets perfectly—physical activity and diet perfectly—it’s cliche: association is not causality. You ultimately end up forming an association that tells you that people who engage in a particular dietary behavior have more or less risk than people who don’t… is that risk determined by the behavior or is it determined by the type of people who engage in that behavior?
There’s two [classic examples of this in nutritional epidemiology] now: ultra-processed food consumption and the benefits of mostly plant diets, and the relative evils of meat consumption.
This was captured for me 20 years ago by a friend: “It’s like they’re studying vegetarians in Berkeley who eat at the famous restaurant Chez Panisse after their yoga practice, and comparing them to truck drivers in West Virginia who eat at Denny’s, and then saying that the difference in their health status is the meat consumed.”
If you would like to learn about the problems and pitfalls of nutritional epidemiology in more detail, I recommend the work of Dr. John Ioannidis, who is one of the most respected critics of this field and its methods. These two articles are a good place to start:
The opening statement from the first of those papers states the problem plainly:
Some nutrition scientists and much of the public often consider epidemiologic associations of nutritional factors to represent causal effects that can inform public health policy and guidelines. However, the emerging picture of nutritional epidemiology is difficult to reconcile with good scientific principles. The field needs radical reform.
If we were to take the evidence from epidemiologic cohort studies at face value and assume the evidence they present represents causal associations across the lifespan, we would draw absurd conclusions. Ioannidis gives some funny examples:
For a baseline life expectancy of 80 years, eating 12 hazelnuts daily (1 oz) would prolong life by 12 years (ie, 1 year per hazelnut),1 drinking 3 cups of coffee daily would achieve a similar gain of 12 extra years,2 and eating a single mandarin orange daily (80 g) would add 5 years of life. Conversely, consuming 1 egg daily would reduce life expectancy by 6 years, and eating 2 slices of bacon (30 g) daily would shorten life by a decade, an effect worse than smoking. Could these results possibly be true? Authors often use causal language when reporting the findings from these studies (eg, “optimal consumption of risk decreasing foods results in a 56% reduction of all-cause mortality”).
As Dr. Ioannidis noted, the scientists who produce these associational studies often use causal language when reporting their results—this isn’t just a problem of science journalism, where non-scientists are blindly taking the results of scientific studies at face value. Epidemiologists themselves often present their work in a cause-and-effect manner even though they’re looking at mere associations built from woefully incomplete datasets.
Ioannidis goes on to describe the complexity of human metabolism and nutrition, and the ways in which nutritional epidemiology research fails to disentangle the many different variables at play, all while making dubious assumptions:
Individuals consume thousands of chemicals in millions of possible daily combinations. For instance, there are more than 250,000 different foods and even more potentially edible items, with 300,000 edible plants alone. Seemingly similar foods vary in exact chemical signatures (eg, more than 500 different polyphenols). Much of the literature silently assumes disease risk is modulated by the most abundant substances; for example, carbohydrates or fats. However, relatively uncommon chemicals within food, circumstantial contaminants, serendipitous toxicants, or components that appear only under specific conditions or food preparation methods (eg, red meat cooking) may be influential. Risk-conferring nutritional combinations may vary by an individual’s genetic background, metabolic profile, age, or environmental exposures. Disentangling the potential influence on health outcomes of a single dietary component from these other variables is challenging, if not impossible.
He goes on to describe how public attention can facilitate the citation of studies with important shortcomings:
Nutritional epidemiology articles also attract attention because the public is very interested in (and perpetually misinformed about) nutrition. For example, one of the 20 highest Altmetric scores in 2017 was for a study reporting major survival benefits from coffee. Despite important limitations and shortcomings, such studies also accrue substantial numbers of citations… studies of nutritional epidemiology continue to be published regularly, spuriously affect guidelines, and confuse the public through heated advocacy by experts and nonexperts.
I recommend reading Ioannidis’ papers in full, but you get the idea. There are serious methodological issues with much of the nutritional epidemiology literature. As we saw in the recent example study discussed above, one major issue is the reliance on self-reported feeding data from food frequency questionnaires. Despite claims to the contrary, these are not reliable sources of data.
My personal opinion is to ignore any studies based on food frequency questionnaires (or articles about such studies). They are less than useless. Your cognitive resources are precious and limited. We each have a finite amount of attention, memory, and curiosity that we can deploy each day. If you’re wasting time and attention reading studies (or articles) that are based on unreliable data, you are wasting precious mental resources that would have been better spent elsewhere.
Even worse: you may internalize the findings of such studies as proven facts, contaminating your subsequent thinking. If you’re in the habit of internalizing headline findings, you may even build up elaborate belief systems about human health and nutrition, all based on flimsy data. At best, that will be a waste of time. At worst, you could construct your whole lifestyle around falsehoods, constantly wondering why you’re you fail to look and feel healthy despite your best efforts at, “following the science.”
The piece of “science journalism” below was recently published, garnering lots of attention. How many people read it? How many of them will continue consuming junk, because some journalist wrote about what they think “science says”?
Personally, I didn’t waste my attention.
Related content:
The Cholesterol Cult & Heart Mafia: How the process of science evolves into The Science™ of public policy
M&M #17: Scientific Publishing & the Business of Science | Michael Eisen, PhD
M&M #135: History of Diet Trends & Medical Advice in the US, Fat & Cholesterol, Seed Oils, Processed Food, Ketogenic Diet, Can We Trust Public Health Institutions? | Orrin Devinsky, MD
M&M #176: Bad Science, Nutrition Epidemiology, History of Obesity Research, Diet & Metabolic Health | Gary Taubes
This was awesome, I will definitely be using this as a discussion point in my AP Chemistry class this year.
Well written, many thanks 😊 👍