STATS ARTICLES 2012
The Sugar Wars: Science's Fierce, Geeky Debate Over Soda
Trevor Butterworth, October 8, 2012
This article originally appeared on The Awl. The debating season may be presidential, but if the spectacle of supersized pandering served with an unlimited salad bar of platitudes, slogans, and empty promises strikes you as strangely unfulfilling, there is always academia, where, sometimes, the politics are as equally vicious because the stakes are equally as high.
The debating season may be presidential, but if the spectacle of supersized pandering served with an unlimited salad bar of platitudes, slogans, and empty promises strikes you as strangely unfulfilling, there is always academia, where, sometimes, the politics are as equally vicious because the stakes are equally as high. Such was the case in San Antonio recently, at the Obesity Society's 30th annual meeting, the premier scientific conference in the US on what is, arguably, the nation's most pressing health problem. As the prologue to a four-day Finnegan's Wake of technical discussion (did you know that NMDA receptor NR2B subunits in the parabrachial nucleus mediate compensatory feeding?), the society's presidential keynote debate took a simpler route: let's argue about soda. The result was mesmerizing, a data maven's idea of good night out. Two academic foes garlanded with paragraph-length job titles, and trailing hundreds of papers in published research, hurled power point slides filled with p-values and confidence intervals at each other with relentless Titanic geekiness. The audience— a large sample of the best and brightest in obesity research, and those who aspire to join them—squinted to keep up.
There were jokes too, although not many given the amount of charts, and numbers and citations that had to be crammed into a fleeting hour or so of combative colloquy. Dr. Frank Hu, Professor of Nutrition and Epidemiology at Harvard's School of Public Health, started his assault on soda with a slide of Clint Eastwood debating an empty chair, the dominant image from the summer's Republican national convention. Given that this opponent, Dr. David Allison, Distinguished Professor and Director of the National Institutes for Health-funded Nutrition Obesity Research Center at the University of Alabama at Birmingham, was such a "formidable debater," Hu said, he would rather be debating Eastwood's chair, perhaps with some soda bottles sitting on it. Through the magic of PowerPoint, three soda bottles appeared on the chair. The audience laughed. "If I debate that chair," said Hu, "I think I'm going to win." Allison, in turn, made a Dr. Who joke, giving away his age with a picture of Tom Baker, last seen saving the universe against cheaply made balsa wood and plastic BBC aliens in 1981. But that, in a peculiar way, was symbolic. The kids of the 70s had been nourished on cheaply made, weirdly pleasurable, possibly unnatural calories; but their bodies did not turn out to be like Doctor Who's Tardis, a telephone box that expanded on the inside while remaining svelte and fixed in its external dimensions.
Was it the soda, the sugar, the deluge of so-called empty calories that had made us so fat? Or was this no more than the academic equivalent of junk food, emotionally and politically satisfying yet intellectually empty?
Certainly, there's something intuitive to the idea that "Big Gulps" had an asteroid-like effect on the evolution of our girth. In that sense, Hu had the seemingly easy task of indicting soda—or sugar-sweetened beverages (SSBs), to be technically precise and inclusive—as the "plague rat" in the obesity epidemic, arguing for the motion that there is sufficient evidence that decreasing SSBs would reduce the prevalence of obesity and obesity-related diseases. This line of argument has been an increasingly common theme in the media, with activist groups such as the Center for Science in the Public Interest (the group that once denounced fettuccine alfredo as "a heart attack on a plate"), politicians such as New York's Mayor Bloomberg, and academics, such as Yale's Kelly Brownell, repeatedly calling for the government to regulate soda to reduce obesity—and, in the case of Bloomberg, actually doing so through regulating soda serving sizes. It's a tempting scenario: much as one might gut the worldwide prevalence of lung cancer by targeting tobacco, one might arrest and reverse ballooning obesity trends by indicting sugared drinks.
But things are not quite as simple as a media meme or a political trend, otherwise why would the Obesity Society stage such a debate or, indeed, an entire conference wherein soda was, at best, a minor player in a complex drama of cause and effect? To go from room to room at the Henry B. Gonzalez Conference Center was to go from probable cause to possible cause and back again. Sleep loss? Yes, the "epidemic" of sleep loss in America shadows the epidemic in obesity. Circadian rhythm? The timing of your meal may disrupt not only your body's internal clock, which ticks in every tissue, but also the circadian rhythms of the bacteria in your gut. Genetics? Obese mothers have underweight children, which is the key factor in predicting their children's later obesity. Food stress, as mothers wean, can permanently alter the growing brain though it may last only a few days—at least in Macaque monkeys. Using heavier crockery to plate your food can, if you feel the weight, change your perception of how much food you think you'll need to eat. And these are but morsels from the conference menu.
As Mike Gibney, Professor of Food and Health at University College Dublin, recently noted, if we think of heart disease as having a complexity of one, and cancer a complexity of ten, then obesity has a complexity of 100. Which means, he continued, that anyone who argues that one food alone is behind the obesity epidemic is just not being intellectually serious.
Moreover, as the economic historian John Komlos has pointed out, it's a mistake to think that obesity suddenly arrived in America as a problem in the 1980s. If you looked at historical data for Body Mass Index (BMI—a proxy measure for fat), Americans started putting on the pounds in the 1920s and piling them on in the 50s and 60s. It just looked like the obesity epidemic struck in the 80s because that's when the BMI threshold for defining obesity was crossed by a mass of people who had been gradually putting on the pounds year after year. This, in turn, coincided with the Centers for Disease Control beginning to systematically measure it. All of which meant that while the Dr.Who-loving Allison had the seemingly hopeless task of arguing that the evidence wasn't sufficient to indict soda, it was far from clear that Harvard, a powerhouse public health school, was about to score an easy victory over Alabama, a powerhouse obesity research center—at least in front of this audience of experts.
In many ways, there is more going on in this debate than the biochemistry of sugar. Although it was sugar that delivered a media high earlier this year, when pediatrician provocateur Robert Lustig published a comment in Nature claiming that sugar was toxic and needed to be regulated like alcohol (the world of food chemistry rolled its eyes, with female scientists reportedly deriding Lustig's assertion, at a particularly testy conference, that breast milk isn't sweet). Soda, in the past couple years, has come to symbolize a fault line running through the public health academic complex, one with serious implications for politics, economics and the status and prestige of science: when is the scientific evidence good enough to justify policy interventions, such as Mayor Bloomberg's restrictions on soda serving size or hefty taxes per ounce?
The "Harvard" position, as laid out by Hu, could be characterized along the following lines: sugar-sweetened beverages have no nutritional benefit, "so people drinking fewer of them does no harm"—to borrow a phrase from another Harvard speaker at the conference. From this position, it's easy to rationalize government intervention on the basis of less-than-perfect scientific evidence; "good enough" is good enough if you're facing a massive problem that places huge costs on government.
The risk, of course, is that "good enough" is not really much good at all, and policy interventions based on conjecture and weak data won't work—or may turn out to have unanticipated and unfortunate consequences, thereby eroding the public's willingness to accept future, urgent, "science-based" interventions. When New York City's public schools trumpeted their removal of whole milk from lunch menus as a method of tackling childhood obesity, they neglected to mention that kids switched to drinking fat-free chocolate milk, which had just as many calories. At the same time, patently weak or overhyped science has a way of directing the public (or, more to the point, the food industry and politicians) to the disconnect between the weakness of the science, the promise of miraculous benefits, and the concrete loss of liberty and economic cost. This is only exacerbated when proponents of policy experimentalism resort to the kind of inflated marketing claims they would otherwise decry in the food industry, notably the repeated conflation of soda and tobacco, and soda manufacturers with "Big Tobacco" by academics such as Yale's Kelly Brownell. If you repeat the analogy often enough, people will conflate the two, even though obesity is not the same as lung cancer, sugar is not the same as nicotine, and soda is not the same as a cigarette. It doesn't even matter that Brownell's analogy refers to the beverage industry's attempts to defeat soda taxes; for when people think of Big Tobacco, they think not of taxes, but of a product that kills people—and an industry's attempt to hide that fact.
The more one advocates for experimental policy interventions, such as obesity taxes, the further one is forced to travel from the disinterested moral high ground of scientific certainty into the moralizing realm of assertion and spin. Science becomes Colbertian, "science-y"; policy becomes technocratic rather than democratic; and politicians dream the political dream, first articulated by Plato, of social engineering. When Bloomberg put his soda serving proposals to the city's health advisory board, he simply asked the experts that he appointed whether they agreed with him; and with near unanimity, they did. But how much of this consensus was just the theater of clubbability?
The casual way autonomy can be overturned by rhetorical claims to public health has the power to drive people potty. As the philosopher-turned-financier Michael Shaoul noted in a 1992 PhD thesis, the emergence of credit in the 1980s had a profound effect on identity, greatly accelerating the transformation whereby people stopped thinking of themselves as being solely defined by their labor in a particular workplace and began to think of themselves as consumers. What he termed the "consumer metaphor" at the time was to become the dominant interpretation of all professional relationships over the next two decades as we became consumers of healthcare and education rather than patients and students and so on. Public health experts have a tendency to think of the consumer as someone who is consuming the wrong things because they lack the right kind of knowledge or incentive to consume the right things or because an industry has successfully persuaded them to consume the wrong things. But when everyone sees their social role as, fundamentally, being a consumer, there is little transactional difference between a beverage company or a fast-food restaurant and a public health department.
This is something that struck me, as a non-scientist, throughout the conference: there was very little room in the "Harvard" conception of public health for the idea of pleasure as a "good"—as something the consumer values, wants, identifies with, and should be at liberty to pursue even at a personal risk to health.
The "Alabama" position, for want of a better way of characterizing it, represents a more cautious view of public health, and one that requires more stringent scientific evidence before deciding on a position, irrespective of the scale or urgency of the problem. Viewed from this direction, "Harvard" can be seen as contaminating science with too much subjectivity. As Alabama's Allison put it, sure, there's sufficient evidence to ask the question whether soda has an effect on obesity, but that's very different from asking whether there's sufficient evidence for action on soda. "The question 'Is there is sufficient evidence for action,' is inherently subjective," he said, "and depends on which action, in which regulatory context, and according to whose tastes and moral values?" There is little nutritionally redeeming in a donut, so why were we not debating donuts?
Allison repeatedly emphasized the importance of disinterestedness as the foundational and animating value of rational inquiry throughout his speech and rebuttals, while pointing out where bias had skewed the obesity research on soda in the direction of finding associations with weight gain. Allison had, it should be noted, been the target of an ABC News "investigation" last year for taking industry money to do research. The piece insinuated that he had been funded to poke holes in the scientific evidence; however, there was only one source for this expose, the University of North Carolina's Barry Popkin, who was forced to admit that he also took industry money (ABC interviewed at least one other academic who defended the integrity of Allison's research but, strangely, that interview never made it to air). Meanwhile, ABC neglected to mention that Allison's statistical analysis on bias in obesity research had been hailed as a wake-up call to the field by the International Journal of Obesity, and that it also, in effect, cast Popkin's research in a negative light. But then we all know that academic politics are vicious—and that academics will persecute each other in the media if they think they can get away with it.
By demanding a higher threshold for evidence, the Alabama conception of public health also offers a little more breathing space for liberty and pleasure, but as tempting as it is to push the Puritan Harvard vs. good-time Alabama contrast, the difference disappears when scientific disagreement disappears; then, we all must, presumably, obey what the science says.
You can see now where public health experts would divide on the issue of soda and obesity: For "Harvard," the perfect is the enemy of the "good enough," because the good enough is the gateway to action. In contrast, the "good enough" is the enemy of action for "Alabama" because, well, what if it's wrong? This is the divide within public health, echoed in many discussions over the conference, and this was the essence of the opening debate: Harvard's Hu almost immediately conceded that the evidence wasn't perfect, but then argued that we can't wait for better evidence given the scale of the obesity problem; what evidence we had was a clear enough rationale for action; and, as there were "no redeeming qualities" to sugar-sweetened drinks in a person's diet, we should act to curb consumption. In doing so, he explicitly drew a connection between soda and tobacco, showing a slide of Joe Camel slugging a bottle of Coca Cola.
Allison responded by saying that Hu was being too subjective about the evidence, when statistical analysis gave us strong, objective scientific reasons to dismiss it. As a consequence, a significant part of his presentation involved taking Hu's evidence and putting it through a statistical grinder. From the perspective of Oxford-style debating, it was hard not to judge this as something of a knockout, simply by virtue of the way Allison seemed to have anticipated exactly what Hu would say: Thus, when Hu put up a slide showing data, compiled by Allison and Richard Mattes, a professor of food and nutrition at Purdue University, as "proof that an increase in consumption of sugared beverages actually causes weight gain," Allison showed the same slide, updated with additional data, and proceeded to explain how Hu had glossed over its fundamental problems and how the real conclusion of the data was the opposite of what Hu claimed. For the primary continuous endpoints the studies set out to evaluate—body mass index (BMI), weight change, and total body fat—not one of the studies found a statistically significant outcome. "The effect," Allison said, "is essentially zero, neither statistically nor clinically meaningful."
These studies looked like they showed evidence of an effect because they had found some effects by accident and not design, The problem with these "secondary endpoints" is that the study is not statistically designed to find them—they can occur within a subgroup of the study sample, which being smaller in number, can lead to exaggerated statistical significance. If this is confusing to follow, think of an analogy with former President George W. Bush's declaration of "Mission Accomplished" in Iraq, when the mission was anything but accomplished; it had simply been redefined to be a much smaller mission. As a consequence, secondary endpoints are treated with caution in scientific trials—or at least they are supposed to be treated with caution.
If you looked at the secondary endpoints in these studies, said Allison, then yes, each of the studies found one statistically significant result: but they were all for different things and so provided no evidence of consistency that might adduce causality. What was troubling—what the field of obesity research had to avoid—was spinning these secondary endpoints into major findings. You just can't do this when they weren't the endpoints you built your study to examine. Yet this is precisely what he, Allison, had found in a study (with Mark Cope) of bias in the obesity literature; the failure to find effects at the primary endpoints was often overlooked in favor of trumpeting the single statistically significant result at the secondary endpoint as an indication of general causality. They had described this—and other problems—in a paper that had sent shockwaves through the world of obesity research and which coined the phrase White Hat Bias," which might be described as when scientists muck up research because they want to do good.
But it wasn't just Allison who found all of this statistical origami problematic. The European Food Safety Authority, the American Heart Association, the American Diabetes Association, and the German Nutrition Society had previously concluded that the evidence was too limited or muddled to draw the kind of clear conclusions Hu was drawing.
At best, evidence from a study by Cara Ebbeling of Harvard Medical School found that there was a statistically significant reduction in BMI among children who either eliminated or dramatically reduced SSB consumption over six months but who were overweight or obese and soda consumers to begin with. Other studies of interventions that included reducing soda consumption had failed to show any benefit. (A new study by Ebbeling, published the day following the debate in the New England Journal of Medicine, found that overweight and obese children who were part of an intervention group that reduced SSB consumption showed a statistically significant reduction in BMI after one year, compared to a control group that maintained consumption. However, the difference disappeared by the end of the second year, which was the primary endpoint).
And here we come to one of the deeply confusing aspects of the soda debate: what is it precisely that we are actually debating about drinking soda? That it doesn't cause weight gain? That would be nonsensical: sugar contains calories and any increase in your caloric intake without a corresponding increase in energy expenditure is going to result in weight gain. The question is whether soda fails to satiate—if it's composed of so-called empty calories that won't tell your body that it's full (unlike, say, chicken or carrots).
If this is true, then, sugar-sweetened beverages lead the body into an exaggerated state of energy imbalance: you're consuming calories but not feeling full, so you eat more energy than you expend without really noticing. On this account, the amount of soda we drink has disproportionately impacted the prevalence of obesity. Hu argued that, as a consequence, restricting such drinks would have a "significant impact," on obesity—although it would be "unlikely" to solve the problem. But he didn't show any data to illustrate how this impact would be significant (especially in the context of falling sugared soda consumption in the US), while, again, the entire conference testified to a bewildering number of complex and powerful factors behind the obesity problem.
Allison, however, did show data that disproved the empty calorie theory, and that showed that the body is able to compensate to some degree for sugared beverage intake, albeit not perfectly. As he put it, "The body does recognize liquid calories." It was also only common sense, he said, to tell someone who was obese and drinking six liters of soda a day that they should cut back. But this wasn't the same as saying that reducing SSB consumption would reduce obesity.
Meanwhile, Hu gave a shout-out to diet drinks, dismissing research that diet sodas and drinks also contributed to weight gain, a claim that's been receiving a lot of media attention lately. These studies relied on deeply problematic methodologies, he said, such as drawing on people who were already overweight and obese (and who therefore had underlying medical issues) but who had shifted to diet drinks. Consuming diet drinks, he said, led to positive weight outcomes.
Finally, both Hu and Allison agreed on one point: Their esteem for Sir Austin Bradford Hill, the founding father of epidemiology. It was Hill—along with Sir Richard Doll—who was the first to demonstrate the overwhelming statistical association between smoking and lung cancer, and he went on to establish, what might be thought of as epidemiology's equivalent of The Rules" for distinguishing association from causation. Both debaters wanted to claim him for their own position.
Hu argued that when you looked at the randomized control trials for soda, there was, in sum, a 26 percent increased risk for weight gain (a relative risk of 1.26) and that this was strong. "I call that weak," countered Allison, noting that not only was there evidence that this 26 percent might be overstated, for various statistical reasons, the relative risk for smoking and lung cancer was vastly greater— about1900 percent (a relative risk of 20). Hu responded by saying the relative risk for smoking and other diseases was in the 1.26 vicinity, which raised the intriguing question: if that was the only statistical evidence there was against smoking, would there be a public health campaign against tobacco?
A longstanding rule of thumb in epidemiology is that a relative risk of 2 (a 100 percent risk increase) should be the threshold for judging the presence of a meaningful causal effect. As Sir Richard Doll noted, "when relative risk lies between 1 and 2 … problems of interpretation may become acute, and it may be extremely difficult to disentangle the various contributions of biased information, confounding of two or more factors, and cause and effect." Even that may be too low. Paolo Boffetta in his recent study on causation where there are weak epidemiologic associations noted that a relative risk below 3 is considered at best, "moderate." Maddeningly, Hill himself straddled both sides of the statistical fence: the strength of the association is hugely important, he wrote, but at the same time slight effects shouldn't be dismissed, especially if there is consistency among different experiments and if other experimental and interpretive criteria are met. He did, however, say this of sugar: "we should need very strong evidence before we made people… stop… eating the fats and sugar that they do like." Touche? Maybe. "In asking for very strong evidence," he continues, I would, however, repeat emphatically that this does not imply crossing every 't,' and swords with every critic, before we act." The one thing we can be certain of is that Hill is the most flexible of scientific oracles. Perhaps the real question is why, given the vast amounts of government money being spent on obesity research, have there been so few randomized controlled trials—the supposed gold standard for establishing evidence of causality—and so many weak observational studies on sugared drinks?
So what did the audience make of the debate? In a remarkably timid response to the vigor and volume of data on display, the chair, Patrick O'Neill, did not put the motion to a vote among the hundreds of experts and students in the hall. Instead, he asked how many people had changed their minds based on what they heard. A few—but not many—raised their hands. Perhaps the Obesity Society, which had put out a statement in support of Mayor Bloomberg's soda policies, didn't want to risk a vote where the audience, its membership, might be interpreted as disagreeing with that position; or perhaps the academic stakes for publicly confirming where one stood on soda were far too risky for most people who weren't at the top of their careers like Allison and Hu.
This latter hypothesis was partially confirmed when I asked, over the course of the conference, various poster presenters—newly minted and almost PhDs and MDs—what they thought of the debate: all were reluctant to comment and possibly offend one side or the other, without the assurance they wouldn't be named. So, with the promise of anonymity, I can report that some people were already in camp "Harvard" and admitted so on political grounds, or because the search for perfect evidence was a rationale to do nothing; for them, Hu was the clear winner. Others, however, expressed surprise at the dissection of the evidence. "It made me think about the data in a way I hadn't, because I am not that strong on data," said one academic. Another said she went into the debate with an open mind, but with the conviction that telling people what they couldn't eat was not a good idea in the real world; she said what troubled her was that academics have a tendency to go from the hypothesis to the conclusion without analyzing the validity of the data in between, and that the debate, as a consequence, had been eye opening. Another said her coursework in statistical analysis had already been a wake up call to her bias on this issue.
And that perhaps was the broader message to take away from this debate: urgent scientific problems are especially prone to bias because they are urgent. This was driven home by James Hill, Anschutz Endowed Chair of Health and Wellness at the Unversity of Colorado, who spoke before the debate, after receiving one of the Obesity Society's top awards. Introduced as a "pioneer" in obesity medicine and as having made "seminal" contributions to understanding and tackling obesity, Hill made an impassioned plea for everyone in the obesity debate to examine their biases, because all of them—no matter whether they were from the Robert Wood Johnson Foundation or the Centers for Disease Control—were biased. "We assume each effect has a single cause, and cease our search for explanations when sufficient evidence is found," he warned. The result was that relevant science was not being considered: Energy balance, behavioral economics, physical activity and sedentariness all urgently needed more attention in the study of obesity. And when it came to controversy over what the data said, the society could do much good by stepping into the fray as a disinterested broker. There is, he said, a "trade off between the need to act with the need to be right," and that was "the need to no make things worse." Build a policy on weak data and you risk not only that it will fail to work; it will fail in unexpected yet consequential ways.
As he explained, when Brian Wansink and his colleagues at Cornell University took a small town and randomized half the inhabitants to pay a ten percent soda tax, the initial results looked promising as soda consumption went down after a month. But it didn't last. After three and six months the reduction had disappeared. But that wasn't the really surprising thing they found; it was something else, something you would have struggled to predict: after the soda went up in cost, people bought more beer.
Trevor Butterworth is the Editor-at-Large of STATS at George Mason University.