STATS ARTICLES 2012

2012 | 2011| 2010 |2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003

Presidential Predictors: How are the Washington Redskins such amazing election oracles?
Rebecca Goldin, Ph.D., November 2, 2012
Football, kids, the unemployment rate – which do you think has the best track record of predicting presidential election winners?

election12Most news about the election seems not to be about the differences between the candidates, but on what might predict who will be the winner. A better predictor may be a tried and true method – one that has been on target for many, if not all, of the presidential elections: Washington Redskins Games.

Since, 1936 the Washington Redkins’ last home game before a presidential election has foretold the outcome for the incumbent party. If the Redskins won, so did the incumbent party. If the Redskins lost, so did the incumbent party – until, that is, 2004, when the Green Bay Packers won against the Redskins, predicting that John Kerry would win against George W. Bush.

But never mind that fateful game: in 2008, the Redkins returned to being the prophets of presidential politics by losing to the Pittsburgh Steelers, correctly predicting, in the process, that the presidency would change parties. And that is how we welcomed Barack Obama as the 44th President of the United States.

If football predicts the election, we’ll know on Sunday (November 4) what to expect from the Romney versus Obama. A win for the Redskins is a win for Obama, and a loss is a win for Romney.

But, c’mon, who would take football – even football in Washington D.C. – seriously, as a reliable predictor of a democratic process taking place across an entire country? Football just got lucky.

The thing is, it got very lucky. They Redskins got it right 17 times in a row before 2004, and 18 out of 19 times since 1936. The chance of guessing the presidency correctly in 18 out of 19 elections is less than 4 in 100,000.

What gives?

Consider another predictor: the “kids’ vote.” Nickelodian Kids Pick the President is a voting program for kids promoted by the children’s television channel; their target audience gets to vote online, with a maximum one vote per electronic device. And they also have had a great run of it – getting 5 of the previous 6 elections right. The chance of “guessing” at least five out of six election outcomes correctly is less than 11 percent. But possibly kids’ opinions reflect their parents’ opinions, giving them more credibility than the spurious football success. This year, 65 percent of kids supported Obama, so Romney may be in trouble…

Except that even if kids’ opinions reflect their parents’ presidential preferences, they are not a good representative sample of the United States at large; at best, they are a good sample of parents in the US who have school-age children. But even for that demographic we have problems. Poorer kids may not have as many computer devices (favoring higher over lower socio-economic class votes) and this could bias the kid vote toward Romney.

The kid vote may also over-represent single-parent households, since in dual-parent households both parents can vote – and in the actual election they will have twice the voting power as single-parent families, but not for Nickelodean Kids’ Pick the President. The kids get one vote per device, independent of how many parents there are in the household. This bias would likely increase Obama’s support in the Nickelodean vote, compared to the real vote, since Obama has more support than does Romeny among single parent households. Think of it this way: these single parent households count as much as dual-parent households in the Nick vote; but in the real vote, those dual-parent households have two votes instead of one, and their vote (possibly in favor of Romney) will be comparatively larger.

Another bias in favor of Obama is simple name recognition – Obama does not just represent the incumbent party, he is also a sitting president. Kids are more likely to be influenced by simple name recognition than voting adults. Finally, there’s a minor problem putting Romney at a disadvantage for the kids’ vote: he didn’t participate in the program. Obama participated in an online discussion with kids, but Romney declined.

With all the bias in the sample, even the votes of parents with school-aged kids is unlikely to be well-predicted, yet there’s that high success rate, so it hardly seems random.

Maybe we should consider the unemployment rate. According to The Big Picture, no incumbent has one the election when the unemployment rate was above 7.4 percent. As they put it, “if Obama wins, he will accomplish what no other incumbent has in the past 70 years.” With employment picking up in a number of swing states, one can only wonder if it’ll change the mood, and influence the vote.

Yet these unemployment predictions hide a statistical slight of hand. Why pick 7.4 percent? Reagan won in 1984 with exactly 7.4 percent. If you only look at when the rate was over 7.4 percent, there isn’t much data. Since 1932, there have been four election years in which the rate was strictly over 7.4 percent. Of these four, two were won by the incumbent (FDR in 1936 and in 1940), and two were lost (Ford to Carter in 1976, and Carter to Reagan in 1980). If one insists on just looking after World War II (so one can say an incumbent hasn’t won with such high unemployment rates since World War II), there are only two elections to consider – 1976 and 1980. Sure enough the incumbent lost in these elections – but if these two elections were independent of the unemployment rate and completely randomly determined, there would be a 25% chance that they would both be lost. Not a very impressive record for a possible predictor for 2012.

Even more suspicious is that, if unemployment were the determining driver, why is there such a weak correlation between low unemployment and election results? Since 1944, there were 15 elections in which the rate was 7.4 percent or less. Of these, five were lost by the incumbent despite the low unemployment rate. This result (losing five or fewer – winning 10 or more out of 15) would happen with about a 15 percent probability. This suggests that a below or above 7.4 percent unemployment rate is not very predictive.

The picture changes radically if we move away from this biased 7.4 percent unemployment or below. Why is it biased? The rate was picked after seeing the data. In other words, it’s a particularly convenient choice, which you would observe by seeing the data. An ideal presidential predictor would design a cut-off before knowing that 7.4 percent was the unemployment level at which Reagan won, despite being an incumbent (heavily biasing the predictive power of the unemployment rate).

If one makes the cut at, say, 7 percent or below, we have a less convincing story. In the 17 elections since 1944, there were four election cycles with unemployment above 7 percent (1976, 1980, 1984, and 1992). Of these, three out of four were won by the challenger and one was won by the incumbent. The chances of the challengers winning at least three by chance is 31.25 percent, not a very impressive guess at all. And as for the elections with 7 percent or under unemployment, of the 13 occurring since 1944, the incumbent party lost four times despite the low unemployment rate. The chances of losing at most four times just by chance is just over 13 percent, again not a ringing endorsement for the predictive powers of unemployment.

All of these predictors have the following problem: if we look at a lot of different parameters for what might correlate well with the outcome of an election, we are bound to find some that are powerful – indeed, the Redskins’ performance at the last home game before an election seems the best of the ones considered thus far. No doubt if that had been an economic indicator, rather than a mere game, it would be widely touted as a predictive – even causally so, as unemployment has been.

The role of randomness, and multiple testing

Here’s the truth about randomness: it can look like it’s not random, especially if you cherry pick your statistic after looking at the outcomes.

Imagine you flip a coin to predict the presidency: heads for Democrats, and tails for Republicans. Suppose you do it 19 times, and 18 of these times you get the “right” answer, just as the Redskins’ home games predicted elections since 1936. Given that this has such a small chance of occurring – less than .004 percent – we might think that this coin is truly magic. Or, if it were an economic indicator, that it must have a reason to track so closely with the presidential election.

But on the other hand, if I were to ask 1000 people to flip a coin 19 times, each time tracking the ability of that coin to predict the presidency, and then I simply pick the coin that best fit the actual election outcomes, I am much more likely to get a coin with such talents. As a matter of fact, there is a 4 percent probability that one of these coins will predict 18 or 19 of the elections. And if I allow 10,000 people to flip, there is a 32 percent chance one of the coins will be this predictive.

The analogy here is clear: if you look at 10,000 random parameters, you’re likely to find something that correlates extremely well with presidential outcomes, even if in fact it is completely independent and completely random.

We should be careful not to allow the fact that something could be causally related (for example, because it is not a random outcome), trump the fact that the relationship could be spurious for reasons of randomness. The election is no doubt going to be exactly as predicted by about half of the indicators out there. That’s no endorsement of the indicator; rather it’s an endorsement of the electoral process.

Appendix: How did we get these numbers? (Get ready for some math!)

You may wonder how we calculated all the probabilities mentioned, that something could happen by chance. We assume that there is a probability of .5 that an election would go to one candidate or another if it were determined by chance alone. Then, among n elections, the formula for the chance that exactly k of these elections goes to one party and the other n-k elections goes to the other is given by

formula1

Where n!=n(n-1)(n-2)...1 is the product of n times n-1 times n-2 times n-3, etc., all the way down to 1. For example, when n=19 and k=18, the chances of correctly predicting the presidency exactly 18 times out of 19 is given by

formula2fixagain

The percentage chance of winning is this number, multiplied by 100.

We are interested in calculating sums of these probabilities – i.e. the probability that the football game gets 18 or better correct. So we add in the probability that they get all 19 presidential elections correct,

formula3fixagain

If we add these together, we get approximately .000038, or just under four in 100,000. More generally, the probability of a random (but fair) coin getting k or more correct is given by

formula4fix

where the summand symbol Σ means to sum over the values i=k, i=k+1, i=k+2, etc, all the way to k=n. This is the formula we used, for n=15 and k=10, to find the probability that the unemployment got 10 or more correct predictions for the incumbents presidential bid, if the unemployment rate were just a random guess at the presidential chances.

Finally, for the probability that, among 1000 coins, there would be at least one that gets 18 or 19 out of 19 correct, we found the probability that among 1000 coins, not one would get 18 or 19 right, and subtracted from 1. Since the probability that an individual coin would get 18 or 19 right is .000038, the probability that it would not is .999962. Take this to the power of 1000 to see the probability that 1000 coins would not get 18 or 19 right.

formula5fix

The probability that none of them gets 18 or 19 right is about 96 percent, meaning the probability that at least one coin gets 18 or 19 right is about 4 percent. Similarly, if 10,000 coins are flipped, the chance that none get it right is

formula6fix

Thus the probability that at least one coin gets 18 or 19 right is 1 minus .68, or a 32 percent chance.

Rebecca Goldin, Ph.D. is the Director of Research at STATS.org. Dr. Goldin was supported in part by National Science Foundation Grant #202726


Digg!

Technorati icon View the Technorati Link Cosmos for this entry


Share