**ODDS RATIOS**

Odds Ratios

Rebecca Goldin Ph.D

Possibly the most difficult concept to grasp when reporting research findings

Though they may sound synonymous, the odds ratio and the relative risk of an event are two distinct statistical concepts. The relative risk is the ratio of the probabilities of two events; if *p* is the probability of the first event, and *q* is the probability of the second, then the relative risk is *p/q.* This is what is being referred to when researchers say, for example, that “Smokers' risk of developing coronary heart disease is two to four times that of nonsmokers.” The risk of developing heart disease for smokers is *relative* to the same risk for nonsmokers. Mathematically, this means that the probability for a smoker will be two to four times the probability for a nonsmoker. If the risk of dying from coronary heart disease is 20 percent for a nonsmoker, the risk of dying from this disease if you’re a smoker is somewhere between 40 and 80 percent.

The odds ratio, however, is a ratio of the odds, not the percentages. The word “odds” is not used in the colloquial sense, where it is often used to mean “chance” or “likelihood.” In statistics, the *odds* of an event occurring is the probability of the event divided by the probability of an event *not* occurring.

For example, if a loaded coin lands on heads 75 percent of the time (and it lands on tails 25 percent of the time) the odds of heads is 75/25 = 3. A good way of thinking about this is to ask *how likely will the event occur compared to how likely will the event not occur?* For an ordinary coin flip, where there’s a 50 percent chance of heads and a 50 percent chance of tails, the odds are 1 (50/50). This means that you are just as likely to get heads as you are tails. In the biased coin scenario, the odds are 3, which means that you are three times as likely to get heads as to get tails.

Now we move to the *odds ratio. *This is used when you want to compare the odds of something occurring to two different groups. It is the *ratio* of the odds for the first group and the odds for the second group. The formula is

__p/(1- p)__

q//(1-q)

where *p* is the probability for the first group, and *q* is the probability for the second.

For example, for women undergoing IVF fertility treatment, the likelihood of getting pregnant hovers around 50 percent for women under 35. The *odds* for this group is approximately 1. The women in this group are just as likely to get pregnant as not get pregnant. On the other had, if you are over 40, the chance of a successful IVF pregnancy is approximately 20 percent. The odds for a woman over 40 (who is undergoing treatment) is thus 20/80 (80 being the chance of not getting pregnant), or .25. This means for every one woman who gets pregnant, there are four women who do not; the chance is one in five to get pregnant, but the odds are one to four.

The *odds ratio* is the ratio of these odds, 1/.25 = 4. Mathematically, since *p=.5 *and *q=.2, *the formula is given by

__.5/(1-.5)__

.2/(1-.2)

1/.25= 4

In this case, the odds ratio being 4 does *not mean* that a woman under 35 is 4 times more likely than a woman over 40 to get pregnant; an under-35 woman is 50/20=2.5 times as likely to get pregnant as a woman over 40. The odds ratio being 4 means 35-year-old women have 4 times the odds (not four times the probability!) as women over 40 at a successful IVF pregnancy. In other words, women under 35 have four times the odds as women over 40 of getting pregnant using IVF, but they are only 2.5 times as likely.

Another way of thinking about the odds versus the chance is horse-racing, where odds are still used in the “statistical” sense of the word. The way betting houses often work is that your pay-off will be based on the likelihood you’ll win compared to the likelihood you won’t – in other words, the odds, not the chance.

If a horse has a 1:2 odds to win a race, and the “house” wants to come out even, then the payout will be $2 per dollar bet on the horse winning (if the horse wins) and $1 per dollar bet on the horse losing (per $1 bet on it losing) – plus the original amount you bet.

Now suppose that one horse has odds of 1:2 (i.e. one chance in three he’ll win, and two chances in three he’ll lose), and another horse has odds of 1:3 (i.e. 1 chance in four he’ll win, and three chances in four he’ll lose). The odds ratio of the first horse to the second is (1/2)/(1/3), which is 1.5. This means that if you, as a gambler, win on your horse, you will have won 1.5 times as much money if you bet on the second horse whose odds of winning were lower than if you bet on the first horse.

On the other hand, you are more likely to win by betting on the first horse. In fact, you will win one in three times (33 percent) with the first horse, and only one in four times (25 percent) with the second horse. The relative risk of winning on the first horse compared to the second is .33/.25 = 1.33. In other words, you have a 33 percent greater chance of winning by betting on the first horse (taken from the relative risk). But if you’re a lucky better (i.e. GIVEN that you win), you’ll get a better payoff by going with the second horse; the comparison of winnings between the two horses will be described by the odds ratio.

It’s easy to mistake the odds ratio with relative risk; check out our piece on how the media misquoted odds ratios with regard to IVF treatment and acupuncture. Both the odds ratio and the relative risk have the benefit that they compare two groups and tell you something about the likelihood of something happening to one group compared to another. Both of them also have the property that if the answer is “1” then the event is equally likely for both groups, if the ratio is higher than 1 then the event with probability *p* is more likely to occur, and if the ratio is lower than 1 then the event with probability *q* is more likely to happen.

But the odds ratio and the relative risk can have very different numbers in certain circumstances. If an event is highly likely to happen, or the initial risk of something is high, the odds ratio can still be large while the relative risk is not very high. For example, suppose women in a biology class get an A or B about 80 percent of the time (with odds 80/20 = 4) and men get an A or B about 70 percent of the time (with odds 70/30 = 2.3), then the *odds ratio* of women to men in the course is 4/2.3 = 1.7. But this does not mean that women get As and Bs 70% more frequently than men! Indeed, women are getting As and Bs approximately 14 percent more frequently than the men (80/70=1.14).

But because the chances of both groups getting good grades to begin with are high, the relative risk and odds ratio diverge significantly. The problem is that when this happens, say in medical research, journalists dramatically over-estimate the risk of something happening by turning the odds ratio into a percentage change in risk.

Conditional Probability

Rebecca Goldin Ph.D

The idea of *conditional probability* is very simple: what is the probability that an event happens *given* that something else happened. For example, suppose you flip two coins. You can calculate the probability that you got two heads *given* that you got at least one head: this probability is ½: since you already got one head, you are only calculating the probability that the other coin is a head.

The story quickly gets more complicated for slightly more advanced questions. In fact, we often want to know the probability of A given B, but we only have the probability of B given A. Indeed, many people confuse these two probabilities. For example, one can find the probability of getting lung cancer given that you smoke, which is about 1 in 10 for lifetime risk, depending on how much you smoke. This is very different than the probability that you smoke, given that you have lung cancer (about 85 percent of lung cancer cases involve smokers).

An excellent example under current scrutiny relates to mammograms for women in their 40s. Without other information, the risk of a woman in her 40s having breast cancer is approximately 1 in 5000. Mammograms can detect these cancers, but they also have what are called *false positives*. These are test results that suggest that there is a cancer (requiring a biopsy to confirm) when there actually is not. The false positive rate for mammograms is 1.7 percent, or approximately 17 in 1000.

What does this mean in terms of conditional probability? A woman who knows that she tested positive for breast cancer through a mammogram will want to know what her chance of having cancer is, *given that she tested positive*. This is a conditional probability. In this case, it’s not too hard to calculate. If 10,000 women are tested for cancer, approximately 2 will actually have it (1 in 5,000 is the same as 2 in 10,000). We will assume that the mammogram correctly suggests they have cancer. There are additionally about 170 *false positives* (17 in 1,000 is 170 in 10,000). Thus, in total, 2 actually have cancer while 172 tested positive for cancer, (assuming all cancers are caught). This implies she has a 2/172 chance of actually having cancer. This is about 1 percent.

Notice that the probability of her having cancer given she tested positive is still very low, even though it’s about 50 times the chance than if she hadn’t had a mammogram and had no information about whether she would test positive or negative.

The key here is that conditional probability *conditions* on something else being the case. It often is vastly different than the probability without a condition. We also have to be very careful exactly *what* we are conditioning on when we calculate a probability: The probability that a randomly chosen heart disease patient is bald (assuming someone has heart disease, what is the probability he is bald?) is very different from the chance a random bald person has heart disease (assuming someone is bald, what is the chance he has heart disease?). It’s a tricky business, requiring us to keep track of explicitly what exactly we are assuming about a population before we evaluate the probability of anything else being the case.