BrownMath.com → Statistics → Medical False Positives
Updated 3 Feb 2013

# Medical False Positives and False Negatives(Conditional Probability)

(adapted from pages 136–137 of John Allen Paulos, A Mathematician Reads the Newspaper)

Summary: If a test for a disease is 99% accurate, and you test positive, the probability you actually have the disease is not 99%. In fact, the more rare the disease, the lower the probability that a positive result means you actually have it, despite that 99% accuracy. The difference lies in the rules of conditional or contingent probability.

You’ve taken a test for a deadly disease D, and the doctor tells you that you’ve tested positive. How bad is the news? You need to know how accurate the test is, and specifically you need to know the probability that your positive test result actually means you have the disease.

Suppose you’re told the test for D is “99% accurate” in the following sense: If you have D, the test will be positive 99% of the time, and if you don’t have it, the test will be negative 99% of the time. The other 1% in each scenario would be a false negative or false positive. (For simplicity, I’m using the same percentage for both positive and negative results. Many tests have a different accuracy for positive and negative.) Suppose further that 0.1% — one out of every thousand people — have this rare disease.

You might think that a positive result means you’re 99% likely to have the disease. But 99% is the probability that if you have the disease then you test positive, not the probability that if you test positive then you have the disease.

This kind of thing is easier to understand if you work with numbers of people rather than percentages, and lay the numbers out in a chart like the one below.

Suppose 100,000 people are tested for disease D. Consider the people who actually have the disease (column 1). Since the disease prevalence is 1 in 1000, about 100,000×0.1% = 100 actually have the disease. And since 99% of people with the disease test positive, those 100 people will get about 99 positive tests and 1 negative test.

Now look at the healthy people (column 2). Out of the 100,000 who took the test, 100,000−100 = 99,900 don’t have the disease. Of the healthy people, 99,900×99% = 98,901 will test negative, and the other 999 will test positive.

Finally, add across to find the row totals (column 3). There are 99+999 = 1,098 positive test results and 1+98,901 = 98,902 negative test results.

Sick Healthy (totals) 99 999 1,098 1 98,901 98,902 100 99,900 100,000

From this chart, you can easily answer the questions, “What’s the probability that testing positive means I have the disease? What’s the probability that testing negative means I don’t?”

Out of the 1,098 tests that report positive results, 99 (9%) are correct and 999 (91%) are false positives. Therefore the probability that you actually have disease D, when you’re given a positive test result, is just 9% — for a test that is 99% accurate! Symbolically you can write this as (P(have D | test positive) = 9%. (Remember that P(A|B) is the probability of “if B then A” or “A given that B is true”.)

To reiterate, the conditional probability that you test positive given that you have the disease is

P(test positive | have D) = 99÷100 = 99%

and this is usually what the “accuracy” of the test means. But the conditional probability that you have the disease if you test positive is

P(have D | test positive) = 99÷1098 = about 9%

Can you believe a negative result? Well, if you test negative, the probability that you are actually negative is P(healthy | test negative) = 98,901/98,902, virtually 100%, so a negative result is almost certainly correct.

The exact probabilities will vary depending on the accuracy of the test and the actual incidence of the disease, but always you have to look at the conditional probability. This is one reason why, for a disease like AIDS, patients are never told they test positive until the blood has been retested with a different test, to minimize the chance of a false positive.

Doctors should be familiar with the probabilities when they give test results to patients, but if you get a positive result from a test for an uncommon disease, make sure your doctor understands.

## “Houston, we have a problem!”

Gordon MacGregor points out (email dated 27 Jan 2013) one giant unstated assumption here: that people who have the disease and people who don’t have the disease are equally likely to be tested for it. That’s probably true or nearly true for diseases like HIV or Huntington’s, where people with no symptoms are encouraged to get tested and do.

But it’s emphatically not true for diseases where people are typically not tested unless they have symptoms. So really what we need to know is not the prevalence of the disease among the general population — the 0.1% in the example above — but the proportion of people who take the test that actually have the disease.

There are three questions to answer before interpreting a positive or negative test result:

1. What proportion of people who undergo the test actually have the disease tested for?
2. What is the proportion of false negatives?
3. What is the proportion of false positives?

Let’s biopsy results in testing for breast cancer. (Please understand that what follows is not medical advice, and your own personal family history and risk factors mean that these figures may not apply to you.)

There are different types of biopsies, ordered by doctors for different reasons including the particular patient’s characteristics, but Figure E in Comparative Effectiveness of Core-Needle and Open Surgical Biopsy for the Diagnosis of Breast Lesions: Executive Summary from the US Agency for Healthcare Research and Quality indicates that 26%–35% of women biopsied actually have breast cancer. Fine Needle Aspiration Cytology (breast) from the General Practice Notebook in the UK indicates a false-positive rate of 1% to 3% and a false-negative rate of 10% to 18%. Notice that the false-positive rate is different from the false-negative rate; this is the usual situation.

You could do the analysis using the above ranges. But to keep things simple I’m just going to use the approximate midpoint of each range: say that 30% of women biopsied actually have breast cancer, and FNA biopsies yield 2% false positives and 14% false negatives. Using those figures, here’s the table:

HaveBreast Cancer Don’t HaveBreast Cancer (totals) 25,800 1,400 27,200 4,200 68,600 72,800 30,000 70,000 100,000

Now you can compute probabilities. First, the false-positive rate, the likelihood of a positive result where there’s actually no cancer, was given as

P(positive | no cancer) = 1% to 3% (I used 2%)

But you’re interested in the probability that a positive result has actually detected cancer, and this is not 100% minus 2%.

P(cancer | positive) = 25,800/27,200 = 95%

The given false-negative rate, the probability that a woman who has breast cancer gets a negative biopsy result, was given as

P(negative | cancer) = 10% to 18% (I used 14%)

But what’s the probability that a woman with a negative result actually has breast cancer?

P(cancer | negative) = 4,200/72,800 = 6%

These discrepancies come from the difference between P(A|B) and P(B|A), such as the difference between “getting a positive result if cancer is present” and “having cancer if the test result was positive”. The differences are less than they were in the original example, because the incidence is greater (30% versus 0.1%).

Caution: Again, don’t use this page to make medical decisions. You should work with your doctor, in light of your unique medical situation.

## What’s New

• 3 Feb 2013: Add Houston, we have a problem! and the breast-cancer example.
• (intervening changes suppressed)
• 2 June 2002: New article.