Medical False Positives and False Negatives
Copyright © 2001–2020 by Stan Brown
(frequency table idea adapted from pages 136–137 of John Allen Paulos, A Mathematician Reads the Newspaper)
Copyright © 2001–2020 by Stan Brown
(frequency table idea adapted from pages 136–137 of John Allen Paulos, A Mathematician Reads the Newspaper)
Summary: If a test for a disease is 98% accurate, and you test positive, the probability you actually have the disease is not 98%. In fact, the more rare the disease, the lower the probability that a positive result means you actually have it, despite that 98% accuracy. The difference lies in the rules of conditional or contingent probability.
You’ve taken a test for a deadly disease D, and the doctor tells you that you’ve tested positive. How bad is the news? You need to know P(D | pos), the probability that your positive test result actually means you have D.
To answer this and interpret your test correctly, you need to know three numbers:
Sometimes instead of a false positive rate you’ll know the specificity, which is P(neg | no D), the probability of getting a correct negative result when you don’t have D. If you don’t have D, you must get either a positive or negative result, so P(pos | no D) + P(neg | no D) = 100%. Therefore, specificity = 100% − FPR, or FPR = 100% − specificity.
Sometimes instead of a false negative rate you’ll know the sensitivity, which is P(pos | D), the probability of getting a correct positive result when you have D. If you have D, you must get either a positive or negative result, so P(pos | D) + P(neg | D) = 100%. Therefore, sensitivity = 100% − FNR, or FNR = 100% − sensitivity.
Whether you consider sensitivity and specificity, or the false negative and false positive rates, the two numbers can be the same but typically they are different. But notice that none of them tells you directly what you want to know: does my positive result mean I have D?
Suppose you’re told the test for D is “98% accurate” in the following sense: If you have D, the test will be positive 99% of the time, and if you don’t have it, the test will be negative 97% of the time. In other words , the sensitivity is 99% and so the false negative rate is 1%; the specificity is 97% and therefore the false positive rate is 3%. Suppose further that 0.1% — one out of every thousand people — have D.
You might think that a positive result means you’re 99% likely to have the disease. But 99% is the probability that if you have the disease then you test positive, not the probability that if you test positive then you have the disease. In symbols, P(pos | D) = 99%, but you want to know P(D | pos).
This kind of thing is easier to understand if you work with numbers of people rather than percentages, and lay the numbers out in a chart like the one below.
Suppose 100,000 people are tested for disease D. Consider the people who actually have the disease (column 1). Since the disease prevalence is 0.1% or 1 in 1000, about 100,000 × 0.1% = 100 actually have the disease. And since 99% of people with the disease test positive, those 100 people will get about 99 positive tests and 1 negative test.
Now look at the healthy people (column 2). Out of the 100,000 who took the test, 100,000 − 100 = 99,900 don’t have the disease. Of those healthy people, 2,997 will test positive, and the other 99,900 × 97% = 96,903 will test negative.
Finally, add across to find the row totals (column 3). There are 99 + 2,997 = 3,096 positive test results and 1 + 96,903 = 96,904 negative test results.
|Test result positive||99||2,997||3,096|
|Test result negative||1||96,903||96,904|
|(For a disease that
in 1000 actually has, and|
a test with false positive rate 3% and false negative rate 1%.)
From this chart, you can easily answer the questions, “What’s the probability that testing positive means I have the disease? What’s the probability that testing negative means I don’t?”
Out of the 3,096 tests that report positive results, 2,997 (97%) are false positives, and only 99 (3%) are correct. The probability that you actually have D, when you’re given a positive test result, is just 3%, so we can say that the positive predictive value (PPV) of this test is 3% — for a test that is 98% accurate! Symbolically you can write this as P(D | pos) = 3%. (Remember that P(A|B) is the probability of “if B then A” or “A given that B is true”.)
Let’s recap. The conditional probability that you test positive, given that you have the disease, is
P(pos | D) = 99 ÷ 100 = 99%
and this is what people sometimes call the “accuracy” of the test. (It’s actually the definition of the sensitivity of the test.) But the conditional probability that you have the disease if you test positive, the positive predictive value, is
P(D | pos) = 99 ÷ 3,096 = about 3%
The number of sick people with positive results is on top of both fractions, but the first fraction has the total sick people on the bottom and the second fraction has the total positive results.
Can you believe a negative result? Well, if you test negative, then the probability that you are actually negative is P(no D | neg) = 96,903 ÷ 96,904, close to 100%, so a negative result is almost certainly correct. You can say that the negative predictive value (NPV) is close to 100%.
The exact probabilities will vary depending on the accuracy of the test and the actual incidence of the disease, but always you have to look at the conditional probability. This is one reason why, for a disease like AIDS, patients are never told they test positive until the blood has been retested with a different test, to minimize the chance of a false positive. See Repeating a test, later.
Doctors should be familiar with the probabilities when they give test results to patients, but if you get a positive result from a test for an uncommon disease, make sure your doctor understands.
Gordon MacGregor points out (email dated 27 Jan 2013) one giant unstated assumption here: that people who have the disease and people who don’t have the disease are equally likely to be tested for it. That’s probably true or nearly true for diseases like HIV or Huntington’s, where people with no symptoms are encouraged to get tested and do.
But it’s emphatically not true for diseases where people are typically not tested unless they have symptoms. So really what we need to know is not the prevalence of the disease among the general population — the 0.1% in the example above — but the proportion of people who take the test that actually have the disease.
Let’s think about biopsy results in testing for breast cancer. (Please understand that what follows is not medical advice, and your own personal family history and risk factors mean that these figures may not apply to you.)
There are different types of biopsies, ordered by doctors for different reasons including the particular patient’s characteristics, but Figure E in Comparative Effectiveness of Core-Needle and Open Surgical Biopsy for the Diagnosis of Breast Lesions: Executive Summary from the US Agency for Healthcare Research and Quality indicates that 26%–35% of women biopsied actually have breast cancer. Fine Needle Aspiration Cytology (breast) from the General Practice Notebook in the UK indicates a false-positive rate of 1% to 3% and a false-negative rate of 10% to 18%. (The sensitivity is therefore 82%–90%, and the specificity is 97%–99%.)
You could do the analysis using the above ranges. But to keep things simple I’m just going to use the approximate midpoint of each range: say that 30% of women biopsied actually have breast cancer, and FNA biopsies yield 2% false positives and 14% false negatives. Using those figures, here’s the table:
|Test result positive||25,800||1,400||27,200|
|Test result negative||4,200||68,600||72,800|
30% of women biopsied actually have breast cancer, and
the biopsy has a false positive rate of 2% and false negative rate of 14%.)
Now you can compute probabilities. First, the false-positive rate, the likelihood of a positive result where there’s actually no cancer, was given as
P(pos | no cancer) = 1% to 3% (I used 2%)
But you’re interested in the probability that a positive result has actually detected cancer, and this is not 100% minus 2%.
P(cancer | pos) = 25,800 ÷ 27,200 = 95%
The given false-negative rate, the probability that a woman who has breast cancer gets a negative biopsy result, was given as
P(neg | cancer) = 10% to 18% (I used 14%)
But what’s the probability that a woman with a negative result actually has breast cancer?
P(cancer | neg) = 4,200 ÷ 72,800 = 6%
These discrepancies come from the difference between P(A|B) and P(B|A), such as the difference between “getting a positive result if cancer is present” and “having cancer if the test result was positive”. The differences are less than they were in the original example, because the incidence is greater (30% versus 0.1%).
Caution: Again, don’t use this page to make medical decisions. You should work with your doctor, in light of your unique medical situation.
Reader Jarno Makkonen writes in to ask, “if you repeat the test and get a confirming result, then what does that do to the probability that a positive result is accurate?”
We can say that two positive results give us greater confidence than one, but how much greater? This depends on the exact mechanism that causes a false positive or false negative result, and this will be different for different tests.
One important question is whether a false result is essentially a random occurrence, or is tied in some way to characteristics of an individual. For example, suppose that using alcohol or other recreational drugs makes you more likely to get a false positive result, or suppose having diabetes makes it more likely, or a recent broken bone. The body is so fantastically complicated that I would imagine each of those could affect some test.
But I don’t have medical training, so let’s stick with pure probability. In other words, let’s make an assumption that any person is as likely to get a false positive (or false negative) as any other person, so that nothing in an individual’s biology has a significant effect on the chance of a false positive (or negative). If we make that simplifying assumption, then my reader’s question can be answered.
We want to know, “If I tested positive twice, how likely is it that I have (or don’t have) the disease?” Well, let’s expand the breast-cancer table to show the results of a second test for people who tested positive the first time, or people who tested negative the first time.
To help you read the table a little more easily, I’ve italicized the results of the second tests. For example, in column 1, we see that of the 25,800 women who actually had breast cancer and got a correct positive result the first time, 22,188 got a positive second result and 3,612 got a negative second result: that’s our false negative rate of 14%, and 14% of 25,800 is 3,612.
|1st result positive||25,800||1,400||27,200|
|2nd result positive||22,188||28||22,216|
|2nd result negative||3,612||1,372||4,984|
|1st result negative||4,200||68,600||72,800|
|2nd result positive||3,612||1,372||4,984|
|2nd result negative||588||67,228||67,816|
|(Assuming that 30% of women
biopsied actually have breast cancer, and that|
the biopsy has a false positive rate of 2% and false negative rate of 14%, and that
a false positive is equally likely for everyone, and the same for a false negative.)
What about the 70,000 women in column 2 who don’t have breast cancer? 1,400 will nonetheless get a positive result: that’s our 2% false positive rate. (Remember, we’re assuming that false positives and false negatives don’t depend significantly on any characteristics of the individual, but only on the test itself.) Of course we don’t know which 1,400 women got the wrong result. But we do know that 2% of any women without breast cancer get a false positive, and 2% of 1,400 is 28. The other 1,372 get a correct negative result.
Once the table numbers are filled in, we can answer my reader’s questions. If you have one test, and the result is positive, there’s a 95% chance you have breast cancer (row 2, 25,800/27,200 = 95%). If you have two tests, both positive, the probability rises to nearly 100% (row 3, 22,188/22,216 = 99.87%). On the other hand, if you have one negative result, you have a 94% chance of not having breast cancer (row 5, 68,600/72,800 = 94%); a second negative test pushes that to 99% (row 7, 67,228/67,816 = 99%).
Filling in all those numbers is a fair amount of work, and mistakes are easy to make. You might want to create formulas in terms of these four variables:
It’s a nice intellectual exercise to develop formulas, but if you just want to know the probabilities, take a look at the accompanying Excel workbook.
It’s already set up with the breast-cancer example using the midpoint figures for the variables. You might want to see how the probabilities change when you vary the proportion of women with breast cancer between 26% and 35%, the false positive rate between 1% and 3%, and the false negative rate between 10% and 18%. But of course you can enter numbers for any problem of your own.
As I write this, on 9 May 2020, it’s unfortunately true that a positive or negative result on a COVID-19 test can’t be interpreted using probability. To answer the questions “I tested positive; what’s the chance I actually have the virus?” or “I tested negative; what’s the chance I have the virus anyway?” you need to know three things:
We didn’t know any of those when I first wrote this section, but three and a half weeks later (28 May 2020) we’re getting some ideas. The FDA has published estimated sensitivity and specificity in EUA Authorized Serology Test Performance. (Remember that the false positive rate is 100% minus the specificity, and the false negative rate is 100% minus the sensitivity.) And the CDC is telling us, in the Test Performance section of Interim Guidelines for COVID-19 Antibody Testing:
In most of the country, including areas that have been heavily impacted, the prevalence of SARS-CoV-2 antibody is expected to be low, ranging from <5% to 25%, so that testing at this point might result in relatively more false positive results and fewer false-negative results.
In some settings, such as COVID-19 outbreaks in food processing plants and congregate living facilities, the prevalence of infection in the population may be significantly higher. In such settings, serologic testing at appropriate intervals following outbreaks might result in relatively fewer false positive results and more false-negative results.
(But remember that what matters is not the prevalence of a disease in the population, but the proportion among tested people who actually have the disease.)
Updates and new info: https://BrownMath.com/stat/