## How to interpret (COVID-19 and other) test results

Version française ici.

In these times of pandemic, there is a lot of talk about the importance of testing. And there is no question indeed that containing the propagation of the virus requires identifying sick people, and therefore testing them. However, it is tempting to take the result of a test for granted: if I test positive, I am sick; if I test negative, I am not.

Unfortunately, tests are not 100% reliable: you may test positive and not be sick – this is called a false positive; you may test negative and nevertheless be sick – this is called a false negative.

In order to understand what it means to receive a positive or negative result from a test, we need two pieces of information:

• The prevalence of the disease, i.e. the proportion of the population who is sick;
• The accuracy of the test, i.e. the probability that the test gives a correct result.
Typically, the prevalence is low, and the accuracy is high. You can set them with these two sliders:
%; %

It is tempting to think that since the test is accurate % of the time, if you receive a positive result, you have % chance of being sick (and conversely, if you receive a negative result, you have % chance of being healthy). THIS IS NOT THE CASE!

In fact, for the parameters above, if you receive a positive result, you have % chance of being sick, and if you get a negative result, you have % chance of being healthy!

Note 1: Rather than using the word "sick", we should say "infected" instead because of the incubation period and the fact that a number of infected people do not get sick at all. For simplicity, we will nevertheless stick with the word "sick".

Note 2: The accuracy of a test is normally measured with two numbers: sensitivity and specificity. We get to these at the end of the page.

### Why is that?

Let us visualize these two parameters:

• In the first diagram below, the prevalence divides the population in two: on the left, those who are sick; on the right, those who are healthy;
• In the second diagram, the accuracy of the test divides the results of the tests in two categories: on the top, those that give a correct result (negative if you are healthy, positive if you are sick); on the bottom, those that give an inaccurate result (negative if your sick – the false negatives, positive if you are healthy – the false positives).
You can change the prevalence and accuracy with the sliders next to the diagrams.

 Prevalence % % % Sick Healthy
 Accuracy: % % Accurate % Inaccurate

Now let's cross these two diagrams. We get the diagram below with four quadrants:

 Sick Healthy Total Accurate test % % % Inaccurate test % % % Total % % 100%

The top part represents the proportion of the population that received a correct result, either positive because they are sick (left, %) or negative because they are healthy (right, %).

The bottom part represents the proportion of the population that received incorrect results: false negatives – the test is negative but they are, in fact, sick (left, %); false positives – the test is positive but they are, in fact, healthy (right, %).

You can change the parameters to see how the four quadrants change.

### Another way to look at it

Another way to look at the diagram is to note that the main diagonal (% + %) represents those who received a negative result: most were healthy (true negatives), but some were sick (false negatives). The other diagonal (% + %) represents those who received a positive result: some were sick (true positives), but some were healthy (false positives).

You can notice that when the prevalence (proportion of sick people) is equal to the inaccuracy of the test (proportion of tests that give an incorrect result, i.e. 100% - accuracy), the proportions of true and false positives become the same (the two rectangles in the top-left and bottom-right have the same area). This is because a small proportion (the false positives) of a large population (the healthy people) can be the same as a large proportion (the real positives) of a small population (the sick people). The net result is that in this case, if you get a positive result, you have only a 50% chance of being sick! Click to show such an example (10% prevalence, 90% accuracy).

In the general case, the confidence you can have in the results is calculated as follows:

 Probability of being sick if you receive a positive test: / ( + ) = % Probability of being healthy if you receive a negative test: / ( + ) = %

As you can see, this is quite different from the accuracy of the test. Here are the two sliders again to explore how the parameters affect the results:
%; %

### An alternative representation

Since we are interested in interpreting the result of a test (positive or negative), let us use a different representation. The diagram below shows the proportion of people who received a negative result on the first line, split between those who were sick and those who were healthy. The second line does the same for those who received a positive result. In other words, we turn the area representations of the previous diagram into bars and organize them differently.

From this diagram we can easily see the proportion of correct results: for the first line, it is the relative size of the bar on the right (negative test and healthy) relative to the whole first bar (all negative tests); for the second line, it is the relative size of the bar on the left (positive test and sick) relative to the whole second bar (all positive tests).

 Sick Healthy Negative test % chance that you are healthy if you receive a negative test.
 Positive test % chance that you are sick if you receive a positive test.

%;

### What about testing for COVID-19

The prevalence of the COVID-19 disease is fairly low. Current estimates are between 5% and 15% of the population.
The accuracy of available tests, on the other hand, is quite low, about 75% (see some references at the end of the page).
You can click to set these parameters and see the results. As you can see, the confidence in positive results is extremely low. This explains why it is not helpful to test the population at large.

By testing only people with symptoms and people who have been in close contact with people who are known to be sick, we test a population where the prevalence of the disease is much higher, say 70%. This increases the confidence in the results, as you can see by clicking . As you can see, the confidence in positive results is high, however the confidence in negative ones is low. You can scroll up to the diagram with the four quadrants to see why: we now have a symmetric situation where a large proportion of a smaller population (those who are healthy and tested negative – %) is simular to a smaller proportion of a larger population (those who are sick but tested negative – %).

### And now for the maths

If you like maths, here are the formulas that lead to these counter-intuitive measures. We note P(x) the probability of event x and P(x | a) the probability of event x given that a is true.

Here are the two parameters again for convenience:
%; %

Based on these two parameters, we can define four basic probabilities:

• P(sick) = prevalence = %
• P(healthy) = 1 - P(sick) = %
• P(accurate) = accuracy = %
• P(inaccurate) = 1 - P(accurate) = %
The test accuracy is the probability that a person tests positive if s/he is sick, and also the probability that a person tests negative if s/he is healthy. These conditional probabilities correspond to the accuracy of the test:
• P(positive | sick) = accuracy
• P(negative | healthy) = accuracy
Note that in general, these are two different values, called sensitivity (probability of getting a positive test when sick) and specificity (probability of getting a negative test when healthy). For simplicity, we use a single value. Below you will find a section where you can set these two parameters separately.

Now we need to calculate the probability that a test turns out positive (resp. negative). This happens when the test is accurate and the person is sick, or when the test is inaccurate and the person is healthy (similarly for negative tests):

• P(positive) = P(accurate) * P(sick) + P(inaccurate) * P(healthy) = accuracy * prevalence + (1-accuracy) * (1-prevalence) = %
• P(negative) = P(accurate) * P(healthy) + P(inaccurate) * P(sick) = accuracy * (1-prevalence) + (1-accuracy) * prevalence = %
The colors help understand how this corresponds to the areas of the two diagonals of the earlier diagram.

What we are interested in are the conditional probabilities:

• P(sick | positive), the probability to be sick if you test positive; and
• P(healthy | negative), the probability to be healthy if you test negative.

To calculate these we use Bayes rule: P(A | B) = P(B | A) * P(A) / P(B):

• P(sick | positive) = P(positive | sick) * P(sick) / P(positive) = accuracy * prevalence / P(positive) = %
• P(healthy | negative) = P(negative | healthy) * P(healthy) / P(negative) = accuracy * (1-prevalence) / P(negative) = %

### Signal detection theory

Traditionally, the tables below are used to present the four cases of interest in what is called Signal Detection Theory:

• Hits : proportion of positive tests and sick (also called true positives);
• Correct rejections : proportion of negative tests and healthy (also called true negatives);
• Misses : proportion of negative tests and sick (also called false negatives); and
• False alarms proportion of positive tests and healthy (also called false positives).

Positive testNegative testTotal
Sick Hit
%
Miss
%
%
Healthy False alarm
%
Correct rejection
%
%
Total % % 100%

If we swap the rows and the columns of the table and use colored bars to represent the percentages, we get the diagram that we saw earlier:

 Sick Healthy Total Negative test Miss% Correct rejection% % Positive test Hit% False alarm% % Total % % 100%

### Sensitivity and Specificity

In the above we have used the word "accuracy" to characterize the proportion of correct results of a test. In practice, there are separate accuracies for positive and negative tests:

• Sensitivity (or true positive rate) measures the proportion of actual positives that are correctly identified as such, i.e. P(positive | sick);
• Specificity (or true negative rate) measures the proportion of actual negatives that are correctly identified as such, i.e. P(negative | healthy).

The diagram below, similar to the one we saw earlier, lets you specify these two rates with the two vertical sliders: the one on the left for sensitivity, the one on the right for specificity.

 Sick Healthy Sensitivity% Specificity%
 Accurate % Inaccurate % Total %
 % Accurate % Inaccurate %

Here is the corresponding alternative representation that we saw earlier:

 Sick Healthy Negative test % chance that you are healthy if you receive a negative test.
 Positive test % chance that you are sick if you receive a positive test.

### Back to COVID-19

In the case of COVID-19, the specificity of the RT-PCR viral test, which is considered the most accurate diagnostic test, is estimated at 75% and its sensitivity at 90%. You can click to see the confidence levels for a prevalence of 10% (testing the general population randomly), or to see the confidence levels for a prevalence of 70% (testing suspicious cases only).

When the probability of a false negative is high, you can increase confidence in the result by taking a new test. In this case, the prevalence is updated to the probability of a true positive. Click to see the confidence after a second test for the current parameters.

In contrast, the COVID-19 antibody tests, which detect if you have antibodies in your blood, have much higher accuracy. The Roche Antibody Test, for example, claims a specificity greater than 99.8% and a sensitivity of 100%. Click to see the confidence in the results for a prevalence of 10%.

#### Why limit the size of gatherings?

The probability of getting sick despite a negative test is not zero. For the current parameters, it is %.

If a group of people, who all tested negative, gets together, the probability that at least one of them is actually sick and risks propagating the virus is %. This percentage increases very quickly if the test is less sensitive. This is why the size of gatherings (family or other) is limited. Moreover, in reality, it is rare that all the people who meet have been tested recently, so this percentage is an underestimate.

You can change the size of the group here : people.

### References

Contact: Michel Beaudouin-Lafonmbl@lri.fr
Thanks to Aditya Bindal for suggesting to estimate the increased risks of gatherings.