Monday, March 30, 2020

The Imperial College Study: Part 3A

[Links to the full series]

Part 1
Part 2
Part 3A
Part 3B

------------

Now I'm going to look at what fatality rate the Imperial College used, and how it was derived. I'll be splitting this part up into two sections. First (3A), I'm going to look at what a CFR is in general and how it's often calculated. Second (3B), I'll look at specifically how it was determined in the Imperial College study.

1. What is a CFR?


CFR stands for "Case Fatality Rate". It is the proportion of those people who are diagnosed with a certain disease who die over some period of time. This distinguishes it from "Mortality", which is the proportion of a total population who die of a particular disease--so MERS, for example (another zoonotic coronavirus), has a massive CFR because it is almost always fatal if you catch it, but a tiny Mortality because very few people do.

CFR is sometimes a confusing number, however, because of an ambiguity in what "Case" stands for? Is a "case" only someone who has been officially diagnosed with the disease by a doctor? Or is it anybody who can be proved to have had the disease? CFR is actually used in both ways at different times. So, for example, you may have heard it said that the CFR for the seasonal flu is about 0.1%. This is using CFR in the second sense. What epidemiologists do is try to get a good estimate of the total number of people who have actually had the flu in a given season, whether or not they were diagnosed with the flu. Then they try to get a good estimate of the number of people who died because of the flu, again, whether or not they were diagnosed with it. Then they divide the second number by the first number.

This use of "CFR" can only be fully accurate after the fact, when an epidemic or an outbreak has passed. So people *also* use "CFR" to refer to the number of people who are *diagnosed* with a disease who then die. This is useful at the time of an outbreak because it gives a sense of the seriousness of the disease right away. But this usage is recognized to be always an overstatement of the eventual fatality of the disease, because it does not account for those people who come down with the disease but never seek medical treatment, and these people will always be heavily skewed towards the milder forms of the illness.

In order to be more clear about the difference in these two numbers (the percentage of diagnosed cases who die and the percentage of all infected people who die), I would like to use a term that I've seen in some papers, which is IFR, or "Infection Fatality Rate". This would be the percentage of all infected people who die, whereas CFR would be the percentage of those people who are diagnosed with the disease who die.

2. How is the CFR calculated in general?


*After* an outbreak has entirely resolved, CFR is very easy to calculate: just divide the number of diagnosed individuals who died by the total number of diagnoses. During an outbreak, however, these numbers are a moving target and it can be deceptive to try to calculate a CFR from those numbers. Here are some reasons why:

  • Early on in an outbreak, a lot more people are going to be in an early phase of the disease, not a late phase. So a lot of *these* people might *eventually* die, but not be to that stage yet. So if you have a lot of testing and record a lot of sick people at an early stage, you might *understate* your CFR.
  • On the other hand, if you *don't* do a lot of testing, the first wave of people in an outbreak you'll find out about are those people who show up at a hospital very sick. These are going to be disproportionately the percentage of the population who got the worst form of the disease or who got hit particularly hard by it. You can expect these people to die at a higher rate than the general population. So if you don't have good testing of your whole population, your initial CFR rates will be *overstated*.
  • Also, the people who die earlier might be the people who are more likely to die anyway. For example, most of the people who die from covid-19 are probably older people *and* they are more likely to die earlier than the younger fatalities. If you try to calculate a CFR from initial death rates without correcting for an age factor, your initial CFR rates will be *overstated*.

In order to best calculate the true CFR from data as it comes in during an outbreak, you need to take both timeline and vulnerability data into account. For timeline correction, you need to find out what is the average time from onset-of-symptoms till death, and then plot both your death rate and infections chronologically, offset by that timelag. Given a sufficient length of time, the ratio between the two will converge on your true CFR. For vulnerability correction, you should identify the different groups that have different risk levels and split up your CFR calculation to do a separate one for each of those groups.

A key point I'd like to re-emphasize: the calculation of a CFR for a particular outbreak always gets more accurate as time goes along. This means you should pay attention to CFR as generated by those locations where outbreaks occurred *first*, when you are tracking a pandemic that is moving across the globe.

2. How is the IFR calculated in general?


In order to calculate the IFR, you need to be able to identify about how many people in your population get the disease without coming in to the doctor. You will then multiply your CFR by this fraction to get the lower IFR.
In practice, this is done in a couple of ways. The "gold standard" here is to do a serological study: you select random people from your population and look for the tell-tale antibodies in their blood that show that that person had a specific disease and developed an immunity to it. You can then get a good number for percentage of your total population that had that disease recently, whether or not they were caught in the official tallies.

Another way to establish this is by random surveys--you just ask people whether they had a particular disease or not and whether they went to the doctor for it. This is less accurate than the first method because the people you are trying to count are necessarily self-diagnosers, and the accuracy of their self-diagnosis is likely to be flawed.

The CDC has used both approaches in the past and come up with a roughly 50% proportion for the seasonal flu. That is, in order to calculate how many people in total have the flu each year, they multiply the cases determined from hospital records by about 2. Note, however, that this number is specific to the seasonal flu. In general, the more severe a disease is, the more people who have that disease will go to the doctor and the lower the proportion of that "undiagnosed" population will be.

No comments:

Post a Comment