Darkened Intellect

Monday, April 20, 2020

Problems with the New Antibody Study from Santa Clara

I've been watching for results from various SARS-CoV-2 antibody testing projects with great interest, so I was interested to know that a Santa Clara County antibody survey just published interim results, here: https://www.medrxiv.org/…/10…/2020.04.14.20062463v1.full.pdf

I was extremely annoyed, however, on reading the results to find out that they recruited for their study using a Facebook ad! This makes their study population essentially self-selected and hence, in my view, practically worthless. I have no idea, either, how you would go about trying to correct for this self-selection bias.

The paper they cited as justification for this practice largely touted Facebook as a "cost effective" way of getting a mostly representative study population--which might be true if you're researching something for which the specific population whose size you are trying to gauge didn't have a vested interest in participating in your study in order to get a very hard-to-obtain and very sought after test. So yeah . . . "cost effective" except that they just wasted over 3000 perfectly good antibody tests on a study with a massive bias problem which we can't realistically quantify or correct for.

More Detailed Criticism

Imagine you are someone who had flu like symptoms a month or a few weeks ago. Now, like everyone in the world, you're wondering, "gee, was that really Covid-19? I bet I had Covid-19 and didn't even know it!" So if you see an ad on Facebook for a "hey, participate in this antibody testing study!", you are highly motivated to say "me! me! me! yes, test me!". On the other hand, if you have not had any flu like symptoms in the past two months, you are only motivated to take the study (which involves getting in your car and driving somewhere to get your blood drawn) if you understand the public health importance of figuring out how many asymptomatic cases there are. Which some people do, but a lot of people don't.

So by the design of the study, the population they are actually studying is probably much more representative of people in Santa Clara County who have had flu or cold symptoms recently than it is of people in Santa Clara County in general. And then of course you are going to way over-sample people who actually did have Covid-19 compared to the rest of the population. The claimed "50 to 85 times as many people" number is meaningless because of this oversampling.

Now, I admit, with something as high profile as Covid-19 antibody testing, it'll be hard to completely eliminate the self-selection bias--but it's not impossible. First of all, I would not use a Facebook ad campaign to recruit volunteers. I would use start with a home survey sent to a randomized set of addresses (as Dr. Streeck did in his antibody testing in the Gengelt). And I would not say in the survey that it was specifically for antibody testing, just that it was a study on Covid-19.

Then I would follow up with an explanation: "OK, so now we would like to do antibody testing on you", and I would explain the importance to public health of getting as high of a participation rate as possible--really sell it hard and try to get everyone you selected to participate. This takes advantage of the existing attention capture, because it's a lot easier to get people to go along with something they've already partially bought into. And I really don't think it would be hard to get very high participation rates from a truly random sample if you approached it this way.

Can Self-Selection Bias be Corrected For?

If you start a study with a survey of a pre-selected set of randomized addresses, then you get to report on what percentage of people didn't participate in your survey. Which means you get to quantify the potential self-selection bias: something like, "since 30% of people chose not to take our test, we might have such-and-such percent selection bias."

The only way to do this with a Facebook ad campaign is to report on the total number of people who saw your ad but didn't click on it . . . which is kind of a dubious number given it's really hard to tell with internet ads how many people actually look at them or not. But the study didn't even report how many views the ad campaign got, or even how many people clicked on the ad, just the total number of people who filled out the initial online survey. So they're not even trying to quantify the massive self-selection bias. This is super shoddy, in my opinion.

How Bad Could the Effect of the Bias Be?

I did some simple calculations in a Google spreadsheet to try to quantify how bad a self-selection bias could be: https://docs.google.com/spreadsheets/d/1JfYxfak6uY4Bd1vBA-HGOYn_OECfGvrIp-H2ygPqn5M/edit?usp=sharing
The goal was to figure out how many different infection rates could actually match the results the Santa Clara study obtained. My question was this:

This study was reporting an interim result of an infection fatality rate for Santa Clara county of 0.12-0.2%. How big an effect would self-selection have to be in order for the true IFR to be actually 1%?

For this, you first have to decide on a percentage of people in Santa Clara county who might think, "hey, I might have had Covid-19 in the past two months". I first set this percentage at 20%, which I think is very generous considering the estimated total percentage of the population who gets the flu for the whole flu season is only 10%--I'm sort of adding in some people who had cold symptoms as well. It's a guestimate, let's go with it.

When you have this percentage, then you can start playing with a multiplier that represent how much more likely it is that people in that specific group of the population (people who have reason to suspect they might have had Covid) would respond to the Facebook ad campaign compared with people who have no reason to think they might have had Covid. The spreadsheet will then tell you how many people you would expect would test positive for Covid from the study under those assumptions. You then need to adjust your numbers till it matches the number of people from the study that actually tested positive for Covid (50), and that will tell you what the self-selecting bias needs to have been.

In order to achieve a target IFR of 1% (around 10 times the study number), and assuming that a full 20% of all people in Santa Clara had reason to believe they had Covid for some reason (again, I feel that very generous), then I need these people to be 4.6 times more likely to respond to the Facebook ad than people who have no reason to suspect they had Covid. I think it's entirely reasonable to think that people who think they might have had Covid would be up to 5 times more likely to respond to such a survey.

If I decrease my 20% estimate and say instead that just 5% of all people in Santa Clara had some reason to think they had Covid earlier, then I only need these people to be 2.9 times more likely to respond to the Facebook ad in order to get the 50 positive tests the study obtained.

Interestingly, after going through this exercise, I think this does open up a way that the self-selection bias could be a least partially detected. The study took a survey of the respondents, and I assume they asked some basic questions including whether they had any cold or flu symptoms in the past few months--although the preliminary report does not say that they took this sort of survey information, so maybe I'm assuming too much. But assuming they did, they could compare the percentage of respondents reporting previous symptoms with the percentage of the general population who actually had undiagnosed flu-like illnesses. This should track with a self-selection bias, I think.

Postscript

After I went through my own analysis, I discovered that a peer reviewer of this study has come to some similar conclusions: https://medium.com/@balajis/peer-review-of-covid-19-antibody-seroprevalence-in-santa-clara-county-california-1f6382258c25. He doesn't do a sensitivity analysis of the same sort that I do, but he does also have a different concern based off of the false positive percentage of the test used as well. The review is worth reading.

Thursday, April 16, 2020

Explaining Shifting Covid-19 Fatality Rates

People have been confused about the wide estimates of the fatality of Covid-19. At one point, the WHO issued an estimate to the effect that 3.8% of people infected by Covid-19 would die. At one point, it looked like Italy was having more than a 10% fatality rate. Later, we've heard a lot of people say something to the effect that Covid-19 is "10 times deadlier than the flu", which works out to be something like a 1% fatality rate. Quite recently, a preliminary report on a serological antibody survey in Germany stated that the true fatality rate in this region works out to be only 0.37%

So why do we see these shifting, very different death estimates? Why is this hard to pin down? The number we are trying to establish here is the Case Fatality Rate, or CFR, and it's defined very simply as the number of deaths from a disease divided by the total number of people who have that disease: if 100 people get a disease and 10 of them die, that's a CFR of 10%. The math is a simple division, so what makes this hard to determine?

It turns out that there are two primary sources of uncertainty in calculating a CFR:

Uncertainty in knowing how many people actually have the disease (the denominator of the percent).
A timeline specific uncertainty in knowing how many deaths will occur (the numerator of the percent) that happens if you are trying to calculate a CFR in the middle of an epidemic.

I wanted to try to illustrate both of these problems with estimating the fatality rate for a new disease, so I came up with some scenarios and graphs that demonstrate them. You can look at all the numbers I came up with for this scenario on this google spreadsheet: https://docs.google.com/spreadsheets/d/1ePV5OnN5xeHYmXyzxbpccYUrxPnffTeU6_YtFOYmiLU/edit?usp=sharing

The Disease Timeline

The first thing I did was to generate some numbers in a spreadsheet for an infection in a location that behaves roughly as we have seen Covid-19 behave. My model for the infection curve that I generated was roughly South Korea, as it has the most complete data for a rise-and-fall of the disease so far. I generated about 3 1/2 months of infections-per-day numbers that rise exponentially for the first 30 days, then abruptly level off due to interventions, and then decay rather rapidly after a time. Then I assumed that some percentage of people with the disease would require hospitalization (10% is the amount I chose), but that on average, people wouldn't need hospitalization until they'd been infected for 10 days. Then I assumed that some percentage of people who were hospitalized (again, 10% is what I chose) would die, on an average of 7 days after hospitalization. This gives a total real-life CFR of 1%. It also give us three graphs which show the same curve shape, but scaled down and shifted in time for the hospitalizations and again for the deaths.

(The curve shapes are pretty terrible because I don't know how to do logistic curves in Google Spreadsheets, but this is fine to get the point across.)

What is the Apparent CFR?

With this timeline established, I then asked the question: given this disease progression timeline, what would the CFR appear to be at any given moment? At any moment, if the people in this scenario stopped to tally up all the deaths that had occurred so far and divide that by all the infections they knew about, what would they think the CFR was?

And the answer to this is, it depends on what infections they know about.

So let's look at three different scenarios:

Poor knowledge of infections
Consistently good knowledge of infections
Perfect knowledge of infections.

In all three scenarios, we are going to assume that we know about all infections that become hospitalized, so these people always get counted in with the known infected. How many non-hospitalized infected are known is what varies for each scenario.

In the first scenario, we are going to assume that our sample nation was unprepared and did almost no testing in the general population until some amount of people started dying: call this the "Italian Paradigm". Even after testing starts, it ramps up slowly, only reaching full capacity by the end of the outbreak.

Furthermore, we are going to assume that there is a large body of infected people who have no symptoms and who never get tested--say, half of all the infected people. So in the "poor knowledge" scenario, the testing for non-hospitalized cases starts near zero and only goes up to a bit above 40% of total coverage at the best.

In the second scenario, we are going to assume that the sample nation was prepared and jumped on testing right away: call this the "South Korean" paradigm. Here we are going to assume a constant high rate of testing that catches most symptomatic infected people. However, we are still going to assume a large body of infected people who are asymptomatic who never get tested. So for this scenario, we are saying that 45% of all infected people outside of the hospital system are known about, as well as all the people within it.

In the third scenario, we are going to assume that we somehow magically know all of the infected people right away.

I generated the numbers of known infected people per day given these knowledge restrictions, for each scenario. Then a calculated what the apparent CFR would look like if it was calculated each day by taking the sum of deaths so far and dividing it by the sum of these known infected. Here's what I got:

Poor Knowledge of non-Hospital Infections

What we see here is that due to the lack of good knowledge of how many non-hospitalized infections there are, the apparent CFR almost immediately jumps up to an artificially high number. Given no extra-hospital testing, this would eventually rise to 10%, which is the fatality rate I chose for infections that get to the hospitalization stage. Once some testing starts to kick in, though, the number starts to go down, as knowledge of total infected starts to get better. However, while this does happen, more people continue to die, and this effect starts taking over and the apparent CFR starts rising again. It finally rests at a number 6 times what it should be, which indicates that while all deaths are counted by the end, only 1/6th of the total infected were ever counted.

I am also plotting on this chart (for comparison purposes) the third scenario, where we magically know all infections at all time: this is the red line on the graph. Note that even with perfect knowledge, due to the time lag of when people die, this also gives an incorrect apparent CFR up until the very end.

Consistently Good Knowledge of non-Hospital Infections

Here we see that due to prompt testing, we don't see an initial spike of the apparent CFR to unrealistic levels dominated by the death rate in hospitals. Instead, though, we see an initial underestimation of CFR, and this is due to the time lag in deaths. This, in my opinion, matches very well with the evolution of CFR that we saw in places like South Korea and Germany, where there was an initial very low CFR estimate that has been creeping up over time. I think both places had pretty good testing in place before the epidemic began to take off (South Korea more so than Germany, but I think both did pretty well).

This is an important context in order to understand the 0.37% CFR that Dr. Streeck recently reported. It needs to be understood that this number would correspond to a point on the red line on this graph: a point at which all deaths so far are known, and also all infections (statistically in this case due to a serological study). If you look at where Germany as a whole is on the curve at the time Dr. Streeck reported his conclusion, I think you will see that it matches in this scenario at a point in time a little past the 1/3rd mark--the point shortly after interventions are starting to flatten the curve.

This means we should not be surprised to see the CFR in Germany increase over time above Dr. Streeck's preliminary report. Doubling or even tripling would not surprise me.

Conclusion

Attempting to evaluate the CFR of a disease while it is in mid-progression is fraught with problems. I have demonstrated only two of the problems with this very over-simplified model. Therefore, the best projections of disease fatality do not use this kind of simplistic logic. If you want to see the more sophisticated way in which these things are done, I encourage you to look at the disease severity study which the Imperial College study used, which I discussed in this blog post earlier: https://darkenedintellect.blogspot.com/2020/03/the-imperial-college-study-part-3a.html .

Monday, April 13, 2020

Are the lockdowns responsible for declines in infection growth?

Recently, infection and death rates in Europe and the United States appear to be leveling off and even dropping. Since this was the point of the massive social distancing measures the whole industrialized world has been taking, the natural interpretation of this would be: the measures we took are working and we are beginning to see the results.

However, some people have claimed that the leveling off of deaths is not due to strict isolation measures, but something the disease was going to do anyway. What we are seeing, this theory claims, is a natural peak in the disease. Lockdown measures may have slightly reduced the total number of deaths, but the behavior of the curve was going to follow the current path we are seeing anyway, more or less. The clear implication of this theory, if it is correct, is that we should end the strict social distancing measures and the disease will dwindle away on its own.

Can we determine which interpretation of the facts fits best? I think we can, fairly simply.

What the Prevailing Theory Expects

Given the theory that the disease will act in the standard way in which one expects an epidemic to act, what we should see is that the infection grows exponentially at first, but then rapidly shifts its growth rates after social distancing measures are put into place, in every place in which these measures are enacted. We can visualize this infection curve using an online pandemic calculator, available here: http://gabgoh.github.io/COVID/index.html

To model our scenario, I have put in a disease with an R(0) of 3. This is midway between earlier estimates of Covid-19's R(0), which was around 2.4, with later estimates which have put it as high as 3.87. Then I set an intervention date about a month into the course of the disease which has the effect of reducing the R(0) to around 1: the threshold below which a disease will begin to die out. Here's what that looked like:

Note that the resulting curve is composed of two curves, which I've marked in red and in blue. On the left of the intervention, there is a standard exponential curve, concave up. Right at the point of the intervention, it rapidly switches to concave down. It still rises for a bit, but it has a shallow hump which then trails off gradually afterwards.

The exact shape of the right-hand side of the curve depends a lot on what you set the R(0) to be after the intervention. Depending on how effective your intervention strategy is, the daily infections can die off either rather steeply, or rather slowly. I've heard, for example, an estimation of current, post-lockdown R(0) being put at 0.62. This is what the pandemic calculator looks like with that number instead:

I encourage my readers to go to this site and play around with the numbers yourself--if nothing else, this should cause you to have better sympathy for the shifting numbers coming from the IHME projections, because you will quickly see that small changes to the R(0) (which is the degree to which people are spreading the virus around) can have quite large changes to the final infected number.

What's Actually Happening

So now let's look at reality instead of this model. Do we see this same sort of results in those countries that have had significant outbreaks, and then initiated strict societal interventions in order to flatten the curve?

In order to look at this, we'll pick some countries from the Worldometer site. In order not to have our results confused by poor testing (which has been a problem in many countries), we are going to look at daily death rates rather than daily infection rates. This curve should be the same shape as the daily infected curve, just smaller and with some time lag, since only a fraction of infected will die and since it takes time for people to progress from having the infection to dying.

Here is Spain's daily death chart:

You can see clearly the exponential growth on the left side and a clear, abrupt transition to a smoothly curved peak and gradual decay on the right. The date of the transition appears to be somewhere around March 24th.

Here's Italy's chart:

Again, we can clearly see exponential growth on the left abruptly transitioning to a shallow hump and gradual decline to the right.

What about Germany? In this case, the shape is less clear:

In this case, the exponential growth on the left is obvious, but it's not so obvious what's happening on the right. Here we should realize that Germany's curve starts later than Spain's and Italy's. The pandemic apparently reached Germany later than it did Italy and Spain, so we're not seeing the peak and trail-off yet in the death rates. However, let's cheat with Germany and look at the daily infection rates chart--we should be able to see the effects of lockdown earlier with these numbers because of the time lag between infection and death. Germany has been doing a lot better in testing than a lot of other countries, so maybe we can trust that their infection rate numbers are fairly reliable:

Nice! It actually looks just like a continued form of the deaths chart from above. More evidence, I think, that Germany's testing has been far more representative of actual infection rates than other countries' has been.

What about South Korea? Here we have a problem that South Korea has been so on top of the pandemic, from the very beginning, that their daily death rate chart doesn't have enough data to form a recognizable curve: they just haven't had enough people die. This is excellent, but it does mean we can't use their chart for this analysis. However, since their epidemic control has been driven by extensive testing, we can probably do what we did for Germany and use their daily infection rates, again probably with a good degree of confidence:

OK, the smaller dataset does make the curve more patchy, but it still fits pretty well: concave up on the left and a trail-off on the right. It does appear to me that the drop-off on the right for South Korea is more dramatic than the trail-offs we've been seeing in Europe. This would fit with their lower overall death rate, though: the fact is, South Korea has simply had a better handle on the epidemic from the beginning.

So lastly, how is the United States doing? First, it should be pointed out that, as opposed to all of the countries we've listed so far, the United States hasn't had one set of lockdown measures. Different states implemented different lockdown measures at different time. We should expect to see a bit of overlapping curve flattening from the time periods when different states were probably experiencing different disease growth rates. Second, the United States is clearly behind Italy and Spain in the pandemic timeline, so we're likely to have small amounts of data for the right side of the curve. Those caveats being given, here's what we see:

Fits pretty well, I'd say, given the caveats above. Given the massive testing problems the U.S. had early on, I'm reluctant to use the daily infection rate curve, but given that our testing has been better recently, maybe we can get a better sense at least of what the right side of the curve looks like?

Still unclear, I'd say; we're still too early on. However, it certainly doesn't invalidate the theory; I'd say it weakly confirms it.

Timeline of the Inflection Points

Now that we've seen that the shape of the curves we see in real life are matching quite well with the predicted curves for the standard theory, can we also ask the question, does the timing of the curve flattening correspond with the lockdowns? Different countries imposed societal lockdowns at different times; if they are what is responsible for the curve flattening, we should expect to see some correlation in the timelines.

For the nations that we have looked at so far, here is a table showing the dates for which those countries imposed a nation-wide lockdown (or in South Korea's case, a nation-wide banning of large public gatherings), side-by-side with the dates at which I am seeing an inflection curve in their charts:

	Lockdown Imposed	Date of Inflection
South Korea	Feb. 21st	Feb. 27th (for infections)
Italy	March 9th	March 19th
Spain	March 14th	March 24th
Germany	March 22nd	April 2nd
United States	March 22nd (New York)	~April 4th

To me, this timeline is compelling; I don't see how anyone could look at this data and not conclude that we are seeing the results of societal changes in the infection and death rates at this point.

The Alternative Theory: The Disease is Peaking by Itself

However, let's suppose the above is not found to be convincing. What about the alternative theory? What would we expect to see if the disease is playing itself out, without social distancing being a major factor in the decline of the disease?

You can see what a standard epidemic disease curve looks like by using the epidemic calculator (making the intervention meaningless by setting the post-intervention R(0) to the same as the pre-intervention R(0)):

Notice that the left and right hand sides of the curve are symmetrical: it declines as rapidly as it attacks, once the population has been saturated. This shape does show up in real-life as well; we see this sort of shape in uncontrolled epidemics all the time. Here's an example graph from some '70s measles outbreaks, for example:

This outbreak came in a rapid sequence of waves (measles is *extremely* infectious), which had a roughly symmetrical look to them. Here's another example which is a collection of epidemic curves from the SARS outbreak:

https://www.who.int/csr/sars/epicurve/epiindex/en/

Here I notice that the right hand side of the curve in these charts is typically as steep to decline or steeper than is the left hand attack portion of the curve.

I find the lack of "spikiness" of the real data we are seeing difficult to square with how epidemics of very infectious diseases look like. I don't know how proponents of this theory explain an exponential attack and a much less exponential decay.

Total Infection Counts

The real problem with this theory, though, is the total infection counts, as a percentage of the population. If the real factor in limiting the continued growth of the disease were that it was reaching inherent limits of the population and herd immunity were kicking in, then each nation should see roughly the same total percentage of their population infected by the end. Herd immunity works by a certain percentage of the population becoming immune, thus crippling the disease's ability to spread rapidly.

So what sort of total infection percentages are we looking at here?

South Korea has a population of 52 million people. They have had 217 deaths so far, and their curve is completely flattened. That's a total death rate of 0.0004% of the population.

Italy has a population of about 60 million people. They're not done with their curve yet, but they've had 20,000 deaths so far . . . maybe we'll guess 25,000 deaths before the curve fully flattens. That's a total death rate of 0.042% of the population. That's over 100 times as many people as a percentage of their population than South Korea.

Spain has a population of about 47 million people. They're also not done with their curve yet, but they look on track to total maybe about 20,000 deaths. That's a total death rate of 0.043% . . . very similar to Italy's.

Germany has a population of about 83 million people. They're even further behind in the timeline than Spain and Italy, but with only 3000 deaths so far, maybe we can project a full doubling and say 6000 total deaths by the time of curve flattening. That's a total death rate of 0.0072% of the population. This is less than 1/5th of the total numbers Italy and Spain are going towards, but more than 15 times the number South Korea is going to end up with.

The United States has a population of about 327 million people. Again, we're back in the timeline a bit too far to project accurately, but applying the same logic as I did with Germany, we'd end up with a total of around 45,000 deaths (I know 60,000 or so is the current best estimate, but I'm just trying to be consistent with what I did for Germany). This would work out to a total death rate of 0.014%, which would put us somewhere in between Germany on the one hand and Spain and Italy on the other hand for total percentage infected.

These numbers are all impossible to explain by the theory that the disease is simply spreading naturally and hitting its natural peak due to herd immunity building in all the countries of the world in which it is spreading. Why should South Korea hit that natural peak 100 times faster than Italy did? Why should Germany hit that peak 15 times faster than Italy but only twice as fast as the United States?

Conclusion

The conclusion here is quite clear: Covid-19 is currently being limited by drastic social distancing measures (in the case of most of the world) or a combination of early testing and case management plus less drastic social distancing measures (in the case of South Korea and some others). There is no way for naturally acquired herd immunity to explain the current decrease in rates of infection and death that we are seeing, but it is easy to explain this using the standard, accepted theory of the disease spread. The highly different percentages of the total population that will die is therefore strictly due to the difference in promptness which these different nations implemented effective disease control measures.

Tuesday, March 31, 2020

The Imperial College Study: Part 3B

[Links to the full series]

Part 1
Part 2
Part 3A
Part 3B

------------

1. How were the CFR and IFR calculated for the Imperial College study?

Since the Imperial College study is a microsimulation, it was simulating a whole population and hence needed to use an IFR, not a CFR. It got an IFR from this study: https://www.medrxiv.org/conte…/10.1101/2020.03.09.20033357v1

Let's look at how this study came up with their numbers:

The report estimated the CFR and IFR based on three different datasets and using multiple techniques in order to validate the results. They looked at data from mainland China (70,000+ cases as of February 11th, then cross checked with latest results as of March 3rd), data from the Diamond Princess cases, and data from cases being tracked outside China (about 2000 cases as of February 25th).

My description of their techniques is going to be very much an oversimplification because I find it hard to describe the statistical techniques employed in a short space. Here's my understanding of some of the key features of the technique used for the mainland China data:

They broke the population into 10 year age bands and assumed that covid-19 would attack each age band equally.
They took the actual age demographics of the infected areas and projected how many people should get sick in each age band.
The looked at how many people were diagnosed as sick in each age band. For the younger age bands, this was fewer than projected given an equal attack rate. This gave them an age-band-specific underreporting amount.
Most of the fatalities were in Wuhan, but Wuhan had a much higher fatality per reported case than mainland China. They assumed this was due to hospital overcrowding causing milder cases to get turned away, so they added in a further factor representing hospital overloading to scale down the Wuhan numbers to be in line with the rest of the Chinese numbers.
For each age-band, they then identified which CFR that--given the onset-to-death times which were observed--would have produced the observed total cumulative deaths as of the most recent data, given the underreporting factors that they identified.
They then aggregated these age-band specific CFRs into a population-wide CFR, which turned out to be 1.38%. Note, though, that this number is specific to the Chinese age demographics.

To estimate an IFR from this CFR, they used data from people repatriated out of China back to their homelands. All of these people were tested, and it was discovered that there were about as many asymptomatic people who tested positive as there were symptomatic people who tested positive. This led to the final IFR for the Chinese outbreak being estimated at 0.66%. I should note here, though, that the data sample size here was particularly small: a total of 6 asymptomatic people who tested positive from those flights.

To validate their IFR using the Diamond Princess data, they took a timeline of onset-of-symptoms for the 705 diagnosed passengers on the cruise. Then applying their age-specific onset-to-death results on the actual ages of the diagnosed passengers, they projected that by March 5th, between 3 and 14 people should have died if their IFR was correct. Since 7 passengers had died by that date, the Diamond Princess case was judged to be consistent with their results.

To separately estimate a CFR from all of cases outside China, they used two different methods, neither of which I have looked into enough to understand. At the time this study was done, this was a pretty small sample size (1334 cases out of 2000+ met their inclusion standards). Also, they didn't have individual-level onset-of-symptom or recovery data for a lot of those cases. For these two reasons, the CFR estimates cover a wide range, from 0.4% to 7.2%, with 1.2% being the best fit to the data. This basically validated the reasonableness of the 1.38% result from mainland China.

2. That's a lot of information. What's the bottom line for the Imperial College study again?

What the Imperial College study took from all of the above is that the Covid-19 IFR is about 0.66% for Chinese demographics. They also took the age-specific IFR from the study and applied it to the older Great Britain demographics to get an IFR that they used for their simulations of 0.9%. They also used the onset-to-death time periods and the percentage of hospitalizations from that study (which I didn't get into here but was another thing calculated from the same data).

Monday, March 30, 2020

The Imperial College Study: Part 3A

[Links to the full series]

Part 1
Part 2
Part 3A
Part 3B

------------

Now I'm going to look at what fatality rate the Imperial College used, and how it was derived. I'll be splitting this part up into two sections. First (3A), I'm going to look at what a CFR is in general and how it's often calculated. Second (3B), I'll look at specifically how it was determined in the Imperial College study.

1. What is a CFR?

CFR stands for "Case Fatality Rate". It is the proportion of those people who are diagnosed with a certain disease who die over some period of time. This distinguishes it from "Mortality", which is the proportion of a total population who die of a particular disease--so MERS, for example (another zoonotic coronavirus), has a massive CFR because it is almost always fatal if you catch it, but a tiny Mortality because very few people do.

CFR is sometimes a confusing number, however, because of an ambiguity in what "Case" stands for? Is a "case" only someone who has been officially diagnosed with the disease by a doctor? Or is it anybody who can be proved to have had the disease? CFR is actually used in both ways at different times. So, for example, you may have heard it said that the CFR for the seasonal flu is about 0.1%. This is using CFR in the second sense. What epidemiologists do is try to get a good estimate of the total number of people who have actually had the flu in a given season, whether or not they were diagnosed with the flu. Then they try to get a good estimate of the number of people who died because of the flu, again, whether or not they were diagnosed with it. Then they divide the second number by the first number.

This use of "CFR" can only be fully accurate after the fact, when an epidemic or an outbreak has passed. So people *also* use "CFR" to refer to the number of people who are *diagnosed* with a disease who then die. This is useful at the time of an outbreak because it gives a sense of the seriousness of the disease right away. But this usage is recognized to be always an overstatement of the eventual fatality of the disease, because it does not account for those people who come down with the disease but never seek medical treatment, and these people will always be heavily skewed towards the milder forms of the illness.

In order to be more clear about the difference in these two numbers (the percentage of diagnosed cases who die and the percentage of all infected people who die), I would like to use a term that I've seen in some papers, which is IFR, or "Infection Fatality Rate". This would be the percentage of all infected people who die, whereas CFR would be the percentage of those people who are diagnosed with the disease who die.

2. How is the CFR calculated in general?

*After* an outbreak has entirely resolved, CFR is very easy to calculate: just divide the number of diagnosed individuals who died by the total number of diagnoses. During an outbreak, however, these numbers are a moving target and it can be deceptive to try to calculate a CFR from those numbers. Here are some reasons why:

Early on in an outbreak, a lot more people are going to be in an early phase of the disease, not a late phase. So a lot of *these* people might *eventually* die, but not be to that stage yet. So if you have a lot of testing and record a lot of sick people at an early stage, you might *understate* your CFR.
On the other hand, if you *don't* do a lot of testing, the first wave of people in an outbreak you'll find out about are those people who show up at a hospital very sick. These are going to be disproportionately the percentage of the population who got the worst form of the disease or who got hit particularly hard by it. You can expect these people to die at a higher rate than the general population. So if you don't have good testing of your whole population, your initial CFR rates will be *overstated*.
Also, the people who die earlier might be the people who are more likely to die anyway. For example, most of the people who die from covid-19 are probably older people *and* they are more likely to die earlier than the younger fatalities. If you try to calculate a CFR from initial death rates without correcting for an age factor, your initial CFR rates will be *overstated*.

In order to best calculate the true CFR from data as it comes in during an outbreak, you need to take both timeline and vulnerability data into account. For timeline correction, you need to find out what is the average time from onset-of-symptoms till death, and then plot both your death rate and infections chronologically, offset by that timelag. Given a sufficient length of time, the ratio between the two will converge on your true CFR. For vulnerability correction, you should identify the different groups that have different risk levels and split up your CFR calculation to do a separate one for each of those groups.

A key point I'd like to re-emphasize: the calculation of a CFR for a particular outbreak always gets more accurate as time goes along. This means you should pay attention to CFR as generated by those locations where outbreaks occurred *first*, when you are tracking a pandemic that is moving across the globe.

2. How is the IFR calculated in general?

In order to calculate the IFR, you need to be able to identify about how many people in your population get the disease without coming in to the doctor. You will then multiply your CFR by this fraction to get the lower IFR.
In practice, this is done in a couple of ways. The "gold standard" here is to do a serological study: you select random people from your population and look for the tell-tale antibodies in their blood that show that that person had a specific disease and developed an immunity to it. You can then get a good number for percentage of your total population that had that disease recently, whether or not they were caught in the official tallies.

Another way to establish this is by random surveys--you just ask people whether they had a particular disease or not and whether they went to the doctor for it. This is less accurate than the first method because the people you are trying to count are necessarily self-diagnosers, and the accuracy of their self-diagnosis is likely to be flawed.

The CDC has used both approaches in the past and come up with a roughly 50% proportion for the seasonal flu. That is, in order to calculate how many people in total have the flu each year, they multiply the cases determined from hospital records by about 2. Note, however, that this number is specific to the seasonal flu. In general, the more severe a disease is, the more people who have that disease will go to the doctor and the lower the proportion of that "undiagnosed" population will be.

The Imperial College Study: Part 2

[Links to the full series]

Part 1
Part 2
Part 3A
Part 3B

------------

Now I will start looking at the parameters that the Imperial College study used to characterize the coronavirus in its simulations. I'll start by looking at the basic reproduction number: the now-famous r(0) ("r naught").

1. What is r(0)?

R(0) is defined as the number of secondary infections a single instance of an infection will cause if exposed to a completely vulnerable population. It arises from a combination of multiple elements:

how naturally infective the particular virus is
how long an infected person is contagious *while* being still present in a population
how many people an infected person can be expected to contact during the infective period

Importantly, while the first two elements are virus dependent, the third is societally dependent. The more mobile and mixing a population is, the higher the r(0) will be. "Social distancing" is therefore a way of changing the r(0) of a particular disease by altering that third element.

From the perspective of a microsimulation, however, epidemiologists can just take the r(0) as a constant which determines the chance (on average) that any given non-infected individual coming into contact with an infected individual will become infected. This is because the microsimulation intrinsically accounts for degrees of social mixing by its rules that govern individual behaviors.

2. How is r(0) calculated in general?

This is an oversimplification, but I am aware of two methods of calculating an r(0).

First, an r(0) is said to be proportional to the early doubling rate of infections during an epidemic. During this early time period, there are so many more non-infected people compared to infected people that each instance of an infection will have essentially a "clean slate" of people to infect. Later on, an increasing percentage of infected individuals will come into contact with already-infected individuals, thus slowing the growth rate. Furthermore, in the earliest phases of an epidemic, the disease will be spreading without a lot of symptoms and without causing major alarm, so that social mixing will be normal as a whole. Later on, fear of the disease spreading through society will itself change mixing behavior and therefore the r(0) itself.

Second, you can examine a timeline of infection and death rates after the fact from a given epidemic event, and then run microsimulations for that population using different r(0) values until you reproduce the observed curve in the simulation.

3. How was the r(0) used in the Imperial College study calculated in particular?

The Imperial College simulation used the results from two different studies, each taking one of these approaches.

The first study was done by the China CDC (available here: https://www.nejm.org/doi/full/10.1056/NEJMoa2001316) looked at the first 425 cases reported in Wuhan and reconstructed an infection timeline by interviews which established when symptoms first occurred in each case. From this they saw an initial doubling time of the disease of about 7.5 days, which calculated out to an r(0) of 2.2, with a 95% confidence range of 1.4 to 3.9.

The second study was done by Julien Riou and Christian Althaus and funded by the Swiss National Science Foundation (available here: https://www.eurosurveillance.org/…/1560-7917.ES.2020.25.4.2…). They ran 2 million separate epidemic simulations with varying parameters and looked for those parameters that could reproduce the timeline of the infections as of January 29th, which at the time included 5,997 confirmed cases in China and 68 confirmed cases exported to other countries. Importantly, they allowed the number of cases *actually* in China in their simulations to vary substantially, in order to account for potential massive misreporting of the Chinese data. The results of this study was an r(0) of 2.2, with a 95% confidence range of 1.4 to 3.8.

4. How confident can we be in the results of these two studies?

Both studies that establish an r(0) of 2.2 for Covid-19 acknowledge the fact that they are operating on limited data, collected in a crisis situation. They both, therefore, should be treated as preliminary best estimates given the data that we had in mid February. Nevertheless, the fact that they agree very closely after taking very different approaches to the estimation does count for something.

Furthermore, I think there is good reason to believe that the true r(0) isn't lower than 2.2 given the newer European data that we have. If you look at infection growth rates for all of the European countries and for the United States since community spread started occurring in late February / early March, you can see a pretty consistent slope of the lines, yielding a doubling time of 3-4 days. This is significantly *faster* than the doubling time that was estimated for the Chinese outbreak. I believe that this makes an r(0) of significantly less than 2.2 very difficult to justify. I think the Imperial College agreed, because they used a base doubling time of 6.5 days (somewhere between current European numbers and the Chinese numbers) and a consequent r(0) value of 2.4. They did also explore values with a range from 2.0 to 2.6 to account for uncertainty.

So I think the calculations that the Imperial College did are a very reasonable interpretation of the best data that we have. I think newer studies taking the rapidly evolving new reports of infections given much greater worldwide testing would be good to do, but I very much doubt at the moment these studies would come to different conclusions right now. If anything, they would probably raise the estimated r(0), not lower it.

The Imperial College Study: Part 1

[Links to the full series]

Part 1
Part 2
Part 3A
Part 3B

------------

A bit ago I posted a link to this study by the Imperial College of London, which is a projection of how the U.S. and Great Britain can expect the Covid-19 disease to progress in the next two years under various scenarios of different levels of social isolation: https://www.imperial.ac.uk/…/Imperial-College-COVID19-NPI-m…
I want now to do a series of posts explaining this study and the current research behind it, because I think this might be the most influential study currently informing government decisions. I encourage you to read the study yourself, but I won't assume this.

1. What is the nature of this study?

This study is the results of a series of computer simulations run by epidemiologists. Specifically, it is an application of pre-existing "microsimulation" (https://en.wikipedia.org/wiki/Microsimulation) software with parameters set to simulate Covid-19 on a virtual population. A "microsimulation" is a computer model in which the behavior of a population is modeled by creating millions of individual virtual people and having them move around in a virtual world according to a set of behavioral rules. If you've ever played any of the Sim City games or any of the Tycoon games, you've played with a simple, small-scale microsimulation. A lot of people at this point have seen an *extremely* crude microsimulation recently published in an article by the Washington Post: https://www.washingtonpost.com/…/20…/world/corona-simulator/.

In epidemiology, these things are incredibly useful because you can simulate the spread and effects of a disease much more accurately than by just talking in broad percentages. You setup your population according to the actual characteristics of the real world: children will congregate in schools during school days but stay home on the weekends, old people will live in nursing home clusters, working-age people will congregate in business and will travel a range of distances to work using public transportation according to actual percentages from real-life that you enter, and so forth. Your virtual population will be created with a range of ages and existing health conditions that you also enter, based on real-world numbers that are correct for the population and time period you are modeling.

Then you characterize your disease: how long is the incubation period? What is the range of severity of symptoms and does this vary on an age basis or by existence of co-morbidities? What is the range of infectiousness of the disease, and how does it alter over the timeline of the progression of symptoms? How close does one person need to be to another person to spread the disease? How much does having the disease itself limit the motion of an individual and therefore the likelihood that one person will continue contacting other people?

Then you introduce the disease into the virtual population and play the simulation forward a certain number of times with different random seeding, and it gives you a composite average result of what happens.

2. What are some advantages to this approach?

There are a number of great advantages to this approach in epidemiology. A lot of statements about how diseases spread are actually just abstractions of just this sort of real-life behavior. We say that diseases spread exponentially in the early phases of an epidemic, but this is just a rough mathematical approximation of the behavior of a self-replicating virus as it spreads through networks of people. A good microsimulation will take more exact accounting of real-world social distances and movement. There will be a real-world mix of dense urban centers and more spread-out suburban neighborhoods, there will be an age and health spread in the population you are simulating that matches the real world, sick people will slow down their movements, etc.

There have been some people who have doubted that Covid-19 will behave as badly in the U.S. as it has in China and in Italy. They've raised potential differences between us and them that could make a difference: different baseline health of the population, different population density, different levels of cultural contact, different standards of hygiene, different access to health care. If the microsimulation is a good match to a given population, none of these problems would apply to its projections.

Furthermore, the type of human and virus behavior that needs to be modeled in order to get a realistic simulation is not actually that complex. Basically, the only human behavior that needs to be about right is movement and proximity. Viruses also are pretty simple organisms and it doesn't take too much data to be able to get a pretty accurate knowledge on the average behavior of an infection in humans on aggregate. So I think the results of these simulations tend to be fairly robust.

Another advantage is that you can use the same model to test multiple scenarios. Not sure the exact R(0) of your virus? You can run your simulation with a range and see how it affects the outcome.

3. What are some disadvantages to this approach?

Not a lot, but I can think of a few. First, as with any computer simulation, the results are only as good as the underlying assumptions and the underlying programming. This can cause it to miss some things. For example, we are now in an absolutely unprecedented level of public awareness and discussion about the Covid-19 pandemic. This itself is likely to change societal behavior--has the model taken this sort of public awareness and consequent fear of spreading infection into account? I don't know. Also, if the underlying characteristics of the virus are not correctly understood, then you will have an example of "garbage in, garbage out". Yes, you can run the simulation on a range of inputs, but if you understanding of the parameters is very out of line from reality, the results won't be very helpful.

Second, the results these types of simulation provide are very specific and realistic. But "realistic" is not the same thing as "real" and I think these simulations tend to produce a bit of over-confidence in their results because it looks like you're observing reality when you are just roughly simulating it.

4. How good a fit to reality is the model that was used in this study?

That, I am not sure of. However, I think there is good reason to trust it's a fairly good fit. With epidemiology, we have an opportunity to run such models and test their output every year because we use them things to predict the movements of seasonal flu. We've used them to model known outbreaks from the past to see if the results they produce match historical known results.

I'll stop my first post here. In a subsequent post, I'll get into the assumptions which this particular study used: how it characterized the virus and how it justified those characterizations.