Wednesday, September 22, 2021

Dying "with" vs. Dying "from" Covid, pt. 2

For the first method of data analysis, I note that the official Covid death tally is surmised to be composed of two series of numbers: the people each day who die of some random cause but only happen to be infected with Covid, and the people each day who actually die of Covid.  And *both* of these series of numbers will be related to another series of numbers: the number of people each day who are diagnosed with Covid.  However, the two types of people who die each day will each have  a *different* relationship to this number.

For the people who die of some other completely unrelated cause, the number of those people--who just happen to also have Covid--will be directly related to how many people currently have Covid in the population.  If a lot of people happen to have Covid at some time, a lot of people who die *at that time* will also happen to have Covid by coincidence.  If few people happen to have Covid at that time, few people will die coincidentally also having Covid.  So if you plotted the number of people who have Covid at any particular time on the same graph as the number of people who die "with" Covid at any particular time, the second graph will be a mirror of the first graph (but smaller).

The same thing is true of people who die "from" Covid--*except* for the important fact that this graph would be not only mirrored, but also time shifted.  It takes some time after you are diagnosed with Covid to actually die of Covid.  So if a lot of people at a particular time are diagnosed with Covid, then *later on* a lot of people will die from Covid--but not right away.

This time dependency represents a difference between the two types of people that we are surmising compose the total official death tally of Covid.  We should then be able to separate out roughly how many people fall into each category by doing a time-dependent analysis.

My Analysis

Here was my approach, using publicly available datasets and a custom Python program:

I assumed that the number of "deaths with" (the coincidental deaths) included in the official death tally was some fairly constant percentage of the total deaths (seeing as I couldn't think of any good reason for this to change over time).  I also assumed that the number of these deaths over time would be directly proportional to the number of Covid cases at the time.  I could therefore generate a time series that represented those deaths by taking the time series number of confirmed cases per day and scaling it down until the number of deaths it represented equaled a given percentage of the total official death tally.

I made this target percentage (the percentage of deaths in the official tally which are "spurious") a variable so that I could generate multiple time series of spurious (or coincidental) deaths per day corresponding to any target magnitude of this effect I wanted.

For each iteration of my run, I would generate the "spurious" deaths that would correspond to a given magnitude.  I then subtracted these deaths from the official tally.  The hypothesis of this particular run would be that the remaining deaths were the deaths caused "by" Covid, and should therefore match the Covid infection curve, but with a time delay.  I then scaled these deaths up to match the infection curve and found the best time delay which caused the death and infection curves to match.

By doing this for a target "spurious" death percentage of 0%, 10%, 25% and 50%, I figured I could see which rate of "deaths with" resulted in the best final match between time-shifted deaths and the original infections.  That is, the closer my arbitrary percent of "deaths with" ended up being to reality, the better the remaining deaths would correspond to the infections that actually caused them.

The result was as following (orange is scaled up deaths, blue is infections):


As you can clearly see, assuming that "deaths with" Covid account for either 0% or 10% of the total deaths results in a perfectly reasonable final death curve that matches the causal infection curve pretty nicely.  However, the further you increase this number above 10%, the worse the match becomes.

Periods of Rapid Infection Growth

The most telling part of these curves are the sections in which infections are increasing rapidly--primarily at the start of the Fall/Winter surge of 2020 and the current Summer surge of 2021.  The reason these diverge so strongly is that when you have infections very rapidly rising, you can start getting large differences between the infections and time-delayed deaths.  That is, you see large numbers of infections two weeks into one of these very rapid surges, whereas the deaths have not moved at all.  These time periods are extremely hard to explain using the "deaths with" hypothesis--if the infections are rising rapidly, why are coincidental deaths not also rising rapidly?  And you can see this divergence visually in my analysis by the big dips in the resulting death graph compared to infections during those periods.

You can see this problem already starting to emerge even in the 10% graph, as is clear in this blowup focusing in on the start of the Summer '21 surge:



That specific downward divergence problem only gets worse and worse as the hypothesized percentage of "spurious" deaths increase (as do other problems as well).  For this reason, I think that the 10% hypothesis has already slightly overshot the reality of how many coincidental deaths there actually are.  I would therefore put 10% as the upper cap on how much of the official death tally could be caused by purely coincidental deaths.

Another Important Factor: Amount of Time Shift

Another important thing to consider is how much the death graph had to be shifted back in time to match up with the infection graph.  Because removing spurious deaths takes deaths away from the left side of the death curve, in order to make the resulting curve match up with the infection curve, I had to increase the amount of time shift each time I increased the percentage of total deaths that I deemed spurious.

For the hypothesis that 0% of the total death tally is spurious, I had to shift the deaths back 20 days to get them to match up with the infections properly.  I had to increase this a few days for each subsequent graph, all the way up to 30 days of time shift or the graph where I assume 50% of the total death tally is spurious.

Here it is important to note that the average time-to-death from infection has been established independently based on case studies, and it's normally given at something in the range of 18 days.  This also argues against positing that the total percentage of spurious deaths goes very far above 0%--it's another way that the hypothesis results in unrealistic data the larger this percentage gets.

Some Closing Comments on this Analysis

1. Just to comment in case someone was confused: yes, there is a clear divergence between deaths and confirmed infections at the beginning of the graph.  This is a known issue caused completely by the fact that at the beginning of the pandemic we had very poor testing, meaning that the actual amount of Covid infection was far higher than what appeared by the number of confirmed Covid cases.  This hasn't been a problem since mid-last year.

2. One objection might be made, suppose there were other causes of overreporting aside from purely coincidental deaths?  This analysis doesn't rule those out per se, however given how well the time-shifted deaths matches the infections (when scaled), those causes of overreporting would have to be somehow time-matched to actual Covid deaths.  That is, the overreporting would get worse when *actual deaths from Covid* go up (not just Covid infections) and get better when these deaths go down.  I have not yet been able to think of a cause of overreporting that would be proportional to correct reporting in such a way.

3. Finally, I should note that this analysis will only catch overreporting of deaths due to coincidental Covid infections.  It would not catch any *underreporting* of Covid deaths.  Most causes of underreporting that you might think of would actually be time-matched with the actual deaths: for example, suppose elderly people with severe comorbidities who died of heart attacks due to stress on their system caused by Covid were sometimes thought to have died just from the heart attack, because it was known that their hearts were weak already.  In this case, a certain percentage of deaths actually caused by Covid could be put down as "just heart attacks" by whomever recorded their death. 

This could happen on a regular basis a certain percentage of time and it would not show up as an anomaly on this kind of a comparison graph, since the deaths are just scaled up to match with the infections anyway.  That would be one time-matched factor causing deaths to be *underreported*, and others could also be easily thought of.

This means that this particular analysis does not offer any sort of cap on how much the official death tally might be under-representing the actual death toll of Covid.  More on this point in Part 3.

Conclusion

The hypothesis that a significant portion of the official tally of Covid deaths are actually coincidental and result from some other cause is consistent with the timing of those deaths, but only if the total proportion of coincidental deaths is held at about 10% or below.  Meanwhile, the possibility that there might be signficant *undercounting* of Covid deaths for other reasons is still, at this stage, a possibility.

No comments:

Post a Comment