NewsWire: 5/17/21

  • A new model of excess deaths during the pandemic estimates that the worldwide excess death toll has been between 7 and 12 million. This would mean that the official numbers represent only one-half to one-quarter of the true death toll. (The Economist)
    • NH: In future decades, when we look back at the Covid-19 pandemic, how will we think about it? As a serious demographic event? Or, from the vantage point of history, as a relatively minor episode?
    • If you focus on the total official count of Covid-19 deaths worldwide, now at just under 3.9 million deaths, you would not say it's that serious.
    • After all, the global death toll from the Spanish Influenza in 1918-19 was (according to the most recent and accurate estimate) around 15 to 25 million. Going back a bit further, we could compare it to the Third Plague Pandemic, a new wave of bubonic plague which spread worldwide out of Yunnan in the 1850s, peaked in the 1890s in India and ravaged Manchuria in the 1900s. It killed an estimated 10 to 12 million people--though it barely touched western countries which is why it's little known today. Or you could go back to the two earlier global plague waves: the Black Death in 1347-52, killing 50 million or more, and the Plague of Justinian in 541-42, killing 30 million or more.
    • But hold on a moment. The way experts estimate these historical death tolls is not by looking at official government death tallies--which is what we have for Covid-19. Back then, few such official tallies were made--and those we have are hopelessly inadequate. Instead, the experts look at demographic residuals. They estimate changes in population, by country or region, adjust for births and migration, and then estimate how much total deaths changed relative to the pre-pandemic trend.
    • This is by the way similar to the way the CDC estimates influenza deaths each year in the United States. "Influenza" is rarely confirmed as a cause of death on a coroner's report (respiratory illness or failure is more likely). To ascertain deaths due to influenza, the CDC needs to statistically infer the total by tracking total deaths by locality and correlating those deaths with confirmed influenza diagnoses by doctors in that locality. President Trump famously brought these estimates to the attention of the American public at a press conference in late February 2020. (See my more detailed explanation in "Why Covid-19 is Unstoppable (Yes, Even in China)")
    • The same challenge confronts demographers estimating Covid-19's death toll. Many people die before they are tested for Covid-19, and even if they are tested doctors are often unsure whether the negative test was wrong or whether the presence of Covid-19 had anything to do with their death. Even more than influenza, Covid-19 can manifest itself in any number of symptoms--which the coroner may or may not be possible to attribute to a specific pathogen in any individual case.
    • Demographers are getting around this challenge by looking at residuals and harnessing the law of large numbers, just as they do when estimating historical pandemics. They calculate, for each country, an "excess death" count. They look at total deaths in a pandemic year and compare them with the average deaths that occurred on the same date over an average of previous years. Comparing by date adjusts for any seasonal effects. They may further adjust the results to account for other effects, such as the aging of the population or changes in deaths due to easily identifiable causes like homicides or traffic accidents.
    • This "excess deaths" approach, possible in higher-income countries that maintain accurate death counts by week or month, generates a more credible number. It answers the counterfactual question: If deaths behaved during the pandemic the way they behaved before the pandemic, how many fewer deaths would we be experiencing? In most countries where they can be calculated, excess death numbers are running about 15% to 25% higher than the official Covid-19 death totals.
    • Fair enough. But what about countries for which excess deaths cannot be calculated, either because such timely and accurate death records are not collected or in any event are never made public? This is a big problem because these countries tend to have the least reliable official counts of Covid-19 deaths. It's an even bigger problem because these countries predominate in two continents (Asia and Africa) which comprise 77% of the world's population. In just two countries, China and India, the lack of excess-death counts already exclude 35% of the world's population.
    • Enter The Economist and its Economist Group research team. They believe this puzzle is solvable by means of big data and machine learning.
    • The team's basic approach is, first, to amass all the data (by week or month for every country) that could be correlated with Covid-19 deaths. Second, by examining countries that do have accurate excess-death numbers, the researchers estimate reliable correlations between these "possible" data series (like mobility or positive test ratios or share of population over age 65) and actual excess deaths. Finally, they estimate the excess-death numbers in all the other countries by looking at whatever sorts of data these other countries do have.
    • Specifically, they model with a gradient boost version of random forest (this is called decision-tree modeling). And they assess the accuracy of their model by repeatedly testing a randomly selected sample set against a test set (this is called bootstrapping).
    • Enough about the method. Let's now look at the results. 

Trendspotting: What is the Global Covid-19 Death Toll? - May17 1

    • Some advice on how to read this chart. The various gray bands are all of the "official" Covid-19 death counts as published by governments. The red line is the Economist Group's best estimate of total excess deaths. The darker pink is the 50% probability band. The lighter pink is the 95% probability band.
    • A few takeaways.
      • Most obviously, the red line soars high above all the gray bands. Cumulatively, the red line totals to 10.2 million deaths, 3.0X larger than the official number. At its lowest estimate, it is 7.1 million (2.1X larger). At its highest estimate, it is 12.7 million (3.7X larger).
      • Most of the official deaths are located in North America, Europe, and South America. Clearly, most of the unofficial and uncounted excess deaths are located in other continents.
      • This month, daily global official deaths are down slightly from their January peak. But daily global excess deaths are hitting new heights, now reaching 40K per day for the first time ever.
    • The following continental breakdown makes these points more clear.

Trendspotting: What is the Global Covid-19 Death Toll? - May17 2.

    • In Europe, United States, Canada, and Oceana, the official death totals come closest to the estimated excess deaths. Here, in other words, the undercount was smallest--as we might expect. In Latin America, the distance is larger. Still, in countries like Brazil and Mexico--where governments have struggled to cope--the official counts remain fairly accurate.
    • Africa and Asia are another story. In Africa, the official death count thus far of 127K--with hardly any deaths in sub-Saharan Africa (outside of South Africa)--can't possibly be right. In Asia, the Economist Group reckons, the official count isn't much better. India, now under pressure some a massive infection wave, seems to be the only large Asian nation that has recently improved its official death count.
    • For anyone interested in the true global magnitude of Covid-19, there's more in this research than I can cover in this note. For a real-time update on any one country, go to the Economist Group's main modeling page. For background on its modeling methods, go here. For more detail on countries where excess deaths are known, go here.
    • What do I think of the accuracy of the Economist Group's modeling? It's a bold and creative effort--and IMO the best on offer. It's superior to a similar effort by the University of Washington's IMHE mainly due to its sophisticated ML approach. The IMHE's effort falls short because it derives simple "regional ratios" between excess deaths and reported deaths and then applies those ratios to countries having no excess deaths data. That's crazy. You cannot estimate China's excess deaths by hypothesizing some sort of "Asian" ratio and cranking out a number based on China's reported Covid-19 deaths.
    • Having said this, I'm still uncertain how reliable the Economist Group's numbers are. If I had to, I would wager it's on the low side--for two reasons.
    • First, its excess-death estimates for countries where excess deaths can be directly observed are mostly lower than what I have seen in other reports. This is probably because the Economist Group takes its estimates directly from Ariel Karlinsky's World Mortality Dataset Project (run by an Israeli think tank, the Kohelet Economic Forum). In order to generate a number for every possible country, the project makes simplifying assumptions and does not adjust for the undercounting of deaths in the most recent months.
    • Also, in calculating pure a "excess deaths" number, the project makes no attempt to eliminate changes that are clearly unrelated to Covid-19. In other words--and this is what happened in several western European countries--if deaths due to accidents fell more than unreported deaths due to Covid-19, then the project reported a negative excess deaths number. In this respect, the IMHE generates higher and more realistic Covid-19 death counts for the developed countries. (For the United States, in fact, the IMHE offers the highest estimate I've seen yet: 917K deaths to date.)
    • Second, while the Economist Group's approach to estimating excess deaths in Asia and Africa is much smarter than the IMHE's, it still leaves much to be desired. My favorite line in its report is this: "The ranges for Africa and Asia are spectacularly wide. So they should be. The data from which to make strong predictions are not available, and in some places do not exist." I appreciate the team's humility here. Unlike the IMHE, it doesn't just use a "fudge factor" to fabricate results for three-quarters of the world's population with hardly a sentence explaining what it's doing.
    • Nevertheless, the excess deaths for the non-reporting countries all remain suspiciously low--much lower, overall, than for the reporting countries (that is, for countries where we can calculate excess deaths directly). The Economist Group acknowledges this but justifies it by observing that the nonreporting countries are generally poorer and younger than the reporting countries. While poverty tends to generate a higher death rate, youth tends to generate a smaller death rate. The latter effect, these researchers argue, is more powerful than the former.
    • They even show this is in a chart.

Trendspotting: What is the Global Covid-19 Death Toll? - May17 3

    • In general, I buy this argument. But the relationship is hardly lockstep. Look at Japan, which has a relatively low death rate though it's very high elderly share suggest it is "expected" to have a very high death rate. And what about Peru and South Africa, which are expected to have low death rates but which in fact have high death rates? Furthermore, I wonder: Why is it that so many Latin American countries, which are generally pretty good at population reporting, show much higher death rates than so many Asian countries, which don't report and which often are no younger.
    • Here's what I'm getting at: No matter how brilliant the ML technique, there may be a systemic bias in trying to infer deaths in nonreporting countries by looking at data from reporting countries. The inference assumes that, other than their nonreporting, such countries are essentially the same. But maybe they're not. Maybe non-reporting countries are different in this respect: They don't want anybody to know what their death counts are. And this may affect all the data available in these countries.
    • China is the most conspicuous example. I have no doubt that China has a lower per-capita death rate from Covid-19 than the United States, thanks to its thorough (if sometimes brutal) method of testing, tracing, and mass isolation. Yet how much lower is plausible?
    • According to the Economist Group, China's true per-capita death rate is only one-tenth of America's (roughly 24 per 100K versus 240 per 100K). Is that plausible? China is rapidly aging: Its population share over age 65 is now 11%, versus 15% in the U.S. So not much can be explained here. China's acute-care and public health services remain woefully subpar, especially in rural areas. China's vaccines are low quality (by its own admission), and its rollout has been slow.
    • So I'm wondering: Which data exactly is the Economist Group tracking in China that serves as surrogates for its excess deaths estimate? Its positive test, hospital admissions, and official Covid-19 death counts seem pretty much worthless. To my knowledge, no researcher has conducted (or at least published) a recent seropositivity sample on a Chinese population. The most unbiased series I can think of would be data on mobility and economic activity. But that's a pretty thin reed to lean on when assessing the cause of death.
    • OK, enough about the limitations of these new estimates. Let's think about their implications. Globally, according to the Economist Group, the pandemic is still raging--in fact, the daily Covid-19 death toll has never been higher. Right now, the team's best guess is just over 10 million. Where will it be when the pandemic is over... 15 million? If better estimation procedures serve to increase the estimate, as I think likely, could we eventually book looking at 20 million, with maybe a 15 to 25 million band of probability? Perhaps.
    • If so, we're looking at a historically significant demographic event, after all, something on par with the Spanish Influenza. Combine the absolute death toll with the much larger number of survivors likely to experience higher death rates and debilitating symptoms from "Long Covid," and we're looking at something that will reshape the rest of our lives.
    • Yes, I know, the world's population back in 1918-19 was about one-quarter of what it is today. So on a per-capita basis, the death toll from the Covid-19 pandemic would still only be a fraction of the death toll from the Spanish Influenza. And of course a still-smaller fraction of the death rates from the great bubonic plagues--or from the massive multiple-disease wave that killed 75% to 90% of the Amerindian population from the 16th through the 18th centuries. We're not talking about anything close to these catastrophes.
    • Even so, it's a reminder that microscopic pathogens continue to shape humanity, even with all our affluence and technology.
    • The best historical parallel I can think of would be the so-called Russian Flu of 1889-90, which had echoes lasting throughout the world, including Europe and America, until 1895. It killed an estimated 1 million people--though on a per-capita basis this would be the equivalent of 5 million people today.
    • Most epidemiologists have long believed that the Russian Flu was caused by an Influenza A H2N2 virus. But a growing number are changing their mind. They say there is mounting evidence that it actually represented the virulent first appearance of the OC43 human coronavirus, which today manifests as one of the (generally harmless) common cold viruses. In my best-case forecast for the world of the 2030s, that will be the future of Covid-19: just another pesky cold virus, still ubiquitous but killing no one. 
To view and search all NewsWires, reports, videos, and podcasts, visit Demography World.
For help making full use of our archives, see this short tutorial.