"It is about understanding best and worst-case scenarios - and the fact that worst case is one of the most important things to keep track of when navigating through pandemics - regardless whether it be in Denmark, the EU, the USA or the WHO. If you are only presented with an average estimate for the development of an epidemic - not knowing how bad it possible can get, then it is difficult to act politically", says Professor Sune Lehmann, one of four authors of the article Fixed-time descriptive statistics underestimate extremes of epidemic curve ensembles just published in Nature Physics.
Researchers Jonas L. Juul, Kaare Græsbøll, Lasse Engbo Christiansen and Sune Lehmann, all from DTU Compute, act as advisors to the National Board of Health in Denmark during the corona crisis. And partly based on their own experience as advisors, they have become aware that the existing methods of projecting the development of epidemics such as COVID-19 have a problem in describing the extremes possibilities of the expected development.
"Disease outbreaks are fundamentally stochastic processes. The same disease introduced in the same population can infect a large number of people or disappear quickly without having a particular prevalence. It depends in part on coincidences," explains postdoc Jonas L. Juul.
It is precisely the unpredictability of epidemics which makes it so difficult to make the right decisions everywhere in society when it hits. How many beds and respirators will there be a need for? And how much can we reduce this demand by enforcing restrictions?
However, the general unpredictability is just one of many problems in estimating the development of an epidemic.
"It is not just the unpredictable nature of epidemics that makes it difficult to predict their course - it is also our lack of knowledge about the disease's characteristics and prevalence in society at any given time. Just to give a few concrete examples of this: there is typically no one who has any idea exactly when an outbreak has started, how many infected we have in an area on any given day, or in which regions the epidemic is getting a foothold right now. The only thing we know for sure is that when the health authorities discover an outbreak, it has been going on for a while, "says Lehmann.
The common way to deal with the lack of information, almost everywhere in the world, is to model many scenarios based on e.g. different numbers of unknown infections and starting times and then summarize by looking at each day separately and assessing the 'middle' predictions as the most likely outcomes of the day. If most input parameters give infection numbers of less than 4000 on Christmas Day, more than 4,000 new infected are subsequently assessed to be unlikely.
The 'day-based' way of making these predictions is used all over the world, and although the link between the development of an epidemic and specific dates is useful in some contexts, it systematically excludes data on how bad or mild the epidemic will be.
If all projections e.g. predict that the epidemic will peak at 4000 infected in one day, but none of the curves shows it on the same day, then on a given day it will be an extreme and therefore not included in any estimate.
"We, therefore, suggest making the summary 'curve-based': Instead of assessing which infection rates are probable or unlikely on individual days, we should look at one entire simulation at a time. Is the entire simulated infection curve probable or not? And based on that you can make a summary of the most likely curves for the development of the epidemic, "says Juul.
"By looking at entire prediction curves instead of individual days, you will get a more realistic estimate of how bad the epidemic can become. It is especially useful if you are trying to avoid the hospital system being overloaded," concludes Lehmann.