Estimates of the probability of occurrence of intense epidemics based on the long-observed history of infectious diseases remain lagging or lacking altogether. Here, we assemble and analyze a global dataset of large epidemics spanning four centuries. The rate of occurrence of epidemics varies widely in time, but the probability distribution of epidemic intensity assumes a constant form with a slowly decaying algebraic tail, implying that the probability of extreme epidemics decreases slowly with epidemic intensity. Together with recent estimates of increasing rates of disease emergence from animal reservoirs associated with environmental change, this finding suggests a high probability of observing pandemics similar to COVID-19 (probability of experiencing it in one’s lifetime currently about 38%), which may double in coming decades.


Observational knowledge of the epidemic intensity, defined as the number of deaths divided by global population and epidemic duration, and of the rate of emergence of infectious disease outbreaks is necessary to test theory and models and to inform public health risk assessment by quantifying the probability of extreme pandemics such as COVID-19. Despite its significance, assembling and analyzing a comprehensive global historical record spanning a variety of diseases remains an unexplored task. A global dataset of historical epidemics from 1600 to present is here compiled and examined using novel statistical methods to estimate the yearly probability of occurrence of extreme epidemics. Historical observations covering four orders of magnitude of epidemic intensity follow a common probability distribution with a slowly decaying power-law tail (generalized Pareto distribution, asymptotic exponent = −0.71). The yearly number of epidemics varies ninefold and shows systematic trends. Yearly occurrence probabilities of extreme epidemics, Py, vary widely: Py of an event with the intensity of the “Spanish influenza” (1918 to 1920) varies between 0.27 and 1.9% from 1600 to present, while its mean recurrence time today is 400 y (95% CI: 332 to 489 y). The slow decay of probability with epidemic intensity implies that extreme epidemics are relatively likely, a property previously undetected due to short observational records and stationary analysis methods. Using recent estimates of the rate of increase in disease emergence from zoonotic reservoirs associated with environmental change, we estimate that the yearly probability of occurrence of extreme epidemics can increase up to threefold in the coming decades.

Data Availability

The historical epidemics dataset generated in the current study and a MATLAB code that analyzes it are available in the Zenodo repository; DOI:

Share This

Related Posts