Corresponding author: Dmitriy V. Pomazkin (

Academic editor:

The article discusses the forecast of the development of the coronavirus disease (COVID-19) on the basis of the SIR (Susceptible, Infected, Recovered) type model and official morbidity statistics.

Coronavirus disease (COVID-19) is caused by the SARS-CoV-2 virus and is a potentially fatal disease of serious concern worldwide. In the context of the development of the COVID-19 pandemic, it is necessary to assess the possible range of key indicators, which primarily include the newly detected number of infections, the total number of infected persons, the period of reaching the plateau, the period of completion or significant reduction of new cases. This problem cannot be solved by extrapolation methods widely used in data analysis, since, having an upward trend in the number of new cases and relying on regression equations, it is possible to obtain only unlimited growth, which is contrary to common sense. Sooner or later there will be a change in the trend and growth will go into decline, but to estimate this point in time one must understand the nature of the phenomenon and draw upon modelling methods in forecasting. Given that we are in the early stages of the pandemic and the mechanisms for its development are simply unknown, the logic of the model remains guided by common sense, actual available data and historical examples.

It should be noted that, given the relatively small set of actual data on the development of the pandemic, and the understanding that the measures taken to combat COVID-19 may vary both ways, as the results described in the paper are rather a test of the logic inherent in the model and cannot to be used as a forecast. One of the main factors influencing the error of estimate is the change in the diagnostic criterion and the asymptomatic course of the disease. Therefore, if the methods of diagnosis change over time and asymptomatic cases of the disease are recorded, the baseline data used to construct this estimate will require clarification.

Currently, there are numerous mathematical models for constructing an epidemiological forecast. SIR models are most commonly used in practice (SIR 2019), and models that take into account the age-gender distribution of a population can also be used (

Daily disease statistics were a source of data (World Health Organization 2020).

We used a discrete model of three states to describe the development of the pandemic: the latent period (state 1), the post-latent period (state 2) and the period of the disease when specific symptoms are observed (state 3). This model is a discrete analog of SIR type models, which considers only a one-way transition between states, regulated by time and probability of transition. This approach has a higher error than continuous models, but in the absence of the information on the nature of the pandemic and sufficient evidence, it can be used in estimating the number of infected persons. The main advantage of the proposed method is simplicity, as well as the fact that the model accurately describes actual data without calculating regression coefficients.

For the convenience of further calculations, we divided the incubation period of COVID-19 into two periods: the latent period, when a person has already contacted the virus but is not yet a distributor, and post-latent period, when a person is contagious, but clinical manifestations are not observed. The duration of the post-latent period is assumed to be equal to five days, based on the study (

Functional equations were used as a mathematical apparatus and the following working hypothesis was used: the spread of the infection occurs during the prodromal period and lasts until the moment of disease detection. In this case, the total number of infected (N) in states 1 and 2 can be described by the following equation:

N(i) = N(i – 1) + a(i – t1) ∙ N(i – t1) – a(i – t2) ∙ N(i – t2),(1)

where

t1 is the latent period;

t2 is the period before detection of the disease from the moment of contact;

a(i) is the intensity of contacts leading to infection.

When carrying out numerical calculations, a time step is taken as one day, the number of newly identified cases at each step is defined as follows: a(i-t2) ∙ N(i-t2). Similar to the SIR model approach, mortality for the calculation of the numbers in states 1 and 2 was not taken into account due to negligible influence. In state 3, the impact of mortality on the number of infected persons is noticeable and requires clarification of data on the increase in mortality.

Intuitively, it is the behaviour of the contact intensity function that limits the increase in the number of infected persons. Taking into account that the actual data set is not sufficient to construct this function, the hypothesis of exponential decrease in the intensity of contacts a(i) = k ∙ a(i – 1) was used with the daily level of decrease (k) by a few per cent. The initial value and the rate of decline were chosen so that the resulting solution at the first period was well aligned with actual data.

Figure

Daily identified cases of infection in the Russian Federation (persons): a) daily rate of the decline in morbidity of 5%; b) daily rate of the decline in morbidity of 4.5%.

The dotted line on Figure

Figure

Total number of infected people in the Russian Federation (persons).

As an illustration, let’s consider estimates for other countries. Figure

It should be separately noted that this estimate is based on the official data of identified daily cases of infection and does not take into account the number of asymptomatic patients, as well as those simply not identified by health care services whose numbers according to various estimates (

Given the experience of other countries, especially of China, one can hope for a short period of the development of the pandemic’s acute phase (from a few months to six months). However, if we consider the consequences of the Spanish flu pandemic that occurred in 1918, we find that in a number of countries this process lasted for 2 years, although the acute phase was in 1918 and mainly affected the population of working age. Figures

Daily identified cases of infection in the USA, Italy, Spain, Sweden (persons): a) USA, N0=50; b) Italy, N0=10; c) Spain, N0=12; d) Sweden, N0=1.

The dynamics of male mortality rate in Italy.

The dynamics of male mortality rate in Spain.

In terms of conclusion, we have obtained the following results:

using a relatively simple mathematical apparatus of functional equations, it is possible to obtain a good approximation of the temporal function of the number of newly identified cases and the total number of infected persons;

due to the high sensitivity of the modelling results to the rate of the decline in infection, the correctness of the measures taken to introduce self-isolation regime in order to avoid rapid spread of the pandemic is confirmed.

Valeria V. Burmakina, student of the medical and biological faculty of Pirogov Russian National Research Medical University of the Ministry of Health of Russia. E-mail:

Dmitriy V. Pomazkin, PhD in Economics, Actuary in the Non-governmental Pension Fund “Gazprombank Fund”. E-mail: dmitri.pomazkin@mail.ru

Ivan D. Prokhorov, student of pediatric faculty of Pirogov Russian National Research Medical University of the Ministry of Health of Russia. E-mail: iva.prohorov@yandex.ru