Methods for constructing an assessment of the development of the coronavirus pandemic

The article discusses the forecast of the development of the coronavirus disease (COVID-19) on the basis of the SIR (Susceptible, Infected, Recovered) type model and official morbidity statistics.


Introduction and brief literature review
Coronavirus disease  is caused by the SARS-CoV-2 virus and is a potentially fatal disease of serious concern worldwide. In the context of the development of the COVID-19 pandemic, it is necessary to assess the possible range of key indicators, which primarily include the newly detected number of infections, the total number of infected persons, the period of reaching the plateau, the period of completion or significant reduction of new cases. This problem cannot be solved by extrapolation methods widely used in data analysis, since, having an upward trend in the number of new cases and relying on regression equations, it is possible to obtain only unlimited growth, which is contrary to common sense. Sooner or later there will be a change in the trend and growth will go into decline, but to estimate this point in time one must understand the nature of the phenomenon and draw upon modelling methods in forecasting. Given that we are in the early stages of the pandemic and the mechanisms for its development are simply unknown, the logic of the model remains guided by common sense, actual available data and historical examples.
It should be noted that, given the relatively small set of actual data on the development of the pandemic, and the understanding that the measures taken to combat COVID-19 may vary both ways, as the results described in the paper are rather a test of the logic inherent in the model and cannot to be used as a forecast. One of the main factors influencing the error of estimate is the change in the diagnostic criterion and the asymptomatic course of the disease. Therefore, if the methods of diagnosis change over time and asymptomatic cases of the disease are recorded, the baseline data used to construct this estimate will require clarification.
Currently, there are numerous mathematical models for constructing an epidemiological forecast. SIR models are most commonly used in practice (SIR 2019), and models that take into account the age-gender distribution of a population can also be used . It should be noted that other studies in this area are under way (Liang 2020;Nag 2020;Giordano et al. 2020).

Data and methods
Daily disease statistics were a source of data (World Health Organization 2020).
We used a discrete model of three states to describe the development of the pandemic: the latent period (state 1), the post-latent period (state 2) and the period of the disease when specific symptoms are observed (state 3). This model is a discrete analog of SIR type models, which considers only a one-way transition between states, regulated by time and probability of transition. This approach has a higher error than continuous models, but in the absence of the information on the nature of the pandemic and sufficient evidence, it can be used in estimating the number of infected persons. The main advantage of the proposed method is simplicity, as well as the fact that the model accurately describes actual data without calculating regression coefficients.
For the convenience of further calculations, we divided the incubation period of COV-ID-19 into two periods: the latent period, when a person has already contacted the virus but is not yet a distributor, and post-latent period, when a person is contagious, but clinical manifestations are not observed. The duration of the post-latent period is assumed to be equal to five days, based on the study (Lauer et al. 2020). In the absence of data, the latent period is also assumed to be 5 days, which, together with the post-latent period, is four days lower than the accepted quarantine period of 14 days.
Functional equations were used as a mathematical apparatus and the following working hypothesis was used: the spread of the infection occurs during the prodromal period and lasts until the moment of disease detection. In this case, the total number of infected (N) in states 1 and 2 can be described by the following equation: where t1 is the latent period; t2 is the period before detection of the disease from the moment of contact; a(i) is the intensity of contacts leading to infection.
When carrying out numerical calculations, a time step is taken as one day, the number of newly identified cases at each step is defined as follows: a(i-t2) • N(i-t2). Similar to the SIR model approach, mortality for the calculation of the numbers in states 1 and 2 was not taken into account due to negligible influence. In state 3, the impact of mortality on the number of infected persons is noticeable and requires clarification of data on the increase in mortality.
Intuitively, it is the behaviour of the contact intensity function that limits the increase in the number of infected persons. Taking into account that the actual data set is not sufficient to construct this function, the hypothesis of exponential decrease in the intensity of contacts a(i) = k • a(i -1) was used with the daily level of decrease (k) by a few per cent. The initial value and the rate of decline were chosen so that the resulting solution at the first period was well aligned with actual data. Figure 1 shows the number of new cases calculated for the Russian Federation using equation (1) with the following initial data: N(0) = 10; t1 = 5, t2 = 10, a(0) = 2, for two different daily rates of decline: 5% and 4.5%. The dotted line on Figure 1a starts from the moment of making the forecast, in this case -14.04.2020. Actual data at the time of writing this article were available until 21.04.2020, so Figure 1b shows the results of another calculation, the only difference of which is the rate of morbidity decline. We should note the high model instability in relation to this parameter, associated with the presence of indicative functions in the solution of the equation (1). However, this example is important, since it highlights the correctness of the measures taken so far to limit the number of contacts that lead to a decrease in the level of infection and preventing the development of the pandemic. Given the short-term period of the assessment, as well as the uncertainty of the whole situation, things may change considerably by the time this article is published. However, the proposed algorithm, after the refinement of the initial data, can be used in the construction of estimates, as it describes with sufficient accuracy the situation in other countries where the pandemic began earlier than in Russia. Figure 2 shows the total number of infected persons in states 1 and 2. In the first case, the maximum number of infected reaches 150 thousand people, in the second -350 thousand people. It is appropriate here to make a comparison with other countries, assuming that they have already reached the peak of the pandemic. So, in the USA as of 21.04.2020 the number of identified infected exceeds 750 thousand people. Given that the population of the Russian Federation is more than twice lower, the maximum number of 350 thousand people for the Russian Federation can serve as a comparative reference point. Of course, it is also important to consider the average duration of the peak state of the disease (the time of being in state 3), but the actual information at the time of writing this article is not sufficient to estimate this parameter. As an illustration, let's consider estimates for other countries. Figure 3 shows comparisons of actual data and modelling results, similar to figure 1, for four countries: USA, Italy, Spain and Sweden. The daily rate of decrease in the intensity of contacts leading to infection for all countries was taken at 5%, the only difference is the initial number of infected persons. Here it is necessary to explain why the daily rate of decrease of infected persons was taken as 5%. We have evaluated COVID-19 mortality reduction charts in other countries and concluded that this type of curve most correctly and logically describes the current situation. And it was noticed that this is characteristic both for countries with rigid quarantine measures and for countries where quarantine is not officially declared, for example in Sweden. In the latter case, it can be assumed that a higher level of social consciousness of people contributes to a decrease in the number of infected people.

Results
It should be separately noted that this estimate is based on the official data of identified daily cases of infection and does not take into account the number of asymptomatic patients, as well as those simply not identified by health care services whose numbers according to various estimates (Pueyo 2020) may differ from official data.
Given the experience of other countries, especially of China, one can hope for a short period of the development of the pandemic's acute phase (from a few months to six months). However, if we consider the consequences of the Spanish flu pandemic that occurred in 1918, we find that in a number of countries this process lasted for 2 years, although the acute phase was in 1918 and mainly affected the population of working age. Figures 4 and 5 indicate the percentage change in mortality rates for Italy and Spain during the century-old pandemic.
In terms of conclusion, we have obtained the following results: • using a relatively simple mathematical apparatus of functional equations, it is possible to obtain a good approximation of the temporal function of the number of newly identified cases and the total number of infected persons; • due to the high sensitivity of the modelling results to the rate of the decline in infection, the correctness of the measures taken to introduce self-isolation regime in order to avoid rapid spread of the pandemic is confirmed.