Risks of morbidity and mortality during the COVID-19 pandemic in Russian regions

The COVID-19 pandemic has covered all Russian regions. As of May 8, 2020, about 190 thousand cases have been identified, more than 1600 people with the corresponding diagnosis have died. The values of the indicators are expected to rise. However, the statistics of confirmed cases and deaths may underestimate their actual extent due to testing peculiarities, lagging reporting and other factors. The article identifies and describes the characteristics of the regions in which the incidence and mortality of COVID-19 is higher. Migration of potential carriers of the virus: summer workers and migrant workers from Moscow and large agglomerations, as well as return of labour migrants to the North increase the risks of the disease spread. The risk of mortality is higher in regions with high proportions of the poor and aged residents, for whom it is difficult to adapt to the pandemic, and lower in regions with greater health infrastructure. Based on the revealed patterns, a typology of regions on possible risks is proposed. Above all the risks in and near the largest agglomerations (the cities of Moscow and Saint Petersburg, Moscow and Leningrad Oblasts), in the northern regions where the share of labour migrants is high (Khanty-Mansi and Yamalo-Nenets Autonomous Okrugs), in southern underdeveloped regions (Ingushetia, Karachay-Cherkess, Kabardino-Balkarian Republics, Dagestan, North Ossetia). For the latter, the consequences may be most significant due to the limited capacity to adapt to the pandemic and self-isolation regime, and additional support measures may be required in these regions.

ged (≈ 12% of confirmed cases), more than 1.6 thousand people have died (≈ 0.88%). Broad scope, high incidence of morbidity and mortality, spread of the disease in all Russian regions (Appendix 1), scale of socio-economic consequences (Zemtsov and Tsareva 2020) determine the relevance of the study. The values of these indicators are expected to increase, as the number of new confirmed cases of the disease has not been steadily decreasing, however, the proportion of discharged patients is increasing and in general the growth rate of new cases is decreasing.
However, the statistics of confirmed cases and deaths may underestimate their real extent due to a number of distortions discussed in the methodological part of the work. Therefore, it is relevant to assess the risks and, accordingly, the future consequences of the pandemic for the population in certain regions. The authors proposed an appropriate methodology based on approaches to assessing the social risks of natural disasters (Welle and Birkmann 2015;Zemtsov et al. 2016).
The purpose of the article is to identify characteristics of Russian regions affecting the incidence of COVID-19 and mortality, and on their basis to assess the risks of the pandemic for the population of the regions at the exponential stage of the coronavirus disease spread.

Methodology, data and their limitations
For analysis, we use the official data of Rospotrebnadzor (2020) on confirmed cases of the new coronavirus infection COVID-2019 in Russia, and on mortality -data of the portal "Coronavirus today" (2020), which aggregates data of Rospotrebnadzor.
The number of officially confirmed cases may be a distorted reflection of the real spread of the coronavirus disease with a certain lag. The fact is that not all patients will contact the doctor (in half of the identified carriers according to Rospotrebnadzor the disease was asymptomatic), there is a lag between the infection entering the human body, the disease and the identification of the virus. Official data may be belatedly available to Rospotrebnadzor. The share of identified cases depends to a large extent on the quality of the tests, the system and method of testing, the coverage of the population with testing, which in turn depends on the level of the health care system development, availability and proximity of laboratories, density of private laboratories, etc. Although according to Rospotrebnadzor, over 4 million tests for coronavirus were carried out, the availability of tests at the regions significantly varied, especially in the first weeks. According to our estimates, the correlation coefficient between the number of tests and the number of confirmed cases as of April 24, 2020 is about 0.3. As the number of tests grows, registered and actual infestations should converge. Therefore, in our opinion, the provision of the population with tests is a significant but not determining factor. Tests for antibodies showing the number of cases of illness have been carried out in other countries and prove that the rates of real morbidity are understated (Nazarov and Sisigina 2020). However, in our view, officially recorded morbidity is proportional to real cases, which enables assessing multifactor regressions where the restrictions mentioned above may be partially eliminated. In doing so, factors and their impact can change as the disease spreads, so we use the latest available data.
Mortality of patients with coronavirus disease can also be significantly underestimated. By far not all those who are ill apply to medical institutions. Many die from exacerbation of concomitant chronic diseases without having an officially confirmed diagnosis of . Some of the deaths during the pandemic will also be attributed to out-of-time care due to overcrowding in medical facilities and the high engagement of emergency medical services. In some cases, deaths from certain socially sensitive diseases, such as HIV (Skochilov et al. 2018), may be underreported due to transfer to other causes of death or provided later. The disease cannot always be correctly established and identified posthumously. Also, there is a lag between real deaths and reporting. In some cases, the time lag between events and statistical registration may reach several months, and final data across the country will be available only at the end of the year. Therefore, in the case of such large-scale events, the excess of total mortality over a given time span (when an event was observed) over total mortality in previous periods is often estimated. For example, it was revealed that the additional mortality from the hot summer of 2010 in Russia amounted to 55.8 thousand people due to cardiovascular pathologies, respiratory problems and other factors (Revich 2011). According to preliminary data in the European Union there is a significant excess of total mortality in April over average values, in some countries by over 50% (EuroMOMO 2020). In Russia, according to the results of April and May, it is also possible to identify additional mortality. According to preliminary data for April, the mortality rate in Moscow increased by 20% compared to previous years, taking into account the decrease in the number of deaths in certain categories, for example from external causes (RBK 2020). In this case, we also believe that the deaths from COVID-19 reflected in the statistics will be proportional to the total additional mortality of the population, giving the basis for the econometric calculations.
Risk assessment models of natural hazards are applied to identify the characteristics of regions affecting population morbidity and mortality (Zemtsov et al. 2016). Traditionally, two components are taken into account: exposure of the population to the danger and its vulnerability. The first case involves the potential number of those who will become ill. This is due to the intensity of the regional community's interaction with other communities and within the community. Vulnerability of the population includes characteristics of the most sensitive part of residents (susceptibility), the ability of the health system to respond quickly to threats (coping capacity), as well as the ability of the population to adapt (adaptive capacity).
The main testing characteristics of the regions and their indicators are presented in Table  1, the data -in Appendix 2. Official Rosstat data is used unless stated otherwise. The values of the indicators are given for the last available year, mainly at the end of 2018, apart from the self-isolation index. We assumed that regional differences in annual indicators are relatively sustainable, so they can be used to identify common characteristics of regions affecting morbidity and mortality from COVID-19 this year.
In our view, regions with a high share of urban residents are most susceptible to the spread of the pandemic, as in cities there is a high intensity of interaction between people in multi-storey buildings, in crowded public transport, and here the proportion of residents who visited foreign countries -foci of the disease (China, Italy) -is also higher. Not far from major cities (with few exceptions) are the largest airports. Roughly half of the flights are via Moscow, Saint Petersburg, Krasnodar, Simferopol and Sochi (Habr 2020), which also increases the likelihood of the disease spread. The increase in the share of urban dwellers is a global contributor to pandemics (other things being equal), especially in developing countries. In the major cities of the third world, not only the intensity of communications is higher, but also the natural and environmental conditions are worse, which has a negative impact on the population health. Table 1. Potential characteristics of regions affecting morbidity and mortality during the COVID-19 pandemic.

Region Characteristics Designation Indicator description
Exposure of the population to the pandemics caused by high intensity of interactions within the regional community Urb Share of urban residents in total population, % Isol Yandex self-isolation index, a reverse indicator to highway congestion in major cities. According to the data on 27.04.2020.
Exposure of the population to the pandemics caused by proximity to major cities as potential sources of infection and the intensity of external relations of the regional community Demo Demo-geographical potential of the region (calculation: population of other regions divided by distance to them squared), person per 1 km 2 Another important spatial factor of the disease spread is the proximity of other major cities, which can have a negative impact through temporary or other types of migration, transit streams, etc. (Ponomarev and Radchenko 2020). We calculated the demo-geographical potential of regions using the gravitational model, i.e. estimated how many people live in other Russian regions considering the distance to them. The regions near the largest agglomerations have the greatest potential: Moscow Oblast, Tver, Kaluga, Ryazan, Oryol, Vladimir, Tula Oblasts near Moscow, as well as Leningrad and Novgorod Oblasts near Saint Petersburg.
The Yandex self-isolation index, calculated as the inverse of traffic density in the regional center directly estimates the population mobility and, accordingly, the potential of the infection. But here, in our opinion, there is a reverse dependence -the number of cars decreases as the number of diseased increases together with the spread of information about it in the media and the actions of the authorities (the presence of positive the relationship was confirmed by the results of econometric calculations).
To assess the intensity of foreign relations, we used various indicators of foreign and intra-Russian tourism and temporary labour migration. Intra-Russian migrants (Florinskaya et al. 2015) often maintain links with their place of origin, they have higher mobility, and consequently, their larger numbers both in the region of arrival and in the donor region can influence the rate and the scale of morbidity. Temporary migrant workers, for example, may have contributed to the coronavirus disease spread by returning from Moscow to the regions of Central Russia or to the North. The increase in global population mobility (tourism, labour mobility) has become a factor of the rapid spread of infection to almost all countries.
To describe the health care system's capabilities to withstand threats, we used population health assessments (regional health capital) as well as health infrastructure development. In the first case, general morbidity rates, the proportion of older people, average age and life expectancy calculated on the basis of age mortality rates, in the second the costs of health care and the provision of doctors and beds. Population ageing is another global factor that increases the likelihood of pandemics, as a set of chronic diseases accumulate with age that can escalate during pandemics, the risk of death rises. This particularly affects death rates from COVID-19. Healthcare costs are rising worldwide but are mainly aimed at serving senior citizens and buying high-priced medications from major pharmaceutical giants. At the same time, the accessibility of medicine from the point of view of availability of doctors and beds in hospitals in Russia decreased due to the optimization. Costs are higher in those regions where the incidence is higher, but they are better at recording cases, for example, private companies place their laboratories closer to potential customers.
The ability of the community to adapt to the pandemic also depends, in our view, on living standards. In particular, wealthier communities on average have greater resources to purchase the necessary equipment, medicines, for self-isolation: dacha, use of delivery services, remote work, etc.
To identify the most significant characteristics of regions, we have consistently tested all variables based on their correlation. Figure 1 shows the coefficients of pair correlation for identified significant variables (the designation of symbols is given in Table 1).
The combination of identified characteristics of regions was expected to help indirectly assess the risks to morbidity and mortality. The limitations of the approach are related to the fact that significant variables associated with the spread of the pandemic are identified to the current date, while it is necessary to predict the final situation. Therefore, we did not use the revealed coefficients in regressions to construct finite risk indices but rather defined The use of econometric methods in non-stationary processes, the nature of which is not fully studied, is always fraught with significant errors, especially when changing the time horizon. Factors identified for different observational periods can vary dramatically, and at the early stages the role of random events is high. Therefore, we are rather talking about risk assessments for the stage of exponential growth of the disease. But it is at this stage that occupancy of medical facilities is maximized due to the high spread rate of infection in the community, and consequently, additional mortality may increase. Moreover, when using the least-squares method, we cannot talk about identifying factors or causalities, but only about different characteristics of regions, in which incidence and mortality of COVID-19 are higher or lower.

Results and discussion
The first cases of the disease in Russia were reported in early March among Chinese workers in the Zabaykalsky Krai, and among Russian citizens -in Moscow among arrived tourists from Italy. Already by the end of March, the number of confirmed cases was growing exponentially. From that point on, the spread of the disease throughout the country began weights for selected region characteristics using the main component method. We assumed that the combination of significant factors in the main component would be an estimate of the initial risks of morbidity and mortality. The production of two indices we interpreted as an assessment of the integral risk index from the COVID-19 pandemic.
( Fig. 3), Moscow's share of new cases steadily decreased from 81% on April 5 to 38% on April 29. And only two weeks after the introduction of the self-isolation regime (March 29, 2020), there was a deviation from this trend (Fig. 2) towards lower growth rates.  The proportion of discharged in the cumulative number of cases, % (right axis) The proportion of deaths in the cumulative number of cases, % (right axis) Yandex self-isolation index in Moscow × 3 (right axis) Daily growth in new cases generally fell until the last week of April (Fig. 4). But in the first days of May, the number of cases in Moscow again rose to 60% of the total for Russia, which may be due to the second wave of the pandemic, the launch of tests in many private laboratories or the consequence of uncoordinated actions by the authorities during the introduction of digital passes in mid-April, which caused queues in the metro. With the introduction of digital passes, the travel intensity of the population increased slightly, and the Yandex self-isolation index on working days fell accordingly from 3.4 to 3.2 ( Fig. 3; index values multiplied by 3 for visibility and dimensionality of the right axis of the graph).   As of May 6, 2020, the confirmed incidence of COVID-19 according to econometric calculations (Table 2) is higher in regions near large centers as potential sources of infection (Demo). Figure 6 shows a belt with increased morbidity around the capital and on the axis of the most intensive communications: Saint Petersburg -Moscow -Nizhny Novgorod. Many temporary migrant workers and dacha owners returned from Moscow to neighbouring regions -Ryazan, Kaluga, Bryansk, Kursk and Oryol Oblasts. After the introduction of the self-isolation regime, labour migrants began to actively leave the capital, carrying the disease across the European part of the country.
At the stage of exponential growth, the coronavirus infection spread from the largest agglomerations to the regions of the North Caucasus, Yamalo-Nenets and Khanty-Mansi Autonomous Okrugs with high life expectancy. Note that the incidence of the COVID-19 population is higher in a number of regions with high life expectancy (Life) and a high proportion of older people, for example in the cities of Moscow and Saint Petersburg, Moscow, Voronezh, Rostov, Tambov Oblasts, Mordovia, Mari El. As we see it, higher life expectancy, and correspondingly a low mortality of older residents and people with chronic diseases in the previous period may have led to increased incidence of COVID-19 this year, given that    when the diagnosis is confirmed there is a certain shift towards the most severe cases. In Tyva or Chukotka, where life expectancy was low and deaths from all causes were higher than the average Russian in previous years, the incidence of COVID-19 is lower, as the proportion of vulnerable members of the community is lower.
Also, the incidence is higher in regions where the proportion of migrant workers from other regions is high (TrudMigIn) as an estimate of the disease transmission between regions, especially from the city of Moscow on a shift to the northern regions (Mikhailova 2020). This share is highest in Yamalo-Nenets and Khanty-Mansi Autonomous Okrugs, the cities of Moscow and Saint Petersburg, Magadan Oblast, Kamchatka Krai.
The COVID-19 death rate in Russian regions strongly correlates with morbidity, the correlation coefficient is 0.78 (Fig. 1). Therefore, one of the most significant characteristics of the regions with higher COVID-19 mortality (Table 3) was a high proportion of citizens (Urb) as an indicator of intensity of internal links, and consequently, indirectly, the share of the media. This proportion is higher in the cities of Moscow and Saint Petersburg, Magadan and Murmansk Oblasts, Yamalo-Nenets Autonomous Okrug, which recorded above-average mortality (Fig. 7). In cities, infection rates are higher due to the intensity of contacts and higher incidence, and consequently overcrowding of medical facilities. There is reason to believe that in cities there is higher incidence of the disease and mortality due to high density of laboratories, stricter reporting, qualification of medics, etc.
Among more aged population, deaths from COVID-19 are on average higher (Chen et al. 2020). In some regions with high life expectancy (Life) and average age above the Russian average, such as the cities of Moscow and Saint Petersburg, Mordovia, Penza and Moscow Oblasts, Chuvashia, the death rate from COVID-19 is indeed higher. In our view, the high life expectancy, and the consequently reduced mortality of aged residents and populations with chronic diseases in previous years, could have led to their increased deaths from COV-ID-19 this year.
Mortality is also higher in regions where the proportion of the population with income below the subsistence level (Poverty) is higher, such as the republics of Ingushetia, Kabardino-Balkaria, Kalmykia, Karachay-Cherkessia, Mari El. The ability of poor, socially vulnerable populations to adapt to the pandemic is limited, as they often work in the informal sector based on personal contacts and cannot afford remote work or work breaks.
The provision of beds in hospitals is an indirect indicator of the health care system development, of its ability to meet the challenges and to connect the largest proportion of seriously ill patients to ventilators, so the higher the availability in the region, the lower the mortality rate (Beds). For example, the lowest indicator values are in Chechnya, Ingushetia, Moscow Oblast, Leningrad and Kaluga Oblasts where deaths from COVID-19 are higher than the national average.   In the next step, using the principal component method, we obtained estimates of the weights of each significant variable for the development of the relevant integral indices (Table 4).
Detailed data on the index values for each region are presented in Appendix 3. The obtained estimates of indices by region are close to the initial parameters of morbidity and mortality from COVID-19 as of May 6, 2020. But since we did not use the weights of the regressions obtained at a particular moment in time, we can say that indices in some approximation estimate the overall risks of regions for the period of the pandemic, at least for its exponential stage. In order to assess the COVID-19 integrated population risk index, we multiplied the two indices received, as we needed to take into account their joint impact.  Figure 8 shows the types of Russian regions by the level of risk and, respectively, the consequences of the exponential spread of the pandemic. According to our calculations, the greatest risks are borne by the population in the largest agglomerations and regions near them (the cities of Moscow and Saint Petersburg, Moscow, Leningrad and Kaluga Oblasts, Tatarstan, etc.), as the population density is higher there, as well as the intensity of interaction, including at the expense of migrant workers. Risks are high in the underdeveloped regions of the North Caucasus (Ingushetia, Dagestan, Karachay-Cherkessia, North Ossetia, etc.) due to high population density, poor development of the health care system, a substantial number of elderly and poor citizens, as well as traditions of large gatherings (weddings, commemorations, celebrations). Risks are higher in the northern regions, where the propor- tion of migrant workers is higher and the density of interaction in cities and especially in the urban areas with a single ventilation system is higher.
The risks are least in poorly populated and remote regions where social distancing is naturally held: Tyva, Chukotka Autonomous Okrug, Jewish Autonomous Oblast, Irkutsk, Sakhalin Oblasts. Despite its relative proximity to China as one of the hot spots of the disease in the regions of the Far East, the risks are assessed as lower due to low population density and relatively young age structure. Of course, risks vary significantly within regions at the level of individual municipalities.

Conclusion
The maximum recorded proportion of patients with COVID-19 as of May 6, 2020, is higher in regions with large agglomerations (foci of the disease) and in their vicinity, with an ageing population and high share of labour migrants. Confirmed mortality from COVID-19 during the same period was higher in regions with high life expectancy, high poverty and insufficient health care infrastructure development. Therefore, the generalized population risks are higher in the largest agglomerations and regions near them, in the underdeveloped regions of the North Caucasus and the northern mining centers.
Risk assessment by indices is necessary in the face of deficiencies in available statistics which are late and may underestimate the scale and impact of the pandemic. Exceeding the real number of illnesses and additional deaths over confirmed cases is expected. In the Russian regions with high risks, removal of restrictions may be delayed compared to other regions.
Risk assessments strongly depend on the observation period, and the combination of factors will change as the disease spreads, so periodic monitoring of the calculated coefficients and the analysis of their behaviour over time is appropriate. The error of the approach used and the sensitivity of the obtained results to the change of the observation period, and accordingly the composition of the indicators are high. It is also important to consider that several regions have insufficient source data. Calculations performed for the earlier period confirm the described limitations of the approach, therefore the obtained calculations are primarily applicable for estimating the risks of the exponential morbidity growth stage. However, this stage is of greatest interest to politicians and scientists due to the high rate of the disease spread, rapid occupancy of medical facilities and potentially most negative consequences for mortality due to the inability to provide assistance in time, social exclusion of the most vulnerable groups, etc.
Additional socio-economic support measures may be required in high-risk regions. The self-isolation regime and other imposed restrictions can have a devastating impact on small and medium-sized businesses in Russia and the regional economies with maximum risks. As part of the pessimistic scenario, up to 80% of enterprises from particularly affected industries may close: hotels and restaurants, domestic services, entertainment (Zemtsov and Tsareva 2020). The multiplier may affect the sectors of trade, construction, real estate and transportation, so in this case, up to 3 million entrepreneurs can cease their activities. According to calculations (Zemtsov and Smelov 2018), if the number of small firms in the region is 1% lower, then the gross regional product (GRP) in it is lower by 0.06−0.17%. Then the closure and bankruptcy of 50−60% of firms are fraught with a fall in the region's GRP by 3−10%. The most affected industries are concentrated in many high-risk regions: the cities of Moscow and Saint Petersburg, Yaroslavl, Kaliningrad, Kaluga, Moscow Oblasts, Stavropol Krai, Kabardino-Balkaria, Chuvashia et al. At the same time, large agglomerations have better opportunities for digital adaptation: remote work, orders via the Internet, online business, etc.; population income is higher in large cities and, accordingly, demand for the products and services of small businesses. Therefore, the most serious social consequences can be expected in the North Caucasus and Crimea, where more than half of the employed are workers in the business sector: tourism, trade, repair, agriculture, etc.

Information about the authors
Stepan