Research Article |
Corresponding author: Yulia Yu. Shitova ( yu_shitova@mail.ru ) © 2022 Yulia Yu. Shitova.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Shitova YuYu (2022) Methodology for Monitoring the Mobility of Circular Labour Migrants in Moscow Region. Population and Economics 6(1): 1-13. https://doi.org/10.3897/popecon.6.e77308
|
The paper presents the original author’s methodology for monitoring commuting labour migration in a region on the example of Moscow Region. The methodology is based on regular collection of real-time information from the Yandex.Probki platform on the regional transport network state by querying and saving travel time for a fixed set of car routes (basic sample) covering the region under study. An analysis of the data collected over the past two years enabled studying the structure and dynamics of travel time losses by commuting labour migrants. The time dynamics of losses are sensitive to events such as lockdowns and holidays. The estimates obtained show a stable cyclicity of travel time losses within a day and a week, which confirms the validity of the indicator proposed by the author. The study demonstrates that the loss of time on the commute during peak hours is 2.5 times higher than the loss when driving without traffic jams. In conclusion, the paper discusses the prospects for scaling the author’s methodology to any regions in which the Yandex.Probki platform is present.
Commuting labour migration, Moscow, Moscow Region, geoinformation systems, GIS analysis
Monitoring the structure and dynamics of circular labour migration (CLM; hereinafter, this abbreviation is used to denote both migrants and migration as a process, the specific option is clear from the context) provides important information for making decisions on the transport infrastructure development, growth points management, etc. At the same time, the development of research in this area is limited by the lack of statistical data.
Technologies related to geographic information systems (GIS) have become an in-demand method used to analyze population mobility and related problems. A literature review shows that GIS data is already being used to solve a number of problems in the research area under discussion.
One of the existing directions is the use of GIS to analyze the dynamics, structure and forecast of CLM. The spatial dynamics of CLM in South East Queensland (Australia) was studied based on data from 1996 and 2006 censuses (
GIS and Geographic Zonation is another topical area of research that studies the relationship between geographic (geolocations of districts in a city, cities in a country, etc.) and other indicators. The correlation between business geography and the levels of development of territories and infrastructure based on the GIS approach was studied in India (
In (
Another popular trend in modern analytics is GIS transport analysis. CLM traffic patterns were reconstructed in New York and Amsterdam (
Other uses of GIS include analysis of the relationship between population mobility and housing prices; modern GIS enables building multimodal trips by different transport modes, on which the comparison of different transport modes used by CLM is based. The analysis of mobility by environmentally friendly (green) transport modes is gaining popularity: bicycles, scooters, etc. Bicycle networks have appeared in almost all major cities of the world, information from which is used in the analysis of population mobility in various ways. A large body of studies is devoted to the methodological issues of the approach: checking the accuracy of GIS calculations and the resulting systematic errors.
Summarizing the above-mentioned, it should be noted that the GIS approach to the analysis of CLM is a popular trend in modern analysis. At the same time, most scientific works solve separate applied problems. Modern GIS platforms operate continuously, producing big data, the analysis of which enables continuous monitoring of different indicators.
In this paper, we propose an original author’s technique for spatial microanalysis of CLM based on data obtained in a continuous mode from the constantly functioning online platform Yandex.Probki, which is a GIS. This enables monitoring the population mobility, implementing automatically updated indices, obtaining a unique coverage and spatial detail of CLM processes. The presented work is the next stage, continuing, developing and expanding the series of our studies of commuting labour migration in Moscow Region (
The main purpose of the study is to analyze the structure and dynamics of time losses in CLM car trips according to the monitoring data of highways in two Russian regions — Moscow and Moscow Region.
The methodology is based on the collection of primary information on the road network state from the Yandex.Maps GIS platform (Yandex.Maps; Yandex.Probki), which collects, processes and provides users with real-time information about traffic jams. The two main ways it works are: a) visualization of traffic jams; b) building a route of movement at points specified by the user, with and without traffic jams (see. Fig.
Left: traffic situation in Moscow at 16:30 on January 21, 2022 according to Yandex.Probki, traffic jams 7 points (red traffic light icon at the top left). Right: routing of a trip by car from Skhodnya to Moscow made by the Yandex.Map router server. The inset contains information about the trip given by Yandex: distance L = 37 km (green box), travel time TP = 79 min including traffic jams (red box) and T0 = 45 min on a free road (black box). Source: compiled by the author.
These functions are highly in demand, which determines the great popularity of Yandex products: Yandex mobile applications (Maps, Navigator, etc.) are installed and actively used by almost every motorist. As a result, Yandex receives information about the position of the user’s car (from the GPS of the mobile device) every five seconds. The analysis of this information allows Yandex to quickly (once every three minutes) calculate and update data on the state of the road network (the map in Fig.
The scoring of the situation used by Yandex (How they work ...) is good for visual presentation of information and everyday use. However, for a number of reasons, it cannot be used for accurate calculations in research. The main problem is the relativity of this indicator, since points for each route are calculated relative to some reference travel time (without traffic jams) along this route, which: a) is set arbitrarily and b) can be changed at any time. Therefore, the scores:
Finally, the score is an integral indicator (for the entire region), which does not enable studying the spatial effects. In this regard, instead of Yandex.Probki points, the author of this work uses absolute figures: time and distance of the route. Calculations based on such indicators do not have the shortcomings described above for the scoring method.
For a comprehensive analysis of the situation, we must have information about the state of the transport network in the key geo-points of the region. Of course, we do not have the capabilities of the Yandex.Maps platform, which receives information every five seconds from almost every car on the road and controls the traffic situation detailed up to individual road segments in Moscow and Moscow Region. Instead, we form a basic sample of virtual CLM in such a way that their «home-work» routes evenly cover the geographic area in which we want to track the traffic situation. For example, to assess the transport situation within Moscow Ring Road (with a pronounced radial structure), we place the houses of the representatives of the basic sample along the radius of Moscow Ring Road and choose places of work in the center of Moscow (see Fig.
Left: an example of a basic sample for monitoring the transport network within the Moscow Ring Road. CLM (blue arrows) live just outside Moscow Ring Road and work in the center of Moscow. Right: places of residence (blue circles) and work (red spot in the center) of CLM from the basic sample of research linked to the map. Source: compiled by the author.
Requesting within a short time interval «home-work» and «work-home» routes with and without traffic jams for each member of the sample, we get a snapshot of the transport situation in the area under consideration at a certain point in time.
A fundamentally important point is the fixation of the basic sample of CLM by analogy with the construction of the base of respondents in panel studies. This is necessary so that correct time series are obtained from data collected at different points in time.
Data is collected through the Yandex.Maps API in automatic mode using specially developed software. The program runs at a specified interval, makes queries to the Yandex.Maps router for the entire basic sample, receives the results and saves the data in the database.
The sample size directly determines the statistical accuracy of the results: the larger the sample, the more accurate the results. However, too large sample creates problems due to the limit on the number of requests to the Yandex.Maps router, which is limited to 25,000 per day. Exceeding the limit results in blocking the IP address of the computer from which data is being collected. The solution to the problem can be a multi-client mode of data collection, which can be implemented in the future.
The main indicator in data analysis is time lost by CLM on the commute due to traffic jams. Despite the fact that traffic congestion is formed not only under the influence of the movements of commuting migrants, but also as a result of residents travelling due to personal affairs within the city and beyond (to dachas, vacation spots, etc.), there is no need to clear data from these effects, since the time lost by CLМ is determined by the actual situation with jams. Initially, it was assumed that the specific losses due to traffic jams for each route from the base sample would be calculated as follows:
E = (TP – T0)/L,(formula 1)
where TP, T0 are the travel time along route L with and without traffic jams, respectively. However, during the study, it was found that Yandex.Probki does not allow obtaining TP and T0 at the same time in one query. And in two different queries to receive TP and T0 separately, even if they are sent almost simultaneously, and the destination points of the route are the same, the GIS can return different routes due to the specifics of its work (asynchrony and non-strictness of the solution of the routing problem). In this regard, the calculation method according to formula (1) turned out to be unimplementable with the currently existing functionality of Yandex.Maps API.
Therefore, a different approach was taken to determine travel time without traffic jams. For each observation from the sample, according to all the accumulated data for the analyzed period, the minimum specific time of its movement was calculated, taking into account traffic jams:
This calculated value was taken as T0 — travel time without traffic jams. It is important to note that the estimate according to formula (2) turned out to be more reliable than the one according to formula (1), since the value of T0 is actually the reference time of movement underlying the Yandex.Maps scoring system. It is subjectivity and the possibility of changing the T0 indicator in the router itself at any time that is the reason for the weakness of the estimate described above. It is much more difficult to manipulate the indicator obtained according to formula (2), and there are ways to control such interference. However, a detailed discussion of this aspect is beyond the scope of this study.
The most convenient way to represent losses due to traffic jams (ER) is in relative units:
Individual losses estimated according to formula (3) can then be averaged into any aggregated indicators both over sets (groups) within the sample and over any time periods depending on the specific task.
Since the object of this study is to monitor the travel conditions of CLM in Moscow agglomeration, we took 20,000 virtual residents as a basic sample, living within a radius of 20, 50, and 80 km from Moscow and working in the center of Moscow (see Fig.
The hypothesis of the study is that the analysis of the structural and time dynamics of travel time losses correlates with important events in regional life, such as natural disasters, social restrictions (for example, the introduction of temporary restrictions on movement or work during the COVID-19 pandemic), adverse weather conditions, repair and construction work at transport infrastructure facilities, etc. Constant operational monitoring of the state of the transport network using online GIS platforms enables quickly identifying emerging problems and evaluating the dynamics of their development, predicting the consequences. The results presented in the next section and their discussion will enable testing the validity of the hypothesis.
Analysis of the time series of time losses for CLM trips from the formed basic sample is of particular interest, since this indicator is most sensitive to the «rhythm of regional life» and captures all anomalies that violate it.
Dynamics of average indicators for the sample. Fig.
Based on the data obtained, it can be concluded that CLM travelling during peak hours spend on average 1.8–2.4 times more time on the commute than it takes to travel without traffic jams.
Thus, the hypothesis about the sensitivity of the studied indicator to abnormal processes in the region is confirmed by the results of the study.
Average daily time losses for CLM trips during peak hours from home to work (black dots) and back (red dots). Blue curve — Citymapper mobility index (CMMI; (Citymapper mobility index… 2021). Source: compiled by the author. Note: CMMI scaling was done for March–April 2021.
Comparison of dynamics between groups. Fig.
Intra-week losses on the «home-to-work» route during peak hours among three sample groups living at distances of 20 km (black dots), 50 km (red dots) and 80 km (green dots) from Moscow and working in the center. Source: compiled by the author.
This result shows that the core of the jams has a radius of less than 20 km, and it is this core that accounts for the main losses of CLM. The ring located at a distance of 20–80 km from the city center is less problematic, as evidenced by the proximity of the CLM indicators of the middle and remote groups. However, do not forget that these are weekly averages, significant deviations from average behaviour are observed quite often, as evidenced by the scatter of points in Fig.
Rush hour effect. Fig.
Intraday cycle. To study this process, the distribution of the transport network load by hours during the day, averaged over various periods (a day, a week, a month), was analyzed. Fig.
Intraweek cycle. Similarly to the intraday cycle, intraweek cycles can be calculated by averaging over any range of interest, for example, over weeks, months, years. Fig.
Note that both the intraday and intraweek cycles have very stable dynamics and repeatability over a long horizon (several years of observations). In this regard, it can be assumed that when building predictive models, the parameters of the hour and day of the week will be significant factors.
Left: intraday travel losses by time of day (group living 20 km from Moscow, averaged from January 2020). Source: compiled by the author. Right: a similar chart in Yandex research. Source: (Traffic jams in Moscow…).
To present the results of the project, a special website (Analysis of mobility…) was launched, where the user has the opportunity to independently study the results of the project in an interactive mode.
This paper presents the results of an analysis of the сircular labour migrants’ trips in terms of time lost on the commute, taking into account the remoteness of the place of residence from the city center. The article proposes an author’s method for assessing temporary losses and demonstrates the dynamics of the obtained indicators for the year, as well as the cyclical nature of losses within a day and a week. The results confirm the hypothesis about the sensitivity of the proposed indicators to regional events.
The described technique has great prospects for further application and scaling. The value of the accumulated data is growing. In the future, it is planned to make predictive models based on the data obtained using machine learning methods. In addition to collecting data on traffic jams, since 2019, the author has been working on collecting data on weather conditions in Moscow Region. The joint analysis of data on weather and traffic congestion is an interesting extension of the project.
Another possible direction for the development of the presented study is the conversion of estimated time losses into monetary losses based on the forgone income model. The implementation of this approach will create an informative economic indicator that will be useful for the purposes of regional management.
Finally, the applied result of the work can be the creation of a mobile application with an option of prediction of the regional transport network state at any time and in any geolocation.
This paper was supported by RFBR grant no. 19-010-00794a.
Julia Yurievna Shitova, Doctor of Economics, PhD in Sociology, Professor, Department of Integrated Communications and Advertising, Russian State University for the Humanities, Moscow, 125047, Russia. Email: yu_shitova@mail.ru