Research Article
Research Article
An Econometric Assessment of the “Punishment” for Singlehood in Russia: Risks or New Opportunities in Life?
expand article infoRafael A. Akhtemzyanov
‡ Lomonosov Moscow State University, Moscow, Russia
Open Access


The paper focuses on the effect of having a marriage partner on health and well-being of Russians as compared with their single compatriots. The health status variation between those who are married and those who are single can be explained both by the protective effect of marriage and marriage selection. Using the Cox proportional hazards model on the self-perceived health data from the RLMS 2004-2019 individual questionnaire, while controlling for socioeconomic factors, lifestyle, and living arrangements, we have found that the protective effect of marriage is non‑existent in men, except for a short-term impact of marital transitions. Women are “punished” for their singlehood due to a lack of a partner in their young age, or being in an unregistered union, or the loss of a breadwinner spouse at the age of 50 to 64. In contrast, women over 65 benefit from singlehood.


Singlehood, marriage, marital status, health, survival analysis

JEL codes: J12, I12, I31


The paper focuses on the effect of having a marriage partner on health and well-being of Russians as compared with their single compatriots. The relationship between the marital status and the health status of a person has been well studied in different countries and supported by data. For example, the health indicators are on average better in those who are married than those who are not, while the mortality rates are lower for both men and women in all age groups (Goldman et al. 1995; Wyke and Ford 1992; Verbrugge 1979). The above papers identify two main reasons of the differences found: the causal positive protective effect of marriage on health through psychological, social, economic, behavioural, and other factors, and the self-selection effect where people choose the healthiest, both physically and mentally, wealthiest, and generally successful partner to marry.

Singlehood itself, associated with the absence of a marriage partner, can be broken down in two types: primary, resulting from not getting married, and secondary, arising upon withdrawal from marriage. In the Russian literature, the effect of having a partner on health and well-being is mainly considered in terms of simple comparison of average indicators (such as self-perceived health or life satisfaction) for the population grouped by the current marital status, gender, and age (Sinelnikov 2011; Gurko 2018). However, the previous studies have not produced conclusive results since they do not take into account individual factors such as income, education, parental status, and others by which selection into marriage can take place.

Our contribution can help shed some light on the positive and negative effects of living with a partner, as well as assess the size of the singlehood “penalty” for Russians.

Causes of Singlehood

The spread of gender egalitarianism and tertiary education along with the revolution in the gender roles in societies are the most important reasons for singlehood and childlessness, according to literature. For example, Bellani et al. (2017) used data from Europe to examine the reasons why people choose to live without a marriage partner. The authors conclude by saying that when traditional family norms do not keep up with changes in people’s lives, then the proportion of single individuals will grow, whereas when the norms themselves change, the same proportion will return to its previous level. As they analysed these trends in terms of the level of education, the authors found higher risks of singlehood, divorce, or refusal of childbearing among college-educated women whose interests rather lay in their career building. For non-college-educated men, the chances of being single decreased as their breadwinner capacity had a lower value now in the marriage market.

Jalovaara et al. (2019) note a growth trend in the proportion of college-educated population in Northern Europe, and it was women who saw the fastest growth, suggesting an increasing gender equality in the access to education. However, the conclusions about the nature of the link between the level of education and the level of singlehood or childlessness were different from those found in the previous paper. Less educated men who initially had the highest level of childlessness retained their position even after controlling for the higher overall level of childlessness in the population. Whereas for college-educated women who initially had the highest levels of childlessness too, the differences by the level of education become insignificant in the younger cohorts born in the 1960s and 1970s.

The findings of the effect of the growing gender equality in access to education and career opportunities and of the lagging gender norms and traditions on singlehood and childlessness are also supported by the panel data from Germany (Raab and Struffolino 2020) and Finland (Jalovaara and Fasang 2017; Saarela and Skirbekk 2020). That said, Bergström and Vivier (2020) note the importance of not confusing singlehood as an absence of a partner while living together with other family members and singlehood when living alone, as the specifics of singlehood differ in each case.

Zinchenko and Lukyanova (2021) note a special situation in Russia with its imbalance of access to tertiary education in favour of women while the gender stereotypes remain conservative. The Russian Longitudinal Monitoring Survey (RLMS) 1995-2015 data revealed a growth trend in the proportion of unmarried individuals, which was more pronounced for women than for men. While in most marriages the spouses have the same level of education (so-called homogamous unions), the number of marriages in which the wife’s level of education is higher than that of her husband (hypogamy) is also growing. While the overall proportion of college-educated Russians is growing, the growth occurs faster among women, which further exacerbates the current gender inequality in the level of education.

Therefore, the level of education is an important socioeconomic parameter, as it is associated with the probability of falling into the group of singles. Next, we review the literature which examines more broadly the effects of the so-called marital transitions, i.e., entering into or withdrawal from marriage, with a view to identifying their positive or negative effects.

Singlehood as a risk factor for health and well-being over a lifetime

Empirical evidence suggests that the initial period of marriage features “bonuses” to health, physical and mental, while dissolution of marriage is associated with “penalties.” The mechanisms of the protective effect of marriage on health include improved psychological status, reduced probability of depression (Koball et al. 2010). Married men and women have a greater propensity to prevention of diseases (Miller and Pylypchuk 2014; Hilz and Wagner 2018). All of the above, according to the authors, also influence the life expectancy, which is higher in married men and women.

Lorenz et al. (2006), Strohschein et al. (2005) exposed a link between divorce and mental problems occurring over a short term (2-3 years) which would translate into physical disease in the long term (10 years). During the first year after a divorce, the negative psychological effects were the strongest, and gradually subsided, on average, over the next three years. Barbuscia et al. (2022) found that a recent (less than five years ago) divorce was associated with increased manifestations of sleep disorder and depression, while the effect on the self-perceived health was apparent in the longer term (5-10 years or more).

Thus, in the short term, entering into marriage is linked with health benefits, while, on the contrary, divorce seems to be a negative stressful life event, leading to deterioration of health, both physical and mental.

The consequences of marriage dissolution go beyond the above-mentioned short-term negative effects of the change in the marital status. A number of longitudinal studies applying the survival analysis have reported higher health risks for both those who for whatever reason withdrew from marriage and those who have never been married.

Molloy et al. (2009); Floud et al. (2014) found that the risk of death for the single, divorced, or widowed individuals aged 30-35 and older was higher than the same for the married individuals. Franke and Kulu (2018) showed that only the magnitude of this additional risk varied according to age while remaining statistically significant across all age groups. Furthermore, the divorcees and the married persons differed significantly in terms of alcohol use, tobacco use, stress levels, as well as socioeconomic status, which was prominent in both men and women.

A number of studies have reported increased risks of cancer in single women (Trudel-Fitzgerald et al. 2019) and hypertension in men (Ramezankhani et al. 2019). However, marriage dissatisfaction seriously increases the risks of heart diseases (Isiozor et al. 2019).

Summarizing the empirical results obtained by the authors, we can assume that the single lifestyle is associated with an increased risk of developing chronic diseases and death as compared to having a marriage partner. The use of survival analysis, namely the Cox proportional hazards model, seems justified here, because it is better in this case than a simple linear regression or matching, and the reasons for that are as follows. Firstly, the survival analysis allows to account for censored observations, that is, individuals who dropped from the survey and whose future is untraceable, which is often the case in longitudinal studies. Secondly, it is easier to consider the long-term effects of the marital status on health by comparing the survival of groups of respondents according to their marital status over the period of survey. Thirdly, characteristics of the participants may change, e.g., as caused by getting married, changing a job, having children, etc., and the survival analysis allows constructing models to account for such changes, which improves the accuracy of results.

Data and Empirical Strategy

We use the Cox proportional hazards model to analyse the relationship between the marital status and health risks. The main source of data is the RLMS, which offers a broad set of questions on various aspects of the life of Russians. All the necessary control variables are only available from 2004 through 2019, so we rely on the annual observations for the waves 13-27, respectively.

The regressors that we further use in our models are the transformed variables taken from the RLMS questionnaire. Thus, the self-perceived health status is the main health variable. In addition, we consider a wide range of control characteristics: education, income, employment, lifestyle and bad health habits, type of population centre, etc.

As mentioned above, we apply the Cox proportional hazards model with time-varying covariates, especially in order to timely account for changes in the variable of interest, i.e., the respondent’s marital status. If the marital status changes during a certain wave of the survey (e.g., by getting married), then we will be able to track this down immediately (i.e., capture the marital transition), which is impossible to do in the standard Cox model. In our case, the model equation looks as follows:

hi (t) = h0(t) * exp[marsti * β(t) + Xi * γ(t)]


t – age of person (model time)

hi (t) – estimated risk of the i-th person to categorize as a low self-perceived health status

h 0(t) – risk to categorize as a low self-perceived health status, the same for all persons

Xi – matrix of control variables for the i-th person

marsti – marital status of the i-th person

γ(t) – coefficient vector for control variables, according to the age group

β(t) – coefficient vector for the variable of interest: marital status, according to the age group

We use the deterioration in the self-perceived health status as a failure event in our models, i.e., moving from the “rather satisfied with one’s health” group to the “dissatisfied with one’s health” group. Since the self-perceived health status is a categorical variable, we use it to construct a binary variable. Thus, respondents with the self-perceived health at “very good,” “good,” or “fair” fall into the good health category while those with the same at “poor” or “very poor”, into the poor health category.

There are several reasons for the above breakdown: firstly, it allows for a large sample size across all age-sex groups. Annex 2 shows the distribution of all respondents by their self-perceived health status. Secondly, we have seen that although there are more respondents with the self-perceived health at 4 or 5 in the older age groups, the health-neutral category 3 is the largest category of all. That is where the transition occurs between those who perceive their health as being rather good and those who developed various diseases with age and those who perceive it as rather poor. Furthermore, it is the transition to the dissatisfied with health category that is associated with the development of chronic diseases (see Annex 7).

For convenience, the control variables used in the model can be divided into the following three blocks:

  • Socioeconomic variables: education, income, employment;
  • Living arrangements-related variables: type of population centre, living alone, presence of adult or minor children;
  • Lifestyle-related variables: tobacco use and alcohol use, physical activity.

Furthermore, control for the year of birth and fixed effects for each wave of the survey were introduced in all the specifications. Such a set of control variables generally covers different aspects of life. As we will see later, even in the absence of detailed medical information on the respondents’ health status, consecutively introducing control variables to the model does explain the protective effect of marriage in many cases, making it statistically insignificant. Therefore, even allowing for the fact that the problem of endogeneity may still be present due to self-selection into marriage, the available controls allow to successfully handle it.

The marital status of the respondent is the variable of interest here. We use the “in a registered marriage” group as a control group (with which we compare all the others). Therefore, all the estimates we obtain further below in the models are the risks of health deterioration for the groups with different marital statuses as compared with those who are in a registered marriage. Besides, we extended the model by introducing the variables to describe the marital transitions over the past three years, namely entering into a registered marriage and withdrawal from marriage through widowhood or divorce. This timeframe was chosen based on the empirical literature on marriage transitions and the duration of their effect on health (Lorenz et al., 2006; Strohschein et al., 2005; Barbuscia et al., 2022) as outlined in the previous chapter, as well as technical considerations relating to the fact that the data on the dynamics of the respondents’ marital status is only available for a limited period of time. Introducing these regressors to the model is necessary to account for the duration of the current marital status. Substantively, our model remembers each respondent’s marital transitions over a three-year period, which means it can more accurately estimate the effect of the respondent’s marital status on the health risks, relying not only on the current marital status but also with due account of the historical data. Annex 1 gives a table describing – variables of the model.

Inasmuch as, consistent with the theory of the previous chapter, we expect different values of the protective effects of marriage for different age-sex groups, and also in order to meet the proportionality condition, which is one of the assumptions of the Cox model, we added binary variables to the model for the following three age groups: 30-49 years, 50-64 years, and 65+ years, with a view to estimating age-heterogeneous effects both for the marital status and for the control variables. This is the method which Therneau et al. (2017) recommend for simulated time-dependent hazard ratios. We chose age-group breakdown thresholds based on the methodology to be found in Franke and Kulu (2018). The respondent’s self-perceived health status can both worsen and improve over the course of the survey which results in there occurring multiple events for one person simulated. The health dynamics of some respondents was nonmonotonic, and therefore we clustered the standard errors at the respondent ID level (Therneau et al. 2017).

Thus, based on the RLMS questionnaire, we found all of the most important variables that were controlled for in the previous studies.

2.4 Results

Annex 4 gives complete tables of the model estimates in all the specifications and with a breakdown by gender.

We obtained the following results (Table 1) by exponentiating the coefficients of the variables of interest from the regression estimates.

Table 1.

Hazard ratios (HR) for age-sex groups as a function of marital status

M 30-49 M 50-64 M 65+ W 30-49 W 50-64 W 65+
Never married 1.38 0.57 0.57 1.45** 1.22 1.02
Living together, not in a registered marriage 1.28 0.88 0.88 1.01 1.46*** 1.06
Divorced and unmarried 1.27 0.79 0.79 0.98 1.17 0.92
Widowed 0.79 1.07 1.07 0.98 1.32** 0.80**
In a registered marriage, not living together 1.51 1.13 1.13 1.32 1.93* 0.31
Events 1 828 2 744
Observations 33 811 41 797
*0.1 **0.05 ***0.01

In order to draw substantive conclusions, we tested the assumptions of the estimated models (see Annex 8).

For convenience, let us consider each of the marriage statuses one by one as classified in the RLMS questionnaire.

  • Never married:

The effect for men in the complete specification model was not significantly different from zero in all age groups, and most of the protective effect of marriage for them was explainable by introducing variables relating to the living arrangements: parental status and living alone, which correlated with the marital status. Notably, the introduction of lifestyle and habits-related variables into the model did very little for the size of effect for men in this age group. It is likely that these factors have a significant effect in older ages. For example, in the 65+ age group, it was the introduction of lifestyle variables that explained the entire effect for primarily single men. For women, the effect remained stable even upon introducing additional groups of controls. Given this control, never-married women aged 30-49 had about a 45% higher risk of lower self-perceived health than married women. Still being single while most of the peers are already in a relationship can cause additional anxiety and stress which lead to health problems. That is true for the Russian women in their younger working ages. However, as women age 50 or older, their non-marriage will no longer impact the probability of health deterioration; we did not find a significant effect even in the baseline specification. This can be explained by two reasons: first, at older ages, the decision not to get married is already an informed choice and therefore does not lead to any significant stress, and second, with age, a person simply gets adapted to living alone so that the costs of this lifestyle will decrease.

  • Living together, but not in a registered marriage:

We deliberately did not cluster this category with those who have officially registered their relationships. Of course, those who live together but are not married are not single and nevertheless, it was interesting to see whether there are any effects for this group of individuals, or it is possible not to single them out into a category of its own in the future. In men, as before, no protective effect of marriage was found. For women in the younger age group, the effect initially identified in the baseline model completely disappeared when the new blocks of control variables were introduced. But at older ages (50-64) it remained robust, with a 46% higher risk of lower self-perceived health. Sinelnikov (2018) argues that cohabitation cannot be treated as registered marriage: “The main purposes for people to get married (achieving personal happiness, getting rid of loneliness, and having children) are achieved much less often in cohabitation than in legal marital unions. Therefore, one should not treat cohabitation as marriage. It is an intermediate form of marital status in between legal marriage and singlehood. The replacement of marriage with cohabitation means a gradual transition from the family lifestyle to the single one” (Sinelnikov 2018: 108). Based on the RLMS data, the above author found significant differences in the average levels of happiness and childlessness between the two groups. But since his work did not control for other factors, his findings do not yet suggest a causality (a more comprehensive analysis being required) while being generally consistent with our conclusions.

  • Divorced and unmarried:

Initially, we expected higher risks for divorced individuals in all age-sex groups since the reviewed empirical literature (Lorenz et al. 2006; Strohschein et al. 2005; Barbuscia et al. 2022) suggests that a marriage breakdown entails negative short-term psychological consequences which further translate into health problems. Nevertheless, based on the Russian data, we did not find a significant long-term effect of a marriage breakdown on the health dynamics in any age-sex group. Remarkably, we did obtain higher health risks in men aged 30-49 and 65+ years in the baseline specification of the model, which gradually disappeared as the control variables were introduced. Older ages probably have a higher proportion of second (and subsequent) divorces whose health consequences the literature considers to be not as strong as those of the first marriage dissolution.

Therefore, parting with an unloved partner, for all its costs, does not create additional health hazards. It seems that the main risk factor for the divorced persons is deterioration of their living arrangements (in particular, they live alone much more often) and development of bad habits, and it is through that channel that deterioration of health further takes place (of course, this hypothesis of ours also needs to be additionally tested). Furthermore, divorce, on average, occurs at an older age than marriage so that deterioration of health can partly be explained with the age factor as well.

  • Widowed:

There were few widowed respondents in the younger age groups and this simple fact means that the confidence intervals of the model were wide, making it difficult to find causal relationships for these groups. In pre-retirement age women, the loss of a partner, a breadwinner of the family, indeed, increases the risks of health problems by approximately 46%. However, at age 65+, by contrast, the effect is positive, with approximately a 20% reduction in the risks of health deterioration. As with the divorced, the widowed women are on average more likely to live alone, and on average they are older. By that age, they will often have adult children who can take care of them, and they would not bear the burden of loneliness even if they live without a partner. Moreover, they do not need to take care of their ill spouses, which also creates a protective effect from living alone for women of retirement age.

  • In a registered marriage, but not living together:

This group was the smallest compared to the others (see Annex 3), potentially making it difficult to capture any relevant effects due to their wider confidence intervals. That said, we identified increased health risks for women under age 65 in the baseline model as compared to married women. However, in both cases it was the introduction of lifestyle-related variables which completely explained the effects. Thus, this group is the most difficult for making substantive conclusions due to its short sampling. Probably, as is the case with the widowed and the divorced, the marital status does not produce a direct impact on health but rather manifests itself through changes in the living arrangements and especially through the development of bad habits.

The following conclusion can be made from all the above: in Russia, the protective effect of marriage is much weaker than the literature initially allowed to expect. And in most cases the protective effect of marriage disappears when controls are introduced for the socioeconomic status, living arrangements, and habits.

In men, the protective effect of marriage was not found in any age group. Women have a greater propensity for prioritizing family over career and, therefore, have a hard time living alone due to not having a partner at their young age or if they live in an unregistered relationship at the pre-retirement age. They face a health risk due to their loss of a breadwinner spouse at ages 50-64. But then after the age of 65, widows face lower health risks than married women, which we initially did not expect to find.

Next, we will discuss the limitations of this analysis and possible directions of future research that would clarify the nature of “punishment” for singlehood in Russia.


The main limitation we encountered in conducting this study relates to the non-binary nature of the dependent variable, i.e., self-perceived health status. We developed a strategy for dichotomizing this variable based on the literature exploring the specifics of using the health variable in the RLMS and bringing it to a binary form, as well as for technical reasons (please, refer to Data and Empirical strategy). However, it is not technically possible to conduct a robustness analysis for selecting a threshold between what would be considered good health and poor health and, in particular, which category the “fair” health status should fall into, because in this case the sample size for all age-sex groups is simply not sufficient for the robustness analysis itself to be considered substantive.

A whole number of studies have looked into the extent to which a self-perceived health status truly reflects a person’s health rather than a subjective perception thereof. For example, Zheng and Thomas (2013) note that a higher self-perceived health status among the married persons can be explained not only by the protective function of marriage but also by the fact that they tend to overestimate their health status so that their self-perceived health only deteriorates when they have already developed serious diseases.

On the other hand, there is evidence to suggest that self-perceived health status used in RLMS is a good predictor of mortality (Perlman and Bobak 2008). The authors used the wave data from 1994 to 2002 to find that if the self-perceived health is considered a categorical variable, it produces anomalies in women: those who rated their health as “good” had higher mortality rates than those who rated it as “fair.” And for the “very good” health rating there was simply not enough data to draw substantive conclusions. In men, a normal mortality gradient was observed from the “good” health to the “poor” health. That said, as the authors suggest, if the respondents are divided into two health groups (including the “fair” health specifically into the “good” health group) rather than five, then the mortality risk correlates very well with the health group (the authors used the Cox proportional hazards model).

The straightforward selection of the development of chronic diseases as a dependent variable presents a number of technical and substantive problems. Firstly, the presence (or absence) of a particular chronic disease conveys little about the person’s overall health status. In such cases, one can control for numerous medical characteristics of health: blood pressure, haemoglobin, body mass index and others, many of which are not to be found in the RLMS questionnaire. Secondly, the RLMS data on the presence of chronic diseases are of low quality. Thus, there are many respondents in the sample who forgot to mention the presence of certain diseases in each wave of the survey. Therefore, an additional cleansing and correction of the chronic disease data is required before those can be used to conduct a longitudinal study, which can also create additional limitations for the extrapolation of the results.


The connection between singlehood and well-being is still insufficiently covered in the scientific literature. Most papers including this one focus specifically on the direct effect of the marital status on health while ignoring other aspects of life, such as development of bad habits, changes in living arrangements, losing or finding a job, income dynamics, etc. The survival analysis methodology allows to assess the connection between the single lifestyle and the above as well as many other aspects of human well-being in order to get a broader picture of the “punishment” for being single in Russia, the magnitude of the “punishment” and its time-dependent behaviour along with its interplay with general changes in lifestyles and the behaviour in the marriage market.

Compared to the previously obtained empirical evidence suggesting a very large protective capacity of marriage in terms of health, we have found from the fresh data on the Russian population that it is not as prominent as originally expected. The three blocks of control variables that we used explain most of the differences between the marriage status groups. This means that in many cases the differences in the health dynamics between the single and the married Russians are explained by marital selection and changes in lifestyles and habits, rather than by the protective function of marriage. Having said this, at present, the differences between the single and the married Russians are still detectable in the data. And therefore, the institution of marriage is still an important factor in the health dynamics of Russia’s population.


  • Barbuscia A., Cambois E., Pailhé A., Comolli C.L., Bernardi L. (2022) Health after union dissolution(s): Cumulative and temporal dynamics // SSM – Population Health: 17: 101042.
  • Bellani D., Esping-Andersen G., Nedoluzhko L. (2017) Never partnered: A multilevel analysis of lifelong singlehood // Demographic Research: 37(4): 53-100.
  • Floud S., Balkwill A. et al. (2014) Marital status and ischemic heart disease incidence and mortality in women: a large prospective study // BMC medicine: 12: 42.
  • Franke S., Kulu H. (2018) Mortality differences by partnership status in England and Wales: the effect of living arrangements or health selection? // European Journal of Population: 34: 87-118.
  • Hilz R., Wagner M. (2018) Marital status, partnership and health behaviour: Findings from the German Ageing Survey (DEAS) // Comparative Population Studies: 43: 65-97.
  • Isiozor N.M., Kunutsor S.K., Laukkanen T., Kauhanen J., Laukkanen J.A. (2019) Marriage dissatisfaction and the risk of sudden cardiac death among men // The American Journal of Cardiology: 123(1): 7-11.
  • Koball H.L., Moiduddin E., Henderson J., Goesling B., Besculides M. (2010) What do we know about the link between marriage and health? // Journal of Family Issues: 31(8): 1019-40.
  • Lorenz F.O., Wickrama K.A.S., Conger R.D., Elder G.H. (2006) The short-term and decade-long effects of divorce on women’s midlife health // Journal of health and social behavior: 47(2): 111-25.
  • Molloy G.J., Stamatakis E., Randall G., Hamer M. (2009) Marital status, gender and cardiovascular mortality: Behavioural, psychological distress and metabolic explanations // Social science & medicine: 69(2): 223-8.
  • Perlman F., Bobak M. (2008) Determinants of self rated health and mortality in Russia–are they the same? // International Journal for Equity in Health 7: 19.
  • Ramezankhani A., Azizi F., Hadaegh F. (2019) Associations of marital status with diabetes, hypertension, cardiovascular disease and all-cause mortality: A long term follow-up study // PloS ONE: 14(4): e0215593.
  • Strohschein L., McDonough P., Monette G., Shao Q. (2005) Marital transitions and mental health: Are there gender differences in the short-term effects of marital status change? // Social science & medicine 61(11): 2293-303.
  • Trudel-Fitzgerald C., Poole E., Sood A.K., Okereke O.I., Kawachi I., Kubzansky L.D., Tworoger S.S. (2019) Social integration, marital status, and ovarian cancer risk: A 20-year prospective cohort study // Psychosomatic medicine: 81(9): 833-40.
  • Zheng H., Thomas P.A. (2013) Marital status, self-rated health, and mortality: Overestimation of health or diminishing protection of marriage? // Journal of Health and Social Behavior: 54(1): 128-43.
  • Zinchenko D.I., Lukyanova A.L. (2021) . Trends in educational assortative mating in Russia: do changes in educational structure matter? // Universe of Russia. Sociology. Ethnology, 30(1), 111-133. (in Rus.).

Other sources of information

Therneau T., Crowson C., Atkinson E. (2023) Using time dependent covariates and time dependent coefficients in the Cox model. CRAN Packages, Survival.



Annex 1. Variables in the model of health effects of singlehood

Name of Variable Description Units, Scale
Variables of Interest
Marst Current marital status 1) Never married
2) In a registered marriage
3) Living together, not married
4) Divorced and unmarried
5) Widowed
6) In a registered marriage, but not living together
marriage Binary variable for those who have got married over the past 3 years 0) No
1) Yes
Divorce Binary variable for those who have divorced over the past 3 years 0) No
1) Yes
widowhood Binary variable for those who have become widowed over the past 3 years 0) No
1) Yes
Block 0: fixed effects and technical control variables
Wave Survey wave index 13-27 waves
Born Year of birth cohorts Respondent’s year of birth
age_group Respondent categories by age (for heterogeneous effects and model calibration). Methodology used: (Franke S., Kulu H., 2018) 1) 30-49 years
2) 50-64 years
3) 65+ years
Block 1: Socioeconomic control variables
Educ Level of education. Represented in the model as a set of dummy variables 0) Basic general
1) Secondary or Vocational secondary
2) Tertiary
Income Monthly income in 2020 rubles. Obtained by nominal income adjustment for CPI RUB ‘000
Job Employment 0) no
1) yes
Block 2: control variables for living arrangements
Status Type of population centre 1) regional centre
2) city
3) urban-type settlement
4) village
Alone Respondent living alone? Obtained from household data 0) no
1) yes
children_old Number of children from 18 years and upwards Number
children_young Number of children under 18 years Number
Block 3: control variables for lifestyle
Alcohol alcohol use 0) no
1) yes
cigarettes tobacco use Cigarettes per day
Phys Physical activity level 0) No physical exercise
1) Occasional physical exercise, or better
Dependent variable (failure event)
Name of variable Description Units, Scale
event_health Falls into “rather poor” self-perceived health category (health > 3) 0) no
1) yes

Annex 2. Distribution of respondents by self-perceived health status

health M 30-49 M 50-64 M 65+ W 30-49 W 50-64 W 65+
1 2.0 % 0.6 % 0.4 % 1.2 % 0.3% 0.1%
2 45.2 % 20.0 % 7.9 % 35.8 % 12.1% 3.7%
3 48.2 % 64.8 % 55.0 % 57.5 % 69.8% 49.2%
4 4.2 % 13.1 % 30.3 % 5.2 % 16.3% 39.3%
5 0.4 % 1.5 % 6.4 % 0.3 % 1.5% 7.7%
Total 100 % 100 % 100 % 100 % 100% 100%
Total, persons 32 569 18 503 10 354 39 498 27 885 25 134

Annex 3: Balance of covariates by gender and marital status of respondents

Women 30+
1 2 3 4 5 6
observations 4972 44041 8357 11962 22613 572
age 48.03 (15.56) 49.15 (13.03) 45.63 (11.83) 51.77 (13.05) 69.00 (12.03) 49.52 (12.78)
income 23.10 (18.03) 19.54 (17.03) 20.44 (18.11) 24.60 (18.67) 18.25 (11.78) 23.67 (17.86)
eduс (%) 0 792 (16.0) 5005 (11.4) 1284 (15.4) 1226 (10.3) 8409 (37.3) 82 (14.4)
1 2456 (49.5) 25762 (58.6) 5179 (62.1) 7200 (60.4) 10629 (47.2) 329 (57.6)
2 1713 (34.5) 13224 (30.1) 1876 (22.5) 3503 (29.4) 3503 (15.5) 160 (28.0)
job (%) 0 2003 (40.3) 17738 (40.3) 2933 (35.1) 4389 (36.7) 17887 (79.2) 205 (35.8)
1 2966 (59.7) 26288 (59.7) 5422 (64.9) 7568 (63.3) 4700 (20.8) 367 (64.2)
status (%) 1 2410 (48.5) 17308 (39.3) 3374 (40.4) 5806 (48.5) 9620 (42.5) 284 (49.7)
2 1014 (20.4) 11908 (27.0) 2322 (27.8) 3536 (29.6) 5728 (25.3) 143 (25.0)
3 405
3050 (6.9) 455
1340 (5.9) 42
4 1143 (23.0) 11775 (26.7) 2206 (26.4) 1798 (15.0) 5925 (26.2) 103 (18.0)
alone (mean (SD)) 0.25 (0.43) 0.00 (0.06) 0.01 (0.12) 0.27 (0.44) 0.42 (0.49 0.18 (0.39)
children_old (mean (SD)) 0.25 (0.53) 1.18 (1.08) 0.90 (1.02) 1.07 (0.88) 1.70 (1.09) 1.21 (1.14)
children_young (mean (SD)) 0.30 (0.58) 0.61 (0.89) 0.59 (0.87) 0.35 (0.62) 0.06 (0.29) 0.48 (0.77)
cigarettes 2.00 (5.05) 1.31 (4.29) 3.75 (6.76) 2.54 (5.83) 0.79 (3.51) 2.96 (6.38)
alcohol 0.58 (0.49) 0.62 (0.49) 0.69 (0.46) 0.63 (0.48) 0.48 (0.50) 0.73
phys 0.23 (0.42) 0.21 (0.41) 0.18 (0.39) 0.25 (0.43) 0.18 (0.38) 0.24 (0.43)
health (mean (SD)) 2.86 (0.73) 2.89 (0.66) 2.85 (0.64) 2.97 (0.67) 3.40 (0.72) 2.98 (0.69)
Men 30+
1 2 3 4 5 6
observations 3384 43864 8075 3503 2312 288
age 39.03 (9.36) 50.34 (13.56) 46.58 (12.43) 48.52 (11.92) 70.49 (12.30) 48.02 (12.77)
income 20.04 (19.77) 28.65 (21.50) 24.94 (20.19) 20.88 (19.55) 21.61 (15.91) 24.48 (21.05)
eduс (%) 0 773 (23.0) 7737 (17.7) 1887 (23.4) 649 (18.6) 971 (42.1) 53 (18.5)
1 1936 (57.5) 25546 (58.3) 5094 (63.3) 2256 (64.6) 935 (40.6) 167 (58.2)
2 657 (19.5) 10504 (24.0) 1068 (13.3) 589 (16.9) 399 (17.3) 67 (23.3)
job (%) 0 1548 (45.7) 14665 (33.4) 2476 (30.7) 1647 (47.1) 1862 (80.7) 110 (38.2)
1 1836 (54.3) 29177 (66.6) 5598 (69.3) 1853 (52.9) 445 (19.3) 178 (61.8)
status (%) 1 1446 (42.7) 16595 (37.8) 3242 (40.1) 1500 (42.8) 989 (42.8) 173 (60.1)
2 737 (21.8) 11992 (27.3) 2227 (27.6) 950 (27.1) 506 (21.9) 57 (19.8)
3 214
3110 (7.1) 420
4 987 (29.2) 12167 (27.7) 2186 (27.1) 818 (23.4) 654 (28.3) 46 (16.0)
alone (mean (SD)) 0.17 (0.37) 0.00 (0.06) 0.01 (0.09) 0.38 (0.48) 0.49 (0.50) 0.36 (0.48)
children_old (mean (SD)) 0.01 (0.15) 1.11 (1.08) 0.78 (1.02) 0.88 (0.94) 1.66 (1.04) 0.81 (1.01)
children_young (mean (SD)) 0.06 (0.27) 0.63 (0.89) 0.54 (0.82) 0.40 (0.68) 0.05 (0.30) 0.57 (0.91)
cigarettes 10.31 (10.13) 9.06 (10.54) 13.26 (10.82) 11.95 (10.47) 6.50 (9.91) 12.13 (11.21)
alcohol 0.77 (0.42) 0.76 (0.43) 0.80 (0.40) 0.77 (0.42) 0.69 (0.46) 0.80
phys 0.24 (0.43) 0.21 (0.40) 0.17 (0.38) 0.22 (0.42) 0.20 (0.40) 0.23 (0.42)
health (mean (SD)) 2.66 (0.73) 2.80 (0.70) 2.74 (0.67) 2.86 (0.73) 3.30 (0.78) 2.91 (0.77)

Annex 4: Complete Results of Assessment of the Marital Status Effect on Health Risks

Men 30+ Women 30+
1 2 3 1 2 3
marst = 1, age_group = 1 0.48*** 0.34* 0.32 0.42*** 0.29* 0.37**
marst = 3, age_group = 1 0.32** 0.21 0.25 0.24** 0.18 0.01
marst = 4, age_group = 1 0.51*** 0.41** 0.24 0.20 0.12 -0.02
marst = 5, age_group = 1 -0.40 -0.44 -0.23 0.23 0.21 -0.02
marst = 6, age_group = 1 0.79* 0.55 0.41 0.94*** 0.79*** 0.28
marst = 1, age_group = 2 -0.47 -0.57* -0.57 0.20 0.15 0.20
marst = 3, age_group = 2 -0.05 -0.07 -0.13 0.32*** 0.30*** 0.38***
marst = 4, age_group = 2 -0.11 -0.18 -0.23 0.18* 0.16 0.16
marst = 5, age_group = 2 -0.02 -0.08 0.07 0.28*** 0.28*** 0.28**
marst = 6, age_group = 2 0.13 0.10 0.12 0.70*** 0.66*** 0.66*
marst = 1, age_group = 3 -1.83** -1.86** -0.57 -0.12 -0.12 0.02
marst = 3, age_group = 3 -0.10 -0.10 -0.13 -0.03 -0.02 0.06
marst = 4, age_group = 3 -0.39** -0.36 -0.23 -0.15 -0.14 -0.08
marst = 5, age_group = 3 -0.28** -0.23 0.07 -0.16** -0.16** -0.22**
marst = 6, age_group = 3 0.40 0.44 0.12 -0.82 -0.90 -1.18
marriage = 1, age_group = 1 0.02 0.01 -0.07 -0.10 -0.10 -0.16
marriage = 1, age_group = 2 -0.10 -0.11 -0.12 0.05 0.04 0.10
marriage = 1, age_group = 3 -0.17** -0.18** -0.21** -0.08 -0.09* -0.13
divorce = 1, age_group = 1 -0.12 -0.15 -0.18 0.06 0.07 0.12
divorce = 1, age_group = 2 0.01 -0.01 0.15 0.11 0.10 0.07
divorce = 1, age_group = 3 0.08 0.06 -0.01 0.09 0.07 -0.10
widowhood = 1, age_group = 1 0.24* 0.24* 0.30** -0.02 -0.05 -0.03
widowhood = 1, age_group = 2 0.02 0.03 -0.17 -0.02 -0.06 -0.23**
widowhood = 1, age_group = 3 0.06 0.05 -0.02 0.04 0.02 0.08
Robust SE Yes Yes Yes Yes Yes Yes
FE waves Yes Yes Yes Yes Yes Yes
Control 1 Yes Yes Yes Yes Yes Yes
Control 2 No Yes Yes No Yes Yes
Control 3 No No Yes No No Yes
Events 2639 2639 1828 5177 5177 2744
Concordance 0.632 0.641 0.663 0.585 0.59 0.607
Observations 45,042 45,039 33,811 63,467 63,464 41,797
Wald Test 412.06*** (df = 51) 510.52*** (df = 69) 491.13*** (df = 77) 256.08*** (df = 37) 277.81*** (df = 46) 274.96*** (df = 55)
Max. Possible R2 0.52 0.52 0.48 0.48 0.36 0.29
*0.1 **0.05 ***0.01

Annex 5. Proportionality test results for full-specification models, by gender

Men 30+
Chisq df p
wave 12.618 13 0.478
born 0.659 1 0.417
marst*age_group 12.890 15 0.611
marriage*age_group 2.326 3 0.508
divorce*age_group 2.071 3 0.558
widowhood*age_group 4.837 3 0.184
educ*age_group 12.921 6 0.044
income*age_group 3.543 3 0.315
job*age_group 3.595 3 0.309
status*age_group 11.509 9 0.242
alone*age_group 6.935 3 0.074
children_old*age_group 2.741 3 0.433
children_young*age_group 3.222 3 0.359
cigarettes*age_group 0.101 3 0.992
alcohol*age_group 2.825 3 0.419
phys*age_group 1.463 3 0.691
GLOBAL 90.656 77 0.137
Women 30+
Chisq df p
born 0.00856 1 0.926
marst*age_group 24.21359 15 0.062
marriage*age_group 2.37526 3 0.498
divorce*age_group 1.07182 3 0.784
widowhood*age_group 0.53934 3 0.910
educ*age_group 7.34928 6 0.290
income*age_group 4.82144 3 0.185
job*age_group 2.24149 3 0.524
alone*age_group 0.16425 3 0.983
children_old*age_group 1.74844 3 0.626
children_young*age_group 1.84474 3 0.605
cigarettes*age_group 5.48372 3 0.140
alcohol*age_group 1.32902 3 0.722
phys*age_group 6.37845 3 0.095
GLOBAL 65.32298 55 0.161

Annex 6. Martingale residuals as a function of model predictions

Source: author’s calculations based on RLMS data

Annex 7. Incidence of chronic diseases as a function of self-perceived health status, %

Source: author’s calculations based on RLMS data

Source: author’s calculations based on RLMS data

Annex 8: Testing the assumptions of the health risk models

The basic assumptions underlying the Cox proportional hazards model (Cox 1972) are as follows:

  1. All variables used are independent.
  2. Event risks for any two respondents at any time interval are proportional.
  3. The effect of each explanatory variable on the risk of occurrence of an event (health deterioration) is linear.

We tested the first assumption by constructing correlation matrices for the control variables (Figure 1 and Figure 2). The continuous variables are reproduced unaltered. For the categorical variables, we created respective binary variables for all the values they took (One-Hot Encoding).

In men, the self-perceived health status declines with age, so does the percentage of the employed. Single or widowed men, as expected, were more likely to live alone in the household. All the variables were characterised by weak to moderate correlation, indicating no strong (0.7 or more) collinearity between the control variables.

Women, too, experience a decline in the self-perceived health status and loss of employment (probably due to retirement) as they age. The widowed category was on average older and more likely to be associated with living alone. There were no strongly correlated factors there too. Therefore, the first assumption of regressor independence holds for the models for both men and women.

It should be noted that the paired Pearson correlation coefficients are a measure of the linear relationship between the variables and may not always be accurate. Moreover, no substantive conclusions can be drawn from them, and we use them only to better trace the characteristics of the data, test the factors for collinearity and identify potential data errors.

Figure 1.

Correlation matrix for the control variables. Men aged 30+. Source: author’s calculations based on RLMS data

We tested the second assumption by means of the proportionality test in Cox models as proposed by (Grambsch and Therneau 1994) and implemented in the Survival package in R. The test results for the complete model specifications by age-sex groups are given in Annex 5. At the 5% level, we do not reject the proportionality hypothesis for all the factors in all the models (except education in men, where the p-value is quite close to 5%, which is not critical in our analysis), and all models in general. Therefore, this assumption also holds.

We graphically tested the third assumption of linearity by plotting the martingale residuals as a function of model predictions. It should be noted that our variable of interest, marital status, is a categorical rather than a continuous variable, while linearity is only tested for continuous explanatory variables. Therefore, we directly tested the entire model for linearity, rather than each continuous variable one by one. See Annex 6 for the diagrams of complete model specifications for all age-sex groups. For men, there was a low-amplitude non-linearity at the extreme right tail. For the purposes of this analysis, we consider this to be non-critical, as we do not aim at a consistent estimation of coefficients of any continuous variables. Therefore, the non-linearity, since it becomes apparent only in extreme cases and is characterized with a low amplitude, will not lead to a significant bias in the estimates of the variables of interest in question. That said, for men, we will use caution in drawing conclusions or generalising on the results, with checking them for substantive considerations and cross-checking against previously obtained empirical results.

Figure 2.

Correlation matrix for the control variables. Women aged 30+. Source: author’s calculations based on RLMS data

Thus, all the basic assumptions of the Cox proportional hazards model are met for the above models.

Information about the author

Akhtemzyanov Rafael Anvarovich – bachelor, Lomonosov Moscow State University, Faculty of economics, Moscow, 119991, Russia. Email:

login to comment