Home | GLMs | Multilevel | Survival | Demography | Stata | R
Home Lecture Notes Stata Logs Datasets Problem Sets

6.5 Ordered Logit Models

We now turn our attention to models for ordered categorical outcomes. Obviously the multinomial and sequential logit models can be applied as well, but they make no explicit use of the fact that the categories are ordered. The models considered here are specifically designed for ordered data.

Housing Conditions in Copenhagen

We will use data from 1681 residents of twelve areas in Copenhagen, classified in terms of the type of housing they have (tower blocks, apartments, atrium houses and terraced houses), their feeling of influence on apartment management (low, medium, high), their degree of contact with the neighbors (low, high), and their satisfaction with housing conditions (low, medium, high).

The data are available in the datasets page and can be read directly from there:

. use http://data.princeton.edu/wws509/datasets/copen, clear
(Housing Conditions in Copenhagen)

We will treat satisfaction as the outcome and type of housing, feeling of influence and contact with the neighbors as categorical predictors.

It will be useful for comparison purposes to fit the saturated multinomial logit model, where each of the 24 combinations of housing type, influence and contact, has its own distribution. The group code can easily be generated from the observation number, and the easiest way to fit the model is to treat the code as a factor variable. If you are running an earlier version of Stata try the xi: prefix.

. gen group = int((_n-1)/3)+1
 
. quietly mlogit satisfaction i.group [fw=n]
 
. estimates store sat
 
. di e(ll)
-1715.7108

The log likelihood is -1715.7.

The Proportional Odds Model

The next task is to fit the additive ordered logit model from Table 6.5 in the notes. We could use factor variables for simplicity, but will construct dummy variables instead

. gen apart   = housing == 2
 
. gen atrium  = housing == 3
 
. gen terrace = housing == 4
 
. local housing apart atrium terrace
 
. gen influenceMed  = influence == 2
 
. gen influenceHi   = influence == 3
 
. local influence influenceMed influenceHi
 
. gen contactHi    = contact == 2

With that done, here's the additive model

. local housing apart atrium terrace
 
. local influence influenceMed influenceHi
 
. ologit satis `housing' `influence' contactHi [fw=n]
 
Iteration 0:   log likelihood = -1824.4388  
Iteration 1:   log likelihood = -1739.8163  
Iteration 2:   log likelihood = -1739.5747  
Iteration 3:   log likelihood = -1739.5746  
 
Ordered logistic regression                       Number of obs   =       1681
                                                  LR chi2(6)      =     169.73
                                                  Prob > chi2     =     0.0000
Log likelihood = -1739.5746                       Pseudo R2       =     0.0465
 
------------------------------------------------------------------------------
satisfaction |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       apart |  -.5723499    .119238    -4.80   0.000    -.8060521   -.3386477
      atrium |  -.3661863   .1551733    -2.36   0.018    -.6703205   -.0620522
     terrace |  -1.091015    .151486    -7.20   0.000    -1.387922   -.7941074
influenceMed |   .5663937   .1046528     5.41   0.000      .361278    .7715093
 influenceHi |   1.288819   .1271561    10.14   0.000     1.039597     1.53804
   contactHi |    .360284   .0955358     3.77   0.000     .1730372    .5475307
-------------+----------------------------------------------------------------
       /cut1 |   -.496135   .1248472                     -.7408311   -.2514389
       /cut2 |   .6907081   .1254719                      .4447876    .9366286
------------------------------------------------------------------------------
 
. estimates store additive
 
. lrtest . sat, force
 
Likelihood-ratio test                                  LR chi2(40) =     47.73
(Assumption: additive nested in sat)                   Prob > chi2 =    0.1874

The log-likelihood is -1739.6, so the deviance for this model compared to the saturated multinomial model is 47.7 on 40 d.f. This is a perfectly valid test because the models are nested, but Stata is cautious and if you type lrtest . sat it will complain that the test involves different estimators: mlogit vs. ologit. Fortunately we can insist with the force option, which is what I have done. Use cautiously!

The bottom line is that the deviance is not much more than one would expect when saving 40 parameters, so we have no evidence against the additive model. To be thorough, however, we will explore individual interactions just in case the deviance is concentrated on a few d.f.

Models with Interactions

The next step is to explore two-factor interactions. We can use factor variables to simplify our search:

. quietly ologit satis i.housing#i.influence i.contact [fw=n]
 
. lrtest . sat, force
 
Likelihood-ratio test                                  LR chi2(34) =     25.22
(Assumption: . nested in sat)                          Prob > chi2 =    0.8623
 
. quietly ologit satis i.housing#i.contact i.influence [fw=n]
 
. lrtest . sat, force
 
Likelihood-ratio test                                  LR chi2(37) =     39.06
(Assumption: . nested in sat)                          Prob > chi2 =    0.3773
 
. quietly ologit satis i.housing i.influence#i.contact [fw=n]
 
. lrtest . sat, force
 
Likelihood-ratio test                                  LR chi2(38) =     47.52
(Assumption: . nested in sat)                          Prob > chi2 =    0.1385

The interaction between housing and influence reduces the deviance by about half at the expense of only six d.f., so it is worth a second look. The interaction between housing and contact makes a much smaller dent, and the interaction between influence and contact adds practically nothing. (we could have compared each of these models to the additive model, thus testing the interaction directly. We would get chi2 of 22.51 on 6 d.f., 8.67 on 3 d.f. and 0.21 on 2 d.f.)

Clearly the interaction to add is the first one, allowing the association between satisfaction with housing and a feeling of influence on management net of contact with neighbors to depend on the type of housing. To examine parameter estimates we build the dummy variables and refit the model:

. gen apartXinfMed = apart * influenceMed
 
. gen apartXinfHi  = apart * influenceHi
 
. gen atriuXinfMed = atrium * influenceMed
 
. gen atriuXinfHi  = atrium * influenceHi
 
. gen terrXinfMed  = terrace * influenceMed
 
. gen terrXinfHi   = terrace * influenceHi
 
. local housingXinf apartXinfMed-terrXinfHi
 
. ologit satis `housing' `influence' `housingXinf' contactHi [fw=n] 
 
Iteration 0:   log likelihood = -1824.4388  
Iteration 1:   log likelihood = -1728.6182  
Iteration 2:   log likelihood = -1728.3201  
Iteration 3:   log likelihood =   -1728.32  
 
Ordered logistic regression                       Number of obs   =       1681
                                                  LR chi2(12)     =     192.24
                                                  Prob > chi2     =     0.0000
Log likelihood =   -1728.32                       Pseudo R2       =     0.0527
 
------------------------------------------------------------------------------
satisfaction |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       apart |  -1.188494   .1972418    -6.03   0.000    -1.575081   -.8019072
      atrium |  -.6067061   .2445664    -2.48   0.013    -1.086047   -.1273647
     terrace |  -1.606231   .2409971    -6.66   0.000    -2.078576   -1.133885
influenceMed |  -.1390175   .2125483    -0.65   0.513    -.5556044    .2775694
 influenceHi |   .8688638   .2743369     3.17   0.002     .3311733    1.406554
apartXinfMed |   1.080868   .2658489     4.07   0.000     .5598135    1.601922
 apartXinfHi |   .7197816   .3287309     2.19   0.029     .0754809    1.364082
atriuXinfMed |     .65111   .3450048     1.89   0.059    -.0250869    1.327307
 atriuXinfHi |  -.1555515   .4104826    -0.38   0.705    -.9600826    .6489795
 terrXinfMed |   .8210056   .3306666     2.48   0.013      .172911      1.4691
  terrXinfHi |   .8446195   .4302698     1.96   0.050     .0013062    1.687933
   contactHi |    .372082   .0959868     3.88   0.000     .1839514    .5602126
-------------+----------------------------------------------------------------
       /cut1 |  -.8881686   .1671554                     -1.215787     -.56055
       /cut2 |   .3126319   .1656627                      -.012061    .6373249
------------------------------------------------------------------------------
 
. lrtest . sat, force
 
Likelihood-ratio test                                  LR chi2(34) =     25.22
(Assumption: . nested in sat)                          Prob > chi2 =    0.8623
 
. lrtest additive .
 
Likelihood-ratio test                                  LR chi2(6)  =     22.51
(Assumption: additive nested in .)                     Prob > chi2 =    0.0010

The model deviance of 25.2 on 34 d.f. is not significant. To test for the interaction effect we compare this model with the additive, obtaining a chi-squared statistic of 22.5 on six d.f., which is significant at the 0.001 level.

At this point one might consider adding a second interaction. The obvious choice is to allow the association between satisfaction and contact with neighbors to depend on the type of housing. This would reduce the deviance by 7.95 at the expense of three d.f., a gain that just makes the conventional 5% cutoff with a p-value of 0.047. In the interest of simplicity we will not pursue this addition.

Interpretation of Parameter Estimates

The estimates indicate that respondents who have high contact with their neighbors are more satisfied than respondents with low contact who live in the same type of housing and have the same feeling of influence on management. The difference is estimated as 0.372 units in the underlying logistic scale. Dividing by the standard deviation of the (standard) logistic distribution we obtain

. display _b[contactHi]/(_pi/sqrt(3))
.20513955

So the difference in satisfaction between high and low contact with neighbors among respondents with the same housing and influence is 0.205 standard deviations.

Alternatively, we can exponentiate the coefficient:

. di exp(_b[contactHi])
1.4507519

The odds of reporting high satisfaction (relative to medium or low), are 45% higher among respondents who have high contact with the neighbors than among tenants with low contact in the same type of housing and influence. The odds of reporting medium or high satisfaction (as opposed to low) are also 45% higher in this group.

Interpretation of the effects of housing type and influence requires taking into account the interaction effect. In the notes we describe differences by housing type among those who feel they have little influence in management, and the effects of influence in each type of housing. In short, among tenants who feel they have little influence the most satisfied are those in tower blocks, followed by residents of atrium houses, apartments and terraced houses, in that order. Feeling that one has some influence in management is associated with higher satisfaction, except in tower blocks where it has no effect. Feeling that one has high influence is associated with higher satisfaction everywhere, except perhaps in atrium houses.

Estimation of Probabilities

Let us consider predicted probabilities. Just as in multinomial logit models, the predict command computes predicted probabilities (the default) or logits. With probabilities you need to specify an output variable for each response category. With logits you specify just one variable which stores the linear predictor x'β, without the cutpoints. Let us predict the probabilities for everyone

. predict pSatLow pSatMed pSatHigh
(option pr assumed; predicted probabilities)

We'll look at these results for tower block dwellers with little influence and with high and low contact with neighbors. The first of these groups is, of course, the reference cell. In the listing I add the condition sat==1 to list the probabilities just once for each group:

. list contact pSatLow pSatMed pSatHigh ///
>   if housing==1 & influence==1 & sat==1, clean
 
       contact    pSatLow    pSatMed   pSatHigh  
  1.       low   .2914879   .2860397   .4224724  
  4.      high   .2209308   .2642111   .5148581  

We see that among tower tenants with low influence, those with high contact with their neighbors have a higher probability of high satisfaction and a lower probability of medium or low satisfaction, than those with low contact with the neighbors.

It is instructive to reproduce these calculations 'by hand'. For the reference cell all we need are the cutpoints. Remember that the model predicts cumulative probabilities, which is why we difference the results.

. scalar c1 = _b[/cut1]
 
. scalar c2 = _b[/cut2]
 
. di invlogit(c1), invlogit(c2)-invlogit(c1),1-invlogit(c2)
.2914879 .28603966 .42247244

For the group with high contact we need to subtract the corresponding coefficient from the cutpoints. The change of sign is needed to convert coefficients from the latent variable to the cumulative probability formulations (or from upper to lower tails).

. scalar h1 = c1 - _b[contactHi]
 
. scalar h2 = c2 - _b[contactHi]
 
. di invlogit(h1), invlogit(h2)-invlogit(h1),1-invlogit(h2)
.22093075 .26421111 .51485814

Results agree exactly with the outpout of the predict command.

The Ordered Probit Model

We now consider ordered probit models, starting with the additive model in Table 6.6:

. oprobit satis `housing' `influence' contactHi [fw=n]
 
Iteration 0:   log likelihood = -1824.4388  
Iteration 1:   log likelihood = -1739.9254  
Iteration 2:   log likelihood = -1739.8444  
Iteration 3:   log likelihood = -1739.8444  
 
Ordered probit regression                         Number of obs   =       1681
                                                  LR chi2(6)      =     169.19
                                                  Prob > chi2     =     0.0000
Log likelihood = -1739.8444                       Pseudo R2       =     0.0464
 
------------------------------------------------------------------------------
satisfaction |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       apart |  -.3475367   .0722909    -4.81   0.000    -.4892244   -.2058491
      atrium |  -.2178875   .0947661    -2.30   0.021    -.4036256   -.0321495
     terrace |  -.6641735      .0918    -7.24   0.000    -.8440983   -.4842487
influenceMed |   .3464228   .0641371     5.40   0.000     .2207164    .4721291
 influenceHi |   .7829146   .0764262    10.24   0.000      .633122    .9327072
   contactHi |   .2223858   .0581227     3.83   0.000     .1084675    .3363042
-------------+----------------------------------------------------------------
       /cut1 |  -.2998279   .0761537                     -.4490865   -.1505693
       /cut2 |   .4267208   .0764043                      .2769711    .5764706
------------------------------------------------------------------------------
 
. lrtest . sat, force
 
Likelihood-ratio test                                  LR chi2(40) =     48.27
(Assumption: . nested in sat)                          Prob > chi2 =    0.1734

The model has a log-likelihood of -1739.8, a little bit below that of the additive ordered logit. This is also reflected in the slightly higher deviance.

Next we add the housing by influence interaction

. oprobit satis `housing' `influence' `housingXinf' contactHi [fw=n]
 
Iteration 0:   log likelihood = -1824.4388  
Iteration 1:   log likelihood = -1728.7767  
Iteration 2:   log likelihood = -1728.6654  
Iteration 3:   log likelihood = -1728.6654  
 
Ordered probit regression                         Number of obs   =       1681
                                                  LR chi2(12)     =     191.55
                                                  Prob > chi2     =     0.0000
Log likelihood = -1728.6654                       Pseudo R2       =     0.0525
 
------------------------------------------------------------------------------
satisfaction |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       apart |  -.7280621   .1205029    -6.04   0.000    -.9642434   -.4918808
      atrium |  -.3720768   .1510259    -2.46   0.014    -.6680821   -.0760716
     terrace |  -.9789998   .1455862    -6.72   0.000    -1.264343   -.6936561
influenceMed |  -.0863672    .130327    -0.66   0.508    -.3418033     .169069
 influenceHi |   .5164514   .1639345     3.15   0.002     .1951457    .8377571
apartXinfMed |   .6600102   .1625787     4.06   0.000     .3413618    .9786586
 apartXinfHi |   .4479134   .1970667     2.27   0.023     .0616698     .834157
atriuXinfMed |   .4108389   .2133778     1.93   0.054     -.007374    .8290517
 atriuXinfHi |  -.0779656   .2496472    -0.31   0.755    -.5672652    .4113339
 terrXinfMed |    .496378   .2016362     2.46   0.014     .1011783    .8915777
  terrXinfHi |   .5216698   .2587276     2.02   0.044     .0145731    1.028767
   contactHi |   .2284567   .0583151     3.92   0.000     .1141612    .3427522
-------------+----------------------------------------------------------------
       /cut1 |  -.5439821   .1023487                     -.7445818   -.3433824
       /cut2 |    .189167   .1018442                     -.0104438    .3887779
------------------------------------------------------------------------------
 
. lrtest . sat, force
 
Likelihood-ratio test                                  LR chi2(34) =     25.91
(Assumption: . nested in sat)                          Prob > chi2 =    0.8387

We now have a log-likelihood of -1728.7 and a deviance of 25.9. which is almost indistinguishable from the corresponding ordered logit model.

The estimates indicate that tenants with high contact with the neighbors are 0.228 standard deviations higher in the latent satisfaction scale than tenants with low contact, who live in the same type of housing and have the same feeling of influence in management. Recall that the comparable logit estimate was 0.205.

The probabilities for the two groups compared earlier can be computed using the predict command or more instructively 'by hand', using exactly the same code as before but with the normal() c.d.f. instead of the logistic c.d.f. invlogit()

. scalar z1 = _b[/cut1]
 
. scalar z2 = _b[/cut2]
 
. di normal(z1), normal(z2)-normal(z1),1-normal(z2)
.29322689 .28179216 .42498095
 
. scalar h1 = z1 - _b[contactHi]
 
. scalar h2 = z2 - _b[contactHi]
 
. di normal(h1), normal(h2)-normal(h1),1-normal(h2)
.21992729 .26440244 .51567027

The main thing to note here is that the results are very close to the corresponding predictions based on the ordered logit model.

The Proportional Hazards Model

The third model mentioned in the lecture notes uses a complementary log-log link and has a proportional hazards interpretation. Unfortunately, this model can not be fit to ordered multinomial data using Stata. It is, of course, possible to fit c-log-log models to binary data, and proportional hazards models to survival data, as we will see in the next chapter.


Continue with 7. Survival Models