Germán Rodríguez
Multilevel Models Princeton University

## A Model With Serial Correlation

Repeated measurements often exhibit serial correlation, with outcomes closer it time more highly correlated that outcomes further apart. Multilevel models have an exchangeable error structure, where all outcomes in a group are equally correlated, but may be adapted to accomodate serial correlation.

We will analyze the example in Goldstein (1995, Sections 6.4 and 6.5 starting on page 91) using the MLwiN macros described in Yang et al (2001) MLwiN Macros for advanced multilevel modelling, which you may download from MLwiN macros.

#### A Seasonal Growth Model

The data are available in a worksheet supplied with MLwiN, `oxboys.ws2`, containing height measurements on 26 boys measured on nine occassions between the ages of 11 and 14. The data have already been stacked in long format. The variables of interest are height and age, which is modelled using a fourth-degree polynomial. Variable 'occ' indicates the occassions and 'seas' is the month when measures were taken.

If you retrieve the worksheet you will note that it already has defined the two-level structure and has fitted a simple 2-level model with a fourth-degree polynomial on age. You may want to type `fixed` and `random` and compare the results with Yang et al (2oo1). I will start from scratch to document all steps:

```
retrieve c:\program files\mlwin1.10\oxboys.ws2
response "height"
ident 1 "occ" 2 "id"
expla 1 "cons" "age" "age2" "age3" "age4"
setv 1 "cons"
setv 2 "cons" "age" "age2"
```

Note that only the constant, linear and square terms on age are random at the student level. You may want to fit this model and compare your results with Yang et al (2001). I will go ahead and add a seasonality term. As noted in Goldstein (1995, p.92), if the seasonal component has amplitude a and phase g we can write

a cos(t+g) = a1cos(t) - a2sin(t), For this data the second coefficient is very close to zero and will be omitted.

```
calc c10 = 3.1416 * 'seas'/6
cos c10 c11
name c11 'cos'
expl 1 'cos'
batch 1
start
fixed
random
```

These are my results, reproducing Table 6.4 in Goldstein (1995, p.93)

```
fixed
PARAMETER            ESTIMATE     S. ERROR(U)   PREV. ESTIMATE
cons                      148.9         1.54              148.9
age                       6.189       0.3485              6.187
age2                      2.167       0.4504              2.167
age3                     0.3918       0.1565             0.3918
age4                     -1.553       0.4423             -1.553
cos                     -0.2367      0.06769            -0.2367

random
LEV.  PARAMETER       (NCONV)    ESTIMATE    S. ERROR(U)  PREV. ESTIM     CORR.
-------------------------------------------------------------------------------
2    cons     /cons     ( 2)       61.51           17.1        61.61         1
2    age      /cons     ( 2)       7.979           3.02        7.999     0.613
2    age      /age      ( 2)       2.754         0.7794        2.758         1
2    age2     /cons     ( 2)       1.349          1.414        1.356     0.213
2    age2     /age      ( 1)      0.8776         0.3432       0.8788     0.654
2    age2     /age2     ( 1)      0.6547         0.2272       0.6549         1
-------------------------------------------------------------------------------
1    cons     /cons     ( 1)      0.1991        0.02254       0.1991
```

We will now consider a model where level-1 residuals for the same individual are correlated and the magnitude of the correlation depends on the time difference between the measurements. Specifically we will assume that

cov(et,j,et-s,j) = s2e exp{- g s}

So the correlation between residuals s time-units apart is exp{- g s} and decays exponentially with s. (The correlation between outcomes will also involve level-2 residuals.)

To fit this model make sure you have downloaded the TS macros, and have adjusted the MLwiN settings to use the right files. As usual I will provide a script instead of using the GUI. First I define the folder where the macros are located and the files to be used for pre and post estimation. The time-series macros require the time variable to be called T. The other commands are explained in comments:

```
note the time variable must be called T
name c2 "T"
note define the location of the pre and post macros
fpath c:\program files\mlwiN1.10\ts
prefile pre.ts
postfile post.ts
note turn on time series switch
set b10 1
note put power in c185 (we use 1)
join c185 1 c185
note put the starting value in c201 (we use 20)
join c201 20 c201
note turn off switches for other models
set b11 0
set b12 0
set b13 0
note set maximum number of iterations to 50
maxi 50
```

At this stage I fitted the model interactively by pressing the Start button. The procedure did four iterations. The results are not all shown in the Equation window, so you have to use `random` to look at the random parameters. To ensure that 's1' has converged press More and use `random` a couple of times.

Here are my results, in general agreement with Table 6.5 in Goldstein (1995, p. 93).

```
fixed
PARAMETER            ESTIMATE     S. ERROR(U)   PREV. ESTIMATE
cons                      148.9        1.539              148.9
t                         6.191       0.3509              6.191
age2                      2.163       0.4494              2.163
age3                     0.3863        0.169             0.3863
age4                     -1.548       0.4294             -1.548
cos                      -0.236      0.06733             -0.236

random
LEV.  PARAMETER       (NCONV)    ESTIMATE    S. ERROR(U)  PREV. ESTIM     CORR.
-------------------------------------------------------------------------------
2    cons     /cons     ( 8)       61.48          17.07        61.48         1
2    t        /cons     ( 8)        7.93          2.992         7.93     0.618
2    t        /t        ( 6)        2.68         0.7655         2.68         1
2    age2     /cons     ( 5)       1.479          1.402        1.479     0.249
2    age2     /t        ( 5)      0.8525         0.3357       0.8525     0.687
2    age2     /age2     ( 5)      0.5745         0.2284       0.5746         1
2    t(0)     *         ( 5)      0.2346         0.0433       0.2346
2    s1       *         ( 4)       6.908          2.082         6.91
-------------------------------------------------------------------------------
1    cons     /cons     ( 5)      0.2346         0.0433       0.2346

```
The interesting parameter is 6.908. To see what this implies in terms of serial correlation we compute exp{- g s} for S=0.25, 0.50, 0.75 and 1, which corresponds to observations 3, 6, 9 and 12 months apart:
```
join c50 .25 .5 .75 1 c50
calc c51 = expo(-6.908 * c50)
print c51
```

The result is

```
print c51
c51
N =      4
1  0.17782
2  0.031619
3  0.0056224
4  0.00099976
```

We see that the correlation is 0.18 for residuals three months apart, and declines to 0.03 for residuals six months apart.