![]() |
|
|
|
|
||
. use http://www.stata.com/data/jwooldridge/eacsap/lowbirth, clear
OLS. Here's a regression of low birth weight on AFDC with a dummy for 1990 (time trends) and controls for log physicians per capita, log beds per capita, log per capita income, and log population.
. reg lowbrth d90 afdcprc lphypc lbedspc lpcinc lpopul
Source | SS df MS Number of obs = 100
-------------+------------------------------ F( 6, 93) = 5.19
Model | 33.7710894 6 5.6285149 Prob > F = 0.0001
Residual | 100.834005 93 1.08423661 R-squared = 0.2509
-------------+------------------------------ Adj R-squared = 0.2026
Total | 134.605095 99 1.35964742 Root MSE = 1.0413
------------------------------------------------------------------------------
lowbrth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
d90 | .5797136 .2761244 2.10 0.038 .0313853 1.128042
afdcprc | .0955932 .0921802 1.04 0.302 -.0874584 .2786448
lphypc | .3080648 .71546 0.43 0.668 -1.112697 1.728827
lbedspc | .2790041 .5130275 0.54 0.588 -.7397668 1.297775
lpcinc | -2.494685 .9783021 -2.55 0.012 -4.4374 -.5519711
lpopul | .739284 .7023191 1.05 0.295 -.6553826 2.133951
_cons | 26.57786 7.158022 3.71 0.000 12.36344 40.79227
------------------------------------------------------------------------------
It seems as if AFDC has a pernicious effect on low birth weight: each percent in AFDC is associated with an extra 1/10-th of one percent with low birth weight. A scatterplot shows a positive correlation:
. twoway (scatter lowbrth afdcprc if year==1987, mcolor(blue) ) /// > (scatter lowbrth afdcprc if year==1990, mcolor(red) ) /// > , legend( lab(1 "1987") lab(2 "1990") ring(0) pos(5) ) . graph export panelFig2.png, replace (file panelFig2.png written in PNG format)

Random Effects.
Fitting a random-effects model does not solve the problem.
I first encode the state abbreviation because xtreg
requires numerical id variables.
. encode stateabb, gen(stateid)
. xtreg lowbrth d90 afdcprc lphypc lbedspc lpcinc lpopul, i(stateid) mle
Fitting constant-only model:
Iteration 0: log likelihood = -108.24542
Iteration 1: log likelihood = -107.11904
Iteration 2: log likelihood = -107.04455
Iteration 3: log likelihood = -107.04404
Fitting full model:
Iteration 0: log likelihood = -99.608575
Iteration 1: log likelihood = -99.37118
Iteration 2: log likelihood = -99.370515
Iteration 3: log likelihood = -99.370515
Random-effects ML regression Number of obs = 100
Group variable (i): stateid Number of groups = 50
Random effects u_i ~ Gaussian Obs per group: min = 2
avg = 2.0
max = 2
LR chi2(6) = 15.35
Log likelihood = -99.370515 Prob > chi2 = 0.0177
------------------------------------------------------------------------------
lowbrth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
d90 | .5683237 .2432568 2.34 0.019 .0915491 1.045098
afdcprc | .0308556 .1102612 0.28 0.780 -.1852524 .2469636
lphypc | .283817 .9252746 0.31 0.759 -1.529688 2.097322
lbedspc | .3520403 .641116 0.55 0.583 -.904524 1.608605
lpcinc | -2.269859 1.218653 -1.86 0.063 -4.658374 .1186558
lpopul | .7373689 .9039096 0.82 0.415 -1.034261 2.508999
_cons | 24.99926 9.132684 2.74 0.006 7.099529 42.89899
-------------+----------------------------------------------------------------
/sigma_u | .9453257 .105888 .7589908 1.177407
/sigma_e | .4471777 0 .4471777 .4471777
rho | .8171486 . . .
------------------------------------------------------------------------------
Likelihood-ratio test of sigma_u=0: chibar2(01)= 85.88 Prob>=chibar2 = 0.000
The effect of AFDC is much closer to zero and fortunately not significant, but it still has the wrong sign. The intra-state correlation over the two years is a remarkable 0.817; persistent state characteristics account for 82% of the variation in the percent with low birth weight after controlling for AFDC participation and all other variables.
Fixed-Effects. Fitting a fixed-effects model gives much more reasonable results:
. xtreg lowbrth d90 afdcprc lphypc lbedspc lpcinc lpopul, i(stateid) fe
Fixed-effects (within) regression Number of obs = 100
Group variable (i): stateid Number of groups = 50
R-sq: within = 0.3839 Obs per group: min = 2
between = 0.1741 avg = 2.0
overall = 0.1679 max = 2
F(6,44) = 4.57
corr(u_i, Xb) = -0.9394 Prob > F = 0.0011
------------------------------------------------------------------------------
lowbrth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
d90 | .1060158 .3090664 0.34 0.733 -.5168667 .7288983
afdcprc | -.1760763 .0903733 -1.95 0.058 -.3582116 .006059
lphypc | 5.894509 2.816689 2.09 0.042 .2178452 11.57117
lbedspc | -1.576195 .8852111 -1.78 0.082 -3.360221 .2078308
lpcinc | -.8455268 1.356773 -0.62 0.536 -3.579924 1.88887
lpopul | 3.441116 2.872175 1.20 0.237 -2.347372 9.229604
_cons | -4.0138 22.97888 -0.17 0.862 -50.32468 42.29708
-------------+----------------------------------------------------------------
sigma_u | 3.0975315
sigma_e | .18464547
rho | .99645917 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(49, 44) = 59.46 Prob > F = 0.0000
Now every percent increase in AFDC is associated with a decline of almost 2/10-th of a percentage point in low birth weight. The coefficient of log physicians per capita is highly suspect; this is due to high correlation with the other predictors, most notably the log of population. In fact once we have state fixed effects we don't really need the other controls:
. xtreg lowbrth d90 afdcprc, i(stateid) fe
Fixed-effects (within) regression Number of obs = 100
Group variable (i): stateid Number of groups = 50
R-sq: within = 0.2602 Obs per group: min = 2
between = 0.0948 avg = 2.0
overall = 0.0694 max = 2
F(2,48) = 8.44
corr(u_i, Xb) = -0.4366 Prob > F = 0.0007
------------------------------------------------------------------------------
lowbrth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
d90 | .2124736 .0542377 3.92 0.000 .1034214 .3215259
afdcprc | -.168598 .0907986 -1.86 0.069 -.3511609 .0139649
_cons | 7.267396 .3411409 21.30 0.000 6.581486 7.953306
-------------+----------------------------------------------------------------
sigma_u | 1.2476272
sigma_e | .19372976
rho | .97645624 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(49, 48) = 65.53 Prob > F = 0.0000
Within State. One way to see what's going on is to plot the state-specific changes over time. The easiest way to do this is to reshape the data so we have the two observations for each state on the same row.
. preserve
. keep stateid d90 afdcprc lowbrth
. reshape wide afdcprc lowbrth, i(stateid) j(d90)
(note: j = 0 1)
Data long -> wide
-----------------------------------------------------------------------------
Number of obs. 100 -> 50
Number of variables 4 -> 5
j variable (2 values) d90 -> (dropped)
xij variables:
afdcprc -> afdcprc0 afdcprc1
lowbrth -> lowbrth0 lowbrth1
-----------------------------------------------------------------------------
. twoway (pcspike lowbrth0 afdcprc0 lowbrth1 afdcprc1) ///
> (scatter lowbrth0 afdcprc0, color(blue) ) ///
> (scatter lowbrth1 afdcprc1, mcolor(red) ) ///
> , legend( order(2 "1987" 3 "1990") ring(0) pos(10)) ///
> xtitle("% AFDC") ytitle("% Low Birth Weight") ///
> title(Low Birth Weight and AFDC)
. graph export panelFig3.png, replace
(file panelFig3.png written in PNG format)

Clearly states with (historically) high percents with low birth weight have high AFDC participation, but within each state, an increase in AFDC is associated with a decline in low birth weight.
FE and Differencing. We can see the last point more clearly by computing differences
. gen dlowbrth = lowbrth1 - lowbrth0
. gen dafdc = afdcprc1 - afdcprc0
. scatter dlowbrth dafdc, title(Regression on Differences) ///
> xtitle(change in AFDC) ytitle(change in low birth wgt)
. graph export panelFig4.png, replace
(file panelFig4.png written in PNG format)
. reg dlowbrth dafdc
Source | SS df MS Number of obs = 50
-------------+------------------------------ F( 1, 48) = 3.45
Model | .258802651 1 .258802651 Prob > F = 0.0695
Residual | 3.60299693 48 .075062436 R-squared = 0.0670
-------------+------------------------------ Adj R-squared = 0.0476
Total | 3.86179958 49 .078812236 Root MSE = .27398
------------------------------------------------------------------------------
dlowbrth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dafdc | -.168598 .0907986 -1.86 0.069 -.3511609 .0139649
_cons | .2124736 .0542377 3.92 0.000 .1034214 .3215259
------------------------------------------------------------------------------
. restore

As you can see, running OLS on differences gives the same
estimates as fixed effects: the constant is the coefficient
of d90 and the slope is the coefficient of
afdcprc.
FE and Dummy Variables. You may also verify that you get the same estimates using state dummies.
. quietly tab stateid, gen(statedummy)
. reg lowbrth afdcprc d90 statedummy2-statedummy50
Source | SS df MS Number of obs = 100
-------------+------------------------------ F( 51, 48) = 69.38
Model | 132.803596 51 2.60399208 Prob > F = 0.0000
Residual | 1.80149846 48 .037531218 R-squared = 0.9866
-------------+------------------------------ Adj R-squared = 0.9724
Total | 134.605095 99 1.35964742 Root MSE = .19373
------------------------------------------------------------------------------
lowbrth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
afdcprc | -.168598 .0907986 -1.86 0.069 -.3511609 .0139649
d90 | .2124736 .0542377 3.92 0.000 .1034214 .3215259
statedummy2 | 3.274314 .2052142 15.96 0.000 2.861703 3.686925
statedummy3 | 2.974756 .2154956 13.80 0.000 2.541473 3.408039
statedummy4 | 1.483325 .2036652 7.28 0.000 1.073829 1.892821
statedummy5 | 1.520203 .297898 5.10 0.000 .9212386 2.119167
statedummy6 | 2.99607 .2107214 14.22 0.000 2.572386 3.419753
statedummy7 | 1.80353 .1953395 9.23 0.000 1.410774 2.196287
statedummy8 | 2.215132 .2068979 10.71 0.000 1.799136 2.631129
statedummy9 | 2.565411 .2177467 11.78 0.000 2.127602 3.00322
statedummy10 | 3.723224 .1977027 18.83 0.000 3.325716 4.120732
statedummy11 | 2.230679 .194009 11.50 0.000 1.840598 2.62076
statedummy12 | .3784638 .1975233 1.92 0.061 -.0186834 .775611
statedummy13 | .4705355 .2815929 1.67 0.101 -.0956451 1.036716
statedummy14 | 3.005836 .2542833 11.82 0.000 2.494565 3.517106
statedummy15 | 1.554514 .2204879 7.05 0.000 1.111193 1.997834
statedummy16 | 1.316635 .2174467 6.05 0.000 .8794287 1.753841
statedummy17 | 2.294124 .2087002 10.99 0.000 1.874504 2.713744
statedummy18 | 4.54742 .2886875 15.75 0.000 3.966974 5.127865
statedummy19 | 1.058487 .1962737 5.39 0.000 .6638525 1.453122
statedummy20 | 3.002687 .1937352 15.50 0.000 2.613156 3.392218
statedummy21 | .5717069 .2045176 2.80 0.007 .1604968 .982917
statedummy22 | 3.134765 .3470942 9.03 0.000 2.436885 3.832645
statedummy23 | .2373309 .1938499 1.22 0.227 -.1524304 .6270921
statedummy24 | 2.265756 .1939155 11.68 0.000 1.875863 2.655649
statedummy25 | 4.968533 .3181213 15.62 0.000 4.328907 5.608159
statedummy26 | .9513923 .2008765 4.74 0.000 .5475029 1.355282
statedummy27 | 3.025681 .2049726 14.76 0.000 2.613556 3.437806
statedummy28 | .1260105 .2435249 0.52 0.607 -.3636291 .6156501
statedummy29 | .388946 .224612 1.73 0.090 -.0626668 .8405588
statedummy30 | -.3260747 .3003073 -1.09 0.283 -.929883 .2777336
statedummy31 | 2.230251 .1944136 11.47 0.000 1.839357 2.621146
statedummy32 | 2.48739 .1947735 12.77 0.000 2.095772 2.879009
statedummy33 | 1.89427 .2724586 6.95 0.000 1.346455 2.442084
statedummy34 | 3.091241 .249264 12.40 0.000 2.590062 3.592419
statedummy35 | 2.392128 .2673585 8.95 0.000 1.854568 2.929688
statedummy36 | 1.7834 .197022 9.05 0.000 1.387261 2.17954
statedummy37 | .2708986 .2058283 1.32 0.194 -.142947 .6847443
statedummy38 | 2.296465 .2005746 11.45 0.000 1.893183 2.699748
statedummy39 | 1.422612 .2046743 6.95 0.000 1.011087 1.834137
statedummy40 | 3.761942 .1994498 18.86 0.000 3.360921 4.162963
statedummy41 | .120629 .2297615 0.53 0.602 -.3413375 .5825954
statedummy42 | 3.388317 .1948257 17.39 0.000 2.996594 3.78004
statedummy43 | 2.01164 .1994886 10.08 0.000 1.610541 2.412738
statedummy44 | .6772191 .227873 2.97 0.005 .2190497 1.135388
statedummy45 | 1.998448 .2363985 8.45 0.000 1.523137 2.473759
statedummy46 | .509038 .1937909 2.63 0.012 .1193953 .8986807
statedummy47 | .6230275 .2047466 3.04 0.004 .211357 1.034698
statedummy48 | 1.058406 .2238938 4.73 0.000 .6082373 1.508575
statedummy49 | 2.631958 .2636139 9.98 0.000 2.101927 3.161989
statedummy50 | 2.597225 .2104775 12.34 0.000 2.174031 3.020418
_cons | 5.367278 .3705424 14.48 0.000 4.622252 6.112303
------------------------------------------------------------------------------
Group Means. Another interesting way to look at the data is to regress the percent with low birth weight on two AFDC variables, the state average over the two years and the state's deviation from its average on a given year.
. egen mafdcprc = mean(afdcprc), by(stateid)
. gen dafdcprc = afdcprc - mafdcprc
. reg lowbrth d90 mafdcprc dafdcprc
Source | SS df MS Number of obs = 100
-------------+------------------------------ F( 3, 96) = 3.47
Model | 13.1585351 3 4.38617836 Prob > F = 0.0192
Residual | 121.44656 96 1.26506833 R-squared = 0.0978
-------------+------------------------------ Adj R-squared = 0.0696
Total | 134.605095 99 1.35964742 Root MSE = 1.1248
------------------------------------------------------------------------------
lowbrth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
d90 | .2124736 .3148923 0.67 0.501 -.4125827 .8375299
mafdcprc | .2716251 .0863252 3.15 0.002 .100271 .4429792
dafdcprc | -.168598 .5271571 -0.32 0.750 -1.214996 .8778005
_cons | 5.526764 .3923577 14.09 0.000 4.74794 6.305588
------------------------------------------------------------------------------
As you can see, the coefficient of the state's AFDC deviation from the state mean is exactly the same as the fixed effects or within group estimator. It turns out that the coefficient of the mean AFDC level for each state is the between group estimator. These results show clearly that the relationship across states is positive but within each state is negative.
As noted earlier, the random effects estimator is an average of the within and between group estimators. It really should not be used when the two coefficients are very different, for example have different signs.