Simulated Datasets

Our 1995 paper relied on simulations of three-level variance-components models with moderate and large random effects at levels 2 and 3 using the same nested structure as the actual survey data from Guatemala.

The datasets available here represent 100 simulations of the structure with large (s2=1) random effects at both levels 2 (family) and 3 (community).

The simulated datasets have been zipped into four archives with 25 datasets each:

FileSimulationsSize
rgs3bb1.zip1-25916 KB
rgs3bb2.zip26-50917 KB
rgs3bb3.zip51-75917 KB
rgs3bb4.zip76-100917 KB

The datasets are named s3bb1.dat to s3bb100.dat. Each dataset is in ascii format and has 8 variables on 2449 cases. The variables are

ColumnVariableNotes
1 child id 2449 kids, ids in 1..14684
2 family id 1558 families, ids in 3..2782
3 community id 161 communities, ids in 1..242
4 binary dependent variable 0,1
5 corresponding latent variable logistic
6 child-level covariate mean .0955621, range -.452557 .541957
7 family-level covariate mean -.083816 , range -1.485043 to 2.284106
8 community-level covariate mean -.6857591, range -1.818267 to -.0097808

All datasets were generated using true parameter values equal to 0.665267 for the constant and equal to 1 for each of the three covariates, and using normal variates with variance 1 for the family and community effect.

Note that only variables 4 and 5 vary across datasets. In view of this, we are also making the data available in a much more compact format, a single zipped file rg3bbpack.zip that includes five files:

  • File s3bbx.dat has variables 1,2,3,6,7 and 8; in other words, the variables that are constant accros simulations.

  • Files s3bby1.dat to s3bby4.dat have 25 columns each, representing the outcomes of simulations 1-25, 26-50, 51-75 and 76-100, respectively, for all 2449 cases.

 

If you have any questions or comments, please send e-mail to
grodri@princeton.edu