Germán Rodríguez
Statistics and Population Princeton University

Simulated Datasets

Our 1995 paper relied on simulations of three-level variance-components models with moderate and large random effects at levels 2 and 3 using the same nested structure as the actual survey data from Guatemala.

The datasets available here represent 100 simulations of the structure with large (s2=1) random effects at both levels 2 (family) and 3 (community). 


Each dataset has 8 variables on 2449 cases. The variables are

Column Variable Notes
1 child id 2449 kids, ids in 1..14684
2 family id 1558 families, ids in 3..2782
3 community id 161 communities, ids in 1..242
4 binary dependent variable 0,1
5 corresponding latent variable logistic
6 child-level covariate mean .0955621, range -.452557 .541957
7 family-level covariate mean -.083816 , range -1.485043 to 2.284106
8 community-level covariate mean -.6857591, range -1.818267 to -.0097808

All datasets were generated using true parameter values equal to 0.665267 for the constant and equal to 1 for each of the three covariates, and using normal variates with variance 1 for the family and community effect.


The 100 simulated datasets are named s3bb1.dat to s3bb100.dat  and are available in plain text ASCII format, zipped into four archives with 25 datasets each:

File Simulations Size 1-25 916 KB 26-50 917 KB 51-75 917 KB 76-100 917 KB

Note that only variables 4 and 5 vary across datasets. In view of this, we are also making the data available in a much more compact format, a single zipped file that includes five files:

You may find this format more suitable for some applications.