Home | GLMs | Multilevel | Survival | Demography | Stata | R

Population Projections

This unit illustrate sthe use of the cohort component method of population projections, with all calculations done in Stata. (An alternate version using Mata is available here. The Mata version includes an eigenanalysis of the Leslie matrix, which we can't do here.)

Reading the Data

We read the data from the datasets section. The file has four columns, the starting age for each group, the population counts for 1993, the person-years 'big-L' column from the life table --which we divide by 100,000-- and the fertility rates.
. set type double

. infile age p93 L f using ///
>   http://data.princeton.edu/eco572/datasets/sweden93.dat, clear
(18 observations read)

. replace L=L/100000
(18 real changes made)

Projections for 5 and 10 years

The first task is to compute the survival ratios used to project people alive at the start of the projection (the sub-diagonal of the Leslie matrix). We store the probability of surviving to an age group, set the last entry to the ratio of time lived at 85+ to 80+. To project we need to multiply these ratios by the population in the previous age group. The last age group is singled-out for special treatment. We try this part of the projection straight away to make sure its OK.

. gen surv = L/L[_n-1]
(1 missing value generated)

. replace surv = L[_N]/(L[_N] + L[_N-1]) in -1
(1 real change made)

. gen p98 = surv * p93[_n-1]
(1 missing value generated)

. replace p98 = p98 + surv*p93 in -1
(1 real change made)

. list age p93 p98

     +--------------------------+
     | age      p93         p98 |
     |--------------------------|
  1. |   0   293395           . |
  2. |   5   248369   293189.18 |
  3. |  10   240012    248250.6 |
  4. |  15   261346   239833.28 |
  5. |  20   285209   261014.93 |
     |--------------------------|
  6. |  25   314388   284786.85 |
  7. |  30   281290   313781.66 |
  8. |  35   286923      280463 |
  9. |  40   304108   285576.19 |
 10. |  45   324946   301730.68 |
     |--------------------------|
 11. |  50   247613    320974.1 |
 12. |  55   211351   243039.01 |
 13. |  60   215140   205108.84 |
 14. |  65   221764   204943.86 |
 15. |  70   223506    204792.8 |
     |--------------------------|
 16. |  75   183654   194419.01 |
 17. |  80   141990   142323.75 |
 18. |  85   112424   131768.18 |
     +--------------------------+

The results agree with the textbook. Next we need to do the births during the projection period to replace the missing entry in the first age group. We start by computing the average fertility rate for each age group, and then multiply by the number of women and sum.

. gen m = f/2.05

. replace m = L[1]*(m + m[_n+1]*surv[_n+1])/2
(9 real changes made, 1 to missing)

. gen b = p93 * m
(1 missing value generated)

. quietly summarize b

. list b

     +-----------+
     |         b |
     |-----------|
  1. |         0 |
  2. |         0 |
  3. | 3492.1153 |
  4. | 32562.717 |
  5. |  83221.72 |
     |-----------|
  6. | 100015.75 |
  7. | 53405.282 |
  8. | 17917.502 |
  9. | 2840.4304 |
 10. | 118.28518 |
     |-----------|
 11. |         0 |
 12. |         0 |
 13. |         0 |
 14. |         0 |
 15. |         0 |
     |-----------|
 16. |         0 |
 17. |         0 |
 18. |         . |
     +-----------+

. replace p98 = r(sum) in 1
(1 real change made)

The procedure is easily applied for the next five years starting with the last projection:

. gen p03 = surv * p98[_n-1]
(1 missing value generated)

. replace p03 = p03 + surv*p98 in -1
(1 real change made)

. qui replace b = p98 * m

. qui sum b

. replace p03 = r(sum) in 1
(1 real change made)

. format p93 p98 p03 %7.0fc

. list age p93 p98 p03

     +-----------------------------------+
     | age       p93       p98       p03 |
     |-----------------------------------|
  1. |   0   293,395   293,574   280,121 |
  2. |   5   248,369   293,189   293,368 |
  3. |  10   240,012   248,251   293,049 |
  4. |  15   261,346   239,833   248,066 |
  5. |  20   285,209   261,015   239,529 |
     |-----------------------------------|
  6. |  25   314,388   284,787   260,629 |
  7. |  30   281,290   313,782   284,238 |
  8. |  35   286,923   280,463   312,859 |
  9. |  40   304,108   285,576   279,147 |
 10. |  45   324,946   301,731   283,344 |
     |-----------------------------------|
 11. |  50   247,613   320,974   298,043 |
 12. |  55   211,351   243,039   315,045 |
 13. |  60   215,140   205,109   235,861 |
 14. |  65   221,764   204,944   195,388 |
 15. |  70   223,506   204,793   189,260 |
     |-----------------------------------|
 16. |  75   183,654   194,419   178,141 |
 17. |  80   141,990   142,324   150,666 |
 18. |  85   112,424   131,768   141,960 |
     +-----------------------------------+

. tabstat p93 p98 p03, stat(sum)

   stats |       p93       p98       p03
---------+------------------------------
     sum |   4397428   4449570   4478712
----------------------------------------

The results agree with the textbook.

The Stable Equivalent

We can't calculate the stable equivalent population from an eigen-analysis without Mata, but you could project the population for a long time and see what you get. If the time is long enough you should come really close to the results of the eigenanalysis here.

In the next handout we apply methods based on the continuous time model.