1. Introduction Table of Contents 2.4. Simple Linear Regression

2  Linear Models in Stata

We start with the linear models in Chapter 2 of the lecture notes, showing how to use the regress command in Stata to fit regression, analysis of variance, and analysis of covariance models.

2.1  The Program Effort Data

The first task is to read the data, which I typed into a simple ASCII file called effort.raw containing the country names, social setting, program effort and fertility change. For a brief description of the data see the lecture notes or point your browser to the datasets page.

To read the file we use the infile command. Because the data are separated by blanks we can read them in free format. The first variable, however, is non-numeric, so we precede its name with str16 to indicate that it is a string of maximum length 16. The "using" clause can specify a local file or a fully-qualified URL. To read the file directly from the web use the following command

. infile str16 country setting effort change using ///
>         http://data.princeton.edu/wws509/datasets/effort.raw
(20 observations read)

Let us list the data to check we got them in OK:

. list country setting effort change, clean

              country   setting   effort   change
  1.          Bolivia        46        0        1
  2.           Brazil        74        0       10
  3.            Chile        89       16       29
  4.         Colombia        77       16       25
  5.        CostaRica        84       21       29
  6.             Cuba        89       15       40
  7.     DominicanRep        68       14       21
  8.          Ecuador        70        6        0
  9.       ElSalvador        60       13       13
 10.        Guatemala        55        9        4
 11.            Haiti        35        3        0
 12.         Honduras        51        7        7
 13.          Jamaica        87       23       21
 14.           Mexico        83        4        9
 15.        Nicaragua        68        0        7
 16.           Panama        84       19       22
 17.         Paraguay        74        3        6
 18.             Peru        73        0        2
 19.   TrinidadTobago        84       15       29
 20.        Venezuela        91        7       11

We might as well label the data set, label the variables and save everything for future use in a Stata system file called fpe.dta.

. label data "Family Planning Effort Data"

. label var setting "Social Setting"

. label var effort "Program Effort"

. label var change "Fertility Change"

. save fpe, replace
file fpe.dta saved

Note: You may notice that every time I save a file I specify replace. This is because I need to run the script used to generate this handout several times (until I get it right :-) and I don't want it to fail because a file already exists. You should feel free to omit this option ... at least the first time around.

The next thing we want to do is plot the data for a closer look. The following command creates scatterplots for all pairs of variables, reproducing Figure 2.1 in the notes.

. graph matrix change setting effort,     ///
>         title("Figure 2.1: Scatterplot Matrix")

. graph export fig21.wmf, replace
(file d:\wws509\stata\fig21.wmf written in Windows Metafile format)

After generating the graph you can print it using the command graph print, save it in Stata's own format using graph save, or export into other graphic formats using graph export. I have chosen to export the graph into Windows Metafile Format, which I then converted to gif format for use on the web version of these logs. (Stata can export to png format, but I get finer control over graph size going the metafile route.) Windows interactive users can also print the graph by choosing File|Print Graph on Stata's menu, or save it in a variety of formats by choosing File|Save Graph. Alternatively, you can choose Edit|Copy Graph to copy the graph to the clipboard and then Edit|Paste to insert it into your favorite word processor.


Continue with 2.4. Simple Linear Regression
Copyright © Germán Rodríguez, 1993-2003. Please send feedback to grodri@princeton.edu