Germán Rodríguez
Generalized Linear Models Princeton University

Weaving Stata Output and Annotations

I am sharing some of the code I use to prepare the Stata logs, which may come handy when you work on the problem sets. I call the approach LDA, for Literate Data Analysis.

The basic idea is quite simple, you (1) prepare a do file that includes Stata code and annotations, as explained below, (2) run it logging the output to a text file, and then (3) use the weave command to convert the log to a web page, including tables, figures and your comments.

A Sample Do File

Here's a complete example used to read the program effort data and draw a scatterplot matrix.

capture log close
log using sample.usl, text replace
/** <h2>Sample Log</h2> We read the program effort data from the course website and list the first three observations */
use http://data.princeton.edu/wws509/datasets/effort list in 1/3
/** Next we draw a scatterplot matrix */
graph matrix change setting effort graph export sample.png, width(500) replace
/** <img class="center" src="sample.png"/> That's all folks! */
log close

This file consists of Stata commands and annotations, written in special comment blocks that start with /** on a line all by itself, and end with */, also in a line all by itself. This is a standard do file, and will run on any version of Stata, without the need for special commands.

When you log the results it is important to use a plain text file rather than Stata's SMCL format, hence the option text. I often use the extension ".usl" for unformated stata log, but you can use the default ".log" instead. If you forget to use a text format you can always translate SMCL to text, type help translate to learn more.

Annotations

Annotations are written in HTML, but you don't need to learn more than a handful of tags. In fact you can produce basic output without knowing any tags at all, just type plain text using blank lines to indicate paragraph breaks. The weave command adds the necessary paragraph tags.

You may also use <h2> and </h2> tags for headings at level 2, or <h4> and </h4> for headings at level 4, as I did in the example. Headings must appear on a line by themselves. The only other tag you need to know about is the image tag discussed in the next section.

Technically you should encode the symbol < using &lt; so it is not mistaken for the start of a tag, but most modern browsers are smart enough not to be confused. The same applies to the ampersand &, which can be encoded as &amp;. The weave command encodes these symbols in Stata output, but leaves the annotations alone, otherwise you wouldn't be able to use any tags!

Figures

After you produce a Stata graph, use the graph export command to export it to a file in PNG (Portable Network Graphics) format. I usually specify the width as 500 pixels using width(500), letting Stata figure out the height needed to maintain the aspect ratio.

In the next annotation block, you add HTML code to include the figure. This is done with the image tag, which has a "src" attribute to specify the name of the file. I usually add the class="center" attribute to center the image on the page, as as shown in the example above. For clarity it is best to put the image tag on a line all by itself.

Weaving

The weave command has the following syntax: weave using filename[, header(filename) footer(filename)] .

The using argument is the name of the log file. You may omit the extension if it is "usl". The optional header and footer are files containing bloilerplate html to be included at the beginning and end of the web page. These files have extension ".html" by default.

The weave command comes with default versions of these files, called weave_header.html and weave_footer.html. The main job of the default header is to define a few styles to render the page. The footer closes the body and html tags. To convert the above log to a web page using the defaults all you need is

weave using sample

Self-Contained Output

You can't send the TA the web page because the HTML and the figures are separate files. There are three solutions to this problem:

  1. Read the HTML file into Word, which does a reasonable job of parsing the code and including the figures. It has a nasty habit of adding a blank line at the start of code blocks, but the latest version of weave gets around that. (Stata 13 can generate Word documents, but it is a lot easier to read the html.)

  2. Use my bundle command, which goes through the html file and each time it finds an image it grabs the file and encodes it as text using the same format as email attachments (base64).

  3. Use the Chrome browser, which can save a web page in PDF format (right click, select Print, and under destination select Print to PDF). Or use Safari on a Mac. I understand that IE can also do this, but only if you have the full version of Adobe Acrobat installed on your PC.

Installation

To install weave type net from http://data.princeton.edu/wws509/stata from net-aware Stata and follow the instructions. This will install the command, a minimal help file, and the default header and footer, in your ADO path.

Alternatively, download everything on a zipfile available here and unzip all the files on your wws509 folder. If you are using a public computer you may want to keep everything on a USB drive. The zip file also includes the graph scheme I use in my own logs, called grlog, in case you want to use it.