Stata Resources for Data Management

Attachment Size
Commenting and Debugging Do Files25.79 KB 25.79 KB
Programs versus do files What's the difference_v2.pdf610.67 KB 610.67 KB
Puetz gettoken guide.pdf266.52 KB 266.52 KB
Puetz closing comments guide.pdf55.6 KB 55.6 KB
traveltime3 geocode3_b.pdf180.25 KB 180.25 KB
spmap.pdf35.03 KB 35.03 KB
STATA matrix construction guide.pdf231.11 KB 231.11 KB
edgelists.pdf187.44 KB 187.44 KB
Community Resource 1.doc35 KB 35 KB
research-log-blank.doc42 KB 42 KB
wf2-data-registry.xls46.5 KB 46.5 KB
wf2-directory-design.xls20 KB 20 KB

This page is the collective product of my Stata Programming class from Spring 2014 (each segment in lines was a course day). We encourage other people to use and build upon these materials. I would appreciate links to material that builds on these topics, so I can integrate links into this page.

It is organized according to the original syllabus-- the table below contains the topics, readings, and course presentation files created for each course day. All readings should have hotlinks that worked at the time of the course (sorry if they break over time as people move their stuff around). Email me if something is broken and I can see if I have an archive copy.

As for the books, I used Scott Long's Workflow of Data Analysis Using Stata and Newton and Cox's Seventy-Six Stata Tips. Those are the only readings below without hotlinks.



Replicability, Efficiency, and Elegance as Programming Goals
Naming Conventions for Files
Writing and Reading Documentation

Help menu and items:
PDF Documentation:
Long, Ch 1: "Introduction"
Long,  Ch 2: "Planning, organizing and documenting"

Learning resource produced in class: &

Also: A cheat sheet and template files to help with planning, organizing, and documenting:

An introduction to do files
Debugging errors Stata finds
Finding errors Stata doesn’t notice

Long,  Ch 3: “Writing and Debugging Do Files”
Long, Ch 5.2-5.4 from “Names, Notes and Labels”

If you are still confused:

Learning resource produced in class:

Ado Files and the SSC Archives + Getting Help

Stata tip 30: “May the Source be With You (viewsource)”
Long, Appendix: “How Stata Works”
General overview for Stata beginners and relative novices:…

Learning resource produced in class:
PDF Guide from


Data Storage types
Variable Naming
Labels and Notes

Variable Types and Precision and String heading:
Long, Ch 5.5-5.10: "Names, notes, and labels"
Drop, Keep, Rename, Labels:
Describe and Labels:
Importing Data:… (from a common format like Excel)
Stata Tip 35: Detecting Whether Data Have Changed (Datasignature)
Other ways to get data into Stata: Pages 3-11 from"…

Learning resource produced in class:

Importing and summarizing data

Long, Ch 6.1-6.2 & 6.4-6.5: “Cleaning your data”
Stata tip 66: “ds- a hidden gem

Learning resource produced in class:

Basic Data Manipulation

Long, Ch 6.3: “Cleaning your data”
Stata Tip 52: Generating composite categorical variables
Stata Tip 2: Building with floors and ceilings
help egen in Stata—mostly focus on the functions available

If the Long was confusing to you, also read:
replace and recode:
Replace and indicator variables: (through Indicator variables)
Using tabulate to recode into dummies: “Answer 2 of 3: Use tabulate”… &

If egen was confusing to you, also read:…

Learning resource produced in class:


By, System Variables, Return Codes, Egenmore

By from Stata PDF Documentation (see course webpage)
System variables:
Using system variables:…
Return codes and values:
Long 4.2: “Information returned by Stata commands
Help egenmore (from Stata)
Stata tip 14: Using value labels in expressions

Learning resource produced in class:

Regular Expressions

Regular expressions available in Stata:
Regular expression commands in Stata:
Example of using a regular expression:
Stata Tip 60: Making fast and easy changes to files with filefilter
help filefilter (when in Stata)

Learning resource produced in class:
Regular Expressions, year 2.pdf

Manipulating Entire Datasets: Append, Merge, Collapse

Stata Tip 5: Ensuring programs preserve data sort order
help append (in Stata)
Append:… (including the Q&A below it)
Stata Tip 73: Append with care!
Long, Ch 6.6: Cleaning your data
help collapse (in Stata)
Applying merge to data cleaning: Stata Tip 64: Cleaning up user-entered string variables

If Long was unclear to you:
Merge overview:…
1-to-1 merging:… &…
One to many merging:…
Merge and append:

Learning resource produced in class:
Merge 1:1, m:1:
Merging multiple datasets in a row:

Reshaping Datasets and Transferring and Preserving Data

Reshape in general:
Reshape to long:
Reshape to wide:
Stata Tip 45: Getting those data into shape (reshape)
StatTransfer manual, Pp 50-78
Long, Ch 8: "Protecting your files"

Learning resource produced in class:
On Reshape:
On StatTransfer: Note: you could run the commands he shows you in the video by saving the commands into a text file with a .stcmd extension, placing the file in the directory that contains the things you want to transfer, and then double-clicking on the program in Windows-- it will work the same--but easier for you--as running it through the command window)
On Backups and Mirrors/Preserving Data: but I think the whole site is worth looking at b/c it has one of the funniest tag lines ever (see
Reshaping Datasets, year 2.pdf


Intro to loops

Long, Ch 4 (pp. 83-105, except 4.3.3): "Automating your work"
Macros: "Macros" on -- up to, but not including, “Nested Loops”
"B] macros" on (pp 10-14)

Learning resource produced in class:

forval versus foreach, forval in detail, i & local j, nested loops

Long, Ch 4 (4.3.3 & 105-end of chapter): "Automating your work"
Forval: "For Loops & Nested Loops " on

Learning resource produced in class: & the accompanying dataset  & do file
forval loops:
foreach loops:
nested loops:

loops continued, preventing errors (assert), and debugging loops (trace, pause, and capture)
Stata Tip 32: Do Not Stop (do nostop)
Stata Tip 41: Monitoring loop iterations
help assert in Stata
help pause in Stata
help capture in Stata

Learning resource produced in class:
Loops Continued, preventing and debugging errors, year 2.pdf


[P] postfile…
Stata tip 54: Post your results (postfile and postclose)

Learning resource produced in class:…
Post, Year 2.pdf

Programs versus do files: what’s the difference? And, passing arguments

Readings: “Programs” section… (read entire page, including questions and answers at the bottom of the page) , sections 18.4, 18.4.1

Learning resource produced in class:…
Writing Programs in Stata

Gettoken and more on arguments , section 18.4.6

Learning resource produced in class:…

Closing Thoughts on Stata

Learning resource produced in class:…


Windows batch files (which have very similar, but often more powerful analogs in Unix, and hence Macs) (skim, but read: dir, cd, copy, del)

Learning resource produced in class:…
Learning Resource for bash commands on unix machines:
Learning Resource for windows batch commands:



Working with Geo-Data:……

Matrices in Stata (Intro):…

Transforming Edge Lists from Pajek:…