Stata for Researchers: Learning More

This is part eleven of the Stata for Researchers series. For a list of topics covered by this series, see the Introduction. If you're new to Stata we highly recommend reading the articles in order.

Congratulations, you now know enough Stata to get started and do some very useful things. However, you'll almost certainly need to learn more at some point in your Stata career. Thus we'll conclude by discussing resources for doing so.

Help

Your first resource is the Stata help files. To see the help for a particular command type help command, e.g.

help egen

You'll get a syntax diagram, a brief explanation of the various options, and even examples.

In the syntax diagram, optional elements are placed in square brackets. Thus for egen a [type] is optional (if you don't specify a variable type you'll get the default float) while a name for the new variable is mandatory. If part of a word is underlined, that is the minimum abbreviation for that word. Thus in the anycount() function, the values() option could be abbreviated as just v() or as val(), value() etc.

Help on Functions

To use a function you need two pieces of information: the input and the output. The inputs, or arguments, are the things that go in parentheses.

For egen functions the inputs will almost always be a single entity, but that entity could be a list of variables (varlist), a single variable (varname) or a mathematical expression (exp), among others. Keep in mind that a single variable counts as an expression.

To get help on general-use functions, type help functions and then click on the type of function you need (for example, string functions). To use these functions you need to find out how many arguments are needed and what they mean. For example, the abbrev() function is listed as abbrev(s,n), which tells you it takes two arguments. Domain s: strings and Domain n: 5 to 32 tells you the first argument must be a string and the second must be a number between 5 and 32, but they don't have to be called s or n. The inputs can be variables of the proper types, or quantities you type in. Range: strings tells you the output is a string, and Description: returns s, abbreviated to n characters, along with the longer note below that, tells you what that string will be.

Findit

You'll often know what you want to do but not the name of the command that will do it. Then findit is your best bet—think of it as Google for Stata. For example, suppose you want to do something with Heckman selection models. If you type

findit heckman

you'll get a tremendous amount of information. First Stata will search the help files and point out that there is a heckman command, along with related commands like suest and treatreg. Then it will search the Frequently Asked Questions files on Stata's web site and the large Stata web site at UCLA. Finally it will search through the user-written programs that have appeared in the Stata Journal, the old Stata Technical Bulletin, or in the Boston College Statistical Software Components archive. You can find out what these programs do by reading their help files (.hlp), and if you decide they'll be useful to you you can download and install them by clicking on the click here to install link. See Finding and Installing User-Written Stata Programs for more information.

Another useful tool for finding commands is the Also see section at the bottom of each help file. If you can think of a command that's close to what you want to do, call up its PDF help file and then see what's related to it.

Documentation

More extensive documentation is available as PDF files. For example, click on the heckman command in the findit results to see its help file, then click [R] heckman at the top. This opens the entry for heckman in the Reference manual (hence the [R]). This will give you a longer description of what the command does, along with worked out examples and technical information about how the command is implemented. The references can be a good place to start if you need to learn more about the theory behind the method.

The PDF documentation is also good for general learning about Stata in general, especially the User's Guide (sections headings like this—once upon a time they were separate books—are found on the left). You can open the PDF documentation directly by clicking Help, PDF Documentation.

SSCC Resources

The SSCC's Knowledge Base has a large section on Stata, including general guides like this one and discussions of specific topics like Bootstrapping in Stata or Using Stata Graphs in Documents. Once you feel confident using Stata's basic syntax, we strongly suggest reading Stata Programming Essentials. It will teach you things like how to do the same thing to ten different variables without having to write it out ten times. If you're interested in graphics, be sure to read An Introduction to Stata Graphics.

The SSCC offer classes on Stata each semester, generally including a class based on this Stata for Researchers series, a class on Stata programming, and at least one class on some other topic—see the training web page for details and to register.

Finally, the SSCC's statistical consultants are available to assist SSCC members. We cannot write your Stata programs for you. But we will be more than happy to help with planning your project, figuring out the commands that will make your program work, and of course finding and fixing bugs along with consulting on statistical methodology.

Practice

The most important resource for learning Stata is practice. If you don't use the skills and knowledge you've gained from reading this series within the next few weeks (at most) you'll lose them rapidly. If you don't have a current research project that will require you to use Stata, make one up.

One particular pitfall to watch out for is "I'll just do it in Excel." It may be true that you can carry out a particular task in Excel faster than you can first learn how to do it in Stata and then actually carry it out. But if you do it in Stata anyway, the next time it comes up you'll be able to do it much more quickly in Stata than in Excel (and more reproducibly, and with less likelihood of error). You'll also build up your general Stata expertise, so that soon you'll be able to do things faster in Stata even if you've never done them before. Now that you've spent the time to learn Stata, plan on never using Excel for research again.

This concludes the Stata for Researchers series. We hope it has been useful to you, and that your relationship with Stata will be a long and productive one.

Previous: Do Files and Project Management

Last Revised: 1/4/2016