Statistical Computing at the SSCC

Statistical computing was the original reason social scientists got interested in computers--email, web pages and all the rest came later. The SSCC sees statistical computing as central to our mission of supporting social science research, and we're eager to help you start getting useful results as quickly and easily as possible.

Which Statistical Program Should You Use?

SSCC staff fully support three general purpose statistical packages: Stata, SAS, and SPSS. Each has its strengths and weaknesses, and each has its advocates. If you know one of them well, the benefits of using that expertise probably outweigh the benefits to be gained by switching. If you will be collaborating with someone who is dedicated to one program you probably want to use the same one. On the other hand, they are not all the same and if you have a choice you should choose one that meets your needs.

Stata is the most popular statistical program at the SSCC. It has a point-and-click graphical user interface that makes it possible to get basic results right away, but also makes it easy to create and run complex programs. Stata's syntax is fairly intuitive and easy to learn, but very powerful. Stata has a very large variety of statistical techniques built in, and its user community has made code available for many more.

SAS is as powerful and flexible as Stata, but it has a steeper learning curve. SAS requires writing out programs, but this is a good practice in any statistical package. While most new SSCC members are choosing to learn Stata over SAS, very few SAS veterans see any need to switch. Also, SAS is used heavily in business and government, so familiarity with SAS is a very marketable skill.

SPSS is the least popular program for research use at the SSCC, but it is often used for teaching statistics (though Stata works well for teaching too). SPSS for Windows has a graphical user interface similar to Stata, but does not make it as easy to make the transition to writing programs. SPSS also does not have as many statistical techniques built in as Stata or SAS.

A variety of special purpose statistical packages are also available to you. Please see our Software web page to look up what software we have, where it runs, and how much help SSCC staff can offer you with it.

Which Operating System Should You Use?

The SSCC provides labs stocked with PCs running Windows and Winstat (our Windows Terminal Server farm) for remote use, but also Linux servers. Stata, SAS and SPSS are all available in both operating systems.

If, like most new SSCC members, you are comfortable with Windows but have never used Linux you can probably stick with Windows. Windows has made great strides as a platform for serious computing, and Winstat provides a great deal of computing power. However, you might want to consider running your jobs using Linux if:

  • Your jobs require more memory than Windows can provide
  • You'd like to run multiple jobs at the same time
  • Your jobs are taking a long time to run

Linux servers are happy to run jobs for weeks at a time if necessary. On the other hand, Winstat disconnects sessions after 24 hours and terminates disconnected sessions after three hours, and the lab PCs are intended for use while you are present at the PC (though we do have some PCs that can be reserved for long-running jobs).

Linux's command-line interface can be intimidating because it just waits for you to type something without giving you any icons or menus to suggest what you can do. However, once you know what to type it's highly efficient. Also, having to type out everything you do supports reproducibility, record-keeping, and other good practices in research computing.

The SSCC has many resources for learning Linux, including classes and Knowledge Base articles. Alternatively, the Linux file system is available from Windows, so it's possible to write your programs in Windows and switch to Linux just to run them. See Running Linux Programs Using Windows (Mostly). Anyone submitting Linux jobs should also take a look at Managing Jobs on Linstat. Not only will you learn tricks that will help you work more efficiently, you'll learn how to not slow down the servers for others. If you really want to become a Linux expert, see Getting Started in Linux.

For more details about the SSCC's servers and their capabilities, see Computing Resources at the SSCC.

Learning to Use Statistical Software

Learning to use any statistical package is an investment, and you should plan to put in some time before getting any useful results. It's rare that shortcuts such as using other people's code actually save you much time in the end unless you know what you're doing. Start with a general introduction. Our Knowledge Base includes Stata for Researchers, Stata for Students and An Introduction to SAS Data Steps. If you prefer books, Stata's User's Guide really will teach you to use the program, or try A Gentle Introduction to Stata by Acock. For SAS we suggest The Little SAS Book by Delwiche and Slaughter. Some books are available in the CDE Library or the 4218 computer lab. The SSCC regularly offers short classes on all three statistical programs; check the schedule on the web for details. Also, the Sociology Department has some official timetable courses that cover statistical software.

Don't forget about the SSCC Knowledge Base after you've started your project. Our section on statistical software can teach you more advanced topics or answer specific questions. Stata and SPSS both have online help built in to the program (Stata's help and findit commands are particularly handy). SPSS and SAS have documentation online, and full Stata documentation is available in the program through its Help menu.

As always, the SSCC Help Desk can help you. Staff with expertise in statistical software consult from 1:00 to 4:00 every afternoon, but feel free to bring up statistical questions at any time and whoever is on duty will record them and refer them to the person who can best answer. We can answer general statistical questions, but we do not have a statistician on staff. We also cannot write your programs for you. But we will be more than happy to help with planning your project, finding the tricks that will make your code work, and of course finding and fixing bugs.

Last Revised: 10/19/2009