As time passes new SSCC users tend to have more and more experience with computers, but mostly using PCs running Windows. However, statistical work for research often requires the power of Linux. Fortunately a program called Samba makes the Linux file system available to Windows, so you can actually do a lot of work with Linux programs from Windows. This article will teach you how to run Stata, SAS or SPSS programs on Linstat while using Windows as much as possible. Similar techniques also work for other programs, like R, Matlab, C/C++, FORTRAN, etc.
The first step is to write your program (or at least a first draft of it) using a text editor. For Stata programs, we suggest using the Do File Editor included in Windows Stata. For other programs we suggest TextPad because it follows all the Windows conventions you're used to so you'll be able to use it right away, but it also includes many features that are useful to programmers such as syntax highlighting. However, you're welcome to use any text editor you prefer. Note that it must be a text editor and not a word processor like Word: word processors save formatting information along with the text, and Linux programs will be confused by that formatting.
If you can put your data, your program, and any output in the same directory your program won't need to specify where these other files are found. You can simply give the file name and Linux will understand that the file is in the same directory as the program itself. If you need to reference other directories, see the discussion of how Linux works with directories under Change to the Proper Directory below.
The programs you'll write and data you'll use should probably be stored in either your Linux home directory or a Linux project directory. If you are logged into Winstat through PRIMO or using a PC in the Sewell Social Sciences Building, your Linux home directory is mapped as the Z: drive and Linux project directories are mapped as the V: drive. If you are logged into Winstat through SoE or if you are connecting from elsewhere using VPN, you'll need to map the directory or directories you need but this is not difficult.
It's simplest to put both your program and your data directly in your home directory (Z: drive) because when you log into Linux you'll start in your home directory automatically. But you may need to use a project directory if you are collaborating with other people or if your data are too big for your home directory. If your work is particularly complex you may also need to use subdirectories within your home directory to keep it organized. Using a location other than your home directory will add a step later.
Note that Linux doesn't like spaces in file or directory names (Windows will let you put them in, but to use them in Linux you'd have to put them in quotes) and it is case-sensitive.
Next log into Linstat. We suggest using X-Win32 as your client program: it's already installed on Winstat and can be freely downloaded and installed elsewhere by UW faculty, staff and students. Click on the preceding link for full instructions on its use, but the short version is to start the program, click once on the icon it creates in the lower right corner of your screen, and click on any of the available sessions.
If you saved your program directly in your home directory (Z: drive), you can skip this step entirely. But if you saved it elsewhere you'll need to use the cd (change directory) command to go to the directory where you saved your program. The general syntax is just
cd directory
where directory is the directory you want to change to. This requires knowing a bit about how Linux works with directories, which will also be useful if your programs need to refer to files in other directories. A few important differences from Windows:
When you first log into Linstat you start in your home directory, so if you had a folder on your Z: drive called research you'd change to it by typing
cd research
The Linux name of the V: drive is /project. Since this starts with a forward slash it means "go up to the root of the directory tree, then down to project." If you were working on the hivaids project and thus saved your program and data in V:\hivaids, you'd need to type
cd /project/hivaids
to get there.
You're now ready to actually run your program. The details will depend on which statistical package you're using:
If you want to work in interactive Stata just like in Windows, type
xstata
However, it's usually somewhat more efficient to work in batch mode. If your program were called dofile.do, you'd run it by typing
stata -b do dofile
You could also submit your job to Condor, which is an excellent choice for jobs that will take more than a few minutes. To do so simply type:
condor_stata -b do dofile
You can also submit Stata jobs to Condor via the web: see Submit a Stata Job to Condor.
If your SAS program were called sasprog.sas, you'd run it by typing
sas sasprog
You can invoke an interactive SAS session by typing just sas, but it's rather clumsy and few people find it useful.
To submit a SAS job to Condor, type:
condor_sas sasprog
If your SPSS program were called syntax.sps and you wanted to save the output in output.log, you'd type
spssb -f syntax.sps -out output.log
The SSCC does not have interactive SPSS for Linux.
You're now ready to take a look at your output and see how well your program worked.
Your program should contain a log command that specifies a log file. You can open this file using Windows Stata's log viewer or in TextPad.
If your program is called sasprog.sas, SAS will create a log file called sasprog.log when it runs. If the program creates any output it will also create sasprog.lst. Open both files in your text editor. Be sure to look at the log file before trusting the output file. One common scenario is that an error in the program causes it to crash before creating any output, but the output file from a previous run still exists and it won't be obvious that it's not from the current run.
Open the file you specified with the -out option when you ran the program using your text editor.
Most likely your program won't work the first time you run it, so you'll need to make changes and run it again. Remember to save the program after making the changes--Linux reads the program off of the disk, not from your text editor! In Linux the up arrow will retrieve your previous command, which is a handy shortcut for running a program repeatedly. If you're using TextPad, it will notice when output files are changed and prompt you to reload them. Then you can see if your changes actually fixed the problems, and make further changes as needed.
Last Revised: 10/26/2010
