Linstat is the SSCC's primary Linux computing cluster. Linstat combines familiar statistical software like Stata, SAS, SPSS, R and Matlab with the power of Linux, making it ideal for jobs that require more memory or computing time than Winstat can provide. Linstat also gives you access to the SSCC's Condor flock, where you can run very long jobs (days or weeks) or run multiple jobs at the same time.
Learning to run jobs on a Linux server is probably easier than you think. If you're new to Linux, be sure to read the section Getting Started on Linstat. Veteran Linux users can probably stop reading at that point, but be sure to read the sections before that which describe some of the unique features of Linstat.
How you'll connect to Linstat depends on what kind of computer you're connecting from:
If your computer runs Windows, we suggest you connect using a program called X-Win32 (though there are many fine alternatives). X-Win32 is already installed and configured on Winstat, so one option is to log in to Winstat and run X-Win32 there. Alternatively, you can download and install a pre-configured version of X-Win32 from the SSCC web site. Simply download the installation file and then double-click on it.
Download X-Win32 from the SSCC
When you run X-Win32 it will place an icon in the lower right corner of your screen: ![]()
Click on the icon once and choose Linstat.
For more details, including how to set up a connection to a particular Linstat server, see Connecting to SSCC Linux Computers using X-Win32.
Macs and Linux computers have client programs for connecting to Linux servers installed by default. Simply start a Terminal program (on a Mac it will be found under Applications, Utilities) and then type:
ssh -Y username@linstat.ssc.wisc.edu
username should be replaced by your SSCC username. If your username on your computer is the same as your SSCC username, you can leave it out (ssh -Y linstat.ssc.wisc.edu). If you are plugged into the wired network in the Sewell Social Sciences Building you can leave out the domain (ssh -Y linstat).
For more details, including how to connect to a particular Linstat server, see Connecting to Linstat from a Mac.
When you connect to Linstat, you'll be directed to one of the three Linstat servers (linstat1, linstat2 and linstat3) automatically. This will spread users among the three servers and help avoid situations where one server is much busier than another.
If you are running a long job and need to connect to the same server again to monitor it, log in to Linstat and then type ssh server, where server should be replaced by the name of the server where you started the job. Be sure to note which server you're on when you start a long job. Most people have the server name in their prompt, but if you don't you can find out which server you're using by typing printenv HOST. It's also possible to connect to a specific server directly—the links in the previous section have instructions.
Almost all the software installed on Linstat is installed on all three Linstat servers. The two exceptions (due to licensing restrictions) are SPSS and Stat/Transfer. They are installed on Linstat1. If you run SPSS or Stat/Transfer on another Linstat server they will automatically connect to Linstat1 and run your job there, but if you need to manage that job later you'll need to log in to Linstat1 to do so.
The Linstat servers have 48GB of RAM, but we've limited the amount any one person can use to 24GB. This will help prevent the servers from running out of RAM, which causes performance and stability problems. Exceptions can be made, so if you need more than 24GB of RAM contact the Help Desk.
/ramdisk is a special "directory" that is actually stored in RAM, making it extremely fast. The maximum size of /ramdisk is 24GB, and any files that are not in use will be deleted after one hour. /ramdisk can be very helpful for programs that spend a lot of time reading and writing temporary files.
On Linstat, the default directory where SAS stores temporary data sets (the WORK library) is /ramdisk. This increases the speed of data-intensive programs significantly. It also prevents them from slowing down the entire server due to disk I/O bottlenecks.
If you need more than 24GB of temporary space, change the WORK directory to /tmp. You can do so by adding the -work option to your SAS command:
sas -work /tmp myprogram
You'll then be able to use up to 200GB of space (or as much of it as is available at the time). For more details see Running Large SAS Jobs on Linstat.
The SSCC's Condor flock contains 48 CPUs and is ideal for running very long statistical programs or for running multiple jobs at the same time. Condor can run Stata, SAS, Matlab and R jobs as well as user-written programs. We've written scripts that make submitting jobs to Condor very easy—see An Introduction to Condor for instructions. (You can also submit Stata jobs to Condor flock via the web.)
Linux can be intimidating because it just waits for you to type commands without giving you any menus or icons to suggest what you can do. But if all you want to do is run jobs, you can get by with just a couple of Linux commands. Here's how:
If you're on Winstat or a Windows PC that logs into the SSCC's PRIMO domain, your Linux home directory is available as the Z: drive, and Linux project directories are the V: drive. They're also available from Macs—see Using SSCC Network Disk Space from Macs. This means you can write your program, manage your files, etc. using the tools you're familiar with and still put the programs and related files on the Linux file system so Linstat can run them.
Put all the files relating to a given project in a single folder (or directory in Linux terminology), then write your programs on the assumption that that folder will be your working directory (i.e. a Stata program should say use datafile, not use z:\research\datafile). If you're only working on a single project then just declare Z: itself that project's "directory."
When you log into Linux, your "current working directory" (where you "are" in the file system) starts out as your home directory—what Windows calls Z:. If that's where your project's files are, you can skip directly to running your job. Otherwise you'll need to go to your project's directory using the cd ("change directory") command. If your project's directory is on your Z: drive, type:
cd projectDirectory
Where projectDirectory should be replaced by the name you gave your project's directory.
If your project's directory is inside an official Linux project directory on the V: drive, type:
cd /project/projectName/myProjectDirectory
Note how Linux separates directories with forward slashes (/) not backslashes (\), and that there are no drive letters in Linux. Linux also doesn't like spaces in file or directory names.
The command to run your program will depend on the program you want to use. Here are some of the most popular:
You can start Stata's graphical user interface by typing xstata. (The do file editor in Linux Stata does not have all the features of the do file editor in Windows Stata, so you probably want to do your program writing in Windows.) You can also run a do file called mydofile.do in batch mode by typing:
stata -b do mydofile
Alternatively you can submit it to Condor with:
condor_stata mydofile
If you run mydofile.do in batch mode or on Condor, Stata will automatically log its output in mydofile.log.
You can start SAS's graphical user interface by typing sas, though it's somewhat clunkier than the Windows version. You can also run a program called myprogram.sas in batch mode by typing:
sas myprogram
Alternatively you can submit it to Condor with:
condor_sas myprogram
If you run myprogram.sas in batch mode or on Condor, SAS will create a log file called myprogram.log and put any output in myprogram.lst.
SPSS on Linstat does not have a graphical user interface, so you'll need to write your syntax files ahead of time (perhaps using Windows SPSS) and then run them. To run myprogram.sps and save its output in myprogram.log, type:
spssb -f myprogram.sps -out myprogram.log
To run R, simply type R. It does not have a graphical user interface but the commands are the same as in Windows R. To submit myprogram.R to Condor and save the output to myprogram.log, type:
condor_R my program.R myprogram.log
You can start Matlab's graphical user interface by typing matlab. To submit myprogram.m to Condor and save its output in myprogram.log, type:
condor_matlab myprogram.m myprogram.log
Linstat has many other programs available (see our software database). See the documentation of the program you're interested in for details on how to run it.
If your job will run for a long time, put it "in the background" by adding an ampersand (&) to the end of the command. For example:
stata -b do mydofile &
This will allow you do other things on Linstat while the job is running, or log off without interrupting the job. Jobs submitted to Condor are essentially "in the background" already.
For more information see Managing Jobs on Linstat.
While this will get you started, there are several other SSCC Knowledge Base articles you can read to become a more flexible and efficient Linstat user. Running Linux Programs Using Windows (Mostly) will give you more details about running programs on Linstat while doing most of the work in Windows. Managing Jobs on Linstat will teach you how to monitor and manage jobs while they run. An Introduction to Condor will teach you more about the SSCC's Condor flock and how to use it. Finally, if you really want to make yourself at home in Linux, read the SSCC's Getting Started in Linux. For a full list of articles, visit the Linux section of our Knowledge Base. SSCC staff will also be happy to answer any questions you have about using Linstat and help you solve any problems you run into—just contact the Help Desk.
Last Revised: 7/24/2012
