One of the main reasons for using Linstat is that it can run very long jobs. This article will teach you how to manage jobs on Linstat.
Normally when you type a command, it is processed and you see the results (if any) before the cursor returns and you can type a new command. These jobs are said to be running in the foreground. If you put a job in the background, the cursor returns immediately and you can keep giving commands and doing other work while the your job is running. When the job finishes a message will appear on your screen.
To run a job in the background, add an ampersand (&) at the end of the command. For example, if you type:
stata -b do myprogram
Stata will start and run myprogram.do in the foreground. Thus your session will be unavailable until the job is done. On the other hand:
stata -b do myprogram &
runs Stata in the background. The cursor returns immediately and you can do other things while Stata is running your program. When it is done you'll see a message like:
[1] Done stata -b do myprogram
A job which creates a separate window (xstata, for example) will be completely functional in the background (in fact xstata puts itself in the background by default).
If a job is running in the background it will keep running even if you log out, so you can start a long job before you leave in the evening, log out, and get the results the next morning (or next week, or next month--though such jobs are good candidates for Condor).
What you should not do when you have a job running in the background is start another CPU-intensive job--see the SSCC's Server Usage Policy.
If you have a job running in the foreground and want to put it in the background, press CTRL-z (if the job has opened a separate window, you must return to your main Linstat window before pressing CTRL-z). The current job will be suspended and you will get your cursor back. Then type bg to put it in the background--it will not run while suspended. You can also type fg to move it back to the foreground, either from being suspended or from the background.
The ps command (think processes) gives you a list of all the processes you are running on the server. The output will be similar to the following:
PID TTY TIME CMD
29413 pts/30 00:00:00 tcsh
1601 pts/30 00:00:00 emacs
1602 pts/30 00:00:00 emacs
1605 pts/30 00:00:00 ps
PID is short for Process IDentifier, and is used when you need to specify a particular job. Keep in mind that Linstat is a cluster of three servers, and ps will only show you the jobs you are running on the server you're logged into. See Switching Between Linstat Servers to learn how to get back to the Linstat server where you started your job.
Another useful command for monitoring jobs is top. This will tell you the "top" jobs (in terms of resources used) currently running on the server. With it you can verify that your job is actually doing work by checking that its %CPU is greater than zero, though jobs can easily get stuck in a state where they use 100% of a CPU without doing anything productive.
top also gives you a sense of how busy the server is. The Linstat servers have eight CPUs and thus %CPU can add up to 800%. If the Linstat you're on has less CPU time available than your program is capable of using, consider switching to a different Linstat. (Most programs can only use one CPU, or 100%, but some can use more. Stata/MP on Linstat can use up to 400%, for example, though it usually uses less.)
Unfortunately top does not monitor all the resources a server needs to run jobs. For example, SAS jobs occasionally generate enough disk traffic to slow down a server without anything unusual appearing in top.
If you need to stop a running job, use the kill command. Simply type kill and then the PID of the job you want to kill. For example:
kill 1602
This doesn't actually stop the job, it merely requests that it shut down, giving the program an opportunity to clean up temporary files and such. Unfortunately both SAS and SPSS will not do so, so if you kill one of these jobs, please go to the /tmp directory and manually delete any files and directories belonging to you. On the other hand, adding the -9 switch to the kill command will kill a program immediately with or without its consent. Thus:
kill -9 1602
will kill process 1602.
Linstat is actually a cluster of three servers. When you log in you're assigned to a server randomly to try to balance the load between them. However, you can choose to connect to a specific server to monitor a job you started previously or if the server you're assigned to turns out to be particularly busy.
To switch to a different server, type:
ssh server
where server can be linstat1, linstat2 or linstat3. Alternatively you can set up your client program to log in to one of those three servers directly.
Be sure to note which server you're on when you start a long job. If the server name is not in your prompt, you can identify it by typing:
printenv HOST
Last Revised: 10/28/2010
