Explanation of Mail Delays
Over the past two weeks we've experienced intermittent delays in
delivering mail. This was due to the server being simply too busy
to deliver the messages as quickly as they arrived. The immediate
cause was the additional
steps we took to filter spam: now that SpamAssassin is looking
for more clues that a given message is spam, it takes more processor
time to examine each message.
However, this was a case of "the straw that broke the camel's back"
as mail volume was already approaching the limit of what our mail
cluster could process. Since December of last year, the total number
of email messages we process on a typical day has doubled, and the
total size of the email traffic has increased sixfold. Much of the
increase in size is due to spammers disguising their advertisements
in images so filters like SpamAssassin can't read them.
To fix these delays, we've moved the SpamAssassin program from
the main mail server cluster to its own cluster of four servers.
By spreading the load among four servers which do nothing but run
SpamAssassin, we can process today's mail load in a timely manner
so mail delivery will go back to being nearly instant. As mail volume
continues to increase, we can add more servers to the SpamAssassin
cluster or the main mail cluster as needed.
We are still fine-tuning the SpamAssassin cluster, so you may notice
more spam in your inbox than usual and that the Not Legit folder
is not getting cleaned out every night as it should. We expect to
resolve these issues very soon.
Reminder: Put Spam that Reaches Your Inbox
in Not Legit
SpamAssassin "learns" by keeping a database of words and how often
they are used in both legitimate messages and spam messages. When
you put a message SpamAssassin failed to identify as spam in the
not legit folder, SpamAssassin corrects its database. One message
won't change it by much--in particular it doesn't guarantee that
SpamAssassin won't miss the next message that you can see is similar,
because the spammers are careful to mix up the words they use. But
over time having an accurate database will help. It's also the only
way we can tell how well SpamAssassin is working.
Condor Upgrade and Stata/MP
The SSCC's Condor flock has been upgraded with all-new servers
and will run jobs as quickly as any of the SSCC servers. In addition,
four of the Condor servers have Stata/MP installed (the multi-processor
edition). Stata/MP will run most jobs substantially faster than
Stata/SE, and Stata jobs submitted to Condor will automatically
be run using Stata/MP if available. This makes Condor the fastest
way to run a Stata job at the SSCC. (We also have a single license
for 64-bit Stata/MP on FALCON, if you need both speed and large
amounts of memory.)
To submit a Stata job to Condor, log in to KITE and replace the
standard "stata -b do {file}" with "condor_stata -b do {file}."
To submit other jobs, type "condor_do {job}". For more details see
An Introduction
to Condor.
Reminder: Set Your Security Questions
If you haven't already, please remember to set
your security questions. If you forget your password and have
your security questions set, you can easily reset
your password yourself. If you don't, SSCC staff will have to
reset it for you. Unfortunately, we're not here 24 hours a day and
we can't take requests to reset passwords by phone or email because
we cannot verify your identity, so you'll have to stop by 4226 Sewell
Social Sciences Building (and please bring some form of photo ID).
Thus the further you are from the Sewell Building, the more important
it is that you set your security questions!
SSCC's Fall Training Schedule
SSCC's Fall training schedule is underway. Check out our remaining
offerings on SSCC's
training web pages. In October we are offering "Introduction
to Parallel Computing," "An Introduction to SAS Data Steps,"
"A Hands-On Introduction to NVivo," "Social Science
Article Databases Overview," and "RSS and Alert Services."
Remember that all SSCC training sessions require preregistration.
|