SSCC Data Integrity

SSCC Windows and Linux networks serve almost 25 terabytes of disk space for home directories, project directories, temporary space, and data distribution. The purpose of this document is to describe how the integrity of these data is safeguarded.

Goals of Backups

Files on the Windows and Linux systems, including members' files in permanent network storage space, are regularly backed up by SSCC staff so that they can be available in case of disastrous disk failures.

On Windows, home directories (mapped as the U: drive) and any project directories (mapped as the X: drive) are considered permanent storage space and are backed up by SSCC staff. SSCC staff are not responsible for backing up any files stored locally on members' PC hard drives, files stored locally on the Winstats, or files stored in the temp30days Y: drive on the terminal servers.

On Linux, /home and /project directories are considered permanent storage space and are backed up by SSCC staff. /temp30days and /tmp are considered temporary storage and are not backed up.

Email and SSCC-hosted web sites are also backed up, using the Linux backup system.

Excluding Restricted-Use Data from Backup

SSCC members sometimes acquire data that contains confidential information and are required by the data distributor to agree to specific practices in the handling of these data. Sometimes this includes a restriction that the data may not be backed up. If members have data of this type, they need to notify SSCC's helpdesk if they do not want these data included in SSCC's backup.

Note that all SSCC staff are required to sign and abide by a privacy standards agreement written by the SSCC Steering Committee.

SSCC's Backup Strategy on Windows

In 2013 SSCC moved to a newer backup technology called snapshotting for its Windows system. Snapshotting is the state of a system at a particular point in time. Snapshots of the permanent storage on Windows (see above) are taken five times a day. These five snapshots are kept on disk for one week, after which they are rolled up into a daily snapshot and kept on disk for two months. Daily snapshots are then rolled up into weekly snapshots and are kept for the remaining ten months. As a result, SSCC has one year of backups.

All of the Windows backups are stored in one of two locations: SSCC's Computer Room in 4411 Social Science and a replication site in the computer room of the Education Building on Bascom Hill of the UW-Madison campus.

Under a worst case scenario in which all Windows data had to be restored from the off-site replication site, recovered data would at most be six hours old.

SSCC's Backup Strategy on Linux

The newer technology of snapshotting is not yet robust enough to use on SSCC's Linux system. We still rely on tapes for backup, but continue to evaluate snapshotting software.

A full backup is done every two weeks with nightly incrementals between each full backup. During a full backup, every file in permanent storage space is copied to disk and then off-loaded to magnetic tape. During an incremental backup, any file in permanent storage space that has changed since the last incremental is copied to disk and then off-loaded to magnetic tape.

All biweekly full backups and nightly incrementals except email are stored for one year in one of two locations: SSCC's Computer Room in 4411 Social Science and DataKeep, Inc. at 2538 Daniels Street, Madison, WI. Backup copies of email are stored for one month on disk only in SSCC's Computer Room.

Under a worst case scenario in which all Linux data had to be restored from the off-site storage facility, recovered data would at most be four weeks old.