Topics

1. Decisions 2. Types of CIT 470: Advanced Network and 3. Backup Hardware System Administration 4. Backup Software Backups 5. Snapshots and CDP 6. Cron 7. Backup Security

CIT 470: Advanced Network and System Administration Slide #1 CIT 470: Advanced Network and System Administration Slide #2

Backup Decisions Why Backups?

Why? Why are you backing up data? What would happen if you lost data and 1. Accidental deletions. didn’t back up? What types of data do you have? What? 2. Hardware failures. What to back up—entire system, or specific filesystems? What OS to backup? What other things to backup—MBR, LVM? 3. Data corruption. When? 4. Security incidents. When is the best time to backup? How often? Where? 5. Plan for the worst. Where will backup occur? Where to store backup volumes? Who? 1. System catches fire. Who is going to provide backup system? Who will do backups? 2. Fire spreads to replicated systems. How? How are you going to do backups? Tape, mirrors, off-site, etc. 3. Sprinklers destroy backup system in data ctr.

CIT 470: Advanced Network and System Administration Slide #3 CIT 470: Advanced Network and System Administration Slide #4

What to Backup? Filesystem / Data Types

Backups for your backups – Be able to restore your backup server (software + backup volume db) in case it’s down. Standard OS image on server. Which peripherals? Software – How many drives per server? Software + config files specific to server. – What is the capacity of each drive? How were they partitioned? Data – Drive partitions must be same as before disaster for Data files specific to server. restore from backup to work. – fdisk –l – lvmcfgbackup

CIT 470: Advanced Network and System Administration Slide #5 CIT 470: Advanced Network and System Administration Slide #6

1 Backing Up Selected FS Backup Entire System

Saves media space and network traffic Complete automation – But OS is small compared to data today. – Create script to parse /etc/fstab and LVM, then Harder to administer backup every disk filesystem. – Must remember or document which fs to backup – Do this once, then it works on all servers. for each server. Worst case Easier to split up between volumes – Increase network traffic by a few percent. – Can easily distribute backup of a server across different backup volumes on a per fs basis. Worst case – Forget to backup an important filesystem.

CIT 470: Advanced Network and System Administration Slide #7 CIT 470: Advanced Network and System Administration Slide #8

Backup Types Backup Types

Image level Full backup – Backup raw disk partition or entire disk. Complete copy of all files from a particular time. – Back up every byte on drive, used or not. Backup: slow, requires high capacity. – Use compression to eliminate GBs of zeros. Restore: fast, simple. – Cannot restore individual files. Incremental backup Filesystem level Storage of changed files since last backup. – Backup files within filesystem. Backup: fast, may store multiple per tape. – Backup tool must understand filesystem. Restore: slow, complex (requires multiple tapes) – Can restore files, no backup of unused blks.

CIT 470: Advanced Network and System Administration Slide #9 CIT 470: Advanced Network and System Administration Slide #10

Capacity Planning: Space Capacity Planning: Time

Partition: 40GB Fileserver: 4TB Full backup every week. Full backup must finish overnight (8 hours) Daily incremental backups. Tape drive: 40MB/s = 144 GB/hr = 1.15TB/night 50% full now, grows 2GB per day Need 4 tape drives running simultaneously. Tape capacity needed Additional concerns: Day 1: 40GB Day 2: 2GB Network performance btw file & backup servers. … Does any capacity need to be reserved for Day 7: 12GB restores? Day 8: 40GB Actual performance vs. manufacturer specs.

CIT 470: Advanced Network and System Administration Slide #11 CIT 470: Advanced Network and System Administration Slide #12

2 Capacity Planning: Media Choosing a Backup Drive

How much media do you need? 1. Reliability Determined by policy and schedule. 2. Flexibility How long are full backups kept? 3. Transfer speed How often are incrementals recycled? 4. Time-to-data How often are tapes moved off-site? 5. Capacity 6. Compatibility 7. Cost

CIT 470: Advanced Network and System Administration Slide #13 CIT 470: Advanced Network and System Administration Slide #14

Reliability Flexibility

MTBF Flexibility means – Able to respond to different data rates. – Remember that drives fail faster than MTBF claims early – Can be used in different ways. and late in lifespan. Tapes aren’t very flexible – Talk to people who have used system. – Typically require a standard data rate. Duty cycle – Slower/faster rates result in I/O errors. – Expected usage per day. – Can only read/write sequentially. – 40% duty cycle = 10 hours per day Disks are very flexibile – MTBF based on listed duty cycle. – Random access medium. – Can change data rates rapidly. Hard drives are more reliable than tape. – Combine with RAID or LVM for capacity or perf. – Closed system protects from contaminants. – Virtual tape software allows disks to appear as tape.

CIT 470: Advanced Network and System Administration Slide #15 CIT 470: Advanced Network and System Administration Slide #16

Transfer Speed Time to Data

Compare native sustained transfer rates Time to Data – Transfer rates often assume compression – How long to load a volume + – Seek to appropriate place on volume + – Burst or synchronous rates are temporary, best- case scenarios. – Start reading data. Load time can include Disks are much faster than tape. – Time to manually find and load tape. – Time for tape robot to locate and load tape. Most restores are for a few files. – User deletions. – Time to Data matters more than Transfer Rate

CIT 470: Advanced Network and System Administration Slide #17 CIT 470: Advanced Network and System Administration Slide #18

3 Capacity Compatibility

Want one backup to fit on single volume. Want to be able to restore everything. – Easy to manage than backups across multiple – If new system incompatible with old, must volumes. transfer old backups to new format. – Tape capacity grows slower than disk capacity. Use a single format for easy of management. Cost – Lots of small backups to a single volume. – Reduces number of volumes to purchase + store High capacity is faster – Fewer volume switches when backing up.

CIT 470: Advanced Network and System Administration Slide #19 CIT 470: Advanced Network and System Administration Slide #20

Backup Media D2D2T

Flash Memory Backup data first to disk Very expensive, small media, personal use only. Super floppies – Take advantage of fast disk speeds. ZIP 750MB, small capacity, high $/GB media. – Complete backups within nightly window. Optical Copy backup disks to tape CD-R cheap drives, slow + small (650MB). DVD-R cheap drives, slow + small (4.7GB). – Copy backup data from disks to tape. Ultra Density Optical (UDO) expensive but larger (60GB). – Disks aren’t in production, so this can be slow. Hard disk Large capacity (1TB), bulky, low $/GB media. Reuse backup disks each night. Tapes Large capacity (800GB), low $/GB media; expensive drives.

CIT 470: Advanced Network and System Administration Slide #21 CIT 470: Advanced Network and System Administration Slide #22

Tape Formats Tape Appearance

Linear Tape Open (LTO), a/k/a Ultrium LTO and SDLT cartridges • 1.5TB capacity (LTO 5) • 140 MB/s transfer rate

Super Digital Linear Tape (DLT) • 800GB capacity (DLT S4) • 60 MB/s transfer rate Full and half height 5.25” SCSI LTO drives

Super Advanced Intelligent Tape (SAIT) • 800GB capacity (SAIT-2) • 45 MB/s transfer rate

CIT 470: Advanced Network and System Administration Slide #23 CIT 470: Advanced Network and System Administration Slide #24

4 Common Tape Features Hardware vs Software Compression

Form Factor Software Compression 5.25” FH SCSI drives – Compress data via software before writing. Media are ~1/2” wide tape stored in cartridges. – Can use high compression tools like 7zip, bzip2. Hardware compression – Lowers amount of data xfer across network. Usually cited as 2:1, some cite higher. – Higher CPU usage. Depends heavily on nature of data stored. Hardware Compression Future Roadmaps – Compress using specialized hardware on tape. Plans to double capacity in next few years. – Does not require additional CPU usage. – Increases throughput of drive.

CIT 470: Advanced Network and System Administration Slide #25 CIT 470: Advanced Network and System Administration Slide #26

Tape Autochangers Backup Software

Stackers OS Provided (backup individual systems) Works sequentially through a stack of tapes. cpio, dump, , ntbackup Autoloader / Jukebox Open source (backup servers) Provides random access to set of tapes. AMANDA Library / Silo Bacula Multiple drives w/ random access to set of tapes. Commercial (backup servers) May incorporate bar code reader, ethernet, etc. Tivoli Storage Manager (IBM) Veritas Storage Manager

CIT 470: Advanced Network and System Administration Slide #27 CIT 470: Advanced Network and System Administration Slide #28

Provided Backup Software Windows Restore Points

UNIX/ Auto backup of registry + critical files – tar – Must have System Restore enabled. – cpio – Done automatically + can create manual too. – dump – Useful when software install has corrupted. – dd Windows Recovering Windows with a Restore Point – ntbackup – Boot into safe mode. – Windows Restore – Select “Restore my computer to an earlier time” MacOS – Choose date from list of restore point dates. – ditto – If that doesn’t work, reboot, try an older one.

CIT 470: Advanced Network and System Administration Slide #29 CIT 470: Advanced Network and System Administration Slide #30

5 Backup Software Open Source Backup Tools

Feature tar cpio dump Amanda List files on vol Slow—search entire Slow—search entire Fast—index at front backup (tar –t) backup (cpio –it) (restore –t) – Single master backup server. Incremental backup Easy (-newer) Must use find Easy (set level) – Simple, fast, uses native backup tools. List files as backed up tar cvf cpio –v Only after backup Compatibility Multi-platform Multi-platform with Readable between Bacula ASCII header some platforms. – Client/server backup system. Backup specific files Yes Yes Whole fs only Stop reading tape after No No Yes – More advanced features but slower than Amanda. restore file is found BackupPC Likelihood file exists Low Low Medium (TOC is in TOC but not in made before backup) – Web-based so works with any client OS. backup Recovery from I/O No No Yes – Backs up PCs and laptops to disk on server. errors

CIT 470: Advanced Network and System Administration Slide #31 CIT 470: Advanced Network and System Administration Slide #32

Snapshots Snapshot Applications

Virtual read-only copy of filesystem Quick restore times – Snapshot has same contents that filesystem has when snapshot was made. – Snapshots take seconds to create, restore from. – Snapshot uses pointers to data and copy-on-write to – Won’t help you if disk or other hardware fails. avoid making a copy of entire fs. – Snapshot lifetime could be short as 1m or as – Snapshots require ~1% of fs size, depending on updates. long as a few days for this purpose. Staging for backups – Snapshot filesystem before starting backup. – Files on snapshot do not change during backup. – Snapshot lifetime is how long it takes to backup.

CIT 470: Advanced Network and System Administration Slide #33 CIT 470: Advanced Network and System Administration Slide #34

LVM Snapshots Continuous Data Protection

LVM can create snapshots of logical volumes. Copy every file change to backup server. lvcreate -L500M -s -n backup /dev/db/db1 – Stores changes in a log like RCS or a database. – Creates snapshot (-s) volume named backup – Can restore to any point of time. – Volume is snapshot of /dev/db/db1 LV – Can make 500M of changes. Near-CDP What happens if >500M of changes? – Snapshots + replication. – LV can’t receive copies of old data if changes are made – Does not provide a log, so can only restore to to original logical volume. saved snapshots, not to any change like CDP. – Ensure 500M is more than enough space for changes made during lifetime of snapshot.

CIT 470: Advanced Network and System Administration Slide #35 CIT 470: Advanced Network and System Administration Slide #36

6 Backing up Virtual Machines Automation

Back up VMs as physical machines The key to efficiency and reliability. – Connect VM to your standard backup system. Use cron instead of manually backing up. – Have to configure backups for each VM. Single tapes require manual media change. Back up VM files Tape libraries automate this process. – Can back up all VMs on host at once. Other automated tasks – VM files are constantly changing, so either Monitoring (up/down, disk space, security) • Suspend VM Logs (rotation, monitoring) • Snapshot filesystem with VM files File distribution

CIT 470: Advanced Network and System Administration Slide #37 CIT 470: Advanced Network and System Administration Slide #38

Cron Crontab

Performs tasks at scheduled times. Format Crontab files specify schedule of tasks minute hour day month weekday user command root: /etc/crontab Examples users: /var/spool/cron/crontabs/* 30 4 * * 0 root yum –y update Cron may log activities and errors. 3 * * * * root (cd /var/www; make) Timing limitations: 20 1 * * * root /usr/local/rot-logs Runs tasks (if any) every minute. Does not perform scheduled tasks if system down. May or may not perform tasks on DST transition.

CIT 470: Advanced Network and System Administration Slide #39 CIT 470: Advanced Network and System Administration Slide #40

Managing Automated Tasks Backup Security

Divide by time: Hourly, daily, weekly, monthly tasks Tape security Crontab uses run-parts meta-script: Tapes contain all of your important data. Data isn’t secure unless tapes are secure. 17 * * * * root run-parts --report /etc/cron.hourly Solutions: tape vault, encrypted tapes. 25 6 * * * root run-parts --report /etc/cron.daily Backup server security 47 6 * * 7 root run-parts --report /etc/cron.weekly Has read access to all important data. 52 6 1 * * root run-parts --report /etc/cron.monthly If backup server isn’t secure, data isn’t secure. Add crons by placing script in time directory. Solutions: integrity checking, least privilege Add random delay if all hosts share same crontab. Restore process Who can request files to be restored? Where will restored file be placed? What will its ACL be?

CIT 470: Advanced Network and System Administration Slide #41 CIT 470: Advanced Network and System Administration Slide #42

7 References

1. AIT, AIT Forum, http://www.aittape.com/index.html, 2006. 2. Lynne Avery, “A Brief History of Tape,” Exabyte white paper, http://www.kontron.com/techlib/whitepapers/A_brief_history_of_tape.pdf, 2000. 3. Aeleen Frisch, Essential System Administration, 3rd edition, O’Reilly, 2002. 4. LTO, http://www.ltotechnology.com/newsite/index.html, 2006. 5. Peter McGowan (ed), Quantum DLTape Handbook, http://downloads.quantum.com/sdlt320/handbook.pdf, 2001. 6. Evi Nemeth et al, UNIX System Administration Handbook, 3rd edition, Prentice Hall, 2001. 7. Shelley Powers et. al., UNIX Power Tools, 3rd edition, O’Reilly, 2002. 8. W. Curtis Preston, Backup & Recovery, O’Reilly, 2007. 9. Quantum, “Tape Storage Automation,” http://www.dlt.com/storage/whitepapers/quantum/dlt/Tape%20Storage%20Automation .pdf 10. “The Tao of Backup,” http://www.taobackup.com/ 11. Wikipedia Contributors, “Magnetic Tape,” http://en.wikipedia.org/wiki/Magnetic_tape, 2005. 12. Elizabeth Zwicky, “Torture Testing Backup and Archive Programs,” ftp://ftp.berlios.de/pub/star/testscripts/zwicky/testdump.doc.html, 1991.

CIT 470: Advanced Network and System Administration Slide #43

8