Title Is Times New Roman 28 Pt., Line Spacing .9 Lines
Total Page:16
File Type:pdf, Size:1020Kb
Guide Guide @ 4.77 Guide @ 2.68 Subtitle Guide @ 2.64 Guide @ 1.97 Guide @ 1.57 Guide @ 0.22 OpenAFS On Solaris 11 x86 Robert Milkowski Unix Engineering Nothing Nothing below this below this point point Guide @ 2.99 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ Why Solaris? 2.68 Subtitle Guide @ 2.64 ZFS Guide @ 1.95 Guide @ Transparent and in-line data compression and deduplication 1.64 Big $$ savings Transactional file system (no fsck) Guide @ 0.22 End-to-end data and meta-data checksumming Encryption DTrace Online profiling and debugging of AFS Only Many improvements to AFS performance and scalability Only Source / Source / Footnotes Footnotes below this below this line Safe to use in production line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ ZFS – Estimated Disk Space Savings 2.68 Subtitle Guide @ 2.64 Disk space usage Guide @ 1.95 ~3.8x Guide @ 1.64 ZFS 128KB GZIP ~2.9x ZFS 32KB GZIP ZFS 128KB LZJB Guide @ 0.22 ~2x ZFS 32KB LZJB ZFS 64KB no-comp Linux ext3 0 200 400 600 800 1000 1200 GB Only Only Source / Source / Footnotes Footnotes below this 1TB sample of production data from AFS plant in 2010 below this line line Guide @ Currently, the overall average compression ratio for AFS on ZFS/gzip is over 3.2x 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ Compression – Performance Impact 2.68 Subtitle Guide @ 2.64 Read Test Guide @ 1.95 Guide @ Linux ext3 1.64 ZFS 32KB no-comp ZFS 64KB no-comp ZFS 128KB no-comp ZFS 32KB DEDUP + LZJB ZFS 32KB LZJB Guide @ 0.22 ZFS 64KB LZJB ZFS 128KB LZJB ZFS 32KB DEDUP + GZIP ZFS 32KB GZIP ZFS 128KB GZIP ZFS 64KB GZIP 0 100 200 300 400 500 600 700 800 Only MB/s Only Source / Source / Footnotes Footnotes below this below this line line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ Compression – Performance Impact 2.68 Subtitle Guide @ 2.64 Write Test Guide @ 1.95 Guide @ ZFS 128KB GZIP 1.64 ZFS 64KB GZIP ZFS 32KB GZIP Linux ext3 ZFS 32KB no-comp Guide @ 0.22 ZFS 64KB no-comp ZFS 128KB no-comp ZFS 128KB LZJB ZFS 64KB LZJB ZFS 32KB LZJB 0 100 200 300 400 500 600 Only MB/s Only Source / Source / Footnotes Footnotes below this below this line line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ Solaris – Cost Perspective 2.68 Subtitle Guide @ 2.64 Linux server Guide @ 1.95 x86 hardware Guide @ 1.64 Linux support (optional for some organizations) Directly attached storage (10TB+ logical) Guide @ Solaris server 0.22 The same x86 hardware as on Linux 1,000$ per CPU socket per year for Solaris support (list price) on non-Oracle x86 server Over 3x compression ratio on ZFS/GZIP Only Only Source / Source / Footnotes 3x fewer servers, disk arrays Footnotes below this below this line line Guide @ 2.80 3x less rack space, power, cooling, maintenance ... Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ AFS Unique Disk Space Usage – last 5 years 2.68 Subtitle Guide @ 2.64 25000 Guide @ 1.95 Guide @ 1.64 20000 15000 Guide @ 0.22 GB 10000 5000 Only 0 Only Source / Source / Footnotes Footnotes below this below this line line 2007-09 2008-09 2009-09 2010-09 2011-09 2012-08 Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ MS AFS High-Level Overview 2.68 Subtitle Guide @ 2.64 AFS RW Cells Guide @ 1.95 Guide @ Canonical data, not available in prod 1.64 AFS RO Cells Globally distributed Guide @ 0.22 Data replicated from RW cells In most cases each volume has 3 copies in each cell ~80 RO cells world-wide, almost 600 file servers This means that a single AFS volume in a RW cell, when Only promoted to prod, is replicated ~240 times (80x3) Only Source / Source / Footnotes Footnotes below this below this line line Guide @ Currently, there is over 3PB of storage presented to AFS 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ Typical AFS RO Cell 2.68 Subtitle Guide @ 2.64 Before Guide @ 1.95 5-15 x86 Linux servers, each with directly attached disk Guide @ 1.64 array, ~6-9RU per server Now Guide @ 4-8 x86 Solaris 11 servers, each with directly attached disk 0.22 array, ~6-9RU per server Significantly lower TCO Soon 4-8 x86 Solaris 11 servers, internal disks only, 2RU Only Only Source / Lower TCA Source / Footnotes Footnotes below this below this line Significantly lower TCO line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ Migration to ZFS 2.68 Subtitle Guide @ 2.64 Guide @ 1.95 Completely transparent migration to clients Guide @ 1.64 Migrate all data away from a couple of servers in a cell Rebuild them with Solaris 11 x86 with ZFS Re-enable them and repeat with others Guide @ 0.22 Over 300 servers (+disk array) to decommission Less rack space, power, cooling, maintenance... and yet more available disk space Fewer servers to buy due to increased capacity Only Only Source / Source / Footnotes Footnotes below this below this line line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ q.ny cell migration to Solaris/ZFS 2.68 Subtitle Guide @ 2.64 Guide @ 1.95 Guide @ 1.64 Guide @ 0.22 Cell size reduced from 13 servers down to 3 Only Only Source / Disk space capacity expanded from ~44TB to ~90TB (logical) Source / Footnotes Footnotes below this below this line Rack space utilization went down from ~90U to 6U line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ Solaris Tuning 2.68 Subtitle Guide @ 2.64 Guide @ 1.95 ZFS Guide @ 1.64 Largest possible record size (128k on pre GA Solaris 11, 1MB on 11 GA and onwards) Disable SCSI CACHE FLUSHES Guide @ 0.22 zfs:zfs_nocacheflush = 1 Increase DNLC size ncsize = 4000000 Disable access time updates on all vicep partitions Only Only Source / Multiple vicep partitions within a ZFS pool (AFS Source / Footnotes Footnotes below this below this line scalability) line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ Summary 2.68 Subtitle Guide @ 2.64 Guide @ 1.95 Guide @ More than 3x disk space savings thanks to ZFS 1.64 Big $$ savings No performance regression compared to ext3 Guide @ 0.22 No modifications required to AFS to take advantage of ZFS Several optimizations and bugs already fixed in AFS thanks to DTrace Better and easier monitoring and debugging of AFS Only Only Source / Source / Footnotes Footnotes below this Moving away from disk arrays in AFS RO cells below this line line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ Why Internal Disks? 2.68 Subtitle Guide @ 2.64 Guide @ 1.95 Most expensive part of AFS is storage and rack space Guide @ 1.64 AFS on internal disks 9U->2U Guide @ More local/branch AFS cells 0.22 How? ZFS GZIP compression (3x) 256GB RAM for cache (no SSD) 24+ internal disk drives in 2U x86 server Only Only Source / Source / Footnotes Footnotes below this below this line line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ HW Requirements 2.68 Subtitle Guide @ 2.64 RAID controller Guide @ 1.95 Guide @ Ideally pass-thru mode (JBOD) 1.64 RAID in ZFS (initially RAID-10) No batteries (less FRUs) Guide @ 0.22 Well tested driver 2U, 24+ hot-pluggable disks Front disks for data, rear disks for OS SAS disks, not SATA Only Only Source / 2x CPU, 144GB+ of memory, 2x GbE (or 2x 10GbE) Source / Footnotes Footnotes below this below this line line Guide @ 2.80 Redundant PSU, Fans, etc. Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ SW Requirements 2.68 Subtitle Guide @ 2.64 Disk replacement without having to log into OS Guide @ 1.95 Guide @ Physically remove a failed disk 1.64 Put a new disk in Resynchronization should kick-in automatically Guide @ 0.22 Easy way to identify physical disks Logical <-> physical disk mapping Locate and Faulty LEDs RAID monitoring Only Only Source / Source / Footnotes Monitoring of disk service times, soft and hard errors, etc. Footnotes below this below this line line Guide @ 2.80 Proactive and automatic hot-spare activation Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ Oracle/Sun X3-2L (x4270 M3) 2.68 Subtitle Guide @ 2.64 2U Guide @ 1.95 Guide @ 1.64 2x Intel Xeon E5-2600 Up-to 512GB RAM (16x DIMM) Guide @ 12x 3.5” disks + 2x 2.5” (rear) 0.22 24x 2.5” disks + 2x 2.5” (rear) 4x On-Board 10GbE 6x PCIe 3.0 Only Only Source / Source / Footnotes SAS/SATA JBOD mode Footnotes below this below this line line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print library_new_final.ppt 10/15/2012 Guide @ SSDs? 2.68 Subtitle Guide @ 2.64 ZIL (SLOG) Guide @ 1.95 Guide @ Not really necessary on RO servers 1.64 MS AFS releases >=1.4.11-3 do most writes as async L2ARC Guide @ 0.22 Currently given 256GB RAM doesn’t seem necessary Might be an option in the future Main storage on SSD Too expensive for AFS RO Only Only Source / AFS RW? Source / Footnotes Footnotes below this below this line line Guide @ 2.80 Guide Guide @ 4.69 prototype template (5428278)\print