High Performance Storage

High Performance Storage

Linux Clusters Instute: High Performance Storage University of Oklahoma, 05/19/2015 Mehmet Belgin, Georgia Tech [email protected] (in collaboraon with Wesley Emeneker) 18-22 May 2015 1 The Fundamental Ques.on • How do we meet *all* user needs for storage? • Is it even possible? • Confounding factors • User expectaons (in their own words) • Budget constraints • Applicaon needs and use cases • Exper9se in team • Exis9ng infrastructure 18-22 May 2015 2 Examples to Common Storage Systems • Network File System (NFS) – a distributed file system protocol for accessing files over a network. • Lustre – a parallel, distributed file system • OSS – object storage server. This server stores stores and manages pieces of files (aka objects) • OST – object storage target. This disk is managed by the OSS and stores data • MDS – metadata server. This server stores file metadata. • MDT – metadata target. This disk is managed by the MDS and stores file metadata • General Parallel File System (GPFS) – a parallel, distributed file system. • Metadata is not owned by any par9cular server or set of servers. • All clients par9cipate in filesystem management • NSD – network storage device • Panasas/PanFS – a parallel, distributed file system • Metadata is owned by director blades • File data is owned by storage blades 18-22 May 2015 3 Nomenclature • Object store – a place where chunks of data (aka objects) are stored. Objects are not files, though they can store individual files or different pieces of files. • Raw space – what the the disk label shows. Typically given in base 10. i.e. 10TB (terabyte) == 10*10^12 bytes • Usable space - what “df” shows once the storage is mounted. Typically given in base 2. i.e. 10TiB (tebibyte) == 10*2^40 bytes • Usable space is o_en about 30% smaller (some9mes more, some9mes less) than raw space. 18-22 May 2015 4 Which one is right for me? Lustre 18-22 May 2015 5 The End. Thanks for par9cipang! 18-22 May 2015 6 Before we start… What is a File System? 18-22 May 2015 7 What is a filesystem? • A system for files (Duh!) • A source of constant frustraon • A filesystem is used to control how data is stored and retrieved –Wikipedia • It’s a container (that contains files) • It’s the set of disks, servers (computaonal components), networking, and so_ware • All of the above 18-22 May 2015 8 Disclaimer • There are no right answers • There are wrong answers • No, seriously. • It comes down to balancing tradeoffs of preferences, exper9se, costs, and case-by-case analysis 18-22 May 2015 9 Know Your Stakeholders … and keep all of them happy! (at the same 9me) 1. Users 2. Managers and University Leadership 3. University support staff 4. System administrators Managers 5. Vendor Users Sysadmins 18-22 May 2015 10 What do you need to support? Common Storage Requirements (which most users can’t ar9culate) • Temporary storage for intermediate results from jobs (a.k.a scratch) • Long-term storage for run9me use • Backups • Archive • Expor9ng said filesystem to other machines (like a user's Windows XP laptop) • Virtual Machine hos9ng • Database hos9ng • Map/Reduce (a.k.a Hadoop) • Data ingest and outgest (DMZ?) • System Administrator storage 18-22 May 2015 11 Tradeoffs First, try to define ‘use purpose’ and ‘operaonal life9me’… • Speed (… is a relave term!) • Space • Cost • Scalability • Administrave burden • Monitoring • Reliability/Redundancy • Features • Support from vendor 18-22 May 2015 12 Parallel/Distributed vs. Serial Filesystems* Serial • It doesn’t scale beyond a single server • It o_en isn't easy to make it reliable or redundant beyond a single server • A single server controls everything Parallel • Speed increases as more components are added to it • Built for distributed redundancy and reliability • Mul9ple servers contribute to the management of the filesystem *None of these things are 100% true 18-22 May 2015 13 The Most Common Solu.ons for HPC Want to access your data from everywhere? You need “Network Aoached Storage (NAS)”! • NFS (serial-ish) • GPFS (Parallel) • Lustre (Parallel) • Panasas (Parallel) • What about others like OrangeFS, Gluster, Ceph, XtreemFS, CIFS, HDFS, Swi_, etc.? 18-22 May 2015 14 Prepare for a Challenge • NFS low • Panasas Administrave Burden & needed experse • GPFS (anectodal) high • Lustre • Your mileage may vary! 18-22 May 2015 15 Network File System (NFS) • Can be built from commodity parts or purchased as an appliance • A single server typically controls everything *Speed *Space *Cost *Scalability *Administrave Burden *Monitoring • Where does it fall for our tradeoffs? *Reliability/Redundancy • No so_ware cost *Features *Vendor Support • Compable (not 100% POSIX) • Underlying Filesystem does not maer much (ZFS, ext3, …) • True redundancy is harder (single point of failure) • Mostly for low-volume, low-throughput workloads • Strong client side caching, works well for small files • Requires minimal exper9se and (relavely) easy to manage 18-22 May 2015 16 General Parallel File System (GPFS) • Can be built from commodity parts or purchsed as an appliance • All nodes in the GPFS cluster par9cipate NSD Server NSD Server in filesystem management Network • Metadata is managed by every node in the cluster Client Client • Where does it fall in our tradeoffs? *Speed *Space *Cost *Scalability *Administrave Burden *Monitoring *Reliability/Redundancy *Features *Vendor Support 18-22 May 2015 17 Lustre • Can be built from commodity parts, or purchased as an appliance • Separate servers for data and metadata • Where does it fall in our tradeoffs? *Speed *Space *Cost *Scalability *Administrave Burden *Monitoring *Reliability/Redundancy *Features *Vendor Support * Image credit: nor-tech.com 18-22 May 2015 18 Panasas • Is an appliance • Separate servers for metadata and data • Where does it fall in our tradeoffs? *Speed *Space *Cost *Scalability *Administrave Burden *Monitoring *Reliability/Redundancy *Features *Vendor Support * Image credit: panasas.com 18-22 May 2015 19 Appliances Screenshot of Panasas management tool • Appliances generally come with vendor tools for monitoring and management • Do these tools increase or decrease management complexity? • How important is vendor support for your team? 18-22 May 2015 20 Good idea? Bad idea? Let’s discuss! • NFS for everything • Panasas for everything • Lustre for everything • GPFS for everything 18-22 May 2015 21 How about… • Lustre for work (files stored here are temporary) • NFS for home • Tape for backup and archival • Lustre available everywhere • Tape available on data movers • NFS only available on login machines 18-22 May 2015 22 Designing your storage soluon • Who are the stakeholders? • How quickly should we be able to read any one file? • How will people want to use it? • How much training will you need? • How much training will your users need to effec9vely use your storage? • Do you have the knowledge necessary to do the training? • How o_en do they need the training? • Do you need different 9ers or types of storage? • Long-term • Temporary • Archive • From what science/usage domains are the users? • aka what applicaons will they be using? • What features are necessary? 18-22 May 2015 23 Applica.on Driven Tradeoffs • Domain Science • Chemistry • Aerospace • Bio* (biology, bioinformacs, biomedical) • Physics • Business • Economics • etc. • Data and Applicaon Restric9ons • HIPAA and PHI • ITAR • PCI DSS • And many more (SOX, GLBA, CJIS, FERPA, SOC, …) 18-22 May 2015 24 What you need to know • What is the distribu9on of files? • sizes, count • What is the expected workload? • How many bytes are wrioen for every byte read? • How many bytes are read for each file opened? • How many bytes are wrioen for each file opened? • Are there any system-based restric9ons? • POSIX conformance. Do you need a POSIX Filesystem? • Limitaons on number of files or files per directories • Network compability (IB vs. Ethernet) 18-22 May 2015 25 Use Case: Data Movement • Scenario: User needs to import a lot of data • Where is the data coming from? • Campus LAN? • Campus WAN? • WAN? • How o_en will the data be ingested? • Does it need to be outgested? • What kind of data is it? • Is it a one-9me ingest or regular? 18-22 May 2015 26 Designing your storage soluon • What technologies do you need to sasfy the requirements that you now have? • Can you put a number on the following? • Minimum disk throughput from a single compute node • Minimum aggregate throughput for the en9re filesystem for a benchmark (like iozone or IOR) • I/O load for representave workloads from your site • How much data and metadata is read/wrioen per job? • Temporary space requirements • Archive and backup space requirements • How much churn is there in data that needs to be backed up? 18-22 May 2015 27 Storage Devices • Solid State speed & cost capacity o Serial ATA (SATA): $/byte, large capacity, less • RAM reliable, slower (7.2k RPM) low • PCIe SSD high o Serial Aached SCSI (SAS): $$/byte, small • SATA/SAS SSD capacity, reliable, fast (15k RPM) • Spinning Disk o Nearline-SAS: SATA drives with SAS interface: • SAS more reliable than SATA, cheaper than SAS, ~SATA speeds but with lower overhead • NL-SAS • SATA o Solid State Disk (SSD): No spinning disks, $$$/ low high byte, blazing fast, reliable1 • Tape 18-22 May 2015 28 What is an IOP? • IOP == Input/Output Operaon • IOPS == Input/Output Operaons per Second • We care about two IOPS reports • The number we tell people when we say “Our Veridian Dynamics Frobulator 2021 gets 300PiB/s bandwidth!” • The number that affects users “Our Veridian Dynamics Frobulator 2021 only gets 5KiB/s for <insert your applicaon’s name>” • Why the difference? 18-22 May 2015 29 More tradeoffs … Space vs. Speed • Do you need 10GiB/s and 10TiB of space? • Do you need 1PiB of usable storage and 1GiB/s? • How do you meet your requirements? Large vs. Small Files • What is a small file? • No hard rule. It depends on how you define it.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    45 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us