File Systems and Storage
Total Page:16
File Type:pdf, Size:1020Kb
File Systems and Storage Rich Sudlow and Paul Brenner University of Notre Dame Center for Research Computing Last Modified : 06/22/11 Last Modified : 06/22/11 Overview • File System Concepts – Aspects and Types – Why do we have so many? • Redundancy and Performance – RAID – Examples on X4500 “thumper” • CRC Supported File Systems – Centralized (ext3, fat32/ntfs) – Distributed (AFS, NFS) – Comparision – capabilities • AFS – crc.nd.edu, nd.edu cells • Backup Storage – Software, Cache, Tape Silo • Using CRC Storage – Scratch space, User workspace, Backup Last Modified : 06/22/11 3 Disclaimer • This is: – A broad overview – Operational viewpoint – Starting point • This is not: – Comprehensive – Authoritative analysis of the industry/technology • For more info consider: – ND courses relevant to this topic – Contacting CRC for specific individual requirements Last Modified : 06/22/11 4 File Systems Concepts • Aspects – Filenames – Meta data (size, # blocks, time, security) – Hierarchical vs Flat – Secure access – Capabilities/Facilities (move, delete, append) – Why so many? Why not use just one? • Types – Disk/Flash – Database & Transactional – Network/Distributed Last Modified :– 06/22/11Special Purpose 5 Wikipedia – List of File Systems http://en.wikipedia.org/wiki/List_of_file_systems Last Modified : 06/22/11 6 Redundancy and Performance • File system design is strongly influenced by the target feature set – Bandwidth, security, latency, distributed access, fault tolerance, etc... • File systems can be tuned and tiered to exploit feature sets of Operating Systems and filesystems. E.g. Solaris vs Linux – Underlying ZFS or ext3 filesystem for NFS/AFS • Scalability – Ability to run across multiple nodes – by multiple users – local vs distributed Last Modified : 06/22/11 7 Wikipedia – Comparison of File Systems http://en.wikipedia.org/wiki/Comparison_of_file_systems Last Modified : 06/22/11 RAID • Redundant Array(s) of Inexpensive/Independent Disks • Utilize multiple/many disks to provide – Capacity – Reliability/Redundancy – Performance – configurable based on user requirements IOPS – bandwidth – various block sizes • Hardware and Software Implementations – Performance, Flexibility, Boot ‘chicken before egg’ • RAID ‘Levels’ – Disk utilization configurations to optimize cost vs capability • Tiered/Nested RAID Levels – One RAID level on top another raid 0+1 vs 10 Last Modified : 06/22/11 9 RAID Levels • RAID 0 – Striped : Performance and Capacity • RAID 1 – Mirrored : Read Performance and Fault Tolerance (FT) • RAID 3 & 4 – Striped with Dedicated Parity • RAID 5 – Striped with Distributed Parity : Performance, Capacity, N+1 FT • RAID 6 – Striped with Distributed Parity Performance, Capacity, N+2 FT • RAID 0+1 – Striped sets in a mirrored set • RAID 1 + 0 generally just called RAID 10 – Mirrored sets in a striped set • RAID 50 – Striped (0) Across Distributed Parity RAID (5)s Last Modified : 06/22/11 10 RAID Reference http://en.wikipedia.org/wiki/RAID Last Modified : 06/22/11 Raid example - Sun Thumper X4500 Last Modified : 06/22/11 c vicepb – single disk C5 5C4 C7 C6 C1 C0 vicepc – 2 disk stripe vicepd – 3 disk stripe vicepe – 4 disk stripe vicepf – 6 disk stripe vicepg – single disk mirror viceph – 2 disk stripe mirror vicepi – 3 disk stripe mirror vicep{ j, k, l } – 3 disk stripe – only Last Modified : 06/22/11 used for read problem encountered in multiclient test. Links to RAID examples Solaris/Red Hat on Sun X4500 (thumper) UFS tests on Solaris 10 using Sun Volume Manager http://www.nd.edu/~rich/afsbpw2007/thumper01-solaris-ufs-tests UFS tests on Linux (RH4U4) http://www.nd.edu/~rich/afsbpw2007/thumper02-linux-ext3-tests Last Modified : 06/22/11 CRC Supported File Systems Why do we have so many? • Centralized – Ext3 (Linux) – Red Hat 4 & 5 – FAT32/NTFS (Windows) • Distributed – NFS – AFS • Others – ZFS Last Modified : 06/22/11 15 CRC Supported File Systems http://www.nd.edu/~rich/CRC_filesystems.html • Scratch Space • User Workspace • Storage Backup – Available for backup of CRC and research machines on campus. Last Modified : 06/22/11 Scalability • Scalability of filesystem • Bottlenecks • Scalability of network • Scalability of codes • Simple testing tools – http://ndt.hpcc.nd.edu:7123 (simple but not always accurate) nuttcp - /opt/und/local/bin/nuttcp - firewalls nuttcp –t (-r) opteron.hpcc.nd.edu (Don’t abuse) diskrate – diskrate –n 10m –f trash Last Modified : 06/22/11 File Permissions (Linux) • What the heck does this mean? – drwxr-xr-- • File permissions for user, group, and all – 10 spaces the first indicates ‘if directory’ • Triples of rwx indicate read, write, and execute for user, group, and all • Change file permissions with ‘chmod’ – Examples: • chmod a+r filename • chmod go+w filename • chmod 1777 directory Last Modified : 06/22/11• chmod 700 directory 18 File Permissions (AFS) • fs setacl -dir $HOME -acl pat all terry none – fs is the command suite. – setacl is the operation code, which directs the File Server process to set an access control list. – -dir $HOME and -acl pat all terry none are arguments. Implies that terry previously had access – -dir and -acl are switches; -dir indicates the name of the directory on which to set the ACL, and -acl defines the entries to set on it. – $HOME and pat all terry none are instances of the arguments. $HOME defines a specific directory for the directory argument. The -acl argument has two instances specifying two ACL entries: pat all and terry none. • Command abbreviations – fs listacl (full command) , fs lista (abbreviation), fs la (alias) Last Modified : 06/22/11 19 File Permissions (AFS) AFS gives each user the permission to create their own groups Common to use syntax owner:group pts creategroup cvrl:cvrl_group pts adduser rich cvrl:cvrl_group pts membership cvrl:cvrl_group To recursively set permissions: find ./ -type d –print –exec fs setacl {} cvrl:cvrl_group read \; Special groups: system:anyuser, system:authuser, nd_campus IP based Access Control Lists Last Modified : 06/22/11 AFS References AFS Reference Links http://crcmedia.hpcc.nd.edu/wiki/index.php/AFS_References_and_Resources Some AFS / NFS Storage comparisons http://crcmedia.hpcc.nd.edu/wiki/index.php/CRC_Storage_Comparisons Sometimes the system is more than just storage – features are important – but need to be the ones users use. Last Modified : 06/22/11 21 AFS – crc.nd.edu – nd.edu cell nd.edu is the campus legacy OpenAFS cell – started May 1990 – uses ND.EDU Kerberos realm – run by OIT staff Currently the default cell for most CRC logins and batch system – opteron, opterona, stats crc.nd.edu is the “new” cell run by CRC staff – Started October 2007 – uses CRC.ND.EDU The future cell for CRC logins and batch system – target for rollout June 2008 – hardware, and administrative differences. Kerb 4 EOL scheduled for 12 / 2008 for nd.edu cell Last Modified : 06/22/11 AFS – crc.nd.edu – nd.edu cell CRC Wiki Links http://crcmedia.hpcc.nd.edu/wiki/index.php/CRC_AFS_Cell Accessing multiple cells http://crcmedia.hpcc.nd.edu/wiki/index.php/Automatic_CRC/ND_AFS_cell_setup Recommendations on cells for primary access – interactive use Methods to migrating data – tar, up, cp, vos dump/restore, start fresh. Issues with interactive use – references to nd.edu that you don’t know about – e.g. mozilla, etc Last Modified : 06/22/11 CRC Storage Backup –B023 Malloy Hall • Software - Teradactyl Inc. – True Incremental Backup System – TiBS http://www.teradactyl.com • Available for backup of CRC and any research machines in colleges – On-site training June 16-20th, 2008. • Supported architectures include OpenAFS, Solaris, Linux, Windows, MacOSX. • Hardware - Backup server – Dell Power Edge 6950 server – utilizing 10 Gb ethernet & fiber channel interfaces. - Cache – 16 TB Infortrend Fibre Channel Array Last Modified : 06/22/11 24 Storage Backup Sony – Consolidated Storage Management System (CSM 200) Capacity of 604 tapes – 3 LTO4 drives with 1 TB tapes – 2 TB per tape with 2:1 compression- Library will hold > 1 PB without reloading tapes – expands to 2,988 tapes with 96 drives. Last Modified : 06/22/11 References • Wikipedia: FileSystems • Advanced File Systems Issues-Andy Wang FSU http://www.cs.fsu.edu/~awang/courses/cop5611_s2004/ • ND CRC wiki http://crc.nd.edu/wiki • OpenAFS User Guide http://www.openafs.org/doc/index.htm • OpenAFS Best Practices 2007 – Sudlow http://crc.nd.edu/facilities/documents/afsbpw2007.pdf Last Modified : 06/22/11 26 Questions ? • How can we improve this class? – Additional topics? – Cover one topic more thoroughly? – Remove topics? – Thanks for the feedback? Last Modified : 06/22/11 27.