OCFS2: the Oracle Clustered File System, Version 2

Total Page:16

File Type:pdf, Size:1020Kb

OCFS2: the Oracle Clustered File System, Version 2 OCFS2: The Oracle Clustered File System, Version 2 Mark Fasheh Oracle [email protected] Abstract Disk space is broken up into clusters which can range in size from 4 kilobytes to 1 megabyte. This talk will review the various components File data is allocated in extents of clusters. This of the OCFS2 stack, with a focus on the file allows OCFS2 a large amount of flexibility in system and its clustering aspects. OCFS2 ex- file allocation. tends many local file system features to the File metadata is allocated in blocks via a sub cluster, some of the more interesting of which allocation mechanism. All block allocators in are posix unlink semantics, data consistency, OCFS2 grow dynamically. Most notably, this shared readable mmap, etc. allows OCFS2 to grow inode allocation on de- In order to support these features, OCFS2 logi- mand. cally separates cluster access into multiple lay- ers. An overview of the low level DLM layer will be given. The higher level file system locking will be described in detail, including 1 Design Principles a walkthrough of inode locking and messaging for various operations. A small set of design principles has guided Caching and consistency strategies will be dis- most of OCFS2 development. None of them cussed. Metadata journaling is done on a per are unique to OCFS2 development, and in fact, node basis with JBD. Our reasoning behind that almost all are principles we learned from the choice will be described. Linux kernel community. They will, however, come up often in discussion of OCFS2 file sys- OCFS2 provides robust and performant recov- tem design, so it is worth covering them now. ery on node death. We will walk through the typical recovery process including journal re- play, recovery of orphaned inodes, and recov- 1.1 Avoid Useless Abstraction Layers ery of cached metadata allocations. Allocation areas in OCFS2 are broken up into Some file systems have implemented large ab- groups which are arranged in self-optimizing straction layers, mostly to make themselves “chains.” The chain allocators allow OCFS2 to portable across kernels. The OCFS2 develop- do fast searches for free space, and dealloca- ers have held from the beginning that OCFS2 tion in a constant time algorithm. Detail on the code would be Linux only. This has helped us layout and use of chain allocators will be given. in several ways. An obvious one is that it made 290 • OCFS2: The Oracle Clustered File System, Version 2 the code much easier to read and navigate. De- 2 Disk Layout velopment has been faster because we can di- rectly use the kernel features without worrying ocfs2_fs.h if another OS implements the same features, or Near the top of the header, one worse, writing a generic version of them. will find this comment: /* Unfortunately, this is all easier said than done. * An OCFS2 volume starts this way: * Sector 0: Valid ocfs1_vol_disk_hdr that cleanly Clustering presents a problem set which most * fails to mount OCFS. Linux file systems don’t have to deal with. * Sector 1: Valid ocfs1_vol_label that cleanly * fails to mount OCFS. When an abstraction layer is required, three * Block 2: OCFS2 superblock. principles are adhered to: * * All other structures are found * from the superblock information. */ • Mimic the kernel API. The OCFS disk headers are the only amount of • Keep the abstraction layer as thin as pos- backwards compatibility one will find within an sible. OCFS2 volume. It is an otherwise brand new • If object life timing is required, try to use cluster file system. While the file system basics the VFS object life times. are complete, there are many features yet to be implemented. The goal of this paper then, is to 1.2 Keep Operations Local provide a good explanation of where things are in OCFS2 today. Bouncing file system data around a cluster can be very expensive. Changed metadata blocks, 2.1 Inode Allocation Structure for example, must be synced out to disk before another node can read them. OCFS2 design at- tempts to break file system updates into node The OCFS2 file system has two main alloca- blocks clusters local operations as much as possible. tion units, and . Blocks can be anywhere from 512 bytes to 4 kilobytes, whereas clusters range from 4 kilobytes up to 1.3 Copy Good Ideas one megabyte. To make the file system math- ematics work properly, cluster size is always There is a wealth of open source file system im- greater than or equal to block size. At format plementations available today. Very often dur- time, the disk is divided into as many cluster- ing OCFS2 development, the question “How do sized units as will fit. Data is always allocated other file systems handle it?” comes up with re- in clusters, whereas metadata is allocated in spect to design problems. There is no reason to blocks reinvent a feature if another piece of software already does it well. The OCFS2 developers Inode data is represented in extents which are thus far have had no problem getting inspira- organized into a b-tree. In OCFS2, extents are tion from other Linux file systems.1 In some represented by a triple called an extent record. cases, whole sections of code have been lifted, Extent records are stored in a large in-inode ar- with proper citation, from other open source ray which extends to the end of the inode block. projects! When the extent array is full, the file system 1Most notably Ext3. will allocate an extent block to hold the current 2006 Linux Symposium, Volume One • 291 DISK INODE of course the set of characters which make up the file name. 2.3 The Super Block EXTENT BLOCK The OCFS2 super block information is con- Figure 1: An Inode B-tree tained within an inode block. It contains a Record Field Field Size Description standard set of super block information—block e_cpos 32 bits Offset into the size, compat/incompat/ro features, root inode file, in clusters pointer, etc. There are four values which are e_clusters 32 bits Clusters in this somewhat unique to OCFS2. extent e_blkno 64 bits Physical disk • s_clustersize_bits – Cluster size offset for the file system. Table 1: OCFS2 extent record • s_system_dir_blkno – Pointer to the system directory. array. The first extent record in the inode will be re-written to point to the newly allocated ex- • s_max_slots – Maximum number of tent block. The e_clusters and e_cpos simultaneous mounts. values will refer to the part of the tree under- • s_first_cluster_group – Block neath that extent. Bottom level extent blocks offset of first cluster group descriptor. form a linked list so that queries accross a range can be done efficiently. s_clustersize_bits is self-explanatory. The reason for the other three fields will be ex- 2.2 Directories plained in the next few sections. Directory layout in OCFS2 is very similar to 2.4 The System Directory Ext3, though unfortunately, htree has yet to be ported. The only difference in directory en- In OCFS2 file system metadata is contained try structure is that OCFS2 inode numbers are within a set of system files. There are two types 64 bits wide. The rest of this section can be of system files, global and node local. All sys- skipped by those already familiar with the di- tem files are linked into the file system via the rent structure. hidden system directory2 whose inode number Directory inodes hold their data in the same is pointed to by the superblock. To find a sys- manner which file inodes do. Directory data tem file, a node need only search the system is arranged into an array of directory en- directory for the name in question. The most tries. Each directory entry holds a 64-bit inode common ones are read at mount time as a per- pointer, a 16-bit record length, an 8-bit name formance optimization. Linking to system files length, an 8-bit file type enum (this allows us 2debugfs.ocfs2 can list the system dir with the to avoid reading the inode block for type), and ls // command. 292 • OCFS2: The Oracle Clustered File System, Version 2 from the system directory allows system file lo- ALLOCATION UNITS catations to be completely dynamic. Adding new system files is as simple as linking them into the directory. Global system files are generally accessible by any cluster node at any time, given that it GROUP has taken the proper cluster-wide locks. The DESCRIPTOR global_bitmap is one such system file. There are many others. Figure 2: Allocation Group Node local system files are said to be owned by groups are then chained together into a set of a mounted node which occupies a unique slot. singly linked lists, which start at the allocator The maximum number of slots in a file sys- inode. tem is determined by the s_max_slots su- perblock field. The slot_map global system The first block of the first allocation unit file contains a flat array of node numbers which within a group contains an ocfs2_group_ details which mounted node occupies which set descriptor. The descriptor contains a small of node local system files. set of fields followed by a bitmap which ex- tends to the end of the block. Each bit in the Ownership of a slot may mean a different bitmap corresponds to an allocation unit within thing to each node local system file. For the group. The most important descriptor fields some, it means that access to the system file follow. is exclusive—no other node can ever access it. For others it simply means that the owning node gets preferential access—for an allocator file, • bg_free_bits_count – number of this might mean the owning node is the only unallocated units in this group.
Recommended publications
  • Oracle® Linux 7 Managing File Systems
    Oracle® Linux 7 Managing File Systems F32760-07 August 2021 Oracle Legal Notices Copyright © 2020, 2021, Oracle and/or its affiliates. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract.
    [Show full text]
  • Parallel File Systems for HPC Introduction to Lustre
    Parallel File Systems for HPC Introduction to Lustre Piero Calucci Scuola Internazionale Superiore di Studi Avanzati Trieste November 2008 Advanced School in High Performance and Grid Computing Outline 1 The Need for Shared Storage 2 The Lustre File System 3 Other Parallel File Systems Parallel File Systems for HPC Cluster & Storage Piero Calucci Shared Storage Lustre Other Parallel File Systems A typical cluster setup with a Master node, several computing nodes and shared storage. Nodes have little or no local storage. Parallel File Systems for HPC Cluster & Storage Piero Calucci The Old-Style Solution Shared Storage Lustre Other Parallel File Systems • a single storage server quickly becomes a bottleneck • if the cluster grows in time (quite typical for initially small installations) storage requirements also grow, sometimes at a higher rate • adding space (disk) is usually easy • adding speed (both bandwidth and IOpS) is hard and usually involves expensive upgrade of existing hardware • e.g. you start with an NFS box with a certain amount of disk space, memory and processor power, then add disks to the same box Parallel File Systems for HPC Cluster & Storage Piero Calucci The Old-Style Solution /2 Shared Storage Lustre • e.g. you start with an NFS box with a certain amount of Other Parallel disk space, memory and processor power File Systems • adding space is just a matter of plugging in some more disks, ot ar worst adding a new controller with an external port to connect external disks • but unless you planned for
    [Show full text]
  • Ubuntu Server Guide Basic Installation Preparing to Install
    Ubuntu Server Guide Welcome to the Ubuntu Server Guide! This site includes information on using Ubuntu Server for the latest LTS release, Ubuntu 20.04 LTS (Focal Fossa). For an offline version as well as versions for previous releases see below. Improving the Documentation If you find any errors or have suggestions for improvements to pages, please use the link at thebottomof each topic titled: “Help improve this document in the forum.” This link will take you to the Server Discourse forum for the specific page you are viewing. There you can share your comments or let us know aboutbugs with any page. PDFs and Previous Releases Below are links to the previous Ubuntu Server release server guides as well as an offline copy of the current version of this site: Ubuntu 20.04 LTS (Focal Fossa): PDF Ubuntu 18.04 LTS (Bionic Beaver): Web and PDF Ubuntu 16.04 LTS (Xenial Xerus): Web and PDF Support There are a couple of different ways that the Ubuntu Server edition is supported: commercial support and community support. The main commercial support (and development funding) is available from Canonical, Ltd. They supply reasonably- priced support contracts on a per desktop or per-server basis. For more information see the Ubuntu Advantage page. Community support is also provided by dedicated individuals and companies that wish to make Ubuntu the best distribution possible. Support is provided through multiple mailing lists, IRC channels, forums, blogs, wikis, etc. The large amount of information available can be overwhelming, but a good search engine query can usually provide an answer to your questions.
    [Show full text]
  • Comparative Analysis of Distributed and Parallel File Systems' Internal Techniques
    Comparative Analysis of Distributed and Parallel File Systems’ Internal Techniques Viacheslav Dubeyko Content 1 TERMINOLOGY AND ABBREVIATIONS ................................................................................ 4 2 INTRODUCTION......................................................................................................................... 5 3 COMPARATIVE ANALYSIS METHODOLOGY ....................................................................... 5 4 FILE SYSTEM FEATURES CLASSIFICATION ........................................................................ 5 4.1 Distributed File Systems ............................................................................................................................ 6 4.1.1 HDFS ..................................................................................................................................................... 6 4.1.2 GFS (Google File System) ....................................................................................................................... 7 4.1.3 InterMezzo ............................................................................................................................................ 9 4.1.4 CodA .................................................................................................................................................... 10 4.1.5 Ceph.................................................................................................................................................... 12 4.1.6 DDFS ..................................................................................................................................................
    [Show full text]
  • LSF ’07: 2007 Linux Storage & Filesystem Workshop Namically Adjusted According to the Current Popularity of Its Hot Zone
    June07login1summaries_press.qxd:login summaries 5/27/07 10:27 AM Page 84 PRO: A Popularity-Based Multi-Threaded Reconstruction overhead of PRO is O(n), although if a priority queue is Optimization for RAID-Structured Storage Systems used in the PRO algorithm the computation overhead can Lei Tian and Dan Feng, Huazhong University of Science and be reduced to O(log n). The entire PRO implementation in Technology; Hong Jiang, University of Nebraska—Lincoln; Ke the RAIDFrame software only added 686 lines of code. Zhou, Lingfang Zeng, Jianxi Chen, and Zhikun Wang, Hua- Work on PRO is ongoing. Future work includes optimiz - zhong University of Science and Technology and Wuhan ing the time slice, scheduling strategies, and hot zone National Laboratory for Optoelectronics; Zhenlei Song, length. Currently, PRO is being ported into the Linux soft - Huazhong University of Science and Technology ware RAID. Finally, the authors plan on further investigat - ing use of access patterns to help predict user accesses and Hong Jiang began his talk by discussing the importance of of filesystem semantic knowledge to explore accurate re - data recovery. Disk failures have become more common in construction. RAID-structured storage systems. The improvement in disk capacity has far outpaced improvements in disk band - The first questioner asked about the average rate of recov - width, lengthening the overall RAID recovery time. Also, ery for PRO. Hong answered that the average reconstruc - disk drive reliability has improved slowly, resulting in a tion time is several hundred seconds in the experimental very high overall failure rate in a large-scale RAID storage setup.
    [Show full text]
  • Oracle® Private Cloud Appliance Administrator's Guide for Release 2.4.3
    Oracle® Private Cloud Appliance Administrator's Guide for Release 2.4.3 F23081-02 August 2020 Oracle Legal Notices Copyright © 2013, 2020, Oracle and/or its affiliates. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract.
    [Show full text]
  • Ceph – Software Defined Storage Für Die Cloud
    Ceph – Software Defined Storage für die Cloud CeBIT 2016 15. März 2015 Michel Rode Linux/Unix Consultant & Trainer B1 Systems GmbH [email protected] B1 Systems GmbH - Linux/Open Source Consulting, Training, Support & Development Vorstellung B1 Systems gegründet 2004 primär Linux/Open Source-Themen national & international tätig über 70 Mitarbeiter unabhängig von Soft- und Hardware-Herstellern Leistungsangebot: Beratung & Consulting Support Entwicklung Training Betrieb Lösungen dezentrale Strukturen B1 Systems GmbH Ceph – Software Defined Storage für die Cloud 2 / 36 Schwerpunkte Virtualisierung (XEN, KVM & RHEV) Systemmanagement (Spacewalk, Red Hat Satellite, SUSE Manager) Konfigurationsmanagement (Puppet & Chef) Monitoring (Nagios & Icinga) IaaS Cloud (OpenStack & SUSE Cloud & RDO) Hochverfügbarkeit (Pacemaker) Shared Storage (GPFS, OCFS2, DRBD & CEPH) Dateiaustausch (ownCloud) Paketierung (Open Build Service) Administratoren oder Entwickler zur Unterstützung des Teams vor Ort B1 Systems GmbH Ceph – Software Defined Storage für die Cloud 3 / 36 Storage Cluster B1 Systems GmbH Ceph – Software Defined Storage für die Cloud 4 / 36 Was sind Storage Cluster? hochverfügbare Systeme verteilte Standorte skalierbar (mehr oder weniger) Problem: Häufig Vendor-Lock-In 80%+ basieren auf FC B1 Systems GmbH Ceph – Software Defined Storage für die Cloud 5 / 36 Beispiele 1/2 Dell PowerVault IBM SVC NetApp Metro Cluster NetApp Clustered Ontap ... B1 Systems GmbH Ceph – Software Defined Storage für die Cloud 6 / 36 Beispiele 2/2 AWS S3 Rackspace Files Google Cloud
    [Show full text]
  • Comparing a Highly-Available Symmetrical Parallel Cluster File System with an Asymmetrical Parallel File System
    pCFS vs. PVFS: Comparing a Highly-Available Symmetrical Parallel Cluster File System with an Asymmetrical Parallel File System Paulo A. Lopes and Pedro D. Medeiros CITI and Department of Informatics, Faculty of Science and Technology, Universidade Nova de Lisboa 2829-516 Monte de Caparica, Portugal {pal,pm}@di.fct.unl.pt Abstract. pCFS is a highly available parallel, symmetrical (where nodes per- form both compute and I/O work) cluster file system that we have designed to run in medium-sized clusters. In this paper, using exactly the same hardware and Linux version across all nodes we compare pCFS with two distinct configu- rations of PVFS: one using internal disks, and therefore not able to provide any tolerance against disk and/or I/O node failures, and another where PVFS I/O servers access LUNs in a disk array and thus provide high availability (in the following named HA-PVFS). We start by measuring I/O bandwidth and CPU consumption of PVFS and HA-PVFS setups; then, the same set of tests is performed with pCFS. We conclude that, when using the same hardware, pCFS compares very favourably with HA-PVFS, offering the same or higher I/O bandwidths at a much lower CPU consumption. Keywords: Shared-disk cluster file systems, performance, parallel file systems. 1 Introduction In asymmetrical file systems, some nodes must be set aside as I/O or metadata servers while others are used as compute nodes, while in symmetrical file systems all nodes run the same set of services and may simultaneously perform both I/O and computa- tional tasks.
    [Show full text]
  • Introduction to Distributed File Systems. Orangefs Experience 0.05 [Width=0.4]Lvee-Logo-Winter
    Introduction to distributed file systems. OrangeFS experience Andrew Savchenko NRNU MEPhI, Moscow, Russia 16 February 2013 . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Outline 1. Introduction 2. Species of distributed file systems 3. Behind the curtain 4. OrangeFS 5. Summary . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Introduction Why one needs a non-local file system? • a large data storage • a high performance data storage • redundant and highly available solutions There are dozens of them: 72 only on wiki[1], more IRL. Focus on free software solutions. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Introduction Why one needs a non-local file system? • a large data storage • a high performance data storage • redundant and highly available solutions There are dozens of them: 72 only on wiki[1], more IRL. Focus on free software solutions. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Species of distributed file systems Network Clustered Distributed Terminology is ambiguous!. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Network file systems client1 client2 clientN server storage A single server (or at least an appearance) and multiple network clients. Examples: NFS, CIFS. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. Clustered file systems server1 server2 serverN storage Servers sharing the same local storage (usually SAN[2] at block level). shared storage architecture. Examples: GFS2[3], OCFS2[4]. .. .. .. .. .
    [Show full text]
  • View the Slides
    SMB3.1.1 POSIX Protocol Extensions: Summary and Current Implementation Status Steve French Azure Storage – Microsoft Samba Team And SMB Jeremy Allison Google/Samba Team 3.1.1 Legal Statement This work represents the views of the author(s) and does not necessarily reflect the views of Microsoft or Google Linux is a registered trademark of Linus Torvalds. Other company, product, and service names may be trademarks or service marks of others. Outline Linux is a lot more than POSIX ... Why do these extensions matter? Implementation Status What works today? Some details How to handle Linux continuing to extend APIs? Wireshark and Tracing Linux > POSIX Currently huge number of syscalls! (try “git grep SYSCALL_DEFINE” well over 850 and 500+ are even documented “man syscalls” FS layer has 223). Verified today vs Only about 100 POSIX API calls 513 syscalls with man pages! +12 just since last year’s SDC! Some examples of new fs ones from past 9 months ... Syscall name Kernel Version introduced io_uring_enter 5.1 io_uring_register 5.1 io_uring_setup 5.1 move_mount 5.2 open_tree 5.2 fsconfig 5.2 fsmount 5.2 fsopen 5.2 fspick 5.2 Repeating an old slide ... Remember LINUX > POSIX And not just new syscalls … new flags ... 2 examples of richer Linux vs. simpler POSIX fallocate has 7 flags – Insert range – Unshare range – Zero range – Keep size – But POSIX fallocate has no flags Rename (renameat2) has 3 flags – noreplace, whiteout and exchange – POSIX rename has none Network File systems matter ● these extensions to most popular network fs protocol (SMB3) are important ● block devices struggle to do file system tasks: locking, security, leases, consistent metadata Linux Apps need to work over network mounts and continue to work as Linux evolves Improve common situations where customers have Linux and Windows and Mac clients Make sure extensions work with most secure, most optimal SMB3.1.1 dialect (don’t encourage less secure network file systems, or even SMB1/CIFS) Quick Overview of Status ● Linux kernel client: – 5.1 kernel or later can be used.
    [Show full text]
  • White Paper | September 2014
    Oracle OpenStack for Oracle Linux High Availability Guide Release 1.0 ORACLE WHITE PAPER | SEPTEMBER 2014 ORACLE OPENSTACK FOR ORACLE LINUX HIGH AVAILABILITY GUIDE Contents Introduction 1 Active-Passive HA Deployment Architecture 1 Active-Passive HA Networking 2 Install Oracle Grid Infrastructure 3 Deploy OpenStack on Nodes 4 Configure iptables for Oracle Clusterware Traffic 4 Install OpenStack HA Package 5 Configure HA for MySQL 5 Configure HA for RabbitMQ 7 Configure HA for OpenStack 8 Configure HA for Keystone 8 Configure HA for Cinder 9 Configure HA for Glance 11 Configure HA for Swift 12 Configure HA for Nova 13 Configure HA for Neutron 15 Configure HA for Network Controller Nodes 15 ORACLE OPENSTACK FOR ORACLE LINUX HIGH AVAILABILITY GUIDE Introduction A High Availability (HA) cluster is a set of servers (nodes) managed by clusterware to work together for minimizing application downtime as much as possible. Clusterware enables servers to communicate with each other, so that they appear to function as a collective unit. In an active-passive HA cluster, there is one active server running the application, and one or more nodes standing by. If any problem were to occur on the active node that prevents it from running the application, the clusterware is able to detect this problem and start the protected applications on one of the standby nodes. Oracle Clusterware provides the infrastructure necessary to run Oracle Real Application Clusters (Oracle RAC). Oracle Clusterware can also manage user applications and other resources, such as virtual IP addresses. This document shows you how to use Oracle Clusterware to manage Oracle OpenStack nodes to make its services HA.
    [Show full text]
  • Around the Linux File System World in 45 Minutes
    Around the Linux File System World in 45 minutes Steve French IBM Samba Team [email protected] Abstract lines of code), most active (averaging 10 changesets a day!), and most important kernel subsystems. What makes the Linux file system interface unique? What has improved over the past year in this important part of the kernel? Why are there more than 50 Linux 2 The Linux Virtual File System Layer File Systems? Why might you choose ext4 or XFS, NFS or CIFS, or OCFS2 or GFS2? The differences are not al- The heart of the Linux file system, and what makes ways obvious. This paper will describe the new features Linux file systems unique, is the virtual file system in the Linux VFS, how various Linux file systems dif- layer, or VFS, which they connect to. fer in their use, and compare some of the key Linux file systems. 2.1 Virtual File System Structures and Relation- ships File systems are one of the largest and most active parts of the Linux kernel, but some key sections of the file sys- tem interface are little understood, and with more than The relationships between Linux file system compo- than 50 Linux file systems the differences between them nents is described in various papers [3] and is impor- can be confusing even to developers. tant to understand when carefully comparing file system implementations. 1 Introduction: What is a File System? 2.2 Comparison with other Operating Systems Linux has a rich file system interface, and a surprising number of file system implementations.
    [Show full text]