An Application Sandbox for Netbsd

Total Page:16

File Type:pdf, Size:1020Kb

An Application Sandbox for Netbsd secmodel sandbox : An application sandbox for NetBSD Stephen Herwig University of Maryland, College Park Abtract POSIX interface into categories, and allows processes to whitelist or pledge their use of certain categories; an at- We introduce a new security model for NetBSD – sec- tempt to perform an operation from a non-pledged cate- model sandbox – that allows per-process policies for re- gory kills the process. stricting privileges. Privileges correspond to kauth au- We implement an application sandbox for NetBSD, thorization requests, such as a request to create a socket secmodel sandbox, that allows per-process restriction or read a file, and policies specify the sandbox’s deci- of privileges. secmodel sandbox plugs into the kauth sion: deny, defer, or allow. Processes may apply mul- framework, and uses NetBSD’s support for in-kernel Lua tiple sandbox policies to themselves, in which case the [7] to both specify and evaluate sandbox policies. We policies stack, and child processes inherit their parent’s are developing several facilities with secmodel sandbox, sandbox. Sandbox policies are expressed in Lua, and the such as a secure chroot and a partial emulation of evaluation of policies uses NetBSD 7’s experimental in- OpenBSD’s pledge system call. kernel Lua interpreter. As such, policies may express static authorization decisions, or may register Lua func- tions that secmodel sandbox invokes for a decision. 2 NetBSD Overview 2.1 kauth 1 Introduction NetBSD 4.0 introduced the kauth kernel subsystem [3] A process sandbox is a mechanism for limiting the privi- – a clean room implementation of Apple’s kauth frame- leges of a process, as in restricting the operations the pro- work [6] for OS X – to handle authorization requests for cess may perform, the resources it may use, or its view of privileged operations. Privileged operations are repre- of the system. Sandboxes address the dual problems of sented as triples of the form (scope, action, optional sub- limiting the potential damage caused by running an un- action). The predefined scopes are system, process, trusted binary, and mitigating the effects of exploitation network, machdep, device, and vnode, each forming of a trusted binary. In either case, the goal is to restrict a a namespace that is further refined by the action and sub- process to only the necessary privileges for the purported action components. For instance, the operation to create task, and, in the latter case, to also drop privileges when a socket is identified by the triple (network, socket, they are no longer needed. open), and the operation to read a file by (vnode, Although NetBSD currently lacks a sandbox mech- read data). anism, sandbox implementations exist for various op- Some authorizations, such as (process, nice), are erating systems. systrace [5], a multi-platform mecha- triggered by a single system call (setpriority); some, such nism used in earlier versions of NetBSD, and seccomp as (system, mount, update), are triggered when a [2], a Linux-specific implementation, exemplify the ap- system call (mount) is called with specific arguments proach of specifying a per-process system call policy, (the MNT UPDATE flag); and others, such as (system, and use system call interposition to enforce the policy filehandle) may be triggered by more than one sys- filter. For systrace, the policy format is systrace-specific, tem call (fhopen and fhstat). Many system calls do not whereas seccomp specifies the policy as a BPF program. trigger a kauth request. OpenBSD’s pledge system call [4] offers a simplified kauth uses an observer pattern whereby listeners reg- interface for dropping privileges: OpenBSD groups the ister for operation requests for a given scope; when a re- quest occurs, each listener is called. module. The sandbox Lua module allows a script to set Each listener receives as arguments the operation policy rules via the following interface: triple, the credentials of the object (typically, the pro- cess) that triggered the authorization request, as well as sandbox.default(result) additional context specific to the request. sandbox.allow(req) Each listener returns a decision: either allow, deny, sandbox.deny(req) or defer. If any listener returns deny, the request sandbox.on(req, func) is denied. If at least one listener returns allow and The sandbox.default function specifies a result of none returns deny, the request is allowed. If all listen- either `allow', `deny', or `defer'. The result is the ers return defer, the decision is scope-dependent. For sandbox’s decision for any kauth request for which the all scopes other than the vnode scope, the result is to script does not specify a more specific rule. deny the authorization. For the vnode scope, the autho- The sandbox.allow and sandbox.deny specify al- rization request contains a “fall-back” decision, which low and deny rules, respectively, for the kauth request nearly always specifies a decision conforming to tradi- given as req. tional BSD4.4 file access permissions. The sandbox Lua module uses strings of the form `scope.action.subaction' to represent the requests; 2.2 secmodel hence, a request to open a socket corresponds to the string `network.socket.open', and a request to read While the NetBSD kernel source contains many listen- a file to `vnode.read data'. A script may specify a ers (typically in accordance with kernel configuration complete request name, or a prefix. When the process options), the secmodel framework offers a lightweight triggers an authorization request, secmodel sandbox will convention for developing and managing a set of lis- select the policy rule that has the longest prefix match teners that represents a larger security model. By with the given request. As an example, a sandbox policy default, NetBSD uses secmodel bsd44, which imple- script of: ments the traditional security model based on 4.4BSD, and which itself is composed of three separate mod- sandbox.default(`deny') els: secmodel suser, secmodel securelevel, and sec- sandbox.allow(`network') model extensions. would allow any request in the network scope, but would An important, subtle point with the default security deny requests from all other scopes. model is that many authorization requests are deferred, The sandbox.on Lua function registers a Lua func- relying on kauth’s default behavior when all listeners re- tion func to be called for the given kauth request. The turn defer to fully implement the policy. signature for func is: func(req, cred, arg0, arg1, arg2, arg3) 3 Design where req is the kauth request that generated the call- We developed secmodel sandbox as a loadable kernel back, cred is a Lua table that represents the credentials module with companion user-space library libsandbox. of the requesting object or process, and the remaining ar- By convention, we install the device file for sec- guments are request-specific. All parameters for func model sandbox at /dev/sandbox. exist only in the Lua environment; manipulating the val- A process interacts with secmodel sandbox via the ues does not affect the underlying C objects that they rep- sandbox(const char *script, int flags) func- resent. tion of libsandbox. The argument script is a Lua For many requests, the values for arg0 through arg3 script that specifies the sandbox policy. The flag argu- are nil, as the kauth request carries no additional con- ment specifies the action to take when a process attempts text. For the requests that do contain context, we trans- a denied operation: a value of 0 means that the oper- late the context into appropriate Lua values. For exam- ation returns an appropriate errno as dictated by kauth ple, for the request `network.socket.open', the ar- (typically EACCES for kauth’s vnode scope and EPERM guments are Lua integers representing the arguments to for all other scopes); a value of SANDBOX ON DENY KILL the socket system call that triggered the request. For specifies the pledge behavior of killing the process. The clarity in script writing, we pre-populate the sandbox sandbox function packages these arguments into a struct Lua module with symbols for common constants, such and, via an ioctl call, passes the struct to /dev/sandbox. as sandbox.AF INET and sandbox.SOCK STREAM. For secmodel sandbox evaluates the Lua script in a Lua requests in the process scope, arg0 is a Lua table that environment that is pre-populated with a sandbox Lua represents a subset of the fields of the struct proc that 2 is the target of the request, such as the pid, ppid, comm struct sandbox. A sandbox contains two main items: (program name), and nice value. Callback functions for a Lua state and a ruleset. The Lua state is the Lua en- the vnode scope receive as arg0 a Lua table that con- vironment in which secmodel sandbox evaluates all Lua tains the pathname and file status information of the tar- code for that particluar sandbox. The ruleset is a pre- get vnode. Completely representing the context with Lua fix tree that secmodel sandbox searches during a kauth values is an ongoing effort. request to find the sandbox’s matching rule. Before secmodel sandbox evaluates the policy script 4 Sandbox Implementation in the newly created Lua state, secmodel sandbox adds the sandbox Lua functions (e.g., sandbox.allow) and Our design and implementation of secmodel sandbox constants (e.g., sandbox.AF INET) to the state. Each considered several important requirements and features. sandbox Lua function is a closure that contains a pointer First, while expressing rules in Lua is elegant, having to to the struct sandbox. In Lua terminology, the call into Lua to find a matching rule for each request is struct sandbox is a light userdata upvalue. not. Thus, we implemented secmodel sandbox so that- When the script calls a sandbox Lua function, the evaluating the policy script “compiles” the rules into a function – which is implemented in C code – performs prefix tree, mimicking the natural hierarchy provided by argument checking, retrieves the ruleset from the struct the (scope, action, subaction) format of requests.
Recommended publications
  • Oracle Berkeley DB Installation and Build Guide Release 18.1
    Oracle Berkeley DB Installation and Build Guide Release 18.1 Library Version 18.1.32 Legal Notice Copyright © 2002 - 2019 Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. Berkeley DB, and Sleepycat are trademarks or registered trademarks of Oracle. All rights to these marks are reserved. No third- party use is permitted without the express prior written consent of Oracle. Other names may be trademarks of their respective owners. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs.
    [Show full text]
  • Unbound: a New Secure and High Performance Open Source DNS Server
    New Open Source DNS Server Released Today Unbound – A Secure, High-Performance Alternative to BIND – Makes its Debut within Open Source Community Amsterdam, The Netherlands and Mountain View, CA – May 20, 2008 – Unbound – a new open source alternative to the BIND domain name system (DNS) server– makes its worldwide debut today with the worldwide public release of Unbound 1.0 at http://unbound.net. Released to open source developers by NLnet Labs, VeriSign, Inc. (NASDAQ: VRSN), Nominet, and Kirei, Unbound is a validating, recursive, and caching DNS server designed as a high- performance alternative for BIND (Berkeley Internet Name Domain). Unbound will be supported by NLnet Labs. An essential component of the Internet, the DNS ties domain names (such as www.verisign.com) to the IP addresses and other information that Web browsers need to access and interact with specific sites. Though it is unknown to the vast majority of Web users, DNS is at the heart of a range of Internet-based services beyond Web browsing, including email, messaging and Voice Over Internet Protocol (VOIP) telecommunications. Although BIND has been the de facto choice for DNS servers since the 1980s, a desire to seek an alternative server that excels in security, performance and ease of use prompted an effort to develop an open source DNS implementation. Unbound is the result of that effort. Mostly deployed by ISPs and enterprise users, Unbound will also be available for embedding in customer devices, such as dedicated DNS appliances and ADSL modems. By making Unbound code available to open source developers, its originators hope to enable rapid development of features that have not traditionally been associated with DNS.
    [Show full text]
  • Package 'Filelock'
    Package ‘filelock’ October 5, 2018 Title Portable File Locking Version 1.0.2 Author Gábor Csárdi Maintainer Gábor Csárdi <[email protected]> Description Place an exclusive or shared lock on a file. It uses 'LockFile' on Windows and 'fcntl' locks on Unix-like systems. License MIT + file LICENSE LazyData true URL https://github.com/r-lib/filelock#readme BugReports https://github.com/r-lib/filelock/issues RoxygenNote 6.0.1.9000 Suggests callr (>= 2.0.0), covr, testthat Encoding UTF-8 NeedsCompilation yes Repository CRAN Date/Publication 2018-10-05 10:30:12 UTC R topics documented: lock .............................................2 Index 5 1 2 lock lock Advisory File Locking and Unlocking Description There are two kinds of locks, exclusive and shared, see the exclusive argument and other details below. Usage lock(path, exclusive = TRUE, timeout = Inf) unlock(lock) Arguments path Path to the file to lock. If the file does not exist, it will be created, but the directory of the file must exist. Do not place the lock on a file that you want to read from or write to! *Always use a special lock file. See details below. exclusive Whether to acquire an exclusive lock. An exclusive lock gives the process ex- clusive access to the file, no other processes can place any kind of lock on it. A non-exclusive lock is a shared lock. Multiple processes can hold a shared lock on the same file. A process that writes to a file typically requests an exclusive lock, and a process that reads from it typically requests a shared lock.
    [Show full text]
  • Mac OS X for UNIX Users the Power of UNIX with the Simplicity of Macintosh
    Mac OS X for UNIX Users The power of UNIX with the simplicity of Macintosh. Features Mac OS X version 10.3 “Panther” combines a robust and open UNIX-based foundation with the richness and usability of the Macintosh interface, bringing UNIX technology Open source, standards-based UNIX to the mass market. Apple has made open source and standards a key part of its foundation strategy and delivers an operating system built on a powerful UNIX-based foundation •Based on FreeBSD 5 and Mach 3.0 • Support for POSIX, Linux, and System V APIs that is innovative and easy to use. • High-performance math libraries, including There are over 8.5 million Mac OS X users, including scientists, animators, developers, vector/DSP and PowerPC G5 support and system administrators, making Mac OS X the most widely used UNIX-based desktop • Optimized X11 window server for UNIX GUIs operating system. In addition, Mac OS X is the only UNIX-based environment that •Open source code available via the natively runs Microsoft Office, Adobe Photoshop, and thousands of other consumer Darwin project applications—all side by side with traditional command-line, X11, and Java applications. Standards-based networking For notebook computer users, Mac OS X delivers full power management and mobility •Open source TCP/IP-based networking support for Apple’s award-winning PowerBook G4. architecture, including IPv4, IPv6, and L2TP/IPSec •Interoperability with NFS, AFP, and Windows (SMB/CIFS) file servers •Powerful web server (Apache) •Open Directory 2, an LDAP-based directory services
    [Show full text]
  • Stateless DNS
    Technical Report KN{2014{DiSy{004 Distributed System Laboratory Stateless DNS Daniel Kaiser, Matthias Fratz, Marcel Waldvogel, Valentin Dietrich, Holger Strittmatter Distributed Systems Laboratory Department of Computer and Information Science University of Konstanz { Germany Konstanzer Online-Publikations-System (KOPS) URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-0-267760 Abstract. Several network applications, like service discovery, file dis- covery in P2P networks, distributed hash tables, and distributed caches, use or would benefit from distributed key value stores. The Domain Name System (DNS) is a key value store which has a huge infrastructure and is accessible from almost everywhere. Nevertheless storing information in this database makes it necessary to be authoritative for a domain or to be \registered" with a domain, e.g. via DynDNS, to be allowed to store and update resource records using nsupdate . Applications like the ones listed above would greatly benefit from a configurationless approach, giving users a much more convenient experience. In this report we describe a technique we call Stateless DNS, which allows to store data in the cache of the local DNS server. It works without any infrastructure updates; it just needs our very simple, configurationless echo DNS server that can parse special queries containing information desired to be stored, process this information, and generate DNS answers in a way that the DNS cache that was asked the special query will store the desired information. Because all this happens in the authority zone of our echo DNS server, we do not cause cache poisoning. Our tests show that Stateless DNS works with a huge number of public DNS servers.
    [Show full text]
  • Managing Network File Systems in Oracle® Solaris 11.4
    Managing Network File Systems in ® Oracle Solaris 11.4 Part No: E61004 August 2021 Managing Network File Systems in Oracle Solaris 11.4 Part No: E61004 Copyright © 2002, 2021, Oracle and/or its affiliates. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract.
    [Show full text]
  • Defense and Detection Strategies Against Internet Worms
    Defense and Detection Strategies against Internet Worms For quite a long time, computer security was a rather narrow field of study that was populated mainly by theoretical computer scientists, electri- cal engineers, and applied mathematicians. With the proliferation of open systems in general, and of the Internet and the World Wide Web (WWW) in particular, this situation has changed fundamentally. Today, computer and network practitioners are equally interested in computer security, since they require technologies and solutions that can be used to secure applications related to electronic commerce. Against this background, the field of com- puter security has become very broad and includes many topics of interest. The aim of this series is to publish state-of-the-art, high standard technical books on topics related to computer security. Further information about the series can be found on the WWW at the following URL: http://www.esecurity.ch/serieseditor.html Also, if you’d like to contribute to the series by writing a book about a topic related to computer security, feel free to contact either the Commissioning Editor or the Series Editor at Artech House. For a listing of recent titles in the Artech House Computer Security Series, turn to the back of this book. Defense and Detection Strategies against Internet Worms Jose Nazario Artech House Boston • London www.artechhouse.com Library of Congress Cataloging-in-Publication Data A catalog record of this book is available from the U.S. Library of Congress. British Library Cataloguing in Publication Data Nazario, Jose Defense and detection strategies against Internet worms. — (Artech House computer security library) 1.
    [Show full text]
  • Internet Systems Consortium, Inc
    BIND 9 Administrator Reference Manual I S C Copyright c 2004, 2005, 2006, 2007, 2008, 2009, 2010 Internet Systems Consortium, Inc. (”ISC”) Copyright c 2000, 2001, 2002, 2003 Internet Software Consortium. Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies. THE SOFTWARE IS PROVIDED ”AS IS” AND ISC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 2 Contents 1 Introduction 9 1.1 Scope of Document . 9 1.2 Organization of This Document . 9 1.3 Conventions Used in This Document . 9 1.4 The Domain Name System (DNS) . 10 1.4.1 DNS Fundamentals . 10 1.4.2 Domains and Domain Names . 10 1.4.3 Zones . 10 1.4.4 Authoritative Name Servers . 11 1.4.4.1 The Primary Master . 11 1.4.4.2 Slave Servers . 11 1.4.4.3 Stealth Servers . 11 1.4.5 Caching Name Servers . 12 1.4.5.1 Forwarding . 12 1.4.6 Name Servers in Multiple Roles . 12 2 BIND Resource Requirements 13 2.1 Hardware requirements . 13 2.2 CPU Requirements . 13 2.3 Memory Requirements .
    [Show full text]
  • Portalocker Documentation Release 2.3.2
    Portalocker Documentation Release 2.3.2 Rick van Hattem Aug 27, 2021 CONTENTS 1 portalocker - Cross-platform locking library1 1.1 Overview.................................................1 1.2 Redis Locks...............................................1 1.3 Python 2.................................................2 1.4 Tips....................................................2 1.5 Links...................................................2 1.6 Examples.................................................3 1.7 Versioning................................................4 1.8 Changelog................................................4 1.9 License..................................................4 1.9.1 portalocker package.......................................4 1.9.1.1 Submodules......................................4 1.9.1.2 Module contents....................................9 1.9.2 tests package.......................................... 13 1.9.2.1 Module contents.................................... 13 1.9.3 License............................................. 13 2 Indices and tables 15 Python Module Index 17 Index 19 i ii CHAPTER ONE PORTALOCKER - CROSS-PLATFORM LOCKING LIBRARY 1.1 Overview Portalocker is a library to provide an easy API to file locking. An important detail to note is that on Linux and Unix systems the locks are advisory by default. By specifying the -o mand option to the mount command it is possible to enable mandatory file locking on Linux. This is generally not recommended however. For more information about the subject: • https://en.wikipedia.org/wiki/File_locking
    [Show full text]
  • Real-Time Audio Servers on BSD Unix Derivatives
    Juha Erkkilä Real-Time Audio Servers on BSD Unix Derivatives Master's Thesis in Information Technology June 17, 2005 University of Jyväskylä Department of Mathematical Information Technology Jyväskylä Author: Juha Erkkilä Contact information: [email protected].fi Title: Real-Time Audio Servers on BSD Unix Derivatives Työn nimi: Reaaliaikaiset äänipalvelinsovellukset BSD Unix -johdannaisjärjestelmissä Project: Master's Thesis in Information Technology Page count: 146 Abstract: This paper covers real-time and interprocess communication features of 4.4BSD Unix derived operating systems, and especially their applicability for real- time audio servers. The research ground of bringing real-time properties to tradi- tional Unix operating systems (such as 4.4BSD) is covered. Included are some design ideas used in BSD-variants, such as using multithreaded kernels, and schedulers that can provide real-time guarantees to processes. Factors affecting the design of real- time audio servers are considered, especially the suitability of various interprocess communication facilities as mechanisms to pass audio data between applications. To test these mechanisms on a real operating system, an audio server and a client utilizing these techniques is written and tested on an OpenBSD operating system. The performance of the audio server and OpenBSD is analyzed, with attempts to identify some bottlenecks of real-time operation in the OpenBSD system. Suomenkielinen tiivistelmä: Tämä tutkielma kattaa reaaliaikaisuus- ja prosessien väliset kommunikaatio-ominaisuudet, keskittyen 4.4BSD Unix -johdannaisiin käyt- töjärjestelmiin, ja erityisesti siihen kuinka hyvin nämä soveltuvat reaaliaikaisille äänipalvelinsovelluksille. Tutkimusalueeseen sisältyy reaaliaikaisuusominaisuuk- sien tuominen perinteisiin Unix-käyttöjärjestelmiin (kuten 4.4BSD:hen). Mukana on suunnitteluideoita, joita on käytetty joissakin BSD-varianteissa, kuten säikeis- tetyt kernelit, ja skedulerit, jotka voivat tarjota reaaliaikaisuustakeita prosesseille.
    [Show full text]
  • Design and Implementation of XNU Port of Lustre Client File System
    Design and Implementation of XNU port of Lustre Client File System Danilov Nikita 2005.02.01 Abstract Describes structure of Lustre client file system module for XNU (Darwin kernel). In particular, changes that were necessary in core XNU kernel to enable unique Lustre requirements (e.g., intents) are discussed in much detail. Changes to the platform-independent core of Lustre in order to make it more portable are discussed in the companion paper Lustre Universal Portability Specification. Contents 1 Introduction 2 2 Distribution 3 3 Backgroundon XNU 3 3.1 XNUVFS.......................................... ......... 3 3.1.1 namei()....................................... ......... 4 3.1.2 vnodelifecycle ................................ ........... 6 3.2 XNUpagecache.................................... ........... 6 3.3 XNUSynchronization .............................. ............. 7 3.4 Miscellania..................................... ............. 9 4 High Level Design 9 4.1 XLLIntentHandling ............................... ............. 9 4.1.1 Requirements .................................. .......... 9 4.1.2 FunctionalSpecification . .............. 10 4.1.3 UseCases ...................................... ........ 10 4.1.4 LogicalSpecification . ............. 11 4.1.5 StateSpecification ......................... .... ............ 12 4.2 Sessions........................................ ............ 12 4.2.1 Requirements .................................. .......... 12 4.2.2 FunctionalSpecification . .............. 12 4.2.3 UseCases .....................................
    [Show full text]
  • Lock Or No-Lock?
    Lock Or No-Lock? Abstract guaranteeing atomicity are necessary when these oper- ations must be atomic. Many scientific applications require high perfor- With the advent of parallel I/O libraries data can be mance concurrent IO accesses to a file by multiple pro- accessed in various complex patterns. Locking mecha- cesses. Those applications rely indirectly on atomic IO nisms are used to ensure that shared data is not being capabilities inorder to perform updates to structured violated. Adapted from the POSIX semantics, parallel datasets, such as those stored in HDF5 format files. file system like GPFS [9] and Lustre [10] provide byte Current support for atomic mode operations such as range locking mechanism. Byte range locks provide these in MPI-IO is performed by locking around the an option for guaranteeing atomicity of non-contiguous operations, imposing lock overhead in all situations, operations. By locking the entire region, changes can even though in many cases these operations are non- be made using a read-modify-write sequence. How- overlapping in the file. We propose to isolate non- ever, this approach does not consider the actual non- overlapping accesses from overlapping ones in collec- contiguous access pattern that may occur in a byte tive I/O cases, allowing the non-overlapping ones to range and introduces false sharing. This approach also proceed without imposing lock overhead. To enable this limits the benefits of parallel I/O that can be gained, we have implemented an efficient conflict detection al- by unnecessarily serializing the accesses. To address gorithm in MPI-IO using MPI file views and datatypes.
    [Show full text]