Spacejmp: Programming with Multiple Virtual Address Spaces

Total Page:16

File Type:pdf, Size:1020Kb

Spacejmp: Programming with Multiple Virtual Address Spaces SpaceJMP: Programming with Multiple Virtual Address Spaces Izzat El Hajj3,4,*, Alexander Merritt2,4,*, Gerd Zellweger1,4,*, Dejan Milojicic4, Reto Achermann1, Paolo Faraboschi4, Wen-mei Hwu3, Timothy Roscoe1, and Karsten Schwan2 1Department of Computer Science, ETH Zürich 2Georgia Institute of Technology 3University of Illinois at Urbana-Champaign 4Hewlett Packard Labs *Co-primary authors Abstract based data structures across process lifetimes, and sharing Memory-centric computing demands careful organization of very large memory objects between processes. the virtual address space, but traditional methods for doing so We expect that main memory capacity will soon exceed are inflexible and inefficient. If an application wishes to ad- the virtual address space size supported by CPUs today dress larger physical memory than virtual address bits allow, (typically 256 TiB with 48 virtual address bits). This is if it wishes to maintain pointer-based data structures beyond made particularly likely with non-volatile memory (NVM) process lifetimes, or if it wishes to share large amounts of devices – having larger capacity than DRAM – expected to memory across simultaneously executing processes, legacy appear in memory systems by 2020. While some vendors interfaces for managing the address space are cumbersome have already increased the number of virtual address (VA) and often incur excessive overheads. bits in their processors, adding VA bits has implications on We propose a new operating system design that promotes performance, power, production, and estate cost that make it virtual address spaces to first-class citizens, enabling process undesirable for low-end and low-power processors. Adding threads to attach to, detach from, and switch between multiple VA bits also implies longer TLB miss latency, which is virtual address spaces. Our work enables data-centric appli- already a bottleneck in current systems (up to 50% overhead cations to utilize vast physical memory beyond the virtual in scientific apps [55]). Processors supporting virtual address range, represent persistent pointer-rich data structures with- spaces smaller than available physical memory require large out special pointer representations, and share large amounts data structures to be partitioned across multiple processes or of memory between processes efficiently. these structures to be mapped in and out of virtual memory. We describe our prototype implementations in the Dragon- Representing pointer-based data structures beyond pro- Fly BSD and Barrelfish operating systems. We also present cess lifetimes requires data serialization, which incurs a large programming semantics and a compiler transformation to performance overhead, or the use of special pointer represen- detect unsafe pointer usage. We demonstrate the benefits of tations, which results in awkward programming techniques. our work on data-intensive applications such as the GUPS Sharing and storing pointers in their original form across benchmark, the SAMTools genomics workflow, and the Redis process lifetimes requires guaranteed acquisition of specific key-value store. VA locations. Providing this guarantee is not always feasible, and when feasible, may necessitate mapping datasets residing 1. Introduction at conflicting VA locations in and out of the virtual space. Sharing data across simultaneously executing processes The volume of data processed by applications is increasing requires special communication with data-serving processes dramatically, and the amount of physical memory in machines (e.g., using sockets), impacting programmability and incur- is growing to meet this demand. However, effectively using ring communication channel overheads. Sharing data via this memory poses challenges for programmers. The main traditional shared memory requires tedious communication challenges tackled in this paper are addressing more physical and synchronization between all client processes for growing memory than the size of a virtual space, maintaining pointer- the shared region or guaranteeing consistency on writes. To address all these challenges, we present SpaceJMP, a set of APIs, OS mechanisms, and compiler techniques that Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without constitute a new way to manage memory. SpaceJMP appli- fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must cations create and manage Virtual Address Spaces (VASes) be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. as first-class objects, independent of processes. Decoupling ASPLOS ’16 April 02 - 06, 2016, Atlanta, GA, USA Copyright is held by the owner/author(s). Publication rights licensed to ACM. VASes from processes enables a single process to activate ACM 978-1-4503-4091-5/16/04. $15.00 DOI: http://dx.doi.org/10.1145/2872362.2872366 multiple VASes, such that threads of that process can switch between these VASes in a lightweight manner. It also enables 104 a single VAS to be activated by multiple processes or to exist 103 map map (cached) in the system alone in a self-contained manner. 102 unmap unmap (cached) 101 SpaceJMP solves the problem of insufficient VA bits by 100 1 allowing a process to place data in multiple address spaces. 10− 2 If a process runs out of virtual memory, it does not need to 10− Latency [msec] 3 10− modify data mappings or create new processes. It simply 4 10− creates more address spaces and switches between them. 15 20 25 30 35 SpaceJMP solves the problem of managing pointer-based Region Size [power of 2, bytes] data structures because SpaceJMP processes can always guarantee the availability of a VA location by creating a new Figure 1: Page table construction (mmap) and removal address space. SpaceJMP solves the problem of data sharing (munmap) costs in Linux, using 4KiB pages. Does not include by allowing processes to switch into a shared address space. page zeroing costs. Clients thus need not communicate with a server process or synchronize with each other on shared region management, and synchronization on writes can be lightweight. bits to the virtual memory translation unit because of power SpaceJMP builds on a wealth of historical techniques in and performance implications. Most CPUs today are limited memory management and virtualization but occupies a novel to 48 virtual address bits (i.e., 256 TiB) and 44-46 physical “sweet spot” for modern data-centric applications: it maps address bits (16-64 TiB), and we expect a slow growth in well to existing hardware while delivering more flexibility these numbers. and performance for large data-centric applications compared The challenge presented is how to support applications with current OS facilities. that want to address large physical memories without paying We make the following contributions: the cost of increasing processor address bits across the board. One solution is partitioning physical memory across multiple 1. We present SpaceJMP (Section 3): the OS facilities pro- OS processes which would incur unnecessary inter-process vided for processes to each create, structure, compose, and communication overhead and is tedious to program. Another access multiple virtual address spaces, together with the solution is mapping memory partitions in and out of the VAS, compiler support for detecting unsafe program behavior. which has overheads discussed in Section 2.4. 2. We describe and compare (Section 4) implementations of 2.2 Maintaining Pointer-based Data Structures SpaceJMP for the 64-bit x86 architecture in two different OSes: DragonFly BSD and Barrelfish. Maintaining pointer-based data structures beyond process lifetimes without serialization overhead has motivated emerg- 3. We empirically show (Section 5) how SpaceJMP improves ing persistent memory programming models to adopt region- performance with microbenchmarks and three data-centric based programming paradigms [15, 21, 78]. While these ap- applications: the GUPS benchmark, the SAMTools ge- proaches provide a more natural means for applications to nomics workflow, and the Redis key-value store. interact with data, they present the challenge of how to repre- sent pointers meaningfully across processes. 2. Motivation One solution is the use of special pointers, but this hin- Our primary motivations are the challenges we have already ders programmability. Another solution is requiring memory mentioned: insufficient VA bits, managing pointer-based data regions to be mapped to fixed virtual addresses by all users. structures, and sharing large amounts of memory – discussed This solution creates degenerate scenarios where memory in Sections 2.1, 2.2, and 2.3 respectively. is mapped in and out when memory regions overlap, the drawbacks of which are discussed in Section 2.4. 2.1 Insufficient Virtual Address Bits Non-Volatile Memory (NVM) technologies, besides persis- 2.3 Sharing Large Memory tence, promise better scaling and lower power than DRAM, Collaborative environments where multiple processes share which will enable large-scale, densely packed memory sys- large amounts of data require clean mechanisms for processes tems with much larger capacity than today’s DRAM-based to access and manage
Recommended publications
  • Redis and Memcached
    Redis and Memcached Speaker: Vladimir Zivkovic, Manager, IT June, 2019 Problem Scenario • Web Site users wanting to access data extremely quickly (< 200ms) • Data being shared between different layers of the stack • Cache a web page sessions • Research and test feasibility of using Redis as a solution for storing and retrieving data quickly • Load data into Redis to test ETL feasibility and Performance • Goal - get sub-second response for API calls for retrieving data !2 Why Redis • In-memory key-value store, with persistence • Open source • Written in C • It can handle up to 2^32 keys, and was tested in practice to handle at least 250 million of keys per instance.” - http://redis.io/topics/faq • Most popular key-value store - http://db-engines.com/en/ranking !3 History • REmote DIctionary Server • Released in 2009 • Built in order to scale a website: http://lloogg.com/ • The web application of lloogg was an ajax app to show the site traffic in real time. Needed a DB handling fast writes, and fast ”get latest N items” operation. !4 Redis Data types • Strings • Bitmaps • Lists • Hyperlogs • Sets • Geospatial Indexes • Sorted Sets • Hashes !5 Redis protocol • redis[“key”] = “value” • Values can be strings, lists or sets • Push and pop elements (atomic) • Fetch arbitrary set and array elements • Sorting • Data is written to disk asynchronously !6 Memory Footprint • An empty instance uses ~ 3MB of memory. • For 1 Million small Keys => String Value pairs use ~ 85MB of memory. • 1 Million Keys => Hash value, representing an object with 5 fields,
    [Show full text]
  • Modeling and Analyzing Latency in the Memcached System
    Modeling and Analyzing Latency in the Memcached system Wenxue Cheng1, Fengyuan Ren1, Wanchun Jiang2, Tong Zhang1 1Tsinghua National Laboratory for Information Science and Technology, Beijing, China 1Department of Computer Science and Technology, Tsinghua University, Beijing, China 2School of Information Science and Engineering, Central South University, Changsha, China March 27, 2017 abstract Memcached is a widely used in-memory caching solution in large-scale searching scenarios. The most pivotal performance metric in Memcached is latency, which is affected by various factors including the workload pattern, the service rate, the unbalanced load distribution and the cache miss ratio. To quantitate the impact of each factor on latency, we establish a theoretical model for the Memcached system. Specially, we formulate the unbalanced load distribution among Memcached servers by a set of probabilities, capture the burst and concurrent key arrivals at Memcached servers in form of batching blocks, and add a cache miss processing stage. Based on this model, algebraic derivations are conducted to estimate latency in Memcached. The latency estimation is validated by intensive experiments. Moreover, we obtain a quantitative understanding of how much improvement of latency performance can be achieved by optimizing each factor and provide several useful recommendations to optimal latency in Memcached. Keywords Memcached, Latency, Modeling, Quantitative Analysis 1 Introduction Memcached [1] has been adopted in many large-scale websites, including Facebook, LiveJournal, Wikipedia, Flickr, Twitter and Youtube. In Memcached, a web request will generate hundreds of Memcached keys that will be further processed in the memory of parallel Memcached servers. With this parallel in-memory processing method, Memcached can extensively speed up and scale up searching applications [2].
    [Show full text]
  • 微软郑宇:大数据解决城市大问题 18 更智能的家庭 19 微博拾粹 Weibo Highlights
    2015年1月 第33期 P7 WWT: 数字宇宙,虚拟星空 P16 P12 带着微软坐过山车 P4 P10 1 微软亚洲研究院 2015年1月 第33期 以“开放”和“极客创新”精神铸造一个“新”微软 3 第十六届“二十一世纪的计算”学术研讨会成功举办 4 微软与清华大学联合举办2014年微软亚太教育峰会 5 你还记得一夜暴红的机器人小冰吗? 7 Peter Lee:带着微软坐过山车 10 计算机科学家周以真专访 12 WWT:数字宇宙,虚拟星空 16 微软郑宇:大数据解决城市大问题 18 更智能的家庭 19 微博拾粹 Weibo Highlights @微软亚洲研究院 官方微博精彩选摘 20 让虚拟世界更真实 22 4K时代,你不能不知道的HEVC 25 小冰的三项绝技是如何炼成的? 26 科研伴我成长——上海交通大学ACM班学生在微软亚洲研究院的幸福实习生活 28 年轻的心与渐行渐近的梦——记微软-斯坦福产品设计创新课程ME310 30 如何写好简历,找到心仪的暑期实习 32 微软研究员Eric Horvitz解读“人工智能百年研究” 33 2 以“开放”和“极客创新”精神 铸造一个“新”微软 爆竹声声辞旧岁,喜气洋“羊”迎新年。转眼间,这一 年的时光即将伴着哒哒的马蹄声奔驰而去。在这辞旧迎新的 时刻,我们怀揣着对技术改变未来的美好憧憬,与所有为推 动互联网不断发展、为人类科技进步不断奋斗的科技爱好者 一起,敲响新年的钟声。挥别2014,一起期盼那充满希望 的新一年! 即将辞行的是很不平凡的一年。这一年的互联网风起云 涌,你方唱罢我登场,比起往日的峥嵘有过之而无不及:可 穿戴设备和智能家居成为众人争抢的高地;人工智能在逐渐 触手可及的同时又引起争议不断;一轮又一轮创业大潮此起 彼伏,不断有优秀的新公司和新青年展露头角。这一年的微 软也在众人的关注下发生着翻天覆地的变化:第三任CEO Satya Nadella屡新,确定以移动业务和云业务为中心 的公司战略,以开放的姿态拥抱移动互联的新时代,发布新一代Windows 10操作系统……于我个人而言,这一 年,就任微软亚太研发集团主席开启了我20年微软旅程上的新篇章。虽然肩上的担子更重了,但心中的激情却 丝毫未减。 “开放”和“极客创新”精神是我们在Satya时代看到微软文化的一个演进。“开放”是指不固守陈规,不囿 于往日的羁绊,用户在哪里需要我们,微软的服务就去哪里。推出Office for iOS/Android,开源.Net都是“开放” 的“新”微软对用户做出的承诺。关于“极客创新”,大家可能都已经听说过“车库(Garage)”这个词了,当年比 尔∙盖茨最早就是在车库里开始创业的。2014年,微软在全球范围内举办了“极客马拉松大赛(Hackthon)”,并鼓 励员工在业余时间加入“微软车库”项目,将自己的想法变为现实。这一系列的举措都是希望激发微软员工的“极 客创新”精神,我们也欣喜地看到变化正在微软内部潜移默化地进行着。“微软车库”目前已经公布了一部分有趣 的项目,我相信,未来还会有更多让人眼前一亮的创新。 2015年,我希望微软亚洲研究院和微软亚太研发集团的同事能够保持和发扬“开放”和“极客创新”的精 神。我们一起铸造一个“新”微软,为用户创造更多的惊喜! 微软亚洲研究院院长 3 第十六届“二十一世纪的计算”学术研讨会成功举办 2014年10月29日由微软亚洲研究院与北京大学联合主办的“二 形图像等领域的突破:微软语音助手小娜的背后,蕴藏着一系列人 十一世纪的计算”大型学术研讨会,在北京大学举行。包括有“计 工智能技术;大数据和机器学习技术帮助我们更好地认识城市和生 算机科学领域的诺贝尔奖”之称的图灵奖获得者、来自微软及学术
    [Show full text]
  • Release 2.5.5 Ask Solem Contributors
    Celery Documentation Release 2.5.5 Ask Solem Contributors February 04, 2014 Contents i ii Celery Documentation, Release 2.5.5 Contents: Contents 1 Celery Documentation, Release 2.5.5 2 Contents CHAPTER 1 Getting Started Release 2.5 Date February 04, 2014 1.1 Introduction Version 2.5.5 Web http://celeryproject.org/ Download http://pypi.python.org/pypi/celery/ Source http://github.com/celery/celery/ Keywords task queue, job queue, asynchronous, rabbitmq, amqp, redis, python, webhooks, queue, dis- tributed – • Synopsis • Overview • Example • Features • Documentation • Installation – Bundles – Downloading and installing from source – Using the development version 1.1.1 Synopsis Celery is an open source asynchronous task queue/job queue based on distributed message passing. Focused on real- time operation, but supports scheduling as well. The execution units, called tasks, are executed concurrently on one or more worker nodes using multiprocessing, Eventlet or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready). Celery is used in production systems to process millions of tasks every hour. 3 Celery Documentation, Release 2.5.5 Celery is written in Python, but the protocol can be implemented in any language. It can also operate with other languages using webhooks. There’s also RCelery for the Ruby programming language, and a PHP client. The recommended message broker is RabbitMQ, but support for Redis, MongoDB, Beanstalk, Amazon SQS, CouchDB and databases (using SQLAlchemy or the Django ORM) is also available. Celery is easy to integrate with web frameworks, some of which even have integration packages: Django django-celery Pyramid pyramid_celery Pylons celery-pylons Flask flask-celery web2py web2py-celery 1.1.2 Overview This is a high level overview of the architecture.
    [Show full text]
  • Hardware Configuration with Dynamically-Queried Formal Models
    Master’s Thesis Nr. 180 Systems Group, Department of Computer Science, ETH Zurich Hardware Configuration With Dynamically-Queried Formal Models by Daniel Schwyn Supervised by Reto Acherman Dr. David Cock Prof. Dr. Timothy Roscoe April 2017 – October 2017 Abstract Hardware is getting increasingly complex and heterogeneous. With different compo- nents having different views of the system, the traditional assumption of unique phys- ical addresses has become an illusion. To adapt to such hardware, an operating system (OS) needs to understand the complex address translation chains and be able to handle the presence of multiple address spaces. This thesis takes a recently proposed model that formally captures these aspects and applies it to hardware configuration in the Barrelfish OS. To this end, I present Sockeye, a domain specific language that uses the model to de- scribe hardware. I then show, that code relying on knowledge about the address spaces in a system can be statically generated from these specifications. Furthermore, the model is successfully applied to device management, showing that it can also be used to configure hardware at runtime. The implementation presented here does not rely on any platform specific code and it reduced the amount of such code in Barrelfish’s device manager by over 30%. Applying the model to further hardware configuration tasks is expected to have similar effects. i Acknowledgements First of all, I want to thank my advisers Timothy Roscoe, David Cock and Reto Acher- mann for their guidance and support. Their feedback and critical questions were a valuable contribution to this thesis. I’d also like to thank the rest of the Barrelfish team and other members of the Systems Group at ETH for the interesting discussions during meetings and coffee breaks.
    [Show full text]
  • An End-To-End Application Performance Monitoring Tool
    Applications Manager - an end-to-end application performance monitoring tool For enterprises to stay ahead of the technology curve in today's competitive IT environment, it is important for their business critical applications to perform smoothly. Applications Manager - ManageEngine's on-premise application performance monitoring solution offers real-time application monitoring capabilities that offer deep visibility into not only the performance of various server and infrastructure components but also other technologies used in the IT ecosystem such as databases, cloud services, virtual systems, application servers, web server/services, ERP systems, big data, middleware and messaging components, etc. Highlights of Applications Manager Wide range of supported apps Monitor 150+ application types and get deep visibility into all the components of your application environment. Ensure optimal performance of all your applications by visualizing real-time data via comprehensive dashboards and reports. Easy installation and setup Get Applications Manager running in under two minutes. Applications Manager is an agentless monitoring tool and it requires no complex installation procedures. Automatic discovery and dependency mapping Automatically discover all applications and servers in your network and easily categorize them based on their type (apps, servers, databases, VMs, etc.). Get comprehensive insight into your business infrastructure, drill down to IT application relationships, map them effortlessly and understand how applications interact with each other with the help of these dependency maps. Deep dive performance metrics Track key performance metrics of your applications like response times, requests per minute, lock details, session details, CPU and disk utilization, error states, etc. Measure the efficiency of your applications by visualizing the data at regular periodic intervals.
    [Show full text]
  • Redis - a Flexible Key/Value Datastore an Introduction
    Redis - a Flexible Key/Value Datastore An Introduction Alexandre Dulaunoy AIMS 2011 MapReduce and Network Forensic • MapReduce is an old concept in computer science ◦ The map stage to perform isolated computation on independent problems ◦ The reduce stage to combine the computation results • Network forensic computations can easily be expressed in map and reduce steps: ◦ parsing, filtering, counting, sorting, aggregating, anonymizing, shuffling... 2 of 16 Concurrent Network Forensic Processing • To allow concurrent processing, a non-blocking data store is required • To allow flexibility, a schema-free data store is required • To allow fast processing, you need to scale horizontally and to know the cost of querying the data store • To allow streaming processing, write/cost versus read/cost should be equivalent 3 of 16 Redis: a key-value/tuple store • Redis is key store written in C with an extended set of data types like lists, sets, ranked sets, hashes, queues • Redis is usually in memory with persistence achieved by regularly saving on disk • Redis API is simple (telnet-like) and supported by a multitude of programming languages • http://www.redis.io/ 4 of 16 Redis: installation • Download Redis 2.2.9 (stable version) • tar xvfz redis-2.2.9.tar.gz • cd redis-2.2.9 • make 5 of 16 Keys • Keys are free text values (up to 231 bytes) - newline not allowed • Short keys are usually better (to save memory) • Naming convention are used like keys separated by colon 6 of 16 Value and data types • binary-safe strings • lists of binary-safe strings • sets of binary-safe strings • hashes (dictionary-like) • pubsub channels 7 of 16 Running redis and talking to redis..
    [Show full text]
  • Application Performance and Flexibility on Exokernel Systems
    Application Performance and Flexibility On Exokernel Systems –M. F. Kaashoek, D.R. Engler, G.R. Ganger et. al. Presented by Henry Monti Exokernel ● What is an Exokernel? ● What are its advantages? ● What are its disadvantages? ● Performance ● Conclusions 2 The Problem ● Traditional operating systems provide both protection and resource management ● Will protect processes address space, and manages the systems virtual memory ● Provides abstractions such as processes, files, pipes, sockets,etc ● Problems with this? 3 Exokernel's Solution ● The problem is that OS designers must try and accommodate every possible use of abstractions, and choose general solutions for resource management ● This causes the performance of applications to suffer, since they are restricted to general abstractions, interfaces, and management ● Research has shown that more application control of resources yields better performance 4 (Separation vs. Protection) ● An exokernel solves these problems by separating protection from management ● Creates idea of LibOS, which is like a virtual machine with out isolation ● Moves abstractions (processes, files, virtual memory, filesystems), to the user space inside of LibOSes ● Provides a low level interface as close to the hardware as possible 5 General Design Principles ● Exposes as much as possible of the hardware, most of the time using hardware names, allowing libOSes access to things like individual data blocks ● Provides methods to securely bind to resources, revoke resources, and an abort protocol that the kernel itself
    [Show full text]
  • Message Passing for Programming Languages and Operating Systems
    Master’s Thesis Nr. 141 Systems Group, Department of Computer Science, ETH Zurich Message Passing for Programming Languages and Operating Systems by Martynas Pumputis Supervised by Prof. Dr. Timothy Roscoe, Dr. Antonios Kornilios Kourtis, Dr. David Cock October 7, 2015 Abstract Message passing as a mean of communication has been gaining popu- larity within domains of concurrent programming languages and oper- ating systems. In this thesis, we discuss how message passing languages can be ap- plied in the context of operating systems which are heavily based on this form of communication. In particular, we port the Go program- ming language to the Barrelfish OS and integrate the Go communi- cation channels with the messaging infrastructure of Barrelfish. We show that the outcome of the porting and the integration allows us to implement OS services that can take advantage of the easy-to-use concurrency model of Go. Our evaluation based on LevelDB benchmarks shows comparable per- formance to the Linux port. Meanwhile, the overhead of the messag- ing integration causes the poor performance when compared to the native messaging of Barrelfish, but exposes an easier to use interface, as shown by the example code. i Acknowledgments First of all, I would like to thank Timothy Roscoe, Antonios Kornilios Kourtis and David Cock for giving the opportunity to work on the Barrelfish OS project, their supervision, inspirational thoughts and critique. Next, I would like to thank the Barrelfish team for the discussions and the help. In addition, I would like to thank Sebastian Wicki for the conversations we had during the entire period of my Master’s studies.
    [Show full text]
  • Performance at Scale with Amazon Elasticache
    Performance at Scale with Amazon ElastiCache July 2019 Notices Customers are responsible for making their own independent assessment of the information in this document. This document: (a) is for informational purposes only, (b) represents current AWS product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided “as is” without warranties, representations, or conditions of any kind, whether express or implied. The responsibilities and liabilities of AWS to its customers are controlled by AWS agreements, and this document is not part of, nor does it modify, any agreement between AWS and its customers. © 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved. Contents Introduction .......................................................................................................................... 1 ElastiCache Overview ......................................................................................................... 2 Alternatives to ElastiCache ................................................................................................. 2 Memcached vs. Redis ......................................................................................................... 3 ElastiCache for Memcached ............................................................................................... 5 Architecture with ElastiCache for Memcached ...............................................................
    [Show full text]
  • Operating System Support for Redundant Multithreading
    Operating System Support for Redundant Multithreading Dissertation zur Erlangung des akademischen Grades Doktoringenieur (Dr.-Ing.) Vorgelegt an der Technischen Universität Dresden Fakultät Informatik Eingereicht von Dipl.-Inf. Björn Döbel geboren am 17. Dezember 1980 in Lauchhammer Betreuender Hochschullehrer: Prof. Dr. Hermann Härtig Technische Universität Dresden Gutachter: Prof. Frank Mueller, Ph.D. North Carolina State University Fachreferent: Prof. Dr. Christof Fetzer Technische Universität Dresden Statusvortrag: 29.02.2012 Eingereicht am: 21.08.2014 Verteidigt am: 25.11.2014 FÜR JAKOB *† 15. Februar 2013 Contents 1 Introduction 7 1.1 Hardware meets Soft Errors 8 1.2 An Operating System for Tolerating Soft Errors 9 1.3 Whom can you Rely on? 12 2 Why Do Transistors Fail And What Can Be Done About It? 15 2.1 Hardware Faults at the Transistor Level 15 2.2 Faults, Errors, and Failures – A Taxonomy 18 2.3 Manifestation of Hardware Faults 20 2.4 Existing Approaches to Tolerating Faults 25 2.5 Thesis Goals and Design Decisions 36 3 Redundant Multithreading as an Operating System Service 39 3.1 Architectural Overview 39 3.2 Process Replication 41 3.3 Tracking Externalization Events 42 3.4 Handling Replica System Calls 45 3.5 Managing Replica Memory 49 3.6 Managing Memory Shared with External Applications 57 3.7 Hardware-Induced Non-Determinism 63 3.8 Error Detection and Recovery 65 4 Can We Put the Concurrency Back Into Redundant Multithreading? 71 4.1 What is the Problem with Multithreaded Replication? 71 4.2 Can we make Multithreading
    [Show full text]
  • High Velocity Kernel File Systems with Bento
    High Velocity Kernel File Systems with Bento Samantha Miller, Kaiyuan Zhang, Mengqi Chen, and Ryan Jennings, University of Washington; Ang Chen, Rice University; Danyang Zhuo, Duke University; Thomas Anderson, University of Washington https://www.usenix.org/conference/fast21/presentation/miller This paper is included in the Proceedings of the 19th USENIX Conference on File and Storage Technologies. February 23–25, 2021 978-1-939133-20-5 Open access to the Proceedings of the 19th USENIX Conference on File and Storage Technologies is sponsored by USENIX. High Velocity Kernel File Systems with Bento Samantha Miller Kaiyuan Zhang Mengqi Chen Ryan Jennings Ang Chen‡ Danyang Zhuo† Thomas Anderson University of Washington †Duke University ‡Rice University Abstract kernel-level debuggers and kernel testing frameworks makes this worse. The restricted and different kernel programming High development velocity is critical for modern systems. environment also limits the number of trained developers. This is especially true for Linux file systems which are seeing Finally, upgrading a kernel module requires either rebooting increased pressure from new storage devices and new demands the machine or restarting the relevant module, either way on storage systems. However, high velocity Linux kernel rendering the machine unavailable during the upgrade. In the development is challenging due to the ease of introducing cloud setting, this forces kernel upgrades to be batched to meet bugs, the difficulty of testing and debugging, and the lack of cloud-level availability goals. support for redeployment without service disruption. Existing Slow development cycles are a particular problem for file approaches to high-velocity development of file systems for systems.
    [Show full text]