Enhancing the Programmability of Cloud Object Storage

Total Page:16

File Type:pdf, Size:1020Kb

Enhancing the Programmability of Cloud Object Storage ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech ADVERTIMENT. L'accés als continguts d'aquesta tesi doctoral i la seva utilització ha de respectar els drets de la persona autora. Pot ser utilitzada per a consulta o estudi personal, així com en activitats o materials d'investigació i docència en els termes establerts a l'art. 32 del Text Refós de la Llei de Propietat Intel·lectual (RDL 1/1996). Per altres utilitzacions es requereix l'autorització prèvia i expressa de la persona autora. En qualsevol cas, en la utilització dels seus continguts caldrà indicar de forma clara el nom i cognoms de la persona autora i el títol de la tesi doctoral. No s'autoritza la seva reproducció o altres formes d'explotació efectuades amb finalitats de lucre ni la seva comunicació pública des d'un lloc aliè al servei TDX. Tampoc s'autoritza la presentació del seu contingut en una finestra o marc aliè a TDX (framing). Aquesta reserva de drets afecta tant als continguts de la tesi com als seus resums i índexs. ADVERTENCIA. El acceso a los contenidos de esta tesis doctoral y su utilización debe respetar los derechos de la persona autora. Puede ser utilizada para consulta o estudio personal, así como en actividades o materiales de investigación y docencia en los términos establecidos en el art. 32 del Texto Refundido de la Ley de Propiedad Intelectual (RDL 1/1996). Para otros usos se requiere la autorización previa y expresa de la persona autora. En cualquier caso, en la utilización de sus contenidos se deberá indicar de forma clara el nombre y apellidos de la persona autora y el título de la tesis doctoral. No se autoriza su reproducción u otras formas de explotación efectuadas con fines lucrativos ni su comunicación pública desde un sitio ajeno al servicio TDR. Tampoco se autoriza la presentación de su contenido en una ventana o marco ajeno a TDR (framing). Esta reserva de derechos afecta tanto al contenido de la tesis como a sus resúmenes e índices. WARNING. Access to the contents of this doctoral thesis and its use must respect the rights of the author. It can be used for reference or private study, as well as research and learning activities or materials in the terms established by the 32nd article of the Spanish Consolidated Copyright Act (RDL 1/1996). Express and previous authorization of the author is required for any other uses. In any case, when using its content, full name of the author and title of the thesis must be clearly indicated. Reproduction or other forms of for profit use or public communication from outside TDX service is not allowed. Presentation of its content in a window or frame external to TDX (framing) is not authorized either. These rights affect both the content of the thesis and its abstracts and indexes. UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Enhancing the Programmability of Cloud Object Storage JOSEP SAMPÉ DOMENECH DOCTORAL THESIS 2018 UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Josep Sampé Domenech Enhancing the Programmability of Cloud Object Storage Doctoral Thesis Advised by Dr. Pedro García López Dr. Marc Sánchez Artigas Department of Computer Engineering and Mathematics Tarragona 2018 UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech FAIG CONSTAR que aquest treball, titulat “Enhancing the Programmability of Cloud Object Storage”, que presenta Josep Sampé Domenech per a l’obtenció del títol de Doctor, ha estat realitzat sota la meva direcció al Departament d’Enginyeria Informàtica i Matemàtiques d’aquesta universitat i que compleix els requeriments per poder optar a la Menció Internacional. HAGO CONSTAR que el presente trabajo, titulado “Enhancing the Programma- bility of Cloud Object Storage”, que presenta Josep Sampé Domenech para la obtención del título de Doctor, ha sido realizado bajo mi dirección en el Departa- mento de Ingeniería Informática y Matemáticas de esta universidad y que cumple los requisitos para poder optar a la Mención Internacional. I STATE that the present study, entitled “Enhancing the Programmability of Cloud Object Storage”, presented by Josep Sampé Domenech for the award of the degree of Doctor, has been carried out under my supervision at the Department of Computer Engineering and Mathematics of this university, and that it fulfills all the requirements to be eligible for the International Doctorate Award. Tarragona, 20 de Setembre/20 de Septiembre/September 20, 2018 Els directors de la tesi doctoral Los directores de la tesis doctoral Doctoral thesis supervisors Dr. Pedro García López Dr. Marc Sánchez Artigas UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Acknowledgements This dissertation has been written during my time at the “Arquitectures i Serveis Telemàtics (AST)” research group at Universitat Rovira i Virgili. As such, in first place I would like to thank my advisors Dr. Pedro García López and Dr. Marc Sánchez Artigas their endless patience, help and advise through my formation as a researcher. Without their teamwork as advisors and their open-minded perspective of what is doing research, I would not have been able of completing this thesis. Also, I thank to all the people in the AST research group who shared these years with me. Specially, Raúl Gracia Tinedo for his help and guidance, and Gerard París, for the moments that we enjoyed together in the lab. Secondly, I would like to thank Ofer Biran, Gil Vernik and Dalit Naor for accepting me as an intern in their research group at IBM Haifa Research Labs. The months I spent in Israel were not only fruitful from a professional perspective, but also very enriching from a personal viewpoint. Last but not least, my deepest appreciation goes to my family and friends. I would like to specially thank my parents Amado and Teresa, my brother Xavi, his wife Montse, and the new member of my family, my nephew Aran, for their love and support. Josep Sampé Domenech Vilalba dels Arcs, September 2018 UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech This work has been partially funded by the European Union Horizon 2020 Framework Programme, in the context of the project IOStack: Software-defined Storage for Big Data (H2020-644182), and by the Spanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R). UNIVERSITAT ROVIRA I VIRGILI ENHANCING THE PROGRAMMABILITY OF CLOUD OBJECT STORAGE Josep Sampé Domenech Abstract In a world that is increasingly dependent on technology, digital data is generated in an unprecedented way. This makes companies that require large storage space, such as Netflix or Dropbox, use cloud storage solutions where data is remotely maintained, managed, and backed up, in an easy and cheap way. Particularly, cloud object stores are widely adopted and increasingly used for storing these huge amounts of data. This is mainly thanks to their built-in characteristics, such as simplicity, scalability and high-availability. Moreover, the evolution of cloud computing, in what refers, for example, to data analysis, make cloud object stores an important actor in today’s cloud ecosystem. However, cloud object stores face three main challenges: 1) Flexible management of multi-tenant workloads. Commonly, cloud object stores are multi-tenant sys- tems, meaning that all tenants share the same system resources, which could lead to interference problems. Furthermore, it is now complex to manage heteroge- neous storage policies in a massive scale. 2) Data self-management. Cloud object stores themselves do not offer much flexibility regarding data self-management by tenants. Typically, they are rigid, non-programmable systems, which prevent tenants to handle the specific requirements of their objects. 3) Elastic compu- tation close to the data. Placing computations close to the data in the storage system can be useful to reduce data transfers. But, the challenge here is how to achieve elasticity in those computations without provoking resource contention and interferences in the storage layer. In this thesis, we present three novel research contributions that solve the afore- mentioned challenges. Firstly, we introduce the first Software-defined Storage (SDS) architecture for cloud object stores that separates the control plane from the data plane, allowing to manage multi-tenant workloads in a flexible and dynamic way. For example, by applying different service levels of bandwidth to different tenants. Secondly, we designed a novel policy abstraction called microcontroller that transforms common objects into smart objects, enabling tenants to programmatically manage their behavior. For example, a content-level access control microcontroller attached to an specific object to filter its content depending on who is accessing it. Finally, we present the first elastic data-driven serverless computing platform that
Recommended publications
  • Respecting the Block Interface – Computational Storage Using Virtual Objects
    Respecting the block interface – computational storage using virtual objects Ian F. Adams, John Keys, Michael P. Mesnier Intel Labs Abstract immediately obvious, and trade-offs need to be made. Unfor- tunately, this often means that the parallel bandwidth of all Computational storage has remained an elusive goal. Though drives within a storage server will not be available. The same minimizing data movement by placing computation close to problem exists within a single SSD, as the internal NAND storage has quantifiable benefits, many of the previous at- flash bandwidth often exceeds that of the storage controller. tempts failed to take root in industry. They either require a Enter computational storage, an oft-attempted approach to departure from the widespread block protocol to one that address the data movement problem. By keeping computa- is more computationally-friendly (e.g., file, object, or key- tion physically close to its data, we can avoid costly I/O. Var- value), or they introduce significant complexity (state) on top ious designs have been pursued, but the benefits have never of the block protocol. been enough to justify a new storage protocol. In looking We participated in many of these attempts and have since back at the decades of computational storage research, there concluded that neither a departure from nor a significant ad- is a common requirement that the block protocol be replaced dition to the block protocol is needed. Here we introduce a (or extended) with files, objects, or key-value pairs – all of block-compatible design based on virtual objects. Like a real which provide a convenient handle for performing computa- object (e.g., a file), a virtual object contains the metadata that tion.
    [Show full text]
  • Distributing an SQL Query Over a Cluster of Containers
    2019 IEEE 12th International Conference on Cloud Computing (CLOUD) Distributing an SQL Query Over a Cluster of Containers David Holland∗ and Weining Zhang† Department of Computer Science, University of Texas at San Antonio Email: ∗[email protected], †[email protected] Abstract—Emergent software container technology is now across a cluster of containers in a cloud. A feasibility study available on any cloud and opens up new opportunities to execute of this with performance analysis is reported in this paper. and scale data intensive applications wherever data is located. A containerized query (henceforth CQ) uses a deployment However, many traditional relational databases hosted on clouds have not scaled well. In this paper, a framework and deployment methodology that is unique to each query with respect to the methodology to containerize relational SQL queries is presented, number of containers and their networked topology effecting so that, a single SQL query can be scaled and executed by data flows. Furthermore a CQ can be scaled at run-time a network of cooperating containers, achieving intra-operator by adding more intra-operators. In contrast, the traditional parallelism and other significant performance gains. Results of distributed database query deployment configurations do not container prototype experiments are reported and compared to a real-world RDBMS baseline. Preliminary result on a research change at run-time, i.e., they are static and applied to all cloud shows up to 3-orders of magnitude performance gain for queries. Additionally, traditional distributed databases often some queries when compared to running the same query on a need to rewrite an SQL query to optimize performance.
    [Show full text]
  • Hypervisors Vs. Lightweight Virtualization: a Performance Comparison
    2015 IEEE International Conference on Cloud Engineering Hypervisors vs. Lightweight Virtualization: a Performance Comparison Roberto Morabito, Jimmy Kjällman, and Miika Komu Ericsson Research, NomadicLab Jorvas, Finland [email protected], [email protected], [email protected] Abstract — Virtualization of operating systems provides a container and alternative solutions. The idea is to quantify the common way to run different services in the cloud. Recently, the level of overhead introduced by these platforms and the lightweight virtualization technologies claim to offer superior existing gap compared to a non-virtualized environment. performance. In this paper, we present a detailed performance The remainder of this paper is structured as follows: in comparison of traditional hypervisor based virtualization and Section II, literature review and a brief description of all the new lightweight solutions. In our measurements, we use several technologies and platforms evaluated is provided. The benchmarks tools in order to understand the strengths, methodology used to realize our performance comparison is weaknesses, and anomalies introduced by these different platforms in terms of processing, storage, memory and network. introduced in Section III. The benchmark results are presented Our results show that containers achieve generally better in Section IV. Finally, some concluding remarks and future performance when compared with traditional virtual machines work are provided in Section V. and other recent solutions. Albeit containers offer clearly more dense deployment of virtual machines, the performance II. BACKGROUND AND RELATED WORK difference with other technologies is in many cases relatively small. In this section, we provide an overview of the different technologies included in the performance comparison.
    [Show full text]
  • Erlang on Physical Machine
    on $ whoami Name: Zvi Avraham E-mail: [email protected] /ˈkɒm. pɑː(ɹ)t. mɛntl̩. aɪˌzeɪ. ʃən/ Physicalization • The opposite of Virtualization • dedicated machines • no virtualization overhead • no noisy neighbors – nobody stealing your CPU cycles, IOPS or bandwidth – your EC2 instance may have a Netflix “roommate” ;) • Mostly used by ARM-based public clouds • also called Bare Metal or HPC clouds Sandbox – a virtual container in which untrusted code can be safely run Sandbox examples: ZeroVM & AWS Lambda based on Google Native Client: A Sandbox for Portable, Untrusted x86 Native Code Compartmentalization in terms of Virtualization Physicalization No Virtualization Virtualization HW-level Virtualization Containerization OS-level Virtualization Sandboxing Userspace-level Virtualization* Cloud runs on virtual HW HARDWARE Does the OS on your Cloud instance still supports floppy drive? $ ls /dev on Ubuntu 14.04 AWS EC2 instance • 64 teletype devices? • Sound? • 32 serial ports? • VGA? “It’s DUPLICATED on so many LAYERS” Application + Configuration process* OS Middleware (Spring/OTP) Container Managed Runtime (JVM/BEAM) VM Guest Container OS Container Guest OS Hypervisor Hardware We run Single App per VM APPS We run in Single User mode USERS Minimalistic Linux OSes • Embedded Linux versions • DamnSmall Linux • Linux with BusyBox Min. Linux OSes for Containers JeOS – “Just Enough OS” • CoreOS • RancherOS • RedHat Project Atomic • VMware Photon • Intel Clear Linux • Hyper # of Processes and Threads per OS OSv + CLI RancherOS processes CoreOS threads
    [Show full text]
  • IDC Marketscape IDC Marketscape: Worldwide Object-Based Storage 2019 Vendor Assessment
    IDC MarketScape IDC MarketScape: Worldwide Object-Based Storage 2019 Vendor Assessment Amita Potnis THIS IDC MARKETSCAPE EXCERPT FEATURES SCALITY IDC MARKETSCAPE FIGURE FIGURE 1 IDC MarketScape Worldwide Object-Based Storage Vendor Assessment Source: IDC, 2019 December 2019, IDC #US45354219e Please see the Appendix for detailed methodology, market definition, and scoring criteria. IN THIS EXCERPT The content for this excerpt was taken directly from IDC MarketScape: Worldwide Object-Based Storage 2019 Vendor Assessment (Doc # US45354219). All or parts of the following sections are included in this excerpt: IDC Opinion, IDC MarketScape Vendor Inclusion Criteria, Essential Guidance, Vendor Summary Profile, Appendix and Learn More. Also included is Figure 1. IDC OPINION The storage market has come a long way in terms of understanding object-based storage (OBS) technology and actively adopting it. It is a common practice for OBS to be adopted for secondary and cold storage needs at scale. Over the recent years, OBS has proven its ability to scale to tens and hundreds of petabytes and is now maturing to support newer workloads such as unstructured data analytics, IoT, AI/ML/DL, and so forth. As the price of flash declines and the data sets continue to grow, the need for analyzing the data is on the rise. Moving data sets from an object store to a high- performance tier for analysis is a thing of the past. Many vendors are enhancing their object offerings to include a flash tier or are bringing all-flash array object storage offerings to the market today. In this IDC MarketScape, IDC assesses the present commercial OBS supplier (suppliers that deliver software-defined OBS solutions as software or appliances much like other storage platforms) landscape.
    [Show full text]
  • An Object Storage Solution for Data Archive Using Supermicro SSG-5018D8-AR12L and Openio SDS
    CONTENTS WHITE PAPER 2 NEW CHALLENGES AN OBJECT STORAGE 2 DATA ARCHIVE WITHOUT COMPROMISE SOLUTION FOR DATA 3 OBJECT STORAGE FOR DATA ARCHIVE 4 SSG-5018D8-AR12L AND OPENIO ARCHIVE USING Power Efficiency at its Core Extreme Storage Density SUPERMICRO SSG-5018D8- Robust Networking 5 OPENIO, AN OPEN SOURCE OBJECT AR12L AND OPENIO SDS STORAGE SOLUTION 6 SUPERMICRO SSG-5018D8-AR12L AND OPENIO COMBINED SOLUTION 7 TEST RESULTS 11 CONCLUSIONS White Paper October 2016 Rodolfo Campos Sr. Product Marketing Specialist Super Micro Computer, Inc. 980 Rock Avenue San Jose, CA 95131 USA www.supermicro.com Intel Inside®. Powerful Data Center Outside. WHITE PAPER An Object Storage Solution For Data Archive using Supermicro SSG-5018D8-AR12L and OpenIO SDS NEW CHALLENGES Every minute - 2.5 million messages on Facebook are sent, nearly 430K tweets are posted, 67K photos on Instagram and over 5 million YouTube videos are uploaded. ~50% These are a few examples of how the Big Data market is growing 9 times faster than Digital Storage the traditional IT market. According to IDC reports, in 2012 the World created 4.4 Annual Growth zettabytes of digital data and is estimated to create 44 zettabytes of data by 2020. Cold Storage These large volumes of digital data are being created, shared, and stored on the cloud. As a result, data storage demands are reaching new limits and are in need Capacity of new requirements. Thus, data storage, data movement, and data analytics applications need a new storage platform to keep up with the greater capacity and scaling demands it brings.
    [Show full text]
  • Presto: the Definitive Guide
    Presto The Definitive Guide SQL at Any Scale, on Any Storage, in Any Environment Compliments of Matt Fuller, Manfred Moser & Martin Traverso Virtual Book Tour Starburst presents Presto: The Definitive Guide Register Now! Starburst is hosting a virtual book tour series where attendees will: Meet the authors: • Meet the authors from the comfort of your own home Matt Fuller • Meet the Presto creators and participate in an Ask Me Anything (AMA) session with the book Manfred Moser authors + Presto creators • Meet special guest speakers from Martin your favorite podcasts who will Traverso moderate the AMA Register here to save your spot. Praise for Presto: The Definitive Guide This book provides a great introduction to Presto and teaches you everything you need to know to start your successful usage of Presto. —Dain Sundstrom and David Phillips, Creators of the Presto Projects and Founders of the Presto Software Foundation Presto plays a key role in enabling analysis at Pinterest. This book covers the Presto essentials, from use cases through how to run Presto at massive scale. —Ashish Kumar Singh, Tech Lead, Bigdata Query Processing Platform, Pinterest Presto has set the bar in both community-building and technical excellence for lightning- fast analytical processing on stored data in modern cloud architectures. This book is a must-read for companies looking to modernize their analytics stack. —Jay Kreps, Cocreator of Apache Kafka, Cofounder and CEO of Confluent Presto has saved us all—both in academia and industry—countless hours of work, allowing us all to avoid having to write code to manage distributed query processing.
    [Show full text]
  • Unlock Modern Backup and Recovery with Object Storage
    Unlock Modern Backup and Recovery With Object Storage E-BOOK It’s a New Era for Data Protection In this age of digital transformation, data is growing at an astounding rate, posing challenges for data protection, long-term data retention and adherence Annual Size of the Global Datasphere 175 ZB to business and regulatory compliance. In just the 180 last two years, 90% of the world’s data has been 160 created by computers, mobile phones, email, social 140 media, IoT smart devices, connected cars and other 120 100 devices constantly generating streams of data1. 80 Zettabytes 60 IDC predicts that the data we generate, collect, and consume will 40 continue to increase significantly—from about 50 zettabytes in 2020 to 175 zettabytes in 2025. With massive collections of data 20 0 amounting, trying to keep pace with the performance required to 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 protect and recover it quickly, while scaling cost effectively, has Years become a major pain point for the enterprise. 1 Jeremy Waite, IBM ‘10 Key Marketing Trends for 2017’ Global Data Growth Prediction 50 zettabytes in 2020 175 zettabytes in 2025 1 zettabyte (ZB) = 175 ZB of data stored on DVDs would circle the 1 trillion gigabytes earth 222 times. 1 Redefine Data Protection Requirements How will your current backup, recovery and long-term retention strategy scale as data reaches uncharted volumes? Consider that for every terabyte of primary data stored, it is not uncommon to store three or more terabytes of secondary data copies for protection, high availability and compliance purposes.
    [Show full text]
  • Retrospect Backup for Windows Data Protection for Businesses
    DATA SHEET | RETROSPECT FOR WINDOWS Retrospect Backup for Windows Data Protection for Businesses NEW Retrospect Backup 17: Automatic Onboarding, Nexsan E-Series/Unity Certification, 10x Faster ProactiveAI, Restore Preflight Retrospect Backup protects over 100 Petabytes in over 500,000 homes and businesses in over 100 countries. With broad platform and application support, Retrospect protects every part of your computer environment, on-site and in the cloud. Start your first backup with one click. Complete Data Protection With cross platform support for Windows, Mac, and Linux, Retrospect offers business backup with system recovery, local backup, long-term retention, along with centralized management, end-to-end security, email protection, and extensive customization–all at an affordable price for a small business. Cloud Backup Theft and disaster have always been important reasons for offsite backups, but now, ransomware is the most powerful. Ransomware will encrypt the file share just like any other file. Retrospect integrates with over a dozen cloud storage providers for offsite backups, connects securely to prevent access from malware and ransomware, and lets you transfer local backups to it in a couple clicks. Full System Recovery Systems need the same level of protection as individual files. It takes days to recreate an operating system from scratch, with specific operating system versions, system state, application installations and settings, and user preferences. Retrospect does full system backup and 100 recovery for your entire environment. Petabytes File and System Migration Retrospect offers built-in migration for files, folders, or entire bootable systems, including extended attributes and ACLs, with extensive options for which files to replace if source and 500,000 destination overlap.
    [Show full text]
  • Openio SDS on ARM a Practical and Cost-Effective Object Storage Infrastructure Based on Soyoustart Dedicated ARM Servers
    OpenIO SDS on ARM A practical and cost-effective object storage infrastructure based on SoYouStart dedicated ARM servers. Copyright © 2017 OpenIO SAS All Rights Reserved. Restriction on Disclosure and Use of Data 30 September 2017 "1 OpenIO OpenIO SDS on ARM 30/09/2017 Table of Contents Introduction 3 Benchmark Description 4 1. Architecture 4 2. Methodology 5 3. Benchmark Tool 5 Results 7 1. 128KB objects 7 Disk and CPU metrics (on 48 nodes) 8 2. 1MB objects 9 Disk and CPU metrics (on 48 nodes) 10 5. 10MB objects 11 Disk and CPU metrics (on 48 nodes) 12 Cluster Scalability 13 Total disks IOps 13 Conclusion 15 2 OpenIO OpenIO SDS on ARM 30/09/2017 Introduction In this white paper, OpenIO will demonstrate how to use its SDS Object Storage platform with dedicated SoYouStart ARM servers to build a flexible private cloud. This S3-compatible storage infrastructure is ideal for a wide range of uses, offering full control over data, but without the complexity found in other solutions. OpenIO SDS is a next-generation object storage solution with a modern, lightweight design that associates flexibility, efficiency, and ease of use. It is open source software, and it can be installed on ARM and x86 servers, making it possible to build a hyper scalable storage and compute platform without the risk of lock-in. It offers excellent TCO for the highest and fastest ROI. Object storage is generally associated with large capacities, and its benefts are usually only visible in large installations. But thanks to characteristics of OpenIO SDS, this next-generation object storage solution can be cost effective even with the smallest of installations.
    [Show full text]
  • Next-Gen Object Storage and Serverless Computing Agenda
    Next-Gen Object Storage and Serverless Computing Agenda 1 About OpenIO 2 SDS: Next-gen Object storage 3 Grid for Apps: Serverless Computing OpenIO About OpenIO OpenIO An experienced team, a robust and mature technology 2017 2016 2006 2007 2009 2012 2014 2015 Global launch OpenIO SDS Idea & 1st Design Production ready! Open 15 PBs OpenIO First release concept dev starts sourced customer company 1st massive prod success over 1PB Lille (FR) | San Francisco | Tokyo OpenIO Quickly Growing Across Geographies And Vertical Markets 35 3 25+ 2 Employees Continents Customers Solutions Mostly engineers, Hem (Lille, France), Paris, Installations ranging OpenIO SDS, next-gen support and pre-sales Tokyo, San Francisco from 3 nodes up to 60 object storage Growing fast Petabytes and billions of objects Teams across EMEA, Grid for Apps, serverless Japan and, soon, US computing framework OpenIO Customers Large Telco Provider Email storage Small objects, high scalability 65 Mln 15 PB mail boxes of storage 650 10 Bln nodes objects 20K services online OpenIO DailyMotion Media & Entertainment High throughput and capacity, 60 PB 80 Mln fat x86 nodes of OpenioIO SDS videos 30% 3 Bln growing/year views per month OpenIO Japanese ISP High Speed Object Storage High number of transactions 6000 10+10 on SSDs Emails per second All-flash nodes 2-sites Indexing Async replication With Grid for Apps OpenIO Teezily E-commerce Website On-premises S3, migrated from Amazon AWS Private Cloud Storage on Public Infrastructure, 350TB 10Bln Cost effective Very small files Objects
    [Show full text]
  • Containerized SQL Query Evaluation in a Cloud
    Containerized SQL Query Evaluation in a Cloud Dr. Weining Zhang and David Holland Department of Computer Science The University of Texas at San Antonio {Weining.Zhang, david.holland}@utsa.edu Abstract—Recent advance in cloud computing and light- management system (DBMS) inside a virtual ma- weight software container technology opens up opportu- chine (VM). The user must administer most sys- nities to execute data intensive applications where data tem management activities, including software li- is located. Noticeably, current database services offered censing, installation, configuration, management, on cloud platforms have not fully utilized container tech- nologies. In this paper we present an architecture of a backup, and recovery. This option is difficult to cloud-based, relational database as a service (DBaaS) that scale. Two) Use a NoSQL database, e.g. Apache can package and deploy a query evaluation plan using HBase[1], Google BigQuery[2], and Cassandra[3]. light-weight container technology and the underlying cloud These services are highly efficient and scalable for storage system. We then focus on an example of how a some types of large data, but the user is forced to select-join-project-order query can be containerized and deal with the lack of structured data and strong deployed in ZeroCloud. Our preliminary experimental results confirm that a containerized query can achieve data integrity that is present in relational models. a high degree of elasticity and scalability, but effective Three) Use a managed SQL database service, e.g. mechanisms are needed to deal with data skew. Amazon RDS[4], Google Cloud SQL[5], and MS Index Terms—Database, DBaaS, query evaluation, soft- Azure SQL[6].
    [Show full text]