SONIC: Application-Aware Data Passing for Chained Serverless Applications

Total Page:16

File Type:pdf, Size:1020Kb

SONIC: Application-Aware Data Passing for Chained Serverless Applications SONIC: Application-aware Data Passing for Chained Serverless Applications Ashraf Mahgoub(a), Karthick Shankar(b), Subrata Mitra(c), Ana Klimovic(d), Somali Chaterji(a), Saurabh Bagchi(a) (a) Purdue University, (b) Carnegie Mellon University, (c) Adobe Research, (d) ETH Zurich Abstract and video analytics [6,27]. Accordingly, major cloud comput- ing providers recently introduced their serverless workflow Data analytics applications are increasingly leveraging services such as AWS Step Functions [5], Azure Durable serverless execution environments for their ease-of-use and Functions [48], and Google Cloud Composer [29], which pay-as-you-go billing. Increasingly, such applications are provide easier design and orchestration for serverless work- composed of multiple functions arranged in some workflow. flow applications. Serverless workflows are composed of a However, the current approach of exchanging intermediate sequence of execution stages, which can be represented as (ephemeral) data between functions through remote storage a directed acyclic graph (DAG) [21, 56]. DAG nodes corre- (such as S3) introduces significant performance overhead. spond to serverless functions (or ls1) and edges represent the We show that there are three alternative data-passing meth- flow of data between dependent ls (e.g., our video analytics ods, which we call VM-Storage, Direct-Passing, and state-of- application DAG is shown in Fig. 1). practice Remote-Storage. Crucially, we show that no single Exchanging intermediate data between serverless functions data-passing method prevails under all scenarios and the op- has been identified as a major challenge in serverless work- timal choice depends on dynamic factors such as the size flows [39, 41, 50]. The reason is that, by design, IP addresses of input data, the size of intermediate data, the application’s and port numbers of individual ls are not exposed to users, degree of parallelism, and network bandwidth. We propose making direct point-to-point communication difficult. More- SONIC, a data-passing manager that optimizes application per- over, serverless platforms provide no guarantees for the over- formance and cost, by transparently selecting the optimal data- lap in time between the executions of the parent (sending) and passing method for each edge of a serverless workflow DAG child (receiving) functions. Hence, the state-of-practice tech- and implementing communication-aware function placement. nique for data passing between serverless functions is saving SONIC monitors application parameters and uses simple re- and loading data through remote storage (e.g., S3). Although gression models to adapt its hybrid data passing accordingly. passing intermediate data through remote storage has the ben- We integrate SONIC with OpenLambda and evaluate the sys- efit that it makes for a clear separation between compute and tem on Amazon EC2 with three analytics applications, popu- storage resources, it adds significant performance overheads, lar in the serverless environment. SONIC provides lower la- especially for data-intensive applications [50]. For example, tency (raw performance) and higher performance/$ across di- Pu et al. [56] show that running the CloudSort benchmark verse conditions, compared to four different baselines: SAND with 100TB of data on AWS Lambda with S3 remote stor- [UsenixATC-18], Vanilla OpenLambda [HotCloud-16], Open- age, can be up to 500× slower than running on a cluster of Lambda integrated with Pocket [OSDI-18], and AWS Lambda VMs. Our own experiment with a machine learning pipeline (state of practice). that has a simple linear workflow shows that passing data 1 Introduction through remote storage takes over 75% of the execution time Serverless computing platforms provide on-demand scala- (Fig. 2, fanout = 1). Previous approaches target reducing this bility and fine-grained allocation of resources. In this comput- overhead, either by replacing disk-based storage (e.g., S3) by ing model, the cloud provider runs the servers and manages memory-based storage (e.g., ElastiCache Redis), or by com- all the administrative tasks (e.g., scaling, capacity planning, bining different storage media (e.g., DRAM, SSD, NVMe) etc.), while users focus on their application logic. Due to to match application needs [15, 42, 56]. However, these ap- its significant advantages, serverless computing is becoming proaches still require passing the data over the network mul- increasingly popular for complex workflows such as data pro- cessing pipelines [40,56], machine learning pipelines [14,59], 1For simplicity, we denote a serverless function as lambda or l. 1 Start DAG parameters Long-Term Storage (S3) Video_input_size = 15 MB Split_Video (λퟏퟏ) Memory = 40 MB, Exec-time = 2 Sec Video_chunk_size = 1 MB Extract Frame (λퟐퟏ) ….... Extract Frame (λퟐ푵) Memory = 55 MB, Exec-time = 0.3 Sec Frame_size = 0.1 MB Classify Frame (λퟑퟏ) ….... Classify Frame (λퟑ푵) Memory = 500 MB, Exec-time = 0.7 Sec Figure 2: Execution time comparison with Remote-Storage, VM-Storage, Long-Term Storage (S3) Fan-out degree (N) = 19 and Direct-Passing for the LightGBM application with Fanout = 1, 3, 12. The best data-passing method differs in every case. End Figure 1: DAG overview (DAG definition provided by the user and our bandwidth. SONIC adapts its decision dynamically to changes profiled DAG parameters) for Video Analytics application. in these parameters in an application-specific manner. The tiple times, adding significant latency to the job’s execution workflow of SONIC is shown in Fig.3. time. Moreover, in-memory storage services are much more We note that a locally optimized data-passing decision for expensive than disk-based storage (e.g., ElastiCache Redis a given stage can become sub-optimal for the entire DAG. As costs 700x compared to S3 per unit data volume [56]). an example, placing two functions together in the same VM In our work, we show that the cloud provider can imple- and using VM-Storage will reduce the data passing time be- ment communication-aware placement of lambdas by opti- tween these two stages but if the second function generates a mizing data exchange between chained functions in DAGs. large volume of intermediate data then the cost of transporting For instance, the cloud provider can leverage data locality by that data to a different VM may be high. Therefore, SONIC de- scheduling the sending and receiving functions on the same velops a Viterbi-based algorithm to avoid greedy data-passing VM, while preserving local disk state between the two invo- decisions and provide optimized latency and cost across the cations. This data-passing mechanism, which we refer to as entire DAG. SONIC abstracts its hybrid data-passing selec- VM-Storage, minimizes data exchange latency but imposes tion and provides the user with a simple file-based API. With constraints on where the lambdas can be scheduled. Alter- SONIC ’s API, users always read and write data as files to nately, the cloud provider can enable data exchange between storage that appear local to the user. SONIC is designed to be functions on different VMs by directly copying intermediate integrated with a cluster resource manager (e.g., Protean [34]) data between the VMs that host the sending and receiving that assigns VM requests to the physical hardware and op- lambdas. This data-passing mechanism, which we refer to timizes provider-centric metrics such as resource utilization as Direct-Passing, requires copying intermediate data ele- and load balancing (Fig4). ments once, serving as a middleground between VM-Storage As our approach requires integration with a serverless plat- (which requires no data copying) and Remote Storage (which form (and commercial platforms do not allow for such im- requires two copies of the data). Each data-passing mecha- plementation), we integrate SONIC with OpenLambda [36]. nism provides a trade-off between latency, cost, scalability, We compare SONIC ’s performance using three popular ana- and scheduling flexibility. Crucially, no single mechanism lytics applications to several baselines: AWS-Lambda, with prevails across all serverless applications with different data S3 and ElastiCache-Redis (which can be taken to represent dependencies. For example, while Direct-Passing does not state-of-the-practice); SAND [2]; and OpenLambda with S3 impose strict scheduling constraints, scalability can become and Pocket [42]. Our evaluation shows that SONIC outper- an issue when a large number of receiving functions copy data forms all baselines. SONIC achieves between 34% to 158% simultaneously saturating the VM’s outgoing network band- higher performance/$ (here performance is the inverse of width. Hence, we need a hybrid, fine-grained data-passing latency) over OpenLambda+S3, between 59% to 2.3X over approach that optimizes the data-passing for every edge of OpenLambda+Pocket, and between 1.9X to 5.6X over SAND, the application DAG. a serverless platform that leverages data locality to minimize Our solution: We propose SONIC, a unified API and the execution time. system for inter-lambda data exchange, which adopts a hybrid In summary, our contributions are as follows: approach of the three data-passing methods (VM-Storage, (1) We analyze the trade-offs of three different intermediate Direct-Passing and Remote-Storage) to optimize application data-passing methods (Fig.5) in serverless workflows and performance and cost. SONIC decides the best data-passing show that no single method prevails under all conditions in method for every edge in the application DAG to minimize both latency and cost. This motivates the need for our hybrid data-passing latency and cost. We show that this selection and dynamic approach. can be impacted by several parameters such as the size of (2) We propose SONIC, which provides selection of optimized input, the application’s degree of parallelism, and VM network data-passing methods between any two serverless functions 2 Figure 3: Workflow of SONIC: Users provide a DAG definition and input data files for profiling. SONIC’s online profiler executes the DAG and generates regression models that map the input size to the DAG’s parameters.
Recommended publications
  • Recursive Definitions Across the Footprint That a Straight- Line Program Manipulated, E.G
    natural proofs for verification of dynamic heaps P. MADHUSUDAN UNIV. OF ILLINOIS AT URBANA-CHAMPAIGN (VISITING MSR, BANGALORE) WORKSHOP ON MAKING FORMAL VERIFICATION SCALABLE AND USEABLE CMI, CHENNAI, INDIA JOINT WORK WITH… Xiaokang Qiu .. graduating! Gennaro Andrei Pranav Garg Parlato Stefanescu GOAL: BUILD RELIABLE SOFTWARE Build systems with proven reliability and security guarantees. (Not the same as finding bugs!) Deductive verification with automatic theorem proving • Floyd/Hoare style verification • User supplied modular annotations (pre/post cond, class invariants) and loop invariants • Resulting verification conditions are derived automatically for each linear program segment (using weakest-pre/strongest-post) Verification conditions are then proved valid using mostly automatic theorem proving (like SMT solvers) TARGET: RELIABLE SYS TEMS SOFTWARE Operating Systems Angry birds Browsers Excel VMs Mail clients … App platforms Systems Software Applications Several success stories: • Microsoft Hypervisor verification using VCC [MSR, EMIC] • A secure foundation for mobile apps [@Illinois, ASPLOS’13] A SECURE FOUNDATION FOR MOBILE APPS In collaboration with King’s systems group at Illinois First OS architecture that provides verifiable, high-level abstractions for building mobile applications. Application state is private by default. • Meta-data on files controls access by apps; only app with appropriate permissions can access files • Every page is encrypted before sending to the storage service. • Memory isolation between apps • A page fault handler serves a file- backed page for a process, the fille has to be opened by the same process. • Only the current app can write to the screen buffer. • IPC channel correctness • … VERIFICATION TOOLS ARE MATURE And works! ENOUGH TO BUILD RELIABLE S/W.
    [Show full text]
  • Type 2 Diabetes Prediction Using Machine Learning Algorithms
    Type 2 Diabetes Prediction Using Machine Learning Algorithms Parisa Karimi Darabi*, Mohammad Jafar Tarokh Department of Industrial Engineering, K. N. Toosi University of Technology, Tehran, Iran Abstract Article Type: Background and Objective: Currently, diabetes is one of the leading causes of Original article death in the world. According to several factors diagnosis of this disease is Article History: complex and prone to human error. This study aimed to analyze the risk of Received: 15 Jul 2020 having diabetes based on laboratory information, life style and, family history Revised: 1 Aug 2020 with the help of machine learning algorithms. When the model is trained Accepted: 5 Aug 2020 properly, people can examine their risk of having diabetes. *Correspondence: Material and Methods: To classify patients, by using Python, eight different machine learning algorithms (Logistic Regression, Nearest Neighbor, Decision Parisa Karimi Darabi, Tree, Random Forest, Support Vector Machine, Naive Bayesian, Neural Department of Industrial Network and Gradient Boosting) were analyzed. Were evaluated by accuracy, Engineering, K. N. Toosi University sensitivity, specificity and ROC curve parameters. of Technology, Tehran, Iran Results: The model based on the gradient boosting algorithm showed the best [email protected] performance with a prediction accuracy of %95.50. Conclusion: In the future, this model can be used for diagnosis diabete. The basis of this study is to do more research and develop models such as other learning machine algorithms. Keywords: Prediction, diabetes, machine learning, gradient boosting, ROC curve DOI : 10.29252/jorjanibiomedj.8.3.4 people die every year from diabetes disease. Introduction By 2045, it will reach 629 million (1).
    [Show full text]
  • A Tensor Compiler for Unified Machine Learning Prediction Serving
    A Tensor Compiler for Unified Machine Learning Prediction Serving Supun Nakandala, UC San Diego; Karla Saur, Microsoft; Gyeong-In Yu, Seoul National University; Konstantinos Karanasos, Carlo Curino, Markus Weimer, and Matteo Interlandi, Microsoft https://www.usenix.org/conference/osdi20/presentation/nakandala This paper is included in the Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation November 4–6, 2020 978-1-939133-19-9 Open access to the Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation is sponsored by USENIX A Tensor Compiler for Unified Machine Learning Prediction Serving Supun Nakandalac,∗, Karla Saurm, Gyeong-In Yus,∗, Konstantinos Karanasosm, Carlo Curinom, Markus Weimerm, Matteo Interlandim mMicrosoft, cUC San Diego, sSeoul National University {<name>.<surname>}@microsoft.com,[email protected], [email protected] Abstract TensorFlow [13] combined, and growing faster than both. Machine Learning (ML) adoption in the enterprise requires Acknowledging this trend, traditional ML capabilities have simpler and more efficient software infrastructure—the be- been recently added to DNN frameworks, such as the ONNX- spoke solutions typical in large web companies are simply ML [4] flavor in ONNX [25] and TensorFlow’s TFX [39]. untenable. Model scoring, the process of obtaining predic- When it comes to owning and operating ML solutions, en- tions from a trained model over new data, is a primary con- terprises differ from early adopters in their focus on long-term tributor to infrastructure complexity and cost as models are costs of ownership and amortized return on investments [68]. trained once but used many times. In this paper we propose As such, enterprises are highly sensitive to: (1) complexity, HUMMINGBIRD, a novel approach to model scoring, which (2) performance, and (3) overall operational efficiency of their compiles featurization operators and traditional ML models software infrastructure [14].
    [Show full text]
  • Sonic on Mellanox Spectrum™ Switches
    SOFTWARE PRODUCT BRIEF SONiC on Mellanox Spectrum™ Switches Delivering the promises of data center disaggregation – SONiC – Software for Open Networking in the Cloud, is a Microsoft-driven open-source network operating system (NOS), HIGHLIGHTS delivering a collection of networking software components aimed at – scenarios where simplicity and scale are the highest priority. • 100% free open source software • Easy portability Mellanox Spectrum family of switches combined with SONiC deliver a performant Open Ethernet data center cloud solution with • Uniform Linux interfaces monitoring and diagnostic capabilities. • Fully supported by the Github community • Industry’s highest performance switch THE VALUE OF SONiC systems portfolio, including the high- Built on top of the popular Switch Abstraction Interface (SAI) specification for common APIs, SONiC density half-width Mellanox SN2010 is a fully open-sourced, hardware-agnostic network operating system (NOS), perfect for preventing • Support for home-grown applications vendor lock-in and serving as the ideal NOS for next-generation data centers. SONiC is built over such as telemetry streaming, load industry-leading open source projects and technologies such as Docker for containers, Redis for balancing, unique tunneling, etc. key-value database, or Quagga BGP and LLDP routing protocols. It’s modular containerized design makes it very flexible, enabling customers to tailor SONiC to their needs. For more information on • Support for small to large-scale deployments open-source SONiC development, visit https://github.com/Azure/SONiC. MELLANOX OPEN ETHERNET SOLUTIONS AND SONiC Mellanox pioneered the Open Ethernet approach to network disaggregation. Open Ethernet offers the freedom to use any software on top of any switches hardware, thereby optimizing utilization, efficiency and overall return on investment (ROI).
    [Show full text]
  • Scalable Fault-Tolerant Elastic Data Ingestion in Asterixdb
    UC Irvine UC Irvine Electronic Theses and Dissertations Title Scalable Fault-Tolerant Elastic Data Ingestion in AsterixDB Permalink https://escholarship.org/uc/item/9xv3435x Author Grover, Raman Publication Date 2015 License https://creativecommons.org/licenses/by-nd/4.0/ 4.0 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, IRVINE Scalable Fault-Tolerant Elastic Data Ingestion in AsterixDB DISSERTATION submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in Information and Computer Science by Raman Grover Dissertation Committee: Professor Michael J. Carey, Chair Professor Chen Li Professor Sharad Mehrotra 2015 © 2015 Raman Grover DEDICATION To my wonderful parents... ii TABLE OF CONTENTS Page LIST OF FIGURES vi LIST OF TABLES viii ACKNOWLEDGMENTS ix CURRICULUM VITAE x ABSTRACT OF THE DISSERTATION xi 1 Introduction 1 1.1 Challenges in Data Feed Management . .3 1.1.1 Genericity and Extensibility . .3 1.1.2 Fetch-Once Compute-Many Model . .5 1.1.3 Scalability and Elasticity . .5 1.1.4 Fault Tolerance . .7 1.2 Contributions . .8 1.3 Organization . .9 2 Related Work 10 2.1 Stream Processing Engines . 10 2.2 Data Routing Engines . 11 2.3 Extract Transform Load (ETL) Systems . 12 2.4 Other Systems . 13 2.4.1 Flume . 14 2.4.2 Chukwa . 15 2.4.3 Kafka . 16 2.4.4 Sqoop . 17 2.5 Summary . 17 3 Background and Preliminaries 19 3.1 AsterixDB . 19 3.1.1 AsterixDB Architecture . 20 3.1.2 AsterixDB Data Model . 21 3.1.3 Querying Data .
    [Show full text]
  • Anomaly Detection and Identification Scheme for VM Live Migration in Cloud Infrastructure✩
    Future Generation Computer Systems 56 (2016) 736–745 Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs Anomaly detection and identification scheme for VM live migration in cloud infrastructureI Tian Huang a, Yongxin Zhu a,∗, Yafei Wu a, Stéphane Bressan b, Gillian Dobbie c a School of Microelectronics, Shanghai Jiao Tong University, China b School of Computing, National University of Singapore, Singapore c Department of Computer Science, University of Auckland, New Zealand h i g h l i g h t s • We extend Local Outlier Factors to detect anomalies in time series and adapt to smooth environmental changes. • We propose Dimension Reasoning LOF (DR-LOF) that can point out the ``most anomalous'' dimension of the performance profile. DR-LOF provides clues for administrators to pinpoint and clear the anomalies. • We incorporate DR-LOF and Symbolic Aggregate ApproXimation (SAX) measurement as a scheme to detect and identify the anomalies in the progress of VM live migration. • In experiments, we deploy a typical computing application on small scale clusters and evaluate our anomaly detection and identification scheme by imposing two kinds of common anomalies onto VMs. article info a b s t r a c t Article history: Virtual machines (VM) offer simple and practical mechanisms to address many of the manageability Received 3 March 2015 problems of leveraging heterogeneous computing resources. VM live migration is an important feature Received in revised form of virtualization in cloud computing: it allows administrators to transparently tune the performance 11 May 2015 of the computing infrastructure. However, VM live migration may open the door to security threats.
    [Show full text]
  • List of Section 13F Securities
    List of Section 13F Securities 1st Quarter FY 2004 Copyright (c) 2004 American Bankers Association. CUSIP Numbers and descriptions are used with permission by Standard & Poors CUSIP Service Bureau, a division of The McGraw-Hill Companies, Inc. All rights reserved. No redistribution without permission from Standard & Poors CUSIP Service Bureau. Standard & Poors CUSIP Service Bureau does not guarantee the accuracy or completeness of the CUSIP Numbers and standard descriptions included herein and neither the American Bankers Association nor Standard & Poor's CUSIP Service Bureau shall be responsible for any errors, omissions or damages arising out of the use of such information. U.S. Securities and Exchange Commission OFFICIAL LIST OF SECTION 13(f) SECURITIES USER INFORMATION SHEET General This list of “Section 13(f) securities” as defined by Rule 13f-1(c) [17 CFR 240.13f-1(c)] is made available to the public pursuant to Section13 (f) (3) of the Securities Exchange Act of 1934 [15 USC 78m(f) (3)]. It is made available for use in the preparation of reports filed with the Securities and Exhange Commission pursuant to Rule 13f-1 [17 CFR 240.13f-1] under Section 13(f) of the Securities Exchange Act of 1934. An updated list is published on a quarterly basis. This list is current as of March 15, 2004, and may be relied on by institutional investment managers filing Form 13F reports for the calendar quarter ending March 31, 2004. Institutional investment managers should report holdings--number of shares and fair market value--as of the last day of the calendar quarter as required by Section 13(f)(1) and Rule 13f-1 thereunder.
    [Show full text]
  • Build Reliable Cloud Networks with Sonic and ONE
    Build Reliable Cloud Networks with SONiC and ONE Wei Bai 白巍 Microsoft Research Asia OCP China Technology Day, Shenzhen, China 1 54 100K+ 130+ $15B+ REGIONS WORLDWIDE MILES OF FIBER AND SUBSEA CABLE EDGE SITES Investments Two Open Source Cornerstones for High Reliability Networking OS: SONiC Network Verification: ONE 3 Networking OS: SONiC 4 A Solution to Unblock Hardware Innovation Monitoring, Management, Deployment Tools, Cutting Edge SDN SONiC SONiC SONiC SONiC Switch Abstraction Interface (SAI) Merchant Silicon Switch Abstraction Interface (SAI) NetworkNetwork ApplicationsApplicationsNetwork Applications Simple, consistent, and stable Hello network application stack Switch Abstraction Interface Help consume the underlying complex, heterogeneous частный 你好 नमते Bonjour hardware easily and faster https://github.com/opencomputeproject/SAI 6 SONiC High-Level Architecture Switch State Service (SWSS) • APP DB: persist App objects • SAI DB: persist SAI objects • Orchestration Agent: translation between apps and SAI objects, resolution of dependency and conflict • SyncD: sync SAI objects between software and hardware Key Goal: Evolve components independently 8 SONiC Containerization 9 SONiC Containerization • Components developed in different environments • Source code may not be available • Enables choices on a per- component basis 10 SONiC – Powering Microsoft At Cloud Scale Tier 3 - Regional Spine T3-1 T3-2 T3-3 T3-4 … … … Tier 2 - Spine T2-1-1 T2-1-2 T2-1-8 T2-4-1 T2-4-2 T2-4-4 Features and Roadmap Current: BGP, ECMP, ECN, WRED, LAG, SNMP,
    [Show full text]
  • The Race of Sound: Listening, Timbre, and Vocality in African American Music
    UCLA Recent Work Title The Race of Sound: Listening, Timbre, and Vocality in African American Music Permalink https://escholarship.org/uc/item/9sn4k8dr ISBN 9780822372646 Author Eidsheim, Nina Sun Publication Date 2018-01-11 License https://creativecommons.org/licenses/by-nc-nd/4.0/ 4.0 Peer reviewed eScholarship.org Powered by the California Digital Library University of California The Race of Sound Refiguring American Music A series edited by Ronald Radano, Josh Kun, and Nina Sun Eidsheim Charles McGovern, contributing editor The Race of Sound Listening, Timbre, and Vocality in African American Music Nina Sun Eidsheim Duke University Press Durham and London 2019 © 2019 Nina Sun Eidsheim All rights reserved Printed in the United States of America on acid-free paper ∞ Designed by Courtney Leigh Baker and typeset in Garamond Premier Pro by Copperline Book Services Library of Congress Cataloging-in-Publication Data Title: The race of sound : listening, timbre, and vocality in African American music / Nina Sun Eidsheim. Description: Durham : Duke University Press, 2018. | Series: Refiguring American music | Includes bibliographical references and index. Identifiers:lccn 2018022952 (print) | lccn 2018035119 (ebook) | isbn 9780822372646 (ebook) | isbn 9780822368564 (hardcover : alk. paper) | isbn 9780822368687 (pbk. : alk. paper) Subjects: lcsh: African Americans—Music—Social aspects. | Music and race—United States. | Voice culture—Social aspects— United States. | Tone color (Music)—Social aspects—United States. | Music—Social aspects—United States. | Singing—Social aspects— United States. | Anderson, Marian, 1897–1993. | Holiday, Billie, 1915–1959. | Scott, Jimmy, 1925–2014. | Vocaloid (Computer file) Classification:lcc ml3917.u6 (ebook) | lcc ml3917.u6 e35 2018 (print) | ddc 781.2/308996073—dc23 lc record available at https://lccn.loc.gov/2018022952 Cover art: Nick Cave, Soundsuit, 2017.
    [Show full text]
  • A Decidable Logic for Tree Data-Structures with Measurements
    A Decidable Logic for Tree Data-Structures with Measurements Xiaokang Qiu and Yanjun Wang Purdue University fxkqiu,[email protected] Abstract. We present Dryaddec, a decidable logic that allows reason- ing about tree data-structures with measurements. This logic supports user-defined recursive measure functions based on Max or Sum, and re- cursive predicates based on these measure functions, such as AVL trees or red-black trees. We prove that the logic's satisfiability is decidable. The crux of the decidability proof is a small model property which al- lows us to reduce the satisfiability of Dryaddec to quantifier-free lin- ear arithmetic theory which can be solved efficiently using SMT solvers. We also show that Dryaddec can encode a variety of verification and synthesis problems, including natural proof verification conditions for functional correctness of recursive tree-manipulating programs, legal- ity conditions for fusing tree traversals, synthesis conditions for con- ditional linear-integer arithmetic functions. We developed the decision procedure and successfully solved 220+ Dryaddec formulae raised from these application scenarios, including verifying functional correctness of programs manipulating AVL trees, red-black trees and treaps, check- ing the fusibility of height-based mutually recursive tree traversals, and counterexample-guided synthesis from linear integer arithmetic specifi- cations. To our knowledge, Dryaddec is the first decidable logic that can solve such a wide variety of problems requiring flexible combination of measure-related, data-related and shape-related properties for trees. 1 Introduction Logical reasoning about tree data-structures has been needed in various applica- tion scenarios such as program verification [32,24,26,42,49,4,14], compiler opti- mization [17,18,9,44] and webpage layout engines [30,31].
    [Show full text]
  • Data Intensive Computing with Linq to Hpc Technical
    DATA INTENSIVE COMPUTING WITH LINQ TO HPC TECHNICAL OVERVIEW Ade Miller, Principal Program Manager, LINQ to HPC Agenda Introduction to LINQ to HPC Using LINQ to HPC Systems Management Integrating with Other Data Technologies The Data Spectrum One extreme is analyzing raw, unstructured data. The analyst does not know exactly what the data contains, nor LINQ to HPC what cube would be justified. The analyst needs to do ad-hoc analyses that may never be run again. Another extreme is analytics targeting a traditional data warehouse. The analyst Parallel Data knows the cube he or she wants to build, Warehouse and the analyst knows the data sources. What kind of Data? Large Data Volume . 100s of TBs to 10s of PBs New Questions & Non-Traditional data Types New Technologies New Insights . Unstructured & Semi structured . How popular is my product? . Weak relational schema . Distributed Parallel Processing . What is the best ad to serve? . Text, Images, Videos, Logs Frameworks . Is this a fraudulent transaction? . Easy to Scale on commodity hardware . MapReduce-style programming models New Data Sources . Sensors & Devices . Traditional applications . Web Servers . Public data Overview MOVING THE DATA TO THE COMPUTE So how does it work? FIRST, STORE THE DATA Data Intensive Computing with HPC Server Windows HPC Server 2008 R2 Service Pack 2 So how does it work? SECOND, TAKE THE PROCESSING TO THE DATA Data Intensive Computing with HPC Server var logentries = from line in logs where !line.StartsWith("#") select new LogEntry(line); var user = from access
    [Show full text]
  • Using Pubsub for Scheduling in Azure SDN Qi Zhang (Microsoft - Azure Networking)
    Using PubSub For Scheduling in Azure SDN Qi Zhang (Microsoft - Azure Networking) Azure Networking Azure Region ‘A’ Regional Cable Network Consumers Regional CDN Network Carrier Microsoft Edge Enterprise, SMB, WAN mobile Azure Region ‘B’ ExpressRoute Regional Internet Network Exchanges Enterprise Regional DC/Corpnet Network DC Hardware Services Intra-Region WAN Backbone Edge and ExpressRoute CDN Last Mile • SmartNIC/FPGA • Virtual Networks • DC Networks • Software WAN • Internet Peering • Acceleration for • E2E monitoring • SONiC • Load Balancing • Regional Networks • Subsea Cables • ExpressRoute applications and (Network Watcher, • VPN Services • Optical Modules • Terrestrial Fiber content Network Performance • Firewall • National Clouds Monitoring) • DDoS Protection • DNS & Traffic Management Microsoft Global Network Svalbard Greenland United States Sweden Norway Russia Canada United Kingdom Poland Ukraine Kazakistan France Russia United States Turkey Iran China One of the largest private Algeria Pacific Ocean Atlanta Saudi Ocean Libya networks in the world Mexico Egypt Arabia India Myanmar Niger (Burma) Mali Chad Sudan Pacific Ocean Nigeria • 8,000+ ISP sessions Ethiopi Venezuela a Colombia Dr Congo • 130+ edge sites Indonesia Peru Angola Brazil Zambia Indian Ocean • Bolivia 44 ExpressRoute locations Nambia Australia • 33,000 miles of lit fiber South Africa Owned Capacity Data Argentina • SDN Managed (SWAN, OLS) center Leased Capacity Moving to Owned Edge Site DCs and Network sites not exhaustive Software Defined Management Central Commodity
    [Show full text]