Bistro: Scheduling Data-Parallel Jobs Against Live Production Systems Andrey Goder, Alexey Spiridonov, and Yin Wang, Facebook, Inc
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The Sad State of Web Development Random Thoughts on Web Development
The Sad State of Web Development Random thoughts on web development Going to shit 2015 is when web development went to shit. Web development used to be nice. You could fire up a text editor and start creating JS and CSS files. You can absolutely still do this. That has not changed. So yes, everything I’m about to say can be invalidated by saying that. The web (specifically the Javascript/Node community) has created some of the most complicated, convoluted, over engineered tools ever conceived. Node.js/NPM At times, I think where web development is at this point is some cruel joke played on us by Ryan Dahl. You see, to get into why web development is so terrible, you have to start at Node. By definition I was a magpie developer, so undoubtedly I used Node, just as everyone should. At universities they should make every developer write an app with Node.js, deploy it to production, then try to update the dependencies 3 months later. The only downside is we would have zero new developers coming out of computer science programs. You see the Node.js philosophy is to take the worst fucking language ever designed and put it on the server. Combine that with all the magpies that were using Ruby at the time, and you have the perfect fucking storm. Lets take everything that was great in Ruby and re write it in Javascript, I think was the official motto. Most of the smart magpies have moved on to Go at this point, but the people who have stayed in the Node community have undoubtedly created the most over engineered eco system that has ever appeared. -
16 Inspiring Women Engineers to Watch
Hackbright Academy Hackbright Academy is the leading software engineering school for women founded in San Francisco in 2012. The academy graduates more female engineers than UC Berkeley and Stanford each year. https://hackbrightacademy.com 16 Inspiring Women Engineers To Watch Women's engineering school Hackbright Academy is excited to share some updates from graduates of the software engineering fellowship. Check out what these 16 women are doing now at their companies - and what languages, frameworks, databases and other technologies these engineers use on the job! Software Engineer, Aclima Tiffany Williams is a software engineer at Aclima, where she builds software tools to ingest, process and manage city-scale environmental data sets enabled by Aclima’s sensor networks. Follow her on Twitter at @twilliamsphd. Technologies: Python, SQL, Cassandra, MariaDB, Docker, Kubernetes, Google Cloud Software Engineer, Eventbrite 1 / 16 Hackbright Academy Hackbright Academy is the leading software engineering school for women founded in San Francisco in 2012. The academy graduates more female engineers than UC Berkeley and Stanford each year. https://hackbrightacademy.com Maggie Shine works on backend and frontend application development to make buying a ticket on Eventbrite a great experience. In 2014, she helped build a WiFi-enabled basal body temperature fertility tracking device at a hardware hackathon. Follow her on Twitter at @magksh. Technologies: Python, Django, Celery, MySQL, Redis, Backbone, Marionette, React, Sass User Experience Engineer, GoDaddy 2 / 16 Hackbright Academy Hackbright Academy is the leading software engineering school for women founded in San Francisco in 2012. The academy graduates more female engineers than UC Berkeley and Stanford each year. -
NUMA-Aware Thread Migration for High Performance NVMM File Systems
NUMA-Aware Thread Migration for High Performance NVMM File Systems Ying Wang, Dejun Jiang and Jin Xiong SKL Computer Architecture, ICT, CAS; University of Chinese Academy of Sciences fwangying01, jiangdejun, [email protected] Abstract—Emerging Non-Volatile Main Memories (NVMMs) out considering the NVMM usage on NUMA nodes. Besides, provide persistent storage and can be directly attached to the application threads accessing file system rely on the default memory bus, which allows building file systems on non-volatile operating system thread scheduler, which migrates thread only main memory (NVMM file systems). Since file systems are built on memory, NUMA architecture has a large impact on their considering CPU utilization. These bring remote memory performance due to the presence of remote memory access and access and resource contentions to application threads when imbalanced resource usage. Existing works migrate thread and reading and writing files, and thus reduce the performance thread data on DRAM to solve these problems. Unlike DRAM, of NVMM file systems. We observe that when performing NVMM introduces extra latency and lifetime limitations. This file reads/writes from 4 KB to 256 KB on a NVMM file results in expensive data migration for NVMM file systems on NUMA architecture. In this paper, we argue that NUMA- system (NOVA [47] on NVMM), the average latency of aware thread migration without migrating data is desirable accessing remote node increases by 65.5 % compared to for NVMM file systems. We propose NThread, a NUMA-aware accessing local node. The average bandwidth is reduced by thread migration module for NVMM file system. -
CROP: Linking Code Reviews to Source Code Changes
CROP: Linking Code Reviews to Source Code Changes Matheus Paixao Jens Krinke University College London University College London London, United Kingdom London, United Kingdom [email protected] [email protected] Donggyun Han Mark Harman University College London Facebook and University College London London, United Kingdom London, United Kingdom [email protected] [email protected] ABSTRACT both industrial and open source software development communities. Code review has been widely adopted by both industrial and open For example, large organisations such as Google and Facebook use source software development communities. Research in code re- code review systems on a daily basis [5, 9]. view is highly dependant on real-world data, and although existing In addition to its increasing popularity among practitioners, researchers have attempted to provide code review datasets, there code review has also drawn the attention of software engineering is still no dataset that links code reviews with complete versions of researchers. There have been empirical studies on the effect of code the system’s code base mainly because reviewed versions are not review on many aspects of software engineering, including software kept in the system’s version control repository. Thus, we present quality [11, 12], review automation [2], and automated reviewer CROP, the Code Review Open Platform, the first curated code recommendation [20]. Recently, other research areas in software review repository that links review data with isolated complete engineering have leveraged the data generated during code review versions (snapshots) of the source code at the time of review. CROP to expand previously limited datasets and to perform empirical currently provides data for 8 software systems, 48,975 reviews and studies. -
Unravel Data Systems Version 4.5
UNRAVEL DATA SYSTEMS VERSION 4.5 Component name Component version name License names jQuery 1.8.2 MIT License Apache Tomcat 5.5.23 Apache License 2.0 Tachyon Project POM 0.8.2 Apache License 2.0 Apache Directory LDAP API Model 1.0.0-M20 Apache License 2.0 apache/incubator-heron 0.16.5.1 Apache License 2.0 Maven Plugin API 3.0.4 Apache License 2.0 ApacheDS Authentication Interceptor 2.0.0-M15 Apache License 2.0 Apache Directory LDAP API Extras ACI 1.0.0-M20 Apache License 2.0 Apache HttpComponents Core 4.3.3 Apache License 2.0 Spark Project Tags 2.0.0-preview Apache License 2.0 Curator Testing 3.3.0 Apache License 2.0 Apache HttpComponents Core 4.4.5 Apache License 2.0 Apache Commons Daemon 1.0.15 Apache License 2.0 classworlds 2.4 Apache License 2.0 abego TreeLayout Core 1.0.1 BSD 3-clause "New" or "Revised" License jackson-core 2.8.6 Apache License 2.0 Lucene Join 6.6.1 Apache License 2.0 Apache Commons CLI 1.3-cloudera-pre-r1439998 Apache License 2.0 hive-apache 0.5 Apache License 2.0 scala-parser-combinators 1.0.4 BSD 3-clause "New" or "Revised" License com.springsource.javax.xml.bind 2.1.7 Common Development and Distribution License 1.0 SnakeYAML 1.15 Apache License 2.0 JUnit 4.12 Common Public License 1.0 ApacheDS Protocol Kerberos 2.0.0-M12 Apache License 2.0 Apache Groovy 2.4.6 Apache License 2.0 JGraphT - Core 1.2.0 (GNU Lesser General Public License v2.1 or later AND Eclipse Public License 1.0) chill-java 0.5.0 Apache License 2.0 Apache Commons Logging 1.2 Apache License 2.0 OpenCensus 0.12.3 Apache License 2.0 ApacheDS Protocol -
Crawling Code Review Data from Phabricator
Friedrich-Alexander-Universit¨atErlangen-N¨urnberg Technische Fakult¨at,Department Informatik DUMITRU COTET MASTER THESIS CRAWLING CODE REVIEW DATA FROM PHABRICATOR Submitted on 4 June 2019 Supervisors: Michael Dorner, M. Sc. Prof. Dr. Dirk Riehle, M.B.A. Professur f¨urOpen-Source-Software Department Informatik, Technische Fakult¨at Friedrich-Alexander-Universit¨atErlangen-N¨urnberg Versicherung Ich versichere, dass ich die Arbeit ohne fremde Hilfe und ohne Benutzung anderer als der angegebenen Quellen angefertigt habe und dass die Arbeit in gleicher oder ¨ahnlicherForm noch keiner anderen Pr¨ufungsbeh¨ordevorgelegen hat und von dieser als Teil einer Pr¨ufungsleistung angenommen wurde. Alle Ausf¨uhrungen,die w¨ortlich oder sinngem¨aߨubernommenwurden, sind als solche gekennzeichnet. Nuremberg, 4 June 2019 License This work is licensed under the Creative Commons Attribution 4.0 International license (CC BY 4.0), see https://creativecommons.org/licenses/by/4.0/ Nuremberg, 4 June 2019 i Abstract Modern code review is typically supported by software tools. Researchers use data tracked by these tools to study code review practices. A popular tool in open-source and closed-source projects is Phabricator. However, there is no tool to crawl all the available code review data from Phabricator hosts. In this thesis, we develop a Python crawler named Phabry, for crawling code review data from Phabricator instances using its REST API. The tool produces minimal server and client load, reproducible crawling runs, and stores complete and genuine review data. The new tool is used to crawl the Phabricator instances of the open source projects FreeBSD, KDE and LLVM. The resulting data sets can be used by researchers. -
Letter, If Not the Spirit, of One Or the Other Definition
Producing Open Source Software How to Run a Successful Free Software Project Karl Fogel Producing Open Source Software: How to Run a Successful Free Software Project by Karl Fogel Copyright © 2005-2021 Karl Fogel, under the CreativeCommons Attribution-ShareAlike (4.0) license. Version: 2.3214 Home site: https://producingoss.com/ Dedication This book is dedicated to two dear friends without whom it would not have been possible: Karen Under- hill and Jim Blandy. i Table of Contents Preface ............................................................................................................................. vi Why Write This Book? ............................................................................................... vi Who Should Read This Book? ..................................................................................... vi Sources ................................................................................................................... vii Acknowledgements ................................................................................................... viii For the first edition (2005) ................................................................................ viii For the second edition (2021) .............................................................................. ix Disclaimer .............................................................................................................. xiii 1. Introduction ................................................................................................................... -
Artificial Intelligence for Understanding Large and Complex
Artificial Intelligence for Understanding Large and Complex Datacenters by Pengfei Zheng Department of Computer Science Duke University Date: Approved: Benjamin C. Lee, Advisor Bruce M. Maggs Jeffrey S. Chase Jun Yang Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science in the Graduate School of Duke University 2020 Abstract Artificial Intelligence for Understanding Large and Complex Datacenters by Pengfei Zheng Department of Computer Science Duke University Date: Approved: Benjamin C. Lee, Advisor Bruce M. Maggs Jeffrey S. Chase Jun Yang An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science in the Graduate School of Duke University 2020 Copyright © 2020 by Pengfei Zheng All rights reserved except the rights granted by the Creative Commons Attribution-Noncommercial Licence Abstract As the democratization of global-scale web applications and cloud computing, under- standing the performance of a live production datacenter becomes a prerequisite for making strategic decisions related to datacenter design and optimization. Advances in monitoring, tracing, and profiling large, complex systems provide rich datasets and establish a rigorous foundation for performance understanding and reasoning. But the sheer volume and complexity of collected data challenges existing techniques, which rely heavily on human intervention, expert knowledge, and simple statistics. In this dissertation, we address this challenge using artificial intelligence and make the case for two important problems, datacenter performance diagnosis and datacenter workload characterization. The first thrust of this dissertation is the use of statistical causal inference and Bayesian probabilistic model for datacenter straggler diagnosis. -
Jetbrains Upsource Comparison Upsource Is a Powerful Tool for Teams Wish- Key Benefits Ing to Improve Their Code, Projects and Pro- Cesses
JetBrains Upsource Comparison Upsource is a powerful tool for teams wish- Key benefits ing to improve their code, projects and pro- cesses. It serves as a polyglot code review How Upsource Compares to Other Code Review Tools tool, a source of data-driven project ana- lytics, an intelligent repository browser and Accuracy of Comparison a team collaboration center. Upsource boasts in-depth knowledge of Java, PHP, JavaScript, Integration with JetBrains Tools Python, and Kotlin to increase the efcien- cy of code reviews. It continuously analyzes Sales Contacts the repository activity providing a valuable insight into potential design problems and project risks. On top of that Upsource makes team collaboration easy and enjoyable. Key benefits IDE-level code insight to help developers Automated workflow, to minimize manual tasks. Powerful search engine. understand and review code changes more efectively. Smart suggestion of suitable reviewers, revi- IDE plugins that allow developers to partici- sions, etc. based on historical data and intel- pate in code reviews right from their IDEs. Data-driven project analytics highlighting ligent progress tracking. potential design flaws such as hotspots, abandoned files and more. Unified access to all your Git, Mercurial, Secure, and scalable. Perforce or Subversion projects. To learn more about Upsource, please visit our website at jetbrains.com/upsource. How Upsource Compares to Other Code Review Tools JetBrains has extensively researched various As all the products mentioned in the docu- tools to come up with a useful comparison ment are being actively developed and their table. We tried to make it as comprehensive functionality changes on a regular basis, this and neutral as we possibly could. -
Characterizing, Modeling, and Benchmarking Rocksdb Key-Value
Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook Zhichao Cao, University of Minnesota, Twin Cities, and Facebook; Siying Dong and Sagar Vemuri, Facebook; David H.C. Du, University of Minnesota, Twin Cities https://www.usenix.org/conference/fast20/presentation/cao-zhichao This paper is included in the Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST ’20) February 25–27, 2020 • Santa Clara, CA, USA 978-1-939133-12-0 Open access to the Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST ’20) is sponsored by Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook Zhichao Cao†‡ Siying Dong‡ Sagar Vemuri‡ David H.C. Du† †University of Minnesota, Twin Cities ‡Facebook Abstract stores is still challenging. First, there are very limited studies of real-world workload characterization and analysis for KV- Persistent key-value stores are widely used as building stores, and the performance of KV-stores is highly related blocks in today’s IT infrastructure for managing and storing to the workloads generated by applications. Second, the an- large amounts of data. However, studies of characterizing alytic methods for characterizing KV-store workloads are real-world workloads for key-value stores are limited due to different from the existing workload characterization stud- the lack of tracing/analyzing tools and the difficulty of collect- ies for block storage or file systems. KV-stores have simple ing traces in operational environments. In this paper, we first but very different interfaces and behaviors. A set of good present a detailed characterization of workloads from three workload collection, analysis, and characterization tools can typical RocksDB production use cases at Facebook: UDB (a benefit both developers and users of KV-stores by optimizing MySQL storage layer for social graph data), ZippyDB (a dis- performance and developing new functions. -
Myrocks in Mariadb
MyRocks in MariaDB Sergei Petrunia <[email protected]> MariaDB Shenzhen Meetup November 2017 2 What is MyRocks ● #include <Yoshinori’s talk> ● This talk is about MyRocks in MariaDB 3 MyRocks lives in Facebook’s MySQL branch ● github.com/facebook/mysql-5.6 – Will call this “FB/MySQL” ● MyRocks lives there in storage/rocksdb ● FB/MySQL is easy to use if you are Facebook ● Not so easy if you are not :-) 4 FB/mysql-5.6 – user perspective ● No binaries, no packages – Compile yourself from source ● Dependencies, etc. ● No releases – (Is the latest git revision ok?) ● Has extra features – e.g. extra counters “confuse” monitoring tools. 5 FB/mysql-5.6 – dev perspective ● Targets a CentOS-type OS – Compiler, cmake version, etc. – Others may or may not [periodically] work ● MariaDB/Percona file pull requests to fix ● Special command to compile – https://github.com/facebook/mysql-5.6/wiki/Build-Steps ● Special command to run tests – Test suite assumes a big machine ● Some tests even a release build 6 Putting MyRocks in MariaDB ● Goals – Wider adoption – Ease of use – Ease of development – Have MyRocks in MariaDB ● Use it with MariaDB features ● Means – Port MyRocks into MariaDB – Provide binaries and packages 7 Status of MyRocks in MariaDB 8 Status of MyRocks in MariaDB ● MariaDB 10.2 is GA (as of May, 2017) ● It includes an ALPHA version of MyRocks plugin – Working to improve maturity ● It’s a loadable plugin (ha_rocksdb.so) ● Packages – Bintar, deb, rpm, win64 zip + MSI – deb/rpm have MyRocks .so and tools in a separate package. 9 Packaging for MyRocks in MariaDB 10 MyRocks and RocksDB library ● MyRocks is tied RocksDB@revno MariaDB – RocksDB is a github submodule – No compatibility with other versions MyRocks ● RocksDB is always compiled with RocksDB MyRocks S Z n ● l i And linked-in statically a b p ● p Distros have a RocksDB package y – Not using it. -
Phabricator 538D8f2... Overview
Institute of Computational Science Phabricator 538d8f2... overview Dmitry Mikushin (for the Bugs Course) . October 17, 2013 Dmitry Mikushin Test-drive at http://devel.kernelgen.org October 17, 2013 1 / 14 . What is Phabricator? The LAMP-based web-server + command-line client for: peer code review task management project communication And it’s open-source Dmitry Mikushin Test-drive at http://devel.kernelgen.org October 17, 2013 2 / 14 . Test drive! Browse to http://devel.kernelgen.org, login and look around Dmitry Mikushin Test-drive at http://devel.kernelgen.org October 17, 2013 3 / 14 . Key features Peer code review Integrated environment for request reviewing, tasks/bugs and versioning system - in web environment - in command line Ergonomic task interface Dmitry Mikushin Test-drive at http://devel.kernelgen.org October 17, 2013 4 / 14 . Key features Peer code review Integrated environment for request reviewing, tasks/bugs and versioning system - in web environment - in command line Ergonomic task interface Dmitry Mikushin Test-drive at http://devel.kernelgen.org October 17, 2013 5 / 14 . Peer code review: traditional 1 RFC: Developer publishes a patch with the source and tests 2 Other developers comment on the patch issues/improvements 3 After all issues are addressed, reviewers ACK patch for commit 4 Developer himself or someone with rw rights commits the patch Everything is over email Relationship with Bugs: fixes for PRs are also reviewed this way Dmitry Mikushin Test-drive at http://devel.kernelgen.org October 17, 2013 6 / 14 . Peer code review: Phabricator approach 1 Developer check-ins the patch to the Phabricator over the command line (passing lint/unit tests, if any) $ arc diff .