Managing and Understanding Distributed Stream Processing
Total Page:16
File Type:pdf, Size:1020Kb
DISS. ETH No. 25928 Managing and understanding distributed stream processing A thesis submitted to attain the degree of Doctor of Sciences of ETH Zurich (Dr. sc. ETH Zurich) presented by Moritz Hoffmann MSc ETH in Computer Science, ETH Zurich born on January 3rd, 1986 citizen of the Federal Republic of Germany accepted on the recommendation of Prof. Dr. Timothy Roscoe (ETH Zurich), examiner Prof. Dr. Gustavo Alonso (ETH Zurich), co-examiner Prof. Dr. Peter Pietzuch (Imperial College London), co-examiner Dr. Frank McSherry (ETH Zurich), co-examiner 2019 Abstract Present-day computing systems have to deal with a continuous growth of data rate and volume. Processing these workloads should introduce as little latency as possible. Today’s stream processors promise to handle large volumes of data while providing low-latency query results. In practice however, their computational model, variations of the workload, and the lack of tools for programmers can lead to situations where the latency increases significantly. The reason for this lies in the design of today’s stream processing systems. Specifically, stream processors do not supply meaningful information for de- bugging root causes of latency problems. Additionally, they have inadequate controllers to automate resource management based on workload properties and requirements of the computation. Lastly, their reconfiguration mechanisms are not compatible with low-latency query processing and cause a significant latency increase while reconfiguring. Solving these problems is crucial to enable users, programmers and operators to maximize the effectiveness of stream processing. To solve these problems, we present three complementing solutions. We first propose a method to identify factors influencing query latency, using detailed internal measurements combined with knowledge of the computational model of the stream processor. Then, we use system-intrinsic measurements to design and implement an automatic scaling controller for scale-out distributed stream processors. It makes fast and accurate scaling decisions with minimum delay. Lastly, we propose a scaling mechanism that reduces downtime during reconfigurations by orders of magnitude. The mechanism achieves this by interleaving fine-grained configuration updates and data processing. The solutions presented in this thesis help designers of stream processors to better optimize for low-latency processing, and users to increase their query performance by providing better metrics and automating operational aspects. We think this is an important step towards efficient stream processing. i Zusammenfassung Die Datenmenge und die Datenfrequenz, mit der aktuelle Computersysteme umgehen müssen, nimmt ständig zu, während zugleich Abfragen auf diese Datenströme mit geringster Latenz bearbeitet werden sollen. Moderne Stream- Prozessoren versprechen zwar, große Datenvolumen mit geringer Abfragelatenz bearbeiten zu können. In der Realität führen die internen Strukturen der Daten- verarbeitung, Schwankungen der Arbeitslast und ein Mangel an Hilfsmitteln für Programmierer jedoch zu Situationen, in denen die Abfragelatenz signifikant ansteigt. Der Grund dafür liegt im Design moderner Stream-Prozessoren. Diese stel- len keine aussagekräftigen Informationen über den Status der Abfragen sowie den Ursprung von Latenzproblemen zur Verfügung. Ihnen fehlen Kontroll- mechanismen, die eine automatisierte Ressourcenzuordnung auf Grundlage der Arbeitslast und der Abfrageeigenschaften möglich machen könnten. Auch bieten sie keine Möglichkeiten, Konfigurationsänderungen ohne eine deutliche Erhöhung der Latenz anzuwenden. Um diese Probleme zu beheben, stellen wir drei sich ergänzende Lösungen vor. Als erstes zeigen wir eine Methode, die es unter Nutzung der internen Struk- tur des Stream-Prozessors und mit Hilfe detaillierter Messungen ermöglicht, genau zu identifizieren, wo Latenz entsteht. Auf der Basis von systemspezi- fischen Messungen entwickeln wir zweitens eine automatische Steuerung zur Skalierung verteilter Stream-Prozessoren, die schnell und präzise skaliert, und dabei die Abfragelatenz möglichst wenig erhöht. Schließlich stellen wir einen Mechanismus vor, der Konfigurationsänderungen in kleine Teile aufspaltet und während der Berechnung von Abfragen anwendet, anstatt der üblichen Methode, bei der die Berechnung angehalten und neu gestartet werden muss. Dieser Ansatz reduziert die Ausfallzeit während der Änderung von Konfigurationen stark. iii Zusammenfassung Die in dieser Arbeit präsentierten Lösungen helfen Entwicklern von Stream- Prozessoren, diese auf geringe Latenz zu optimieren, und erlauben Nutzern, die Latenz ihrer Anfragen durch bessere Messdaten und Automatisierung zu optimieren. Wir sind überzeugt, dass sie wichtige Beiträge zur Steigerung der Effizienz von Stream-Prozessoren darstellen. iv Acknowledgments The work presented in this dissertation would not have been possible without the collaboration and interactions with wonderful friends and colleagues. I would like to thank my advisor Timothy Roscoe for always supporting me, believing in me, and guiding me during the course of my doctoral studies. I would also like to thank Gustavo Alonso for co-advising my dissertation. Thanks to you, Peter Pietzuch, for taking part in my committee and the valuable feedback that you have given me. Thank you, Frank McSherry, for hours of interesting discussions, brain-storming sessions and your positive attitude to turn interesting ideas into successful research projects. During my internship I had the opportunity to collaborate with many incred- ibly smart collegues, which helped me to shape the structure of my dissertation: Adrian L. Shaw, Alexander Richardson, Chris Dalton, Dejan Milojicic, Geof- frey Ndu, Paolo Faraboschi, and Robert N. M. Watson. A special thanks goes to all friends and collaborators from the Systems Group and the Strymon project: Andrea, Claude, Desi, Gerd, Ghislain, Ingo, John, Lefteris, Lukas, Matthew, Michal, Pratanu, Pravin, Renato, Reto, Roni, Sebastian, Simon, Stefan, Vasia, and Zaheer. You all made it a pleasure to be part of the group during the last five years. Also thanks to all my friends for bearing with me during the time of writing, and especially to Florian for designing the great cover page. For many great things in my life, I would like to thank my parents Brigitte and Hartmut. Your support, motivation and courage helped me through all my studies and allowed me to pursue my path. Finally, special thanks to you, Marina, for standing by my side and your patience during difficult times. v Contents Abstracti Zusammenfassung iii Acknowledgmentsv 1 Introduction1 1.1 Structure of the dissertation . .3 1.2 Notes on collaborative work . .5 2 Background and motivation7 2.1 Stream processing concepts . .7 2.2 Expressing queries as dataflows . .8 2.3 A critical history of stream processing . 11 2.3.1 First generation stream processors . 12 2.3.2 Second generation stream processors . 12 2.3.3 MapReduce: large-scale computations . 15 2.3.4 Third-generation stream processors . 16 2.4 Timely dataflow concepts . 20 2.5 Related work: Distributed systems performance analysis . 21 2.6 Related work: Scaling controllers for distributed stream pro- cessors . 24 2.7 Related work: Applying configuration updates . 28 2.7.1 State migration in streaming systems . 29 2.7.2 Live migration in database systems . 31 2.7.3 Live migration for streaming dataflows . 31 3 Snailtrail 33 3.1 Critical path analysis background . 36 vii Contents 3.2 Online critical path analysis . 38 3.2.1 Transient critical paths . 38 3.2.2 Critical participation (CP metric) . 41 3.2.3 Comparison with existing methods . 44 3.3 Applicability to dataflow systems . 45 3.3.1 Activity types . 46 3.3.2 Instrumenting specific systems . 47 3.3.3 Model assumptions . 48 3.3.4 Instrumentation requirements . 49 3.4 Program activity graph construction . 50 3.5 Snailtrail system implementation . 53 3.6 CP-based performance summaries . 56 3.7 Evaluation . 59 3.7.1 Experimental setting . 59 3.7.2 Instrumentation overhead . 60 3.7.3 Snailtrail performance . 61 3.7.4 Comparison with existing methods . 62 3.7.5 Snailtrail in practice . 64 3.8 Critical participation: conclusion . 68 4 DS2: Controlling distributed streaming dataflows 69 4.1 Background and motivation . 71 4.2 The DS2 model . 72 4.2.1 Problem definition . 73 4.2.2 Performance model . 73 4.2.3 Assumptions . 79 4.2.4 Properties . 80 4.3 Implementation and deployment . 82 4.3.1 Instrumentation requirements . 82 4.3.2 Integration with stream processors . 83 4.3.3 DS2 and execution models . 86 4.4 Experimental evaluation . 87 4.4.1 Setup . 87 4.4.2 DS2 compared to Dhalion on Heron . 88 4.4.3 DS2 on Flink . 89 4.4.4 Convergence . 91 4.4.5 Accuracy . 92 4.4.6 Instrumentation overhead . 93 viii Contents 4.5 DS2: conclusion . 97 5 Megaphone 99 5.1 State migration design . 101 5.1.1 Migration formalism and guarantees . 101 5.1.2 Configuration updates . 103 5.1.3 Megaphone’s mechanism . 104 5.1.4 Example . 106 5.2 Implementation . 108 5.2.1 Megaphone’s operator interface . 109 5.2.2 State organization . 110 5.2.3 Timely Dataflow instantiation . 113 5.2.3.1 Monitoring output frontiers . 113 5.2.3.2 Capturing Timely Dataflow idioms . 113 5.2.4 Discussion . 114 5.3 Evaluation . 116 5.3.1 NEXMark benchmark . 117 5.3.2 Overhead of the interface . 124 5.3.3 Migration micro-benchmarks . 125 5.3.3.1 Number of bins vary . 129 5.3.3.2 Number of keys vary . 131 5.3.3.3 Number of keys and bins vary proportionally 132 5.3.3.4 Throughput versus processing latency . 133 5.3.3.5 Memory