Online Modeling and Tuning of Parallel Stream Processing Systems Jonathan Curtis Beard Washington University in St
Total Page:16
File Type:pdf, Size:1020Kb
Washington University in St. Louis Washington University Open Scholarship Engineering and Applied Science Theses & McKelvey School of Engineering Dissertations Summer 8-15-2015 Online Modeling and Tuning of Parallel Stream Processing Systems Jonathan Curtis Beard Washington University in St. Louis Follow this and additional works at: https://openscholarship.wustl.edu/eng_etds Part of the Engineering Commons Recommended Citation Beard, Jonathan Curtis, "Online Modeling and Tuning of Parallel Stream Processing Systems" (2015). Engineering and Applied Science Theses & Dissertations. 125. https://openscholarship.wustl.edu/eng_etds/125 This Dissertation is brought to you for free and open access by the McKelvey School of Engineering at Washington University Open Scholarship. It has been accepted for inclusion in Engineering and Applied Science Theses & Dissertations by an authorized administrator of Washington University Open Scholarship. For more information, please contact [email protected]. WASHINGTON UNIVERSITY IN ST. LOUIS School of Engineering and Applied Science Department of Computer Science and Engineering Dissertation Examination Committee: Roger Dean Chamberlain, Chair Jeremy Buhler Ron Cytron Roch Guerin Jenine Harris Juan Pantano Robert B. Pless Online Modeling and Tuning of Parallel Stream Processing Systems by Jonathan Curtis Beard A dissertation presented to the Graduate School of Arts and Sciences of Washington University in partial fulfillment of the requirements for the degree of Doctor of Philosophy August 2015 Saint Louis, Missouri c 2015, Jonathan Curtis Beard Table of Contents ListofFigures.................................... v ListofTables .................................... viii Acknowledgements ................................. ix Abstract ....................................... xi Chapter1: Introduction.............................. 1 1.1 IndustryTrends.................................. 2 1.2 ParadigmFlux .................................. 3 1.3 TurningStreamsintoTorrents . .... 5 1.4 ContributionandStructure . ... 8 Chapter2: BackgroundandRelatedWorks . 10 2.1 StreamProcessing ................................ 10 2.2 Modeling...................................... 12 2.3 Instrumentation................................. 14 2.4 OnlineModeling&PerformanceTuning . .... 15 2.5 DynamicAdaptation ............................... 16 Chapter3: RaftLibStreamingLibrary . 18 3.1 Designconsiderations. ... 20 3.2 RaftLibdescription .............................. .. 21 3.2.1 RaftLibasaresearchplatform. .. 24 3.2.2 Authoringstreamingapplications . ... 29 3.3 Benchmarking................................... 33 3.4 ConcludingRemarksandWhatFollows. ... 37 Chapter 4: Modeling Streaming Applications . 38 4.1 Introduction.................................... 38 4.1.1 Description ................................ 39 4.1.2 SharingModels .............................. 45 4.1.3 ModelingAssumptions . 45 4.1.4 Example.................................. 46 ii 4.2 ModelEvaluationApproach . 48 4.2.1 Tools.................................... 48 4.2.2 Hardware ................................. 48 4.2.3 EmpiricalTesting............................. 49 4.2.4 Selecting Compute Resources and Mapping Application Kernels . 49 4.2.5 SyntheticBenchmarks . .. 50 4.2.6 RealApplications............................. 51 4.3 Results....................................... 51 4.3.1 ProcessorSharingModel . 51 4.3.2 TheFlowModel ............................. 52 4.3.3 TheQueueingModel ........................... 54 4.4 Conclusions .................................... 56 Chapter5: BestCaseExecutionTimeVariation . 57 5.1 Methodology ................................... 60 5.1.1 SyntheticWorkload............................ 60 5.1.2 Hardware, Software, and Data Collection . .... 61 5.1.3 Distribution................................ 64 5.1.4 Parameterization and definition of mLevy ............... 67 5.2 Results....................................... 69 5.3 Conclusions .................................... 72 Chapter6: DynamicInstrumentation. 75 6.1 Introduction.................................... 75 6.2 InstrumentationConsiderations . ...... 79 6.2.1 Throughput ................................ 82 6.2.2 QueueOccupancy............................. 82 6.3 ServiceRate.................................... 83 6.3.1 OnlineServiceRateHeuristic . 85 6.3.2 SamplingPeriodDetermination . .. 85 6.3.3 ServiceRateHeuristic .......................... 86 6.4 Evaluation..................................... 92 6.4.1 Infrastructure ............................... 92 6.5 Applications.................................... 92 6.5.1 MatrixMultiply.............................. 93 6.5.2 Rabin-KarpStringSearch . 94 6.6 Validation ..................................... 94 6.7 Conclusions .................................... 99 Chapter7: ModelSelection............................ 101 7.1 Stochastic Models and Streaming Applications . ........ 103 7.2 StochasticQueueingModelSelection . ...... 104 7.2.1 Methodology ............................... 104 iii 7.2.2 SupportVectorMachine . 105 7.2.3 DataCollection&Hardware. 107 7.2.4 SVMandTraining ............................ 109 7.2.5 Artificial Neural Network (ANN) . 113 7.3 Conclusions&FutureWork . 114 Chapter 8: Online Tuning: Putting It All Together . 117 8.1 OnlineModelingofStreamingSystems . ..... 117 8.1.1 Whyisonlinetuningimportant? . 119 8.1.2 AdaptionTypes.............................. 119 8.1.3 EvaluationMethodology . 121 8.1.4 AdaptionResults ............................. 121 8.2 Conclusions .................................... 124 Chapter 9: Conclusions and Future Work . 125 9.1 Conclusions .................................... 125 9.2 FutureWork.................................... 127 References ...................................... 130 Vita.......................................... 146 iv List of Figures Figure3.1: Simplestreamingexample . .... 19 Figure3.2: RaftLibsumkernelexample. .... 23 Figure 3.3: RaftLib sum application mapping example . ........ 24 Figure3.4: RaftLibscheduleroptions . ..... 26 Figure3.5: Importance ofproperly sized buffers . ....... 27 Figure 3.6: RaftLib syntax for C++ container interaction . .......... 31 Figure3.7: RaftLibforeachconstruct . ..... 32 Figure3.8: RaftLiblambdakernelsyntax . ..... 33 Figure3.9: RaftLibstringmatchingtopology . ....... 34 Figure 3.10: Aho-Corasick string matching algorithm . ......... 34 Figure 3.11: RaftLib performance by thread count. ........ 36 Figure4.1: JPEGencodetopology. 40 Figure4.2: Stagesofflowmodeltransformation . ...... 40 Figure4.3: Exampleapplicationtopology . ...... 46 Figure4.4: Applicationmapping. 46 Figure 4.5: Traditional service rate characterization . ............ 47 Figure4.6: Flowmodelingstepbystep . .. 47 Figure4.7: Completedsolutionstoflowmodel . ..... 47 v Figure 4.8: Offline service rate characterization . ......... 49 Figure4.9: DESapplicationtopology . .... 51 Figure4.10: Percenterrorforsharingmodel . ....... 52 Figure 4.11: Synthetic app % error for gain/loss flow model . ......... 53 Figure 4.12: JPEG encode % error for gain/loss flow model. ....... 53 Figure 4.13: DES encrypt % error for gain/loss flow model . ........ 53 Figure 4.14: Synthetic app buffer capacity % error . ....... 54 Figure4.15: JPEGencodebuffercapacity%error. ..... 55 Figure4.16: DESencryptbuffer capacity %error . ...... 55 Figure5.1: Histogramofexecutionvariation . ...... 58 Figure 5.2: Simplifying distributional assumptions . ........... 59 Figure5.3: Memoryaccessvariances. .... 63 Figure5.4: Timervariance ............................. 64 Figure5.5: Timerlatency.............................. 65 Figure 5.6: mLevyQQ-plotsgroupedbyprocess . 71 Figure5.7: KLdivergenceformodelselection. ....... 72 Figure 6.1: Non-blocking service rate intuition . ........ 77 Figure 6.2: Non-blocked periods in highly utilized queue . .......... 78 Figure6.3: Twoserverstreamingsystem . .... 79 Figure 6.4: Instrumentation monitor arrangement . ........ 80 Figure6.5: TimingacrossNUMAnodes. 81 Figure6.6: Stabilityoftimeframe . .... 82 Figure 6.7: Probability of observing non-blocking read . ........... 84 vi Figure 6.8: Direct observations of service rate. ......... 87 Figure 6.9: Initial filtered observations of non-blocked servicerate ........ 89 Figure 6.10: Convergence of stable non-blocking service rates............ 90 Figure 6.11: Estimating the point of convergence . ........ 91 Figure 6.12: Instrumentation of shifting distributions . ............. 91 Figure6.13: Matrixmultiplyexample. ..... 93 Figure6.14: Rabin-Karpexample . ... 94 Figure 6.15: Histogram of micro-benchmark instrumentation . ........... 95 Figure 6.16: Micro-benchmark shifting distribution results ............. 96 Figure6.17: Detectionofbothphases. ..... 97 Figure 6.18: Matrix multiply instrumentation . ........ 98 Figure 6.19: Rabin-Karp instrumentation results . ......... 99 Figure 7.1: Fingerprinting of computer systems . ....... 102 Figure7.2: Simplemicro-benchmark. .... 103 Figure7.3: Machinelearningfeaturecloud . ...... 106 Figure7.4: Classificationbyerrorcategory . ....... 110 Figure 7.5: Classification rate by queue utilization (ρ)............... 111 Figure7.6: Neuralnetworkresults . 115 Figure 8.1: Complete enumeration of RaftLib application . .......... 122 Figure8.2: Peakperformancezoom-in. .... 123 Figure 8.3: Parallelization event monitor control actions . ............. 123 Figure 8.4: Queue sizing event monitor control actions . .......... 123 vii List of Tables Table 3.1: Benchmarking hardware summary for RaftLib . .......