A Modular Environment for Geophysical Inversion and Run-Time Autotuning Using Heterogeneous Computing Systems

A Modular Environment for Geophysical Inversion and Run-time Autotuning using Heterogeneous Computing Systems A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Joseph M. Myre IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy David J. Lilja and Martin O. Saar July, 2013 c Joseph M. Myre 2013 ALL RIGHTS RESERVED Acknowledgements The support and guidance of my advisors, David Lilja and Martin Saar, have been central to the completion of this dissertation. Our frequent meetings and discussions have provided me with indispensable assistance and feedback. Thanks are always due to the staff members that make our departments run smoothly. Especially Georganne Tolaas and Sharon Kressler, each are invaluable members of their respective departments and have provided me with immeasurable assistance. I also thank the members of my written, oral, and final examination committees for their work. Members of these committees include Shashi Shekhar, Antonia Zhai, Justin Revenaugh, David Lilja, and Martin Saar. I thank the research groups and departments that welcomed me to graduate school and helped me along the way. The assistance from Shruti Patil, Peter Bailey, Erich Frahm, and the Laboratory for Advanced Research in Computing Technology and Compilers (ARCTiC Labs) in the Department of Electrical and Computer Engineer- ing was invaluable. The Geofluids research group provided me with an excellent place to learn about the physical nature of fluids in the Earth Sciences. I'm very grateful for the assistance and mentoring that I have received from the post-docs I have worked with including Stuart Walsh, Matthew Covington, Po-Hao Kao, Kong XiangZhao, and Weijun Xiao, as well as helpful discussions with Ben Tutolo, my fellow Geofluids graduate student. I am grateful to Brian Bagley, Lars Hansen, David Fox, Lindsay Iredale, Andrew Haveles, Kit Resner, Ioan Lascu, Ben Tutolo, Andrew Luhmann, and Annia Fayon for enlightening discussions about research, improving my understanding of geology, and their friendship. I would also like to thank Calvin Alexander and Josh Feinberg for the interesting discussions that continued developing my interesting the Earth Sciences. The hospitality I received from the remainder of the Department of Earth Sciences was beyond what I could have expected and was extremely appreciated. The friendship of Aaron and Allison Burnett, Julia Tutolo, Tiffany Bagley, Elissa i Hansen, Ioana Lascu, Ryan Littlewood, and Laura Vietti was also enjoyed and appreciated. The Earth Science departmental Saturday morning ice hockey group and the AHA Yetis ice hockey team provided wonderful and much needed recreational breaks throughout the years. The pursuit of my Ph.D. would have been less pleasurable without them. I would finally like to thank my family for their support. My parents, specifically, have been continuously supportive and inquisitive throughout my academic career and for that I thank them. I also met my soon to be wife, Anna Lindquist, here at the University of Minnesota and she has provided more support, encouragement, patience, and love than could be expected of any one person and I wholly appreciate it. Chapter 2 acknowledgements: This work was supported in part by National Sci- ence Foundation (NSF) Grant No. EAR-0941666. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF. I also thank Erich Frahm, Bill Tuohy, Shimin Lian, the Minnesota Supercomputing Institute, and NVIDIA for their advice and support. Chapter 3 acknowledgements: This work was supported in part by National Sci- ence Foundation (NSF) Grants No. DMS-0724560, EAR-0510723, and EAR-0941666. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF. We are also grateful to Peter Bailey, Erich Frahm, Victor Barocas, the Laboratory for Computational Science & Engineering, Paul Woodward, and NVIDIA for their advice and support, and thank our Reviewers for their excellent comments. Chapter 4 acknowledgements: This work was supported in part by National Sci- ence Foundation (NSF) Grant No. EAR-0941666. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF. We also thank Stuart Walsh, Erich Frahm, and Bill Tuohy for their advice and support as well as the Minnesota Super- computing Institute for their high-performance computing resources. Chapter 5 acknowledgements: This work was supported in part by National Sci- ence Foundation (NSF) Grant No. EAR-0941666. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF. We also thank Bill Tuohy, Carl ii Sturtivant, Joshua Feinberg, and Ioan Lascu for helpful discussions and the Institute for Rock Magnetism, and the Minnesota Supercomputing Institute for computational resources. Chapter 6 acknowledgements: This work was supported in part by National Sci- ence Foundation (NSF) Grant No. EAR-0941666. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF. We also thank Erich Frahm, Bill Tuohy, the Minnesota Supercomputing Institute, and NVIDIA for their advice and support. Chapter 7 acknowledgements: This work was supported in part by National Sci- ence Foundation (NSF) Grants No. DMS-0724560, EAR-0510723 and EAR-0941666. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF. I also thank Erich Frahm, Bill Tuohy, the Minnesota Supercomputing Institute for their advice and support. iii Contents Acknowledgements i List of Tables viii List of Figures x 1 Introduction 1 2 CUSH { An efficient programming framework for single host GPU clusters 6 2.1 Introduction . .8 2.2 Background and Related Work . .9 2.3 GPU Computation . 11 2.4 Figures . 11 2.5 The Design of CUSH . 11 2.6 Test Methodology and Case Studies . 16 2.6.1 Program Complexity Measurement . 17 2.6.2 Statistical Comparison of Performance . 17 2.6.3 Matrix Multiplication . 18 2.6.4 Lattice-Boltzmann Fluid Flow . 20 2.6.5 Fast Fourier Transform . 21 2.7 Discussion of Results . 23 2.8 Conclusions and Future Work . 28 3 A Lattice-Boltzmann Method for Fluid Flow 30 3.1 Introduction . 32 3.2 Lattice-Boltzmann Computation . 35 3.2.1 Memory Access Patterns . 37 iv 3.2.2 Memory Constraints . 39 3.2.3 Single-GPU Implementation . 41 3.2.4 Multi-GPU Implementation . 46 3.3 Experimental Methodology . 48 3.3.1 Memory Access Pattern Methodology . 48 3.3.2 Statistical Methodology . 49 3.4 Performance Analysis . 53 3.4.1 Memory Access Patterns . 53 3.4.2 Statistical Analysis . 56 3.5 Related Work . 66 3.6 Conclusions and Future Work . 68 4 Karst Conduit Flow 70 4.1 Introduction . 72 4.2 Heat Transport in Karst Conduits . 74 4.2.1 Physical governing equations . 74 4.2.2 Computational Model . 78 4.3 Solute Transport in Karst Conduits . 82 4.3.1 Physical governing equations . 82 4.3.2 Computational model . 84 4.4 GPU Implementation . 86 4.5 Test Methodology . 89 4.6 Results . 91 4.6.1 Gaussian Thermal Pulses . 91 4.6.2 Reproduction of Observed Thermal Signals . 92 4.7 Conclusions and Future Work . 94 5 The Singular Value Composition and TNT Non-Negative Least Squares Solver 95 5.1 Introduction . 97 5.2 Background and related work . 98 5.2.1 Non-Negative Least Squares . 98 5.2.2 Test Matrix Generation . 100 5.3 Test Matrix Generation . 101 5.3.1 Design of the SVC algorithm . 101 v 5.3.2 Work Factor Analysis . 106 5.4 Non-Negative Least Squares Solver Strategies . 108 5.4.1 Existing active set and least squares solver strategies . 108 5.4.2 The design of the TNT algorithm . 109 5.5 Test methodology . 110 5.5.1 SVC . 110 5.5.2 TNT . 112 5.6 Discussion of results . 114 5.6.1 SVC . 114 5.6.2 TNT . 115 5.7 Conclusions and future work . 124 6 Run-time Autotuning for Performance Using Applied Geophysical Inverse Theory 126 6.1 Introduction . 128 6.2 Heterogeneous Computing . 129 6.3 Common Inversion Methods . 130 6.3.1 Non-Linear Conjugate Gradient . 131 6.3.2 The Simplex Method . 132 6.3.3 Simulated Annealing . 132 6.4 Applying Inversion in Machine Space . 133 6.5 Case Studies and Methodology . 134 6.6 Discussion of Results . 136 6.6.1 Performance Analysis . 136 6.6.2 Optimal Run-Time Configurations . 138 6.7 Conclusions and Future Work . 140 7 A Modular Environment for Geophysical Inversion and Run-time Autotuning Using Heterogeneous Computing Systems 142 7.1 Introduction . 144 7.2 Components of the Environment . 145 7.2.1 Heterogeneous System Abstraction . 145 7.2.2 Geophysical Inversion Methods . 145 7.2.3 Run-time Autotuning for Performance . 147 7.3 Case Studies and Methodology . 148 vi 7.3.1 Lattice-Boltzmann Fluid Flow . 149 7.3.2 Karst Conduit Fluid Flow . 151 7.3.3 Non-Negative Least Squares . 159 7.4 Discussion of Case Study Results . 161 7.4.1 Lattice-Boltzmann Method . 161 7.4.2 Karst Conduit Flow . 163 7.4.3 Non-Negative Least Squares { TNT . 164 7.5 Conclusions and Future Work . 167 8 Conclusion and Discussion 169 9 Complete Bibliography 172 Appendix A. CUSH Example Code Listings 192 A.1 Matrix Multiplication . 193 A.2 The Lattice-Boltzmann Method .

Load more