CHANG-DISSERTATION-2017.Pdf (4.942Mb)
Total Page:16
File Type:pdf, Size:1020Kb
STRUCTURE-PRESERVING HIGH PERFORMANCE COMPUTATIONAL METHODS FOR TRANSPORT IN POROUS MEDIA A Dissertation Presented to the Faculty of the Department of Civil and Environmental Engineering University of Houston In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in Civil Engineering by Justin Chang May 2017 STRUCTURE-PRESERVING HIGH PERFORMANCE COMPUTATIONAL METHODS FOR TRANSPORT IN POROUS MEDIA Justin Chang Approved: Chair of the Committee Kalyana Nakshatrala, Assistant Professor Civil & Environmental Engineering Committee Members: Kaspar Willam, Distinguished Professor Civil & Environmental Engineering Cumaraswamy Vipulanandan, Professor Civil & Environmental Engineering Gino Lim, Professor Industrial Engineering Lennart Johnsson, Distinguished Professor Computer Science Matthew Knepley, Assistant Professor Computational and Applied Mathematics Rice University Suresh K. Khator, Associate Dean Roberto Ballarini, Professor and Chair Cullen College of Engineering Civil and Environmental Engineering Acknowledgements First and foremost, I would like to thank my advisor, Professor Kalyana Babu Nakshatrala, for his excellent guidance and mentorship throughout the last six years of my graduate studies. Your research interests and teaching methodologies have inspired me to make a truly life changing decision by going into the field of computa- tional mechanics. It truly has been a memorable experience conducting research at the Computational and Applied Mechanics Laboratory (CAML). I would also like to thank Professors Lennart Johnsson and Matthew Knepley for the many hours of fruitful and thought provoking discussions on topics related to high performance computing. I would also like to thank both my SCGSR mentor, Dr. Satisk Karra, and my former CAML office mate, Dr. Maruti Mudunuru, over at the Los Alamos National Laboratory (LANL) for exposing me to the DOE National Laboratory system as well as their helpful guidance towards my research. Much of the work presented in this dissertation would not have been possible without the many world-class computational scientists and researchers perusing through the PETSc, Firedrake Project, and PAPI mailing lists. Pariticipation in these e-mail threads has helped me not only run my software and computational frameworks more efficiently but also become a skilled scientific programmer. Thank you, University of Houston, for the amazing facilities, resources, and fac- ulty. The all-you-can-eat-buffet at Fresh Food Company is arguably the best buffet I have ever eaten at, and it has kept me well fed these last six years. I will miss this place dearly. I also want to acknowledge the research faculty over at the Center for Advanced Computing and Data Systems (CACDS) for providing the computa- tional resources needed to test my computational frameworks. Thank you Dr. Jerry Ebalunode, Dr. Martin Huarte-Espinosa, Dr. Amit Amritkar, and other CACDS iv members for the very helpful emails and individual meetings. I also want to acknowledge my mom, dad, and sister. I love y’all so much. My educational journey would not have been possible with their unrelenting love, support, and sacrifices these last 29 years. Life in Houston would not have been enjoyable without my friends and community over at the Houston Chinese Church. Thank you guys for the weekly words of encouragement and prayers. Last but not least, I would like to thank my Lord and Savior Jesus Christ because without Him, I am nothing. v STRUCTURE-PRESERVING HIGH PERFORMANCE COMPUTATIONAL METHODS FOR TRANSPORT IN POROUS MEDIA An Abstract of a Dissertation Presented to the Faculty of the Department of Civil and Environmental Engineering University of Houston In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in Civil Engineering by Justin Chang May 2017 vi Abstract This dissertation aims at developing a novel and robust high performing paral- lel computational framework that can enhance the current predictive capabilities of numerical methods for large-scale subsurface transport applications. With increasing capacity and complexity of processors and memory systems, the need for improving the performance of subsurface flow and transport simulations has become an area of active research. Two of the most well known numerical deficiencies that the pop- ular formulations and existing simulation packages suffer from are the inability to meet the non-negative constraint under anisotropic diffusion (i.e., they violate dis- crete maximum principles) and the inability of the standard finite element formulation to ensure local mass balance. Moreover, there is no platform-agnostic performance model that can simultaneously document both the hardware/architectural and algo- rithmic efficiencies of any numerical method or software package that is sensitive to memory-bandwidth limitations. Several existing parallel scientific libraries, such as the Portable and Extensible Toolkit for Scientific Computations (PETSc) and the Massively Parallel Reactive Flow and Transport (PFLOTRAN) libraries, have been developed to help predict subsurface phenomena and are used by many of today’s leading hydrologists and geophysicists alike, but till date, the aforementioned con- cerns have not yet been resolved. Solutions to these numerical deficiencies have been proposed in literature but they do not address them concurrently in a high perfor- mance computing setting. Without a performance model, it is intractable to deter- mine whether proposed modifications to these subsurface software packages would be fast, scalable, or efficient. The objective of this dissertation is to present a computational framework that preserves important properties like local mass balance and positivity that can per- form at a high level. Performance tools and methodologies are presented to guide vii users on how to understand the performance of such frameworks. This dissertation comprises of three sections. First, we present a conceptual performance spectrum that covers time-to-solution, arithmetic intensity, and equations solved per second for any parallel computational framework that solves partial differential equations. As proof of concept, this spectrum is utilized on a wide variety of state-of-the-art sci- entific libraries and multi-grid solvers like the FEniCS/Firedrake Projects, HYPRE, and Trilinos. It is shown that this spectrum can augment one’s ability to understand both the hardware and algorithmic efficiencies of popular numerical techniques like the finite element method. Second, we propose an optimization-based computational framework that can ensure variationally consistent non-negative concentrations for large-scale anisotropic diffusion problems like Chromium remediation in the Sandia Canyon. The predicted computational performance of the proposed framework is based on a “perfect-cache” roofline model, loosely based on concepts originating from the performance spectrum, and it is shown that this roofline model can be used to predict how well the optimization-based framework can strong-scale. Third, we ex- tend the proposed computational framework to solve advection-diffusion equations by employing the variational inequality solver. We also enforce local/element-wise mass conservation by discretizing the advection-diffusion equation using the Discontinuous Galerkin finite element method. Our numerical experiments demonstrate that the proposed variational inequality approach conserves local mass balance, ensures non- negative concentration fields, and can accurately model large-scale and non-linear coupled flow and transport phenomena such as the miscible displacement of oil in a heterogeneous reservoir. viii Table of Contents Acknowledgments ................................ iv Abstract ...................................... vii Table of Contents ................................ ix List of Figures .................................. xiii List of Tables ................................... xxii Notation ...................................... ix Chapter 1. Introduction ............................ 1 Chapter 2. A performance spectrum for parallel computational frame- works that solve PDEs ........................... 5 2.1 INTRODUCTION AND MOTIVATION................ 5 2.1.1 Review of previous works..................... 6 2.1.2 Main contributions........................ 8 2.2 COMMON PERFORMANCE ISSUES................. 9 2.3 PROPOSED PERFORMANCE SPECTRUM............. 16 2.3.1 Arithmetic Intensity ....................... 16 2.3.2 Rate................................ 19 2.3.3 Using the performance spectrum................. 22 2.4 DEMONSTRATION OF THE PERFORMANCE SPECTRUM . 24 2.4.1 Demo #1: AMD Opteron 2354 vs Intel Xeon E5-2680v2 . 28 2.4.2 Demo #2: FEniCS vs Firedrake................. 30 2.4.3 Demo #3: Continuous Galerkin vs Discontinuous Galerkin . 35 2.4.4 Demo #4: Material properties.................. 38 2.5 CASE STUDY PART 1: SINGLE NODE................ 40 2.5.1 Hydrostatic ice sheet flow equations............... 41 2.5.2 Problem setup........................... 42 ix 2.5.3 Results............................... 43 2.6 CASE STUDY PART 2: MULTIPLE NODES............. 46 2.6.1 Example #1: 1024 MPI processes................ 47 2.6.2 Example #2: 256 compute nodes................ 48 2.7 CONCLUDING REMARKS....................... 50 Chapter 3. Large-scale optimization-based non-negative computa- tional framework for diffusion equations: Parallel implementation and performance studies .......................... 52 3.1 INTRODUCTION ............................ 52 3.1.1 Prior