Digitalcommons@USU
Total Page:16
File Type:pdf, Size:1020Kb
Utah State University DigitalCommons@USU All Graduate Theses and Dissertations Graduate Studies 12-2009 Parallelization of Performance Limiting Routines in the Computational Fluid Dynamics General Notation System Library Kyle Horne Utah State University Follow this and additional works at: https://digitalcommons.usu.edu/etd Part of the Mechanical Engineering Commons Recommended Citation Horne, Kyle, "Parallelization of Performance Limiting Routines in the Computational Fluid Dynamics General Notation System Library" (2009). All Graduate Theses and Dissertations. 522. https://digitalcommons.usu.edu/etd/522 This Thesis is brought to you for free and open access by the Graduate Studies at DigitalCommons@USU. It has been accepted for inclusion in All Graduate Theses and Dissertations by an authorized administrator of DigitalCommons@USU. For more information, please contact [email protected]. PARALLELIZATION OF PERFORMANCE LIMITING ROUTINES IN THE COMPUTATIONAL FLUID DYNAMICS GENERAL NOTATION SYSTEM LIBRARY by Kyle Horne A thesis submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in Mechanical Engineering Approved: Dr. Thomas Hauser Dr. Robert Spall Major Professor Committee Member Dr. Heng Ban Dr. Byron Burnham Committee Member Dean of Graduate Studies UTAH STATE UNIVERSITY Logan, Utah 2009 ii Copyright © Kyle Horne 2009 All Rights Reserved iii Abstract Parallelization of Performance Limiting Routines in the Computational Fluid Dynamics General Notation System Library by Kyle Horne, Master of Science Utah State University, 2009 Major Professor: Dr. Thomas Hauser Department: Mechanical and Aerospace Engineering The Computational Fluid Dynamics General Notation System provides a unified way in which computational fluid dynamics data can be stored, but does not support the parallel I/O capabilities now available from version five of the Hierarchical Data Format library which serves as a back end for the standard. To resolve this deficiency, a new parallel extension library has been written and benchmarked for this work which can write files compliant with the standard using parallel file access modes. When using this new library, the write performance shows an increase of four- fold in some cases when compared to the same hardware operating in serial. Additionally, the use of parallel I/O allows much larger cases to be written since the problem is scattered across many nodes of a cluster, whose aggregate memory is much greater than that found on a single machine. These developments will allow computational fluid dynamics simulations to execute faster, since less time will be spent waiting for each time step to finish writing, as well prevent the need for lengthy reconstruction of data after the completion of a simulation. (293 pages) iv Acknowledgments Firstly, I would like to thank my wife, Lydia, for supporing me in the completion of this project. Her efforts to make our humble apartment our home have made all the difference. Secondly, I want to thank my parents for encouraging my curiosity and scientific interest from an early age, and giving me the freedom to become who I am. I would also like to thank Dr. Thomas Hauser for employing me since my senior year of undergraduate studies. He has funded my research and provided me with opportunities to grow that I had not even considered before. Additionally, I would thank the faculty for their efforts in making the undergraduate and graduate programs at Utah State University environments in which students can succeed. Much credit must also go to Bonnie Ogden for making sure that graduate students get all their paperwork done. Additionally, I would thank Mr. Mike Kennedy, Mrs. Vanessa Liveris, Ms. Nicole Burt, and Mr. Gussman from Neuqua Valley High School for the sound foundation in science and mathemat- ics which I received there. Kyle Horne v Contents Page Abstract ......................................................... iii Acknowledgments .................................................. iv List of Tables ..................................................... viii List of Figures ..................................................... ix 1 Introduction ................................................... 1 1.1 Scientific Computing . 1 1.1.1 Data Storage History . 1 1.1.2 System Level I/O . 2 1.1.3 Hierarchical Data Format . 3 1.1.4 Supercomputing . 3 1.1.5 Vector Computers . 3 1.1.6 Cluster Computing . 4 1.2 Computational Fluid Dynamics . 5 1.2.1 Problem Formulation . 5 1.2.2 Solution Algorithms . 7 1.2.3 Parallel Implementations . 8 1.2.4 Storage Concerns . 8 2 Problem Description .............................................. 13 2.1 Problem Introduction . 13 2.2 Problem Background . 16 2.3 Computational Fluid Dynamics General Notation System . 18 2.3.1 Standard Interface Data Structures . 18 2.3.2 Advanced Data Format . 18 2.3.3 Hierarchical Data Format version 5 . 20 vi 2.3.4 Mid-level Library . 20 3 Literature Review ............................................... 21 3.1 Previous Work . 21 3.2 Parallel Input/Output . 22 3.2.1 Message Passing Interface Standard . 22 3.2.2 Network Attached Storage . 24 3.2.3 Message Passing Interface-I/O . 24 3.2.4 Parallel File Systems . 24 3.2.5 Hierarchical Data Format version 5 . 27 4 Parallel Implementation for the CFD General Notation System ................ 28 4.1 Design . 28 4.2 Features . 29 5 Results ....................................................... 31 5.1 Hardware Setup . 31 5.2 Benchmark Selection . 33 5.2.1 IOR . 33 5.2.2 pCGNS . 33 5.3 Benchmark Results . 34 5.3.1 IOR . 35 5.3.2 pCGNS . 38 6 Summary and Conclusion .......................................... 48 Bibliography ...................................................... 51 Appendix ........................................................ 58 A pCGNS Source Code & Documentation ................................ 59 A.1 Data Structure Index . 59 A.1.1 Data Structures . 59 A.2 File Index . 59 A.2.1 File List . 59 A.3 Data Structure Documentation . 60 vii A.3.1 base_s Struct Reference . 60 A.3.2 coords_s Struct Reference . 63 A.3.3 file_s Struct Reference . 64 A.3.4 iter_s Struct Reference . 69 A.3.5 section_s Struct Reference . 70 A.3.6 slice_s Struct Reference . 72 A.3.7 sol_s Struct Reference . 74 A.3.8 zone_s Struct Reference . 76 A.4 File Documentation . 80 A.4.1 benchmark.c File Reference . 80 A.4.2 benchmark.c . 88 A.4.3 open_close.c File Reference . 94 A.4.4 open_close.c . 96 A.4.5 pcgns_util.c File Reference . 97 A.4.6 pcgns_util.c . 119 A.4.7 pcgns_util.h File Reference . 140 A.4.8 pcgns_util.h . 163 A.4.9 pcgnslib.c File Reference . 168 A.4.10 pcgnslib.c . 186 A.4.11 pcgnslib.h File Reference . 228 A.4.12 pcgnslib.h . 249 A.4.13 test_base.c File Reference . 252 A.4.14 test_base.c . 254 A.4.15 test_queue.c File Reference . 256 A.4.16 test_queue.c . 258 A.4.17 test_unstructured.c File Reference . 260 A.4.18 test_unstructured.c . 262 A.4.19 test_zone.c File Reference . 265 A.4.20 test_zone.c . 266 A.4.21 thesis_benchmark.c File Reference . 268 A.4.22 thesis_benchmark.c . 274 viii List of Tables Table Page 4.1 API of the pCGNS library as implemented in pcgnslib.c and declared in pcgnslib.h. Software using the library to access CGNS files in parallel must do so using the routines listed here. 30 ix List of Figures Figure Page 1.1 A generalized diagram of a cluster computer (Upper) and a shared-memory com- puter (Lower). The abbreviations used in the diagram are as follows: CPU=Central Processing Unit; NIC=Network Interface Card; HD=Hard Disk. The dashed line connecting all the components of each computer represents the primary data bus of the machine. The simplicity of the shared-memory computer architecture com- pared to that of the cluster is apparent, making shared-memory computers desirable when possible. Unfortunately, the performance required from the main data bus in a shared-memory computer limits the scalability of the platform. 6 1.2 Mesh partitioned for parallel computation using Gmsh [1]. The mesh has been split into four partitions algorithmically, with the intent of minimizing the number of boundary edges between each partition. This minimization lowers the communica- tions overhead incurred when executing a CFD simulation on the mesh. 9 1.3 Mesh from a CFD solution of the common driven cavity problem on an unstructured grid. The mesh was generated using Gmsh [1], and is composed of triangles and quadrilaterals on the interior, with one dimensional cells all around the boundary. 11 1.4 Velocity field from a CFD solution of the common driven cavity problem on an unstructured grid. The solution was obtained using a code written by the author and is used only as an example of CFD results. The color corresponds to the velocity magnitude at each cell center and the lines portray both the direction and magnitude of the velocity. 11 1.5 Pressure field from a CFD solution of the common driven cavity problem on an unstructured grid. The solution was obtained using a code written by the author and is used only as an example of CFD results. The color corresponds to the relative pressure at each cell center. 12 1.6 General transported quantity field from a CFD solution of the common driven cavity problem on an unstructured grid. The solution was obtained using a code written by the author and is used only as an example of CFD results. The color corresponds to the magnitude of the quantity phi at each cell center. ..