Parallelization of a Dissipative Particle Dynamics Application in a Partitioned Global Address Space Environment

Parallelization of a Dissipative Particle Dynamics Application in a Partitioned Global Address Space Environment THESIS Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University By Karthik Raj Saanthalingam Graduate Program in Computer Science and Engineering The Ohio State University 2012 Dissertation Committee: Dr. Ponnuswamy Sadayappan, Advisor Dr. Atanas Nasko Rountev Copyright by Karthik Raj Saanthalingam 2012 Abstract Molecular dynamics simulation provides the methodology for detailed microscopic modeling on the molecular scale. The nature of matter is to be found in the structure and motion of its constituent building blocks, and the dynamics is contained in the solution to the N-body problem. Given that the classical N-body problem lacks a general analytical solution, the only path open is the numerical one. Scientists engaged in studying matter at this level require computational tools to allow them to follow the movement of individual molecules and it is this need that the molecular dynamics approach aims to fulfill. The Molecular Dynamics method follows a constructive approach by trying to reproduce the microscopic behavior of matter using model systems rather than deduce the behavior directly from experiment. Dissipative particle dynamics (DPD) is a stochastic simulation method for soft materials and has been applied to a variety of simulations. Doubts about its adequacy due to upper coarse-graining limitations, which could prevent the method from being applicable to the whole mesoscopic range has led to the proposal for a modified coarse-grained level tunable DPD method that demonstrates its performance for linear polymeric systems. The proposed method models the system through the interaction between the particles as a result of interparticle forces. The calculation region size is non-trivial and can vary depending on the simulation strategy adopted and hence greatly affects the performance of the application. Simulations of very large systems, approaching a cubic micron for milliseconds, are ii possible using a parallel implementation of DPD running on multiple processors. Because of the short-range nature of forces in DPD an efficient way to parallelize the application is to adopt a special decomposition technique. In this scheme, the total simulation space is divided into a number of cuboidal regions which are distributed across the processes in the cluster. Each processor is responsible for integrating the equations of motion of all particles that lie within its region of space. Only particles lying near the boundaries of each processor's space require communication between processors. In order to ensure that the simulation is efficient, the crucial requirement is that the number of particle-particle interactions that require inter-processor communication be much smaller than the number of particle-particle interactions within the bulk of each processor's region of space. In this thesis we outline the various alternatives to increase the per-node utilization and reduce the communication bottlenecks in each time step of the computation so as to achieve a near uniform distribution of work among the processes. A partitioned global address space (PGAS) model is deployed in the parallelization strategy so that the system can be logically portioned among the processors. The novelty of PGAS is that the portions of the shared memory space may have an affinity for a particular thread, thereby exploiting locality of reference and thereby reducing the communication overhead. iii Dedication This document is dedicated to my family. iv Acknowledgments Firstly, I would like to express my sincere gratitude to my advisor Dr. P. Sadayappan and Dr. David Hudak for their continuous support of my Masters study and research. They have been a great motivation for me throughout and their guidance has helped with my Masters. Their constant encouragement and complete knowledge of the subject has made me learn and explore more about my research. My sincere thanks also goes to my thesis committee member Dr. Atanas Nasko Rountev, for his encouragement and support. I would like to thank John Eisenlohr and Mikio Yamanoi for all their motivation, guidance and help throughout my thesis. Also, my thanks to all my other lab members for their support. Finally, I would like to extend to my gratitude to my family and friends for all their help and motivation, without whom this Masters would not have been possible. v Vita 2004................................................................B.Tech., Information Technology, Anna University, Chennai, India. 2010 to 2012 ..................................................Masters Student, Department of Computer Science and Engineering, The Ohio State University. Fields of Study Major Field: Computer Science and Engineering vi Table of Contents Abstract ................................................................................................................................ ii Dedication ........................................................................................................................... iv Acknowledgments................................................................................................................v Vita...................................................................................................................................... vi Fields of Study .................................................................................................................... vi Table of Contents ............................................................................................................... vii List of Tables ...................................................................................................................... ix List of Figures ......................................................................................................................x Chapter 1: Introduction ...................................................................................................... 1 1.1 Theoretical Foundation ........................................................................................ 1 1.2 Molecular Dynamics ............................................................................................ 2 1.3 Dissipative Particle Dynamics ............................................................................. 4 1.4 Specific Goals ...................................................................................................... 6 1.5 Contributions ........................................................................................................ 7 1.6 Organization ......................................................................................................... 8 Chapter 2: System Design................................................................................................... 9 vii 2.1 Background .......................................................................................................... 9 2.2 Global Arrays ....................................................................................................... 9 2.3 Overview ............................................................................................................ 10 2.4 Programming Model .......................................................................................... 12 2.5 Overview of the System ..................................................................................... 13 2.6 DPD Parallelization in Global Arrays ................................................................ 20 2.6.1 Neighbor List .............................................................................................. 22 2.6.2 Force Calculation ........................................................................................ 25 2.7 DPD Parallelization in OpenMP ........................................................................ 27 2.7.1 Alternative approach ................................................................................... 30 Chapter 3: Experimental Evaluation ................................................................................ 33 3.1 Performance Tests .............................................................................................. 33 3.2 Summary ............................................................................................................ 42 Chapter 4: Conclusions .................................................................................................... 43 4.1 Conclusion.......................................................................................................... 43 Bibliography...................................................................................................................... 44 viii List of Tables Table 3.1: Sample system attributes ................................................................................. 34 ix List of Figures Figure 1.1: Molecular Dynamics Simulation ...................................................................... 3 Figure 1.2: Illustration of Dissipative Particle Dynamics approach ................................... 6 Figure 2.1: GA Programming Model ................................................................................ 12 Figure 2.2: The different approaches to computing interactions [1] ................................ 18 Figure 2.3: Parallel global sum ........................................................................................

Parallelization of a Dissipative Particle Dynamics Application in a Partitioned Global Address Space Environment

Benchmarking the Intel FPGA SDK for Opencl Memory Interface

BCL: a Cross-Platform Distributed Data Structures Library

Advances, Applications and Performance of The

Enabling Efficient Use of UPC and Openshmem PGAS Models on GPU Clusters

Automatic Handling of Global Variables for Multi-Threaded MPI Programs

Overview of the Global Arrays Parallel Software Development Toolkit: Introduction to Global Address Space Programming Models

Exascale Computing Project -- Software

The Global Arrays User Manual

Recent Activities in Programming Models and Runtime Systems at ANL

The Opengl ES Shading Language

The Opengl ES Shading Language

Enabling Efficient Use of UPC and Openshmem PGAS Models on GPU Clusters