Space-Efficient and Exact System Representations for the Nonlocal

Space-efficient and exact system representations for the nonlocal protein electrostatics problem Dissertation zur Erlangung des Grades „Doktor der Naturwissenschaften” am Fachbereich Physik, Mathematik und Informatik der Johannes Gutenberg-Universität in Mainz Thomas Kemmer geb. in Wittlich Mainz, den 14. Oktober 2020 1. Berichterstatter: Prof. Dr. Andreas Hildebrandt 2. Berichterstatter: Prof. Dr. Bertil Schmidt Tag des Kolloquiums: 09.03.2021 D77 ii Abstract The study of proteins and protein interactions, which represent the main constituents of biological functions, plays an important role in modern biology. Application settings such as the development of pharmaceutical drugs and therapies rely on an accurate description of the electrostatics in biomolecular systems. This particularly includes nonlocal electrostatic contributions of the water-based solvents in the cell, which have a significant impact on the long-range visibility of immersed proteins. Existing mathematical models for the nonlocal protein electrostatics problem can be approached with numerical standard techniques, including boundary element methods (BEM). However, the typically large, dense, and asymmetric matrices involved in discretized BEM formulations previously prevented the application of the method to realistically-sized biomolecular systems. Here, we overcome this obstacle by developing implicit yet exact representations for such BEM matrices, capturing trillions of floating-point values in only a few megabytes of memory and without loss of information. We present generalized reference implementations for our implicit matrix types alongside specialized matrix operations for the Julia and Impala programming languages and demonstrate how they can be utilized with existing linear system solvers. On top of these implementations, we develop full- fledged CPU- and CUDA-based BEM solvers for the nonlocal protein electrostatics problem and make them available to the scientific community. In this context, we show that our solvers can perform dozens of matrix-vector products for the previously inaccessible BEM systems within a few seconds or minutes and thus allow, for the first time, to solve the employed BEM formulations with exact system matrices in the same time frame. iii German summary Proteine und ihre Interaktionen mit anderen Biomolekülen in der Zelle bilden eine wichtige Grundlage biologischer Funktionen. Studien im Bereich dieser Interaktio- nen verwenden, je nach Problemstellung, unterschiedliche mathematische Modelle zur Beschreibung der beteiligten Energien. Im Bereich der Medikamentenentwick- lung ist dabei beispielsweise eine möglichst präzise Modellierung der Elektrostatik des Systems entscheidend, da diese maßgeblich am Findungsprozess potenzieller Bindungspartner beteiligt ist. Die strukturelle Komplexität der wasserbasierten Lösungsmittel in der Zelle stellt hier eine besondere Herausforderung dar, denn Dipolmomente sowie der Drang zur Ausbildung von Wasserstoffbrückenbindungen schränken die Bewegungsfreiheit der Wassermoleküle in der unmittelbaren Umge- bung gelöster Proteine stark ein. Diese Effekte können z. B. durch das theoretische Rahmenwerk der nichtlokalen Elektrostatik beschrieben und die resultierenden Gle- ichungssysteme mit Standardverfahren der Numerik gelöst werden. Hier betrachten wir eine existierende Formulierung des Proteinelektrostatikproblems unter Verwen- dung einer Randelementmethode (engl. boundary element method, BEM), deren Beitrag zur genaueren Berechnung elektrostatischer Potentiale in der Vergangenheit bereits für kleine Biomolekularsysteme gezeigt werden konnte, deren praktische Anwendbarkeit auf größere Systeme jedoch bislang durch ihre immensen Speicher- anforderungen eingeschränkt war. Im Rahmen dieser Dissertation entwickeln wir implizite Repräsentationen für die üblicherweise dicht besetzten und nicht symmetrischen Systemmatrizen der in den numerischen Lösungsprozess involvierten linearen Gleichungssysteme, die der Grund für die oben genannten Einschränkungen sind. Unsere impliziten Repräsenta- tionen reduzieren den ursprünglich quadratischen Speicherbedarf auf einen linearen, ohne dabei Informationen zu verlieren, so dass Matrizen mit mehreren Milliarden Elementen problemlos mit wenigen Megabytes im Speicher dargestellt werden kön- nen. Wir präsentieren außerdem Referenzimplementierungen für unsere impliziten Matrixtypen sowie Operationen für selbige in den Programmiersprachen Julia und Impala und beschreiben, wie diese in vorhandenen Lösern für lineare Gleichungssys- teme zur Anwendung kommen oder die Repräsentationen auf andere Probleme übertragen werden können. v Auf dieser Grundlage entwickeln wir schließlich vollwertige Randelementlöser für das nichtlokale Proteinelektrostatikproblem, die zum ersten Mal in der Lage sind, praxisnahe Eingabegrößen verarbeiten und Lösungen innerhalb weniger Sekunden oder Minuten bereitstellen zu können. Dazu implementieren wir verschiedene Varianten der allgemeinen Matrix-Vektor-Multiplikation für CPUs und GPUs (unter Verwendung von NVIDIAs bekannter CUDA-Plattform), deren Performanz wir hier auf unterschiedlichen Rechnern gegenüberstellen. Ein besonderes Augenmerk legen wir dabei auf unsere Julia-Implementierungen, die durch das geschickte Ausnutzen von Spracheigenschaften eine Manipulation von eigentlich seriellen und rein CPU- basierten Gleichungssystemlösern erlauben, die effektiv zu einer (wahlweise CPU- oder GPU-basierten) Parallelisierung der Löser führen. Ein weiteres Augenmerk legen wir auf die domänenspezifische Implementierung unserer Impala-Lösungen, die für die gewählte Zielplattform spezialisierte und optimierte Programme aus einer weitgehend plattformunabhängigen Codebasis generiert, was einerseits den Wartungsaufwand der Implementierung drastisch verringert und andererseits das einfache Hinzufügen oder Austauschen von Plattformen ermöglicht. Alle im Rahmen dieser Arbeit entwickelten Softwarepakete sind quelloffen und frei verfügbar, so dass die hier präsentierten Ergebnisse leicht nachvollzogen und unsere Randelementlöser insbesondere für wissenschaftliche Zwecke eingesetzt werden können. vi Chapter 0 Acknowledgments First and foremost, I want to express my gratitude to my advisor, Professor Dr. Andreas Hildebrandt, who introduced me into the world of nonlocal protein electrostatics, guided me through the early stages of this project and provided invaluable feedback ever since. I am also indebted to my second reviewer, Professor Dr. Bertil Schmidt, for several constructive discussions on the high-performance capabilities of the system solving process. Furthermore, I want to thank my dear friends who assisted me with proofreading, for every bit of nitpicking, all fruitful discussions, and for accompanying me on my never-ending quest for working through unspeakable amounts of coffee and sushi. I am also deeply grateful to my significant other for her endless support and encouragement over the course of the writing process, for proofreading, and for company during roughly one or two all-nighters. I would like to thank my friends and colleagues at the Institute of Computer Science, the Scientific Computing and Bioinformatics group, and the Integrated Research Training Group for a familiar working atmosphere and numerous discussions on diverse topics. Last but not least, I am indebted to my former supervisor, who introduced me into a more structured approach to scientific writing during my Master studies. Although she was not involved in this manuscript or the projects leading to it, her invaluable feedback to my Master’s thesis a long time ago sharpened my awareness of stylistic consistency permanently and thus helped me, once again, through the writing process. vii Abbreviations AoS Array of structs BEM Boundary element method DSL Domain-specific language FDM Finite difference method FEM Finite element method GMRES Generalized minimal residual method GPGPU General-purpose computing on GPUs IR Intermediate representation JIT Just in time PE Partial evaluation REPL Read-eval-print loop SIMD Single instruction, multiple data SIMT Single instruction, multiple threads SM Streaming multiprocessor SoA Struct of arrays ix Contents 1 Introduction 1 2 Nonlocal protein electrostatics 5 2.1 Protein interactions . 5 2.1.1 Protein structure hierarchy . 6 2.1.2 The human proteome . 7 2.1.3 Protein docking . 8 2.1.4 Force field models . 9 2.2 Protein electrostatics . 10 2.2.1 Local cavity model . 10 2.2.2 Nonlocal cavity model . 13 2.3 Numerical solvers . 15 3 BEM system analysis 19 3.1 Background: Operators . 19 3.1.1 Trace operators . 20 3.1.2 Boundary integral operators . 21 3.2 Local cavity model . 22 3.2.1 Continuous formulation for the local cavity model . 23 3.2.2 Discrete formulation for the local cavity model . 24 3.3 Nonlocal cavity model . 27 3.3.1 Continuous formulation for the nonlocal cavity model . 28 3.3.2 Discrete formulation for the nonlocal cavity model . 28 3.4 Derived quantities . 30 3.4.1 Interior potentials . 31 3.4.2 Exterior potentials . 32 3.4.3 Reaction field energies . 34 4 Platform selection 37 4.1 The Julia language . 37 4.1.1 Scripting language characteristics . 38 4.1.2 High-performance capabilities . 40 xi 4.2 AnyDSL . 42 4.2.1 Compilation flow . 42 4.2.2 Impala . 43 4.3 CUDA . 44 4.3.1 Programming model . 45 4.3.2 CUDA support in Julia . 47 4.3.3 CUDA support in Impala . 47 5 Solving implicit linear systems 49 5.1 Background: Iterative solvers . 49 5.1.1 Deterministic iterative solvers . 49 5.1.2 Probabilistic and randomized iterative solvers

Load more