BigDFT Real-Space approach
Max Webinar Luigi Genovese
Wavelets and ISF
BigDFT GPU
Wavelet Convolutions Optimization Approach BOAST Challenges The operators of BigDFT code. From convolutions Poisson Solver Implicit Solvents to Poisson Solver in High Performance Computing Summary
Luigi Genovese
Laboratoire de Simulation Atomistique - L_Sim
November 12, 2020
Virtual Room BigDFT Real-Space Wavelet families used in BigDFT code approach
Luigi Genovese
Daubechies f (x) = ∑µ cµφµ(x) Interpolatingf (x) = ∑ fj ϕj (x) j Wavelets and ISF Orthogonal set Dual to dirac deltas (. . . is it?) GPU Wavelet Convolutions Optimization Approach R BOAST cµ = dx φµ(x)f (x) ≡ hφµ|f i fj = f (j) Challenges
Poisson Solver LEAST ASYMMETRIC DAUBECHIES-16 INTERPOLATING (DESLAURIERS-DUBUC)-8 1.5 1 scaling function scaling function Implicit Solvents wavelet wavelet 1 Summary 0.5 0.5
0 0
-0.5 -0.5
-1
-1 -1.5 -4 -2 0 2 4 -6 -4 -2 0 2 4 6 No need to calculate overlap The expansion coefficients matrix of basis functions are the point values on a grid Used for wavefunctions, Used for charge density, scalar products function products
Magic Filter method (A.Neelov, S. Goedecker) The passage between the two basis sets can be performed
without losing accuracy cµ = ∑k wµ−k fk , wk = hφ|Lk i. BigDFT Real-Space Operations performed approach
Luigi Genovese
Real Space Daub. Wavelets The SCF cycle Wavelets and ISF GPU
Wavelet Convolutions Orbital scheme: Optimization Approach BOAST • Hamiltonian Challenges • Preconditioner Poisson Solver Implicit Solvents Coefficient Scheme: Summary • Overlap matrices • Orthogonalisation
Comput. operations
• Convolutions • BLAS routines • FFT (Poisson Solver) Why not GPUs? BigDFT Real-Space HPC approach of BigDFT approach
Luigi Genovese
A DFT code conceived with a HPC mindset Wavelets and ISF • DFT calculations up to many thousands atoms GPU Wavelet Convolutions Optimization Approach An award-winning HPC code BOAST Challenges
• BigDFT has been conceived for massively parallel Poisson Solver Implicit Solvents etherogeneous architectures since more than 10 years Summary (MPI + OpenMP + GPU)
Code able to run routinely on different architectures • GPGPU since the advent of double-precision (2009) • Itanium, BG-P and BG-Q (Incite award 2015) • ARM architectures (Mont Blanc I) • KNL Accelerators (Marconi) • K computer (RIKEN) Fugaku – preparatory • European Supercomputers (Archer, IRENE-Rome, . . . ) BigDFT Real-Space Using GPUs in a given (DFT) code approach
Luigi Genovese
Developer and user dilemmas Wavelets and ISF
GPU • Does my code fit well? For which systems? Wavelet Convolutions Optimization Approach • How much does porting costs? BOAST Challenges
• Should I always use GPUs? Poisson Solver
Implicit Solvents • How can I interpret results? Summary
Evaluating GPU convenience Three levels of evaluation 1. Bare speedups: GPU kernels vs. CPU routines Are the operations suitable for GPU? 2. Full code speedup on one process Amdahl’s law: are there hot-spot operations? 3. Speedup in a (massively?) parallel environment The MPI layer adds an extra level of complexity BigDFT Real-Space Case study: 1D convolutions (BigDFT code) approach
Luigi Genovese
Initially, naive routines (FORTRAN?) Wavelets and ISF
do j=1,ndat GPU U do i=0,n1 Wavelet Convolutions tt=0.d0 Optimization Approach y(j,I) = ∑ h`x(I + `,j) BOAST `=L do l=lowfil,lupfil Challenges tt=tt+x(i+l,j)*h(l) Poisson Solver • Easy to write and Implicit Solvents enddo Summary debug y(j,i)=tt • Define reference enddo results enddo Optimisation can then start (2010: X5550,2.67 GHz)
Method GFlop/s % of peak SpeedUp Naive (FORTRAN) 0.54 5.1 1/(6.25) Current (FORTRAN) 3.3 31 1 Best (C, SSE) 7.8 73 2.3 OpenCL (Fermi) 97 20 29 (12.4) BigDFT Real-Space How to optimize? approach
Luigi Genovese
Wavelets and ISF
A trade-off between benefit and effort GPU Wavelet Convolutions Optimization Approach BOAST FORTRAN based Challenges
Poisson Solver
Relatively accessible (loop unrolling) Implicit Solvents Summary Moderate optimisation can be achieved relatively fast Compilers fail to use vector engine efficiently
Push optimisation at the best
• Only one out of 3 convolution type has been dissected • About 20 different patterns have been studied for one 1D convolution • Tedious work, huge code −→ Maintainability? Automatic code generation with BOAST engine BigDFT Real-Space BOAST Workflow approach
Luigi Genovese
Metaprogramming the kernels Wavelets and ISF GPU
Wavelet Convolutions Optimization hand-in-hand with design of the operations Optimization Approach BOAST Challenges Development Source Compilation Binary Poisson Solver Code Implicit Solvents Summary Transformation Performance BOAST Analysis Generative Optimization Performance Source Code data
• Generate combination of optimizations • C, OpenCL, FORTRAN and CUDA are supported • Compilation and analysis are automated • Selection of best version can also be automated BigDFT Real-Space Example with a kernel (Wavelet Synthesis) approach
Luigi Genovese
Wavelets and ISF
GPU
Wavelet Convolutions Optimization Approach BOAST Challenges
Poisson Solver
Implicit Solvents Summary
Take-home messages of the BOAST strategy
• Meta-programming of Hot-spot operations • Optimization can be adapted to the characteristics • Portability, best effort and maintainance may go together Towards a BLAS-like library for wavelets convolutions BigDFT Real-Space Autotuning of Libraries (core level) approach
Luigi Genovese
Not all codes can benefit from BLAS/LAPACK/FFT Wavelets and ISF
GPU • Domain specific libraries need to be optimized for the Wavelet Convolutions Optimization Approach architecture BOAST Challenges • Autotuning is possible through the usage of tools like Poisson Solver BOAST Implicit Solvents Summary • Generality can be achieved throught meta-programming
Example: libconv
• BOAST can generate all wavelet families ⇒ the library has more functionalities than the original code • BOAST can adapt the routines to the architecture using OpenMP and vector instructions • BOAST can target FORTRAN and C with OpenMP, CUDA, OpenCL BigDFT Real-Space O(N) BigDFT code is completely different approach Luigi Genovese
Wavelets and ISF • Less data → lower number of flops per atom GPU Wavelet Convolutions Optimization Approach • Communication scheduling completely different BOAST Challenges Poisson Solver Different problems new opportunities Implicit Solvents Summary • No hot spot operations anymore! Convolutions are a less important percentage of the run • Linear algebra becomes sparse Extra completity in handling parallelisation
The time-to-solution is considerably improved 18 k Atoms → less than 20 minutes on 13k cores • The computational physicist is happier • What about the computer scientist? BigDFT Real-Space Interpolating SF Poisson Solver approach
Luigi Genovese
(Screened) Poisson Equation for any BC in vacuum Wavelets and ISF
GPU
Non-orthorhombic cells (periodic, surface BC): Wavelet Convolutions 2 2 Optimization Approach (∇ − µ0)V (x,y,z) = −4πρ(x,y,z) BOAST Machine-precision accuracy J. Chem. Phys. 137, 13 (2012) Challenges Poisson Solver
Implicit Solvents Extended to implicit solvents (JCP 144, 014103 (2016)) Summary
Future developments Range-separated Coulomb operator h i 1 erf r + erfc r r r0 r0 BigDFT Real-Space Precision and performances (JCC, 2014) approach
Luigi Genovese
Wavelets and ISF
GPU
Wavelet Convolutions Optimization Approach BOAST Challenges
Poisson Solver
Implicit Solvents Summary
Volume 35 | Issues 5–6 | 2014 Included in this print edition: Issue 5 (February 15, 2014) Journal of Issue 6 (March 5, 2014) COMPUTATIONAL CHEMISTRY Organic • Inorganic • Physical Biological • M a t e r i a l s Research in Systems Neuroscience www.c-chem.org
Editors: Charles L. Brooks III • Masahiro Ehara • Gernot Frenking • Peter R. Schreiner BigDFT Real-Space Hybrid Functionals (JPCM 30 (9),095901 (2018)) approach
Luigi Genovese
UO systems: Wavelets and ISF 2 GPU Atoms Orbitals Wavelet Convolutions Optimization Approach 12 200 BOAST Challenges 96 1432 Poisson Solver 324 5400 Implicit Solvents 768 12800 Summary 1029 17150 BigDFT Real-Space Polarizable Continuum Models approach
Luigi Genovese Poisson solver for implicit solvents JCP 144, 014103 (2016) Wavelets and ISF Allows an efficient and GPU accurate treatment of implicit Wavelet Convolutions Optimization Approach solvents BOAST Challenges The dielectric function Poisson Solver determine the cavity where Implicit Solvents Summary the solute is defined. The cavity can be • rigid (PCM-like) • determined from the Electronic Density (SCCS approach) Can treat various BC
(here TiO2 surface)
Can be used in conjunction with O(N) BigDFT BigDFT Real-Space From vacuum to complex wet environments approach
Luigi Genovese
Neutral or ionic wet environments Wavelets and ISF
GPU
Poisson-Boltzmann eq. Wavelet Convolutions Generalized Poisson eq. Optimization Approach ∇ · ε(r)∇φ(r) = BOAST Challenges ∇ · ε(r)∇φ(r) = −4πρ(r) ions −4π ρ(r) + ρ [φ](r) Poisson Solver
Implicit Solvents Summary Various algorithms implemented
• Polarization Iteration (use gradient of polarization potential) • Preconditioned Conjugate Gradient (need only function multiplications)
Solution found by iterative application of PS in vacuum BC BigDFT Real-Space Modelling of Electrostatic Environment approach
Luigi Genovese
Advantages of explicit BC Wavelets and ISF GPU Difference between the electrostatic solvation energy computed Wavelet Convolutions Optimization Approach with periodic and free boundary conditions as function of the BOAST Challenges periodic cell length. Poisson Solver
Implicit Solvents Summary Molecule dipole norm (D)
vacuum H2O CH3CONH2 3.88 5.76 H2O 1.81 2.41 CH3OH 1.57 2.14 NH3 1.49 1.98 CH3NH2 1.27 1.78
Explicit BC avoid error due to supercell aliasing BigDFT Real-Space Performances and timings for full DFT runs approach
Luigi Genovese
Wavelets and ISF
GPU Blackbox-like usage Wavelet Convolutions Optimization Approach BOAST The Generalized PS only needs Challenges few iterations of the vacuum Poisson Solver Implicit Solvents poisson solver Summary
Time-to-solution Timings for the protein PDB ID: 1y49 (122 atoms) in water • Full SCF convergence 49 s • Solvent/vacuum runtime ratio α = 1.16 BigDFT Real-Space ISF: wavelet analysis meets real-space approach
Luigi Genovese
Features of Systematicity Wavelets and ISF
GPU • ISF provide a rigorous framework to interpret real-space Wavelet Convolutions Optimization Approach coefficients BOAST Challenges • High interpolating power (precise results) Poisson Solver • Implicit Solvents Scaling relation → computational efficiency, arbitrary Summary precision • (Quasi) variational • In conjunction with (Daubechies) wavelets offer a good formalism to get rid of uncertainties
Opportunities from new quadratures
• Generalize the concept of compensating charge (multipoles) • Combine together cartesian and polar meshes