HLST Core Team Report 2013

EUROFUSION WPISA-REP(16) 16107 R Hatzky et al. HLST Core Team Report 2013 REPORT This work has been carried out within the framework of the EUROfusion Con- sortium and has received funding from the Euratom research and training programme 2014-2018 under grant agreement No 633053. The views and opinions expressed herein do not necessarily reflect those of the European Commission. This document is intended for publication in the open literature. It is made available on the clear under- standing that it may not be further circulated and extracts or references may not be published prior to publication of the original when applicable, or without the consent of the Publications Officer, EUROfu- sion Programme Management Unit, Culham Science Centre, Abingdon, Oxon, OX14 3DB, UK or e-mail Publications.Offi[email protected] Enquiries about Copyright and reproduction should be addressed to the Publications Officer, EUROfu- sion Programme Management Unit, Culham Science Centre, Abingdon, Oxon, OX14 3DB, UK or e-mail Publications.Offi[email protected] The contents of this preprint and all other EUROfusion Preprints, Reports and Conference Papers are available to view online free at http://www.euro-fusionscipub.org. This site has full search facilities and e-mail alert options. In the JET specific papers the diagrams contained within the PDFs on this site are hyperlinked HLST Core Team Report 2013 Contents 1. Executive Summary ..............................................................................................4 1.1. Progress made by each core team member on allocated projects ................ 4 1.2. Further tasks and activities of the core team .................................................9 1.2.1. Dissemination .........................................................................................9 1.3. Training ..........................................................................................................9 1.3.1. Internal training .......................................................................................9 1.3.2. Workshops & conferences ....................................................................10 1.3.3. Meetings ...............................................................................................10 2. Final report on HLST project EMPHORB ............................................................11 2.1. Introduction ..................................................................................................11 2.2. Phase factor .................................................................................................11 2.3. Modified equations .......................................................................................12 2.3.1. Weight equation ....................................................................................12 2.3.2. Charge assignment ..............................................................................12 2.3.3. Field solver ...........................................................................................13 2.3.4. Field evaluation ....................................................................................14 2.3.5. Diagnostics ...........................................................................................14 2.4. Implementation ............................................................................................15 2.4.1. Weight equation ....................................................................................15 2.4.2. Charge assignment and field evaluation ..............................................16 2.4.3. Matrix building ......................................................................................16 2.4.4. Complex numbers ................................................................................17 2.4.5. Diagnostics ...........................................................................................18 2.4.6. Adjusting the filter .................................................................................18 2.5. Using NEMORB with phase factor ...............................................................19 2.6. Tests ............................................................................................................19 2.7. Resolution ....................................................................................................21 2.8. Summary .....................................................................................................23 2.9. References ..................................................................................................23 3. Final report on HLST project MICPORT ............................................................. 24 3.1. Introduction ..................................................................................................24 3.2. The hardware ...............................................................................................24 3.3. Porting JOREK ............................................................................................25 3.3.1. Installation of libraries ...........................................................................25 3.3.2. Thread pinning ......................................................................................26 3.3.3. JOREK shell scripts ..............................................................................26 3.3.4. Shared libraries ....................................................................................26 3.4. JOREK Tests ...............................................................................................27 3.4.1. Profiling .................................................................................................27 3.5. Matrix building ..............................................................................................28 3.5.1. Vectorization .........................................................................................30 3.6. LU factorization ............................................................................................32 3.6.1. PaStiX test ............................................................................................33 3.6.2. JOREK LU factorization ........................................................................ 34 3.7. The solver ....................................................................................................35 3.8. Conclusions .................................................................................................36 4. Final Report on HLST project BLIGHTHO ..........................................................37 4.1. Introduction ..................................................................................................37 4.2. MPI libraries evaluation status .....................................................................37 4.3. MPI non-blocking call evaluation .................................................................39 4.4. MPI library initialization time ........................................................................40 4.4.1. Running the initial transpose ................................................................40 4.4.2. Alternative transpose implementations ................................................. 43 4.5. Task/thread pinning .....................................................................................45 4.5.1. How threads are actually pinned .......................................................... 46 4.5.2. The proposed pinning strategy .............................................................46 4.5.3. Pinning’s impact on performance ......................................................... 47 4.5.4. Non temporal stores or streaming stores ............................................. 49 5. Final report on the KSOL2D-2 project .................................................................51 5.1. Introduction ..................................................................................................51 5.2. Model problem .............................................................................................51 5.3. Multigrid method with reduced number of cores ..........................................52 5.4. FETI-DP method ..........................................................................................53 5.5. FEM discretization ....................................................................................... 55 5.6. Current status and future work ....................................................................57 6. Final Report on HLST project MGTRIOPT .........................................................58 6.1. Introduction ..................................................................................................58 6.2. Multigrid with a hybrid programming model .................................................58 6.3. Conclusions .................................................................................................63 7. Final report on HLST project PARFS ..................................................................64 7.1. Introduction ..................................................................................................64 7.2. Initial activities, problem assessment ..........................................................64 7.3. Project Activities ..........................................................................................64 7.4. Distributed MARST ......................................................................................65 7.5. The

HLST Core Team Report 2013

Easybuild Documentation Release 20210907.0

Interfacing Epetra to the RSB Sparse Matrix Format for Shared-Memory Performance

Using Machine Learning to Improve Dense and Sparse Matrix Multiplication Kernels

Best Practice Guide

Efficient Multithreaded Untransposed, Transposed Or Symmetric Sparse

Auto-Tuning Shared Memory Parallel Sparse BLAS Operations In

GRILLIX: a 3D Turbulence Code for Magnetic Fusion Devices Based on a Field Line

PRIMME SVDS: a High-Performance Preconditioned SVD Solver For

WP 8 Project Presentations - 28 September 2020

Matrix:“MM” 3/9 HCSR 1.3E+04 4/9 HCOO 2.0E+03 – Sparse Triangular Solve by Matrix:“SM” Untuned

Fast Matrix-Vector Multiplications for Large-Scale Logistic Regression on Shared-Memory Systems

Librsb: a Shared Memory Parallel Sparse BLAS Implementation Using the Recursive Sparse Blocks Format