<<

Geant4 Computing Performance: Intensity Frontier Experiments

Soon Yung Jun, Krzysztof Genser, Robert Hatcher Geant4 Collaboration Meeting August 26-30, 2018, Lund, Sweden Outline

• Muon experiments – Mu2e – Muon g-2 • LArTPC Neutrino experiments – MicroBooNE – DUNE • Results of profiling LArSoft applications • Optical photon simulation • Summary

2 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund Muon Experiments

• Mu2e has been using Geant4 10.4 since April 2018; can use Geant4MT as one of the options • Preliminary data suggests that Mu2e spends only ~1% of detector simulation time in geometry functions for electron workflows; (the fraction may be different for hadrons as they travel through different parts of the detector. – Given the above, migrating to VecGeom shapes is not a priority at the moment

• Muon g-2 is using Geant4 10.3.p03 with the spin tracking problem patched locally (planning to move to 10.4+ later in the year)

3 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund Liquid Argon Time Projection Chamber (LArTPC) Experiments • New generation of neutrino experiments based on LAr TPCs Short baseline:

Long baseline:

Demonstrator ProtoDUNE at CERN

4 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund Neutrino Experiments • Elements of physics program – Mass hierarchy – CP-violation – Sterile neutrinos – Search for neutrinos from supernova – Sear for evidence of proton decay

• Primary physics processes in LArTPC – Neutrino-Nuclear interactions (nN) – Charged particle (�±,h) transport through a large volume of 40Ar • e±, µ±, p±, K±, p • 0.5 GeV – 7 GeV (typically energy range for DUNE/protoDUNE) – Photon creation and propagation

5 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund Primary Geant4 Processes in LArTPC Simulation • Charged particles – Ionization (dE/dx) – Multiple scattering – Drift of e- in 500V/cm • (Optical) Photons pixcel – Scintillation image • 45,000 photons/MeV • 6ns/1.6μs (fast/slow) – Cherenkov radiation – Rayleigh scattering – Absorption by impurities – Reflection

DL (CNN) signal vs. bkg.

6 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund LArSoft • A collaboration of experiments, labs and university groups – collection of software packages (art, Pandora, Wire-cell, etc.)

• Goal is to provide integrated, detector-independent software tools for the simulation, reconstruction and analysis to be used by LAr TPC neutrino experiments

7 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund Profiling LArSoft Applications: Workflow • LArSoft/protoDUNE and Dune-FD: dunetpc v06_57_00 – Geant4 version: 10.3.p01(à 10.3.p03 now) – 6 GeV proton and 1 GeV cosmic – simulation chain: genàg4àdetsimàrecoàana • CPU and memory: (prodgenie_nue_dune10kt_1x2x6)

CPU Memory (RSS)

13% 2% 9% 19% 3% 5%

25%

34% 9% 81% generator Geant4 DetectorSim generator Geant4 Detector Sim Reconstruction Analysis Reconstruction Analysis

8 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund Profiling Tools

• Memory: IgProf (http://igprof.org) v5.9.6 • CPU: HPCToolkit (http://hpctoolkit.org) version 2017.06

9 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund Results: Profiling Overhead • Profiling overhead (OH): Profiling/Bench-test in [sec]

10 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund LArSoft/DUNE 6GeV Proton: HPCTooklit • CPU Top libraries

11 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund LArSoft/DUNE 6GeV Proton: HPCTooklit • CPU Top functions OpFastScintillation::PostStpDoIt

• G4Exp is called from OpFastScintillation::PostStepDoIt

12 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund LArSoft/DUNE 6GeV Proton: IgProf

hotspot (processing hits)

Initialization and loop-ups

13 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund LArSoft/protoDUNE Profiling summary based on HPCToolkit • Top CPU functions – (Fast) Optical photon simulation – Usual charged particle energy loss processes • Large memory footprint – processing a large number of hits – loading lookup-table for optical photons • CPI (Cycle per Instruction) – Good balance with minimal stalls – Geant4/detsim/reco = 0.67/0.48/0.55 • FMO (Flops per Memory operation) – Computational intensity – Geant4/detsim/reco = 7.20e-04/0.35/0.24 • https://g4cpt.fnal.gov/larsoft/dunetpc_v06_57_00/hpctoolkit.html

14 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund Optical Photon Simulation • Optical photons in LAr-TPC experiments: – Copiously produced in LAr (scintillation and Cherenkov) – Full optical photon simulation is prohibitively expensive on CPUs for a large LAr-based detector

– Use lookup tables to roughly estimate arrival time – Optical photons simulation on GPU is a suitable option (ex. Blyth S 2017 Opticks : GPU Optical Photon Simulation for using NVIDIA® OptiX™ J. Phys.: Conf. Series 898 042001)

15 S.Y. Jun | Computing Performance, Fermilab-IF Meeting, Lund Summary

• No critical Geant4 computing issues identified by Muon experiments (mu2e, muon g-2) • Computing challenges for LAr-TCP experiments – Neither simulation nor reconstruction are “fast”, but it seems there are no critical bottlenecks in Geant4 itself – Geant4 memory footprint is large (range 6-8 Gb) due to a large hit data. Moving to multi-threading is a possible solution – Looking for opportunities for acceleration, especially based on SIMD vectorization • Intensity frontier experiments are actively adopting recent versions (10.3.p01+) of Geant4 and moving to Geant4MT, which should accelerate with the migration to the multithreaded art framework

16 S.Y. Jun | Computing Performance, Fermilab-IF 2018 Geant4 Collaboration Meeting, Lund