<<

Conference Slides Pack 3. Chris Allton, 50. Huw Morgan, University 77. Anna Price, 94. Chennakesava Kadapa, Swansea University 112. Andrew Lane, Cardiff University 125. Alys Brett and Simon Hettrick, Society Index of Research Software Engineers 166. Crispin Keable, Atos 182. Timothy Lanfear, NVIDIA 192. Steve Smith, Dell 219. Simone Warren, Verne Global 233. Gert Aarts, Swansea University Using HPC for Particle Physics

Chris Allton [Swansea University] FASTSUM Collaboration Supercomputing , January 2020 FASTSUM Collaboration Gert Aarts, CRA, Tim Burns, Sergio Chaves, Davide de Boni, Jonas Glesaaen, Simon Hands, Alan Kirby, Aleksandr Nikolaev, Sam Offler, Thomas Spriggs, Dawid Stasiak Swansea University Benjamin Jäger Southern Denmark University Seyong Kim Sejong University, Korea Maria Paola Lombardo INFN, Pisa Sinead Ryan Trinity College, Dublin Jon-Ivar Skullerud National University of Ireland, Maynooth, Ireland Liang-Kai Wu Jiangsu University, China Overview

• Particle Physics at high temperature • Lattice approach • Hadrons (i.e. protons/neutrons) at T ≈ 0 • Spectral representation and Bayesian Methods • Hadrons at T ≫ 0

Phase diagram of water Pressure

Temperature The Periodic Table Molecule Atom

Protons & Nucleus Neutrons

Quarks The “Periodic Table” of Particle Physics The “Periodic Table” of Particle Physics The “Periodic Table” of Particle Physics The “Periodic Table” of Particle Physics Phase diagram of Quarks Temperature

Baryon Density T = 0 versus T ≠ 0

Quarks Accuracy of Similar to Fundamental Asymptotic Are Predictions Properties states

Thermo- T,μ ≠ 0 Deconfined O(20%) Plasma/fluid dynamic Quarks / Quantities gluons

Atomic T = 0 Confined O(1%) physics Decay Rates Hadrons (bound states) T = 0 versus T ≠ 0

Quarks Accuracy of Similar to Fundamental Asymptotic Are Predictions Properties states

Thermo- T ≠ 0 Deconfined O(20%) Plasma/fluid dynamic Quarks / Quantities gluons

Atomic T = 0 Confined O(1%) physics Decay Rates Hadrons (bound states) Experiment Theory Continuum Lattice

T =/ 0

Ordinary QCD

T = 0 Large Hadron Collider

protons electrons

Quarks interact strongly

15 tonnes (!) Quarks are Quantum Mechanical

Finish

Start Experiment Theory Continuum Lattice

T =/ 0

Ordinary QCD Bound States

T = 0 StudyingStudying a Mesona Meson via Lattice on a Lattice

|P > ¯ |Pi > 0 Aµ(x)=ψ(x)γµγ5ψ(x) + A (x) A (x) Example Correlationµ Function µ ∂ → Δ Creates a tower of states x,y,z ∫ → Σ t † ⃗ −M0t G2(t)= ⟨0|Aµ(⃗x, t)Aµ(0, 0)|0⟩ → Ze !{U} !⃗x † ⃗ ⟨O⟩ = G2(t)= ⟨0|Aµ(⃗x, t)Aµ(0, 0)|0⟩ {!U} !⃗x UK_wil60ll mesons_LL_ ViVi_000 K=.15500,.15500 Chan= 21 3 t= 2-22d Err=Jk Sym=Y #cfgs= 455 #cfg/clus=13 † = ⟨0|Aµ(⃗x, t)|Pi(⃗k)⟩⟨Pi(⃗k)|A (⃗0, 0)|0⟩ 0.1 " 2E µ {!U} !⃗x !i 0.01 1 † −Mit = ⟨0|Aµ(0)|Pi(0)⟩⟨Pi(0)|Aµ(0)|0⟩e 2Mi 0.001{!U} !i 2

2 |⟨0|Aµ(0)|P (0)⟩| −M0t −M0t tlarge: →0.0001 e ≡ Ze G 2M0

1e-05 Lightest state! –p.4/58

1e-06

1e-07 0 5 10 15 20 t –p.5/58 Experiment Theory Continuum Lattice

Extreme QCD

T =/ 0

Ordinary QCD Bound States

T = 0 Heavy ion collisions Heavy ion collisions Experiments of QCD at T =0 Experiments ̸ at T≠0 RHIC @ BNL ALICE @ LHC

–p.13/83 Phase diagram of Quarks Temperature

Baryon Density ParticleParticle Data Book Data Book

1, 500 pages ∼ zero pages on Quark-Gluon Plasma... Experiment Theory Continuum Lattice

Extreme QCD Spectral F’ns

T =/ 0

Ordinary QCD Bound States

T = 0 FASTSUMFASTSUM set up setup

! anisotropic lattices aτ < as ! allowing better resolution, particularly at finite temperatures 1 since T = Nτ aτ x x

τ τ

! "2nd" generation lattice ensembles ! moving towards continuum, infinite volume, realistic light quark masses Lattice correlators: correlators: Nucleon Nucleon

0 10 positive parity negative parity

-1 10 T/Tc=1.90 T/Tc=1.52 -2 T/T =1.27 10 c (0) T/T =1.09

G c

)/ T/T =0.95 τ c ( -3 G 10 T/Tc=0.84 T/Tc=0.76 T/Tc=0.24 -4 10

-5 10 0816 24 0816 24 32 a τ/ τ

! parity channels nondegenerate ± ! more T dependence in negative-parity channel Baryons in the hadronic phase

BaryonsMasses m ( Tin), normalised the confined with m+ at lowest phase temperature ± 1.6 1.6 octet (spin 1/2)1.6 S=0 S= 1 octet (spin 1/2) =−1 S=0 decuplet (spin 3/2) S=−1 S=0 1.4 S 1.4 1.4

1.2 N( ) ( ) ∗ 1.2 N(−) Σ(−) 1.2 ∆(−) Σ (−) N(+) (+) ∗ N(+) ) Σ(+) ∆(+) 0

) Σ (+) 0

1 (T 1 1 (T + + m m )/ )/ T T S S ( ( S=−2 =−2 =−3 S=−1 S= 2

m S

m = 1 1.4 1.4 1.4

1.2 1.2 ∗ Λ(−) 1.2 Ξ(−) Ξ (−) Ω(−) ( ) ( ) ∗ Λ(+) Ξ(+) Ξ (+) Ω(+) (+) (+) 1 1 1 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 1 0 50T T 100 150 0 50 100 150 T T / c / c T [MeV] In each channel:

! emerging degeneracy around Tc ! negative-parity masses reduced as T increases ! positive-parity masses nearly T independent QGP:Temperature fate of light baryons Dependent Masses Consider now the quark-gluon plasma

! no clearly identifiable groundstates: baryons dissolved Example: use conventional exponential fits

2.5 2.5

+ + N Ω − N − Ω 2 2 ] ]

1.5 1.5 GeV GeV [ [ m m

1 1

0.5 0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 T/T T/T c c

No clearly defined groundstates above Tc T = 0 versus T ≠ 0

Quarks Accuracy of Similar to Fundamental Asymptotic Are Predictions Properties states

Thermo- T ≠ 0 Deconfined O(20%) Plasma/fluid dynamic Quarks / Quantities gluons

Atomic T = 0 Confined O(1%) physics Decay Rates Hadrons (bound states) Experiment Theory Continuum Lattice

Extreme QCD Spectral F’ns

T =/ 0

Ordinary QCD Bound States

T = 0 Spectral Functions Spectral Functions Spectral Functions C(t)= ρ(ω)K (ω, t)dω C(t)= ρ(ω)K (ω, t)dω ! ωt where K (ω, t)=kernel e− ! ωt where ∼K (ω, t)=kernel e− ρ(ω) = spectral function ∼ ρ(ω) = spectral function = Zi δ(ω Mi ) (confined phase) − = Zi δ(ω Mi ) (confined phase) "i − "i

Stable Stable

) Bound (decaying) States ω ) ( Bound (decaying) States ρ ω

( Melted/Plasma

ρ Melted/Plasma Melted/Plasma with Transport Melted/Plasma with Transport Transport peak Transport peak

ω ω No transport peak No transport peak Spectral Functions Spectral Functions Spectral Functions C(t)= ρ(ω)K (ω, t)dω C(t)= ρ(ω)K (ω, t)dω ! ωt where K (ω, t)=kernel e− ! ωtThe problem where ∼K (ω, t)=kernel e− ρ(ω) = spectral function ∼ ρ(ω) = spectral function = Z δ(ω M ) The(confined problem phase) i − i = Z δ(ω M ) (confined phase) ρ F and C Data i i − i ≡ ≡ " i " ρ F and C Data ≡ ≡ P(C ρ)P(ρ) χ2+S Need to maximise P(ρ C) = | e− Stable | P(C) ∼ Stable P(C ρ)P(ρ) χ2+S

) Bound (decaying) StatesNeed to maximise P(ρ C) = | e−

ω The problem:

) ( ) ( Bound (decaying) States | P C ∼ ρ ω

( Melted/Plasma

ρ The problem:Melted/Plasma C(t) known at (10) t-points Melted/Plasma with Transport O Melted/PlasmaC(t) withknown Transport at (10) t-points 3 Transport peak Obut ρ(ω) should be known at (10 ) points

Transport peak O but ρ(ω) should be known at (103) points Naively:O (ρ) >> (C) (ρ) >> ( ) I I ω Naively: C (output) >> (input)!!! ωI I No transport peak (output) >> (input)!!!I I No transport peak I I Without S system is underconstrained Without S system is underconstrained 2 many solutions with χ2 =many0 solutions with χ = 0 −→ −→ The Task The Task

Given data D

Find fit F by maximising P(F D) | Bayes TheoremBayes Theorem

Need to maximise P(F D) | Bayes Theorem:

P(F∩D) = P(F D)P(D)=P(D F)P(F) F F∩D D | |

P(D F)P(F) i.e. P(F D) = | | P(D)

χ2 2 But P(D F) e− minimising χ = maximising P(F D) | ∼ −→ ̸ | Maximum Likelihood Method wrong?? −→ No! Since for simple F(t)=Ze Mt , P(F) = P(Z, M) const − ∼ Priors Priors

Actually P(F = elephant) 0 ≡ “priors” which encode any additional information −→ (a.k.a. predisposition, prejudices, impartialities, biases, prediliction, subjectivity, ...) E.g. in L.G.T. P(M < 0) 0 ≡ Maximum Likelihood Method applies this prior implicitly Can encode prior information with “entropy”= S (dis-information) Define (F) = “Information content” of F I P(F) = e -S “Bland” F has (F) 0 and S >> 0 I ∼ “Spiky” F has (F) >> 0 and S 0 I ≡ Entropy PriorsEntropy / Entropy

No Data Data

No Prior (F) 0 F from min χ2 I ?≡

Prior F prior F from max P(F D) ≡ |

S P(F) = e− Maximum Entropy Method A Solution: Maximum Entropy Method Can define Entropy

ρ(ω) S = ρ(ω) m(ω) ρ(ω)log dω − − m(ω) ! " # $% m(ω) is the default model which encodes prior information Typical m(ω) is free theory result

Discretise K (ω, t) K −→ w,n N Express ρw = mw exp Vw,nun &n=1 where Vw,m is from a SVD of Kw,n.

Note N N by construction! ≤ t (output) (input) −→ I ≤ I Spectral Functions Spectral Functions Spectral Functions C(t)= ρ(ω)K (ω, t)dω C(t)= ρ(ω)K (ω, t)dω ! ωt where K (ω, t)=kernel e− ! ωt where ∼K (ω, t)=kernel e− ρ(ω) = spectral function ∼ ρ(ω) = spectral function = Zi δ(ω Mi ) (confined phase) − = Zi δ(ω Mi ) (confined phase) "i − "i

Stable Stable

) Bound (decaying) States ω ) ( Bound (decaying) States ρ ω

( Melted/Plasma

ρ Melted/Plasma Melted/Plasma with Transport Melted/Plasma with Transport Transport peak Transport peak

ω ω No transport peak No transport peak _ T=0Zero TemperatureSpectral Functions Spectral Functions (bb meson)

ωmax dω ωτ G(τ)= K (τ, ω) ρ(ω), K (τ, ω)=e− . 2π !ωmin

0.4 60 Υ MEM 50 Exponential fits 0.3

2 b 40 4 b /m /m

) ) 0.2

ω 30 ω ( ( ρ ρ 20 χ 0.1 b1 10 MEM Exponential fits 0 0 9101112131415 9101112131415 ω (GeV) ω (GeV) T≠0Nucleon Spectral spectralFunctions function (bb meson) via MEM

T < TC T > TC 12 10 N N 10 0.24 Tc 8 1.09 Tc 0.76 Tc 1.27 Tc 8 0.84 Tc 1.52 Tc 0.95 Tc 6 1.90 Tc ) ) ω ω ( 6 ( ¯ ¯ ρ ρ 4 4

2 2

0 0 -16 -12 -8 -4 0 4 8 12 16 -16 -12 -8 -4 0 4 8 12 16 ω [GeV] ω [GeV] CMS ppLHC versus proton PbPb versus Pb Experiment Theory Continuum Lattice

Extreme QCD Spectral F’ns

T =/ 0

Ordinary QCD Bound States

T = 0 Summary

• Particle Physics at high temperature • Lattice approach • Hadrons (i.e. protons/neutrons) at T ≈ 0 • Spectral representation and Bayesian Methods • Hadrons at T ≫ 0 Advancements in solar astronomy using SCW

Huw Morgan Department of Physics, Aberystwyth University The Sun: Overview

• Sun’s mass 2 x 1030Kg (~300,000 x Earth).

• 99% of Solar system’s mass in Sun.

• Core: T ~15,000,000K Density~12 x lead H -> He fusion (10 billion years)

• 4.5 million tons of hydrogen fused into helium every second (~several Snowdons)

• Energy (light) created in the core gradually escapes (~100,000 yrs)

• Simple machine – self-regulated

Photosphere : 6000K Photosphere : 6000K

Corona: 1->3MK ? The Sun’s atmosphere: Waves vs. nanoflares Images : Joten Okamoto , ISAS

Wave heating: the magnetic field Nanoflare heating: the magnetic field oscillates due to disturbances near is constantly reconfiguring as it is the Sun. This energy travels to the dragged around by motions in the atmosphere and is converted photosphere. These ‘reconnections’ (somehow) into the plasma thermal heats the plasma locally, and the energy. heated plasma is transported throughout the atmosphere. Coronal mass ejections (solar storms) Space Weather: Do you need extra life insurance cover? How can we measure the corona’s properties?

Remote observations => physical properties

Development of new methods: Improved estimates of temperature, density 3D structure Feature identification & tracking. Automation Data streams Processing/methods Outputs Space-based: • HMI/SDO • Feature identification • photospheric magnetic • Feature tracking • Feature characterization field • 2010-present • Tomography • Dynamics maps • AIA/SDO • Low corona • Stereoscopy • Plasma density • 2010-present • Plasma temperature • IRIS • Chromosphere • 3D mappings (quiescent • 2014-present • Coronagraphs corona) • COR/STEREO • Inversion for plasma • 3D mappings (CMEs: • LASCO/SOHO) properties solar storms) • 1996-present • Advanced time analysis • Ground-based High-resolution imagery • SST • Image processing (photospheric/chromospheric ) • Very large datasets, occasional • DKIST Types of data: EUV

• Images of the low corona taken by spacecraft in Extreme Ultraviolet (=hot) wavelengths

• Different channels = different ionization states = different plasma temperatures

• Most common source of information of low corona Types of data: coronagraphs

• Images of the extended corona in visible light

• Scattering of Sun’s light by coronal electrons

• Most common source of information of extended corona (solar wind & solar ejections) So many pixels, so little time AIA/SDO (2010->)

EUVI/STEREO (2007->) EIT/SOHO (1995->)

1 megapixel, 4 channels, T ~1 minute: <105 pix/s 4 megapixel, 4 channels, T ~30 s, (2 spacecraft): ~106 pix/s

Example of the growth of astronomical datasets in the past two decades. 16 megapixel, 10 channels, New mission => step change in required computing resource T ~12 s: >107 pix/s New mission => step change in methodology (i.e. automated detection/analysis) New mission => new challenges & direction to theory/models/simulations Processing approach DATA Job 1 File 1 File 2 Job 2 File 3 File 4 Job 3 . Save Run ~16 jobs output . Job 4 in parallel results . Split into nj . equal . . . sections .

files, multiple years/decades .

5 . . . . .

Several 10 Several File n Decades of data processed in 1/16 x serial processing time Qualitative Image processing 1: MGN • Multiscale Gaussian Normalization • Effective at revealing new detail in EUV imagery • Quickly became widely-used. Discussion with medical imaging groups.

• SCW not crucial to process small datasets. But we have access to several 105 images. Scope to create output, and make available to the public. • http://eagle.imaps.aber.ac.uk Qualitative Image processing 2: Colour vs. temperature

• Mapping of 7 monochrome data channels into RGB • Original data in Extreme UltraViolet (EUV) Qualitative Image processing 3: Time-normalization • Time series of images. • Bandpass filter over time, plus normalization by signal variance over bandpass • c = Noise dampening

Ridges + very low Bandpass filter Bandpass filter with amplitude circular waves + normalization noise Qualitative Image processing 3: Time-normalization

• Application to time series of EUV images (one per 10s) • Reveals faint motions everywhere. Known previously in certain regions, but: • Difficult to visualize • Previously not known to exist everywhere Quantitative Time-normalized optical flow • Modified Lucas-Kanade algorithm

• Application to time-normalized data gives striking coronal flow-field

• Must be linked to underlying (unknown) magnetic field structure

• Seems to trace motion correctly even in very faint and noisy regions (see below)

• Computationally expensive, SCW crucial Quantitative Coronal tomography

• Coronagraph data, extended distances from Sun

• Tomography gives distribution of plasma

• Computational resource demanding

• 14 years of data >>> 10,000 tomography map output

• SCW crucial for this large processing effort Quantitative Grid-SITES (PhD: James Pickering) • Inversions of input data to gain temperature maps • Core inversion method developed at Aberystwyth • Slow (~1000 pix/s). 16Mpix image would take ~5 hours! • Developed new method to grid input data. • Factor of ~100 improvement in efficiency (16Mpix in 3 minutes). • Greater efficiency factors for multiple observations (16Mpix in ~10 seconds). • SCW will be crucial to process large datasets

1.6MK 2.1MK 3.0MK Quantitative Feature detection • Effort to identify and track different types of regions in the corona • Based on relative intensities in different EUV channels • E.g. Active region hot, active region warm, plage, quiet Sun, coronal holes • Aim to process >10 years of data, SCW crucial Quantitative Measuring Sunspot Rotation (PhD: Richard Grimes)

• Tracking the dark red pixels allows us to follow the motions of the magnetic field lines within the sunspot.

• This is normally done by creating a contour around the darkest points and fitting an ellipse to it.

• The angle of the major axis to the normal can then be tracked over time to estimate the angular velocity of the sunspot.

Figs: Data from HMI instrument on SDO showing a sunspot in false colour white light. Image to the right shows the ellipse fitting method. Bright-point Detection (PhD: Llyr Humphreys)

• Series of intensity threshold masks e.g. total range %, local median %, absolute values, etc.

• Region identification & labelling

• Apply, for example, same-event- threshold used by Sekse et al. (ApJ), 2012: • Find chain of brightenings from frame to subsequent frame(s) manually • Determine middle point of brightenings • determine velocity between middle points • !"#!$%&'"# = *+*#,& %#-*$.,*/, +*$ ≥ 2 ) *+*#,& .-* -*$.,*/, +*$ < 2 Novel prediction of flares and CMEs (PDRA: Marianna Korsos)

Magnetic helicity flux calculation across an entire solar active region

Solar Active Region Quiescent active region

Eruptive Active Region [%] [%]

Straightforward application of ML?: look for a pattern in the entire solar AR. SCW & solar research

• Large datasets spanning decadal timescales

• Advanced analysis methods: computationally demanding

• SCW makes such analysis manageable

• Enables unique science on existing (public) datasets – Almost like having exclusive access to new instrument!

• Simple approach > not strictly parallel computing – Send ‘chunks’ of data (e.g. 6 months) for processing – Processing time reduced to manageable scales e.g. days rather than weeks. In summary: SCW is central to our research! Implementing Scalable Bioinformatics Pipelines for the Pathogen Genomics of Tuberculosis

Dr Anna Price Research Software Engineer, School of Biosciences, Cardiff University Bioinformatics Software Developer, Public Health Wales

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 Motivation

• Tuberculosis (TB) usually affects the lungs and can potentially be very serious, with estimates of 2 million deaths annually, mostly in developing countries • Over 2000 patients tested for TB by Public Health Wales per annum • Better tools are needed for the detection of tuberculosis. Existing academic software has limitations • Public Health Wales TB service directly feeds into healthcare provision. So software needs to fast, reproducible, auditable, and automated

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 2 Motivation

• Aim is to create a scalable and reproducible pipeline for the Public Health Wales TB service which processes the genomic data of the samples from the sequencing machine, providing quality control and speciation of the samples • Work is broadly split into two parts: building the main pipeline for the service and creating an additional pipeline which builds a high-quality database for speciation • High Performance Computing (HPC) is crucial to achieving this • Work has benefits for both research and healthcare

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 3 What is genomics?

• Study of the genomes of organisms • Sequencing machines read the DNA • Use bioinformatics to assemble and analyse the structure of the genome

AGTCTTCAGATGACAGTACA |||||||||||| GATGACAGTACAGGATTATACT

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 4 Building the main pipeline Krona

Problem: • Several stage workflow which relies on several tools and scripts • Workflow needs to reproducible and automated • Need to run multiple files through the pipeline → should be able to run in parallel

Solution: • Use Nextflow and containers! FastQC

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 5 Nextflow

• Can produce scalable pipelines from existing software and tools • Parallelisation is implicitly defined by the process input • Can be executed on multiple platforms • Can be partnered with Docker or Singularity containers to create fully self-contained and reproducible pipelines • DSL2: new syntax which introduces library modules and workflow components

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 6 Nextflow Process Example

Channel.fromFilePairs( params.reads ) .into { read_pairs_channel } Channels process fastqc { input: set sample_id, file(reads) from read_pairs_channel Input

output: file("fastqc_${sample_id}") into fastqc_channel Output

script: """ mkdir fastqc_${sample_id} Script fastqc -o fastqc_${sample_id} -f fastq -q ${reads} """ } main.nf

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 7 Nextflow Process Example – DSL2 Processes in module scripts Workflow calls modules process fastqc_proc { // import modules and define input channels input: include "./modules/fastqc.nf" set sample_id, file(reads) read_pairs_channel = Channel.fromFilePairs( params.reads )

// can split workflow into components output: workflow fastqc { file("fastqc_${sample_id}") get: reads script: main: """ fastqc_proc(reads) mkdir fastqc_${sample_id} } fastqc -o fastqc_${sample_id} -f fastq -q ${reads} workflow { """ main: } fastqc(read_pairs_channel) fastqc.nf } main.nf

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 8 Containers

• Docker and Singularity • Container engines can be used to deploy software packages in a lightweight, standalone, and reproducible environment • Software is packaged with its dependencies and is isolated from the host machine • Software runs uniformly regardless of infrastructure • Easily integrated with Nextflow – input files are automatically mounted to the container

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 9 Using Docker with Nextflow Define Docker containers for Build the Docker images needed for the workflow each process in config file FROM ubuntu:16.04 process { withName:MashProcess { RUN apt-get update && apt-get install -y gcc wget \ container = 'mash:latest' && wget 'https://github.com/marbl/Mash/releases/mash-Linux64-v2.1.1.tar' \ } && tar -xf mash-Linux64-v2.1.1.tar \ } && mv mash-Linux64-v2.1.1/mash /usr/local/bin \ docker { && rm -r mash-Linux* enabled = true } CMD ["/bin/bash"] Dockerfile nextflow.config

Dockerfile → Docker Image → Docker Container

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 10 Creating a high-quality custom Mycobacterium database for speciation

• “What species do I have in a sample?” is a fundamental question for clinical diagnostics • The speciation of Mycobacterium samples is important in the detection of diseases such as TB, which is primarily caused by the species Mycobacterium tuberculosis • Taxonomic sequence classification is achieved using software packages such as Kraken, Centrifuge, and Mykrobe which use reference databases for speciation • Built a custom high-quality Mycobacterium database from scratch and compared its performance with the databases provided by the software packages

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 11 Creating a high-quality custom Mycobacterium database for speciation

• 10,070 Mycobacterium genomes (~ 5 TB) were downloaded from the ENA • The genomes were assembled using Shovill • Using Mash the pairwise distance between samples was calculated and used to identify poor quality assemblies • Left with 5,028 high-quality assemblies • Database was built using Kraken2

10,070 Mycobac- Using Mash, the pairwise 5,028 high-quality Database built terium genomes Reads were assembled distance between the samples assemblies were on a 700GB, were downloaded using Shovill was calculated and a selected using 48 CPU machine from the ENA distance matrix was built distance matrix using 24 threads

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 12 Heatmap for sensitivity of speciation of Mycobacterium tuberculosis

• Choice of database for speciation is crucial!

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 13 Composition of the custom Mycobacterium database

• Dominated by Mycobacterium tuberculosis • Several species missing or underrepresented • Want to automate the process of building a custom database to allow new samples to be easily added

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 14 How to automate the process of building a custom database?

• Use Nextflow and containers! • Create custom tools which automate the process of downloading genomes, mapping samples to their taxon, building a distance matrix etc. • Develop algorithm to select high quality samples • Should be self-correcting, i.e. over time as more samples are added to the database old poor quality samples will be removed

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 15 Summary

• Can create scalable and reproducible bioinformatics pipelines using Nextflow and containers • Choice of database for speciation is crucial • Created a high-quality custom Mycobacterium database for the detection of tuberculosis which showed great improvement over other available databases • Benefits for both research and healthcare

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 16 Acknowledgements

• Dr Matthew Bull • Dr Thomas Connor • Dr Sally Corden • Dr Matthijs Backx

Supercomputing Wales: A New Decade of Supercomputing 23rd January 2020 17 Simulating Complex Fluid-Structure Interaction on Supercomputers

Dr Chennakesava Kadapa SA2C, Swansea University, UK.

A New Decade of Supercomputing SCW Midpoint Conference 23-24 January 2020 What is Fluid-Structure Interaction?

 Interactions of fluid(s) and solid(s)  A multi-physics phenomenon  Abundant in nature – Cardiovascular flows  Occurs in many areas of engineering – Aerospace: Aircraft, parachutes, rockets – Civil: Bridges, dams, cable/roof structures – Mechanical: Automobiles, turbines, pumps – Naval: Ships, off-shore structures, submarines Why we should model FSI?

Tacoma Narrows Bridge (1940) Collapsed after 4 months after its opening What makes FSI modelling difficult?

Solid Fluid solver solver

Coupling Can we solve all the FSI problems if we use the best available fluid and solid solvers? What makes FSI modelling difficult?

Solid Fluid solver solver

Coupling Can we solve all the FSI problems if we use the best available fluid and solid solvers?

No. The devil is at the interface. Troubles with the interface

 Highly nonlinear equations  The standard approach is solve everything together  Disadvantages  Computationally expensive  Can’t use block box solvers  Not possible with many approaches for FSI  Load balancing issues for parallelisation Interface issues – a solution Meshing - issues Meshes aligned with the solid boundary Finite Element or Finite Volume schemes for the fluid problem

http://bloodhound1.efar.co.uk/project/car/aerodynamics

How to deal with moving solids? Body-fitted meshes When the solid moves surrounding fluid mesh also moves. This has serious implications.

Large deformations Significantly large deformations. Solid-solid contact Re-meshing. Hard to parallelise. Unfitted/immersed methods

 Solids immersed/embedded on fixed grids  Advantages ✔ No need for body-fitted meshes ✔ No need for re-meshing ✔ Complex FSI problems can be solved ✔ Relatively easy to parallelise  Disadvantages ➢ Need to develop a fluid solver The FSI framework  Fluid solver  Grids with local refinement capabilities  Nitsche method  Ghost-penalty  Solid solvers  PETSc for parallel matrix algebra  VTK for parallel post-processing  Staggered coupling scheme  Second-order accurate schemes Benchmarking the fluid solver Flow past a fixed circular cylinder Benchmarking the fluid solver

Scalability studies. Level-2 mesh. Time in milliseconds. Benchmarking the FSI solver Aeroelastic flutter of a bridge deck

Fluid DOF ~= 172,000

All the high-frequency modes are captured well Variety of complex FSI simulations Valve with flexible components

● About 2 million DOFs ● About a day on 20 processors ● Successful completion of the simulation itself is a huge success Summary and outlook

 Has been a very challenging project to bring so many ingredients together  Successfully developed a parallel framework for FSI  Successfully simulated complex FSI problems  Would not have been possible without HPC  Improve parallel performance to achieve scalability up to 1000 processors  Applications in biomedical, off-shore structures and energy harvesting

Opportunities and challenges of equality and diversity in HPC What is EDI?

Equality: associated with the elimination of unlawful and unfair discrimination against particular groups. Equality = a state of being equal.

Diversity: based upon the concept of recognizing, respecting and valuing difference.

Inclusion: about creating and promoting an environment welcoming to everyone.

Equality protects us all… Diversity reflects us all… Neither sufficient without Inclusion. 2

Who does the law protect?

There are 9 protected characteristics specified under the Equality Act 2010

4 Unconscious Bias

• Everyone has biases - some are of a subconscious nature • Filters through to how we view, interpret and treat other people Royal Society Video • Try to increase awareness of yours – take practical action • Unconscious Bias is not an excuse! Micro-inequities

• Small, ephemeral behaviours that are hard-to-prove and frequently unrecognized by the perpetrator

– Interrupting a person particularly if you don’t do it with others – Repeating what someone has else has just said in a meeting as if it was your own idea – Managers encouraging some colleagues in their work but not others whose achievements are equally relevant – Conducting meetings or work social events in environments that would automatically exclude some colleagues

• They have a cumulative effect that is often serious and harmful to the work environment Cardiff University – Our work

• Disclosure Response Team – Provides support on harassment and hate crimes. Introducing a restorative approach to resolve issues.

• Appointment of a Dean of Equality, Diversity and Inclusion – provides the link between Academic staff, the Student and Professional Services

• Race Equality Panels – An expert panel for incident support and a group that meets to encourage discussion about Race

• Focussed work on narrowing the Attainment Gap

• Improving the inclusivity of our curriculum

• The University’s Strategic Equality Plan 2020-24 is due to be published. Our Culture and Values

We are committed to equal pay, treatment and opportunity, to supporting diversity and creating an open and inclusive community.

Research funding bodies

– They are increasingly scrutinising our research culture and environment – They are interested in the efforts a specific research group is making to create an inclusive working environment – Some require us to report any proven cases of harassment/bullying within the research group

Royal Society Video – Group Decision Making 8 Research Environment – An example of our challenge

• Women were less likely to be selected - 52% of the eligible pool of women compared with 67% of the eligible pool of men. This gap widened with age • BAME academics were less likely to be selected - 53% of the eligible BAME ethnicity group compared with 64% of the eligible “white” • Women were also less likely to be selected for the Main Panels How is this affecting our students?

• Not to be disadvantaged or experience negative behaviour for a reason relating to your protected characteristic • To study/live in an environment that allows you to ‘be yourself’ and be open about your identity and needs • To be better prepared for employment • Recognise harassment / bullying and understand how to raise issues or support others

10 Freedom of Speech and Academic Freedom

• Universities have a legal duty to proactively promote and protect Freedom of speech • This allows for lawful, legitimate criticism or debate of issues, ideas and materials for academic purposes. • However those exercising freedom of speech must not breach other laws for example relating to harassment and incitement to hatred in the way ideas are delivered. What you can do

• You are leaders in science and business • Recognise when you find yourself in a position of power or influence • Listen to others and do due diligence on your conscience • Speaking about the behaviour of others does not need to be an aggressive or defensive action

12 Equality does matter…..

Consider what equality means to you

Treat everyone with fairness and aim to be non-judgemental and respectful Work together, share ownership and responsibility for implementing equality

13 The role of the Research Software Engineer

Alys Brett (UKAEA) @alysbrett Simon Hettrick (SSI) @sjh5000 Research is changing Project Administration management Domain Outreach & knowledge Community Research practice Data & ethics 21st century management

Budget research skills Research management policy

Software People Engineering Intellectual management Property Reliability & Skill level reproducibility Modern research is team research New research roles

Research Software Engineer

Software Comp. Researcher Engineer researcher Problem solved! Research systems (computing, experimental control….) INSTITUTIONAL CHANGE IS HARD

7 LONG-ESTABLISHED RESEARCH CULTURE

8 RESEARCH IS NOT UNIFORM

9 What about software? 210,00092%69% usefundamental UKsoftware researchers to research Time to hire some software developers Project Administration management Domain Outreach & knowledge Community Research practice Data & ethics 21st century management

Budget research skills Research management policy

Software People Engineering Intellectual management Property “Beginner” “Professional”

Publications & funding Time to rethink research roles Research Software Engineer Analyst Developer Analyst Programmer Analyst Programmer - SITS (x 3) Analyst/Programmer Applications Developer Applied Scientist Architectural Developer Assistant Data Programmer Assistant Project Manager Atmospheric Correction and Radiative Transfer Model Scientist Audio Software Developer - KTP Associate Bioinformatician Bioinformatician In Potato

Genomics and Genetics Bioinformatician/Computational Bioscientist in Microbiology Bioinformaticians Bioinformatics Analyst Bioinformatics Postdoctoral Researcher Bioinformatics scientist Biometric Software Systems Developer Biorespository Software Developer C++ / 3D Graphics Software Engineer Casebooks Project Editor (Research Assistant/Associate) Climate Researcher (Research Associate) Clinical Study Programmer CoMPLEX Research Associate Computational Biologist / Bioinformatician Computational

Scientist Computational Scientist in Computational Fluid Dynamics & Industrial Applications Computational Scientist in Structural Mechanics and Industrial

Applications Computer Scientist Computer Vision Researcher Content Developer/Programmer Control Engineer-IMG - 3 posts CREATe Data Specialist Data Analyst Data Integration Coordinator Data Manager x3 Database and Software Engineer Database Manager/Researcher Database Programmer Digital Media Technician E-Learning Portal Manager (KTP Associate) e-Learning Systems DevelopmentResearch Analyst e-Learning Systems Development Software Analyst (Moodle, SQL) E-Learning Web Developer E-Portfolio Learning Technologist Embedded Systems Engineer Engineering Technician Environmental Scientist EPSRC Studentship on Algorithmic Construction of Finsler-Lyapunov Functions Experimental Officer in Bioinformatics Gaia Alerts Software Developer Gaia Software Developer (Gaia UK Team) GIS Applications Specialist Graduate Programmer / Software Developer Graphics Programmer Health Data Manager / Scientist High Throughput Bioinformatician High Throughput Sequencing Bioinformatician (Two posts) HIVE Manager/ HIVE Co-ordinator HIVE Senior Researcher and Technical Lead Hydro-informatics Scientific Software DeveloperEngineer Image Analysis Manager for Cancer Imaging Information Systems Developer Instrumentation Engineer Investigator Statistician IT Developer IT Services (Unix / Windows Systems) Knowledge Transfer Partnership (KTP) Associate: Innovent Technologies LTD Knowledge Transfer Partnerships (KTP) Associate - Software Developer KTP Associate - Robot Vision Scientist (Research Fellow) KTP Associate (Fixed Term Contract for 24 months) KTP Associate (Precision Agriculture Data Analyst) KTP Associate – Graduate Web Leicester Respiratory BRU IT Developer Linguist / Psycholinguist Maker Space Technician Marie Curie Early Stage Researcher Marie Curie Early Stage Researcher in Radar Rainfall for Integrated Water Quality Modelling Marine Earth Observation Scientists Medical Statistician Medical Statistician/Senior Medical Statistician Metrology Engineer Mobile Application Developer NAtoral Researcher in the application of Digital Technology Post-Doctoral Research Assistant in Simulation and Visualization Post-Doctoral Research Associate...better Post-Doctoral Research Associate than (Pathogen Genomics) the Post-Doctoral 194 other Research Fellowchoices Postdoctoral Fellow - population genetics / evolutionary genetic Postdoctoral Fellow in Bioinformatics Postdoctoral Fellow in Cancer Therapeutics Postdoctoral Research Assistant Postdoctoral Research Associate Postdoctoral Research Fellow Postdoctoral Research Scientist Postdoctoral Researcher in Declarative (Logic and Functional) Programming Postdoctoral Researcher Postdoctoral Scientist Postdoctoral statistician Postdoctoral

Training Fellow - Statistical and Computational Genetics of Autism Principal / Senior Bioinformatician Principal Bioinformatician Product Development Engineer (Rail) Publishing Portal Web Developer Radio Frequency Engineer Reader in Computer Science Reporting Analyst Research (Software) Engineer Research Assistant Research Associate Research Fellow Research Image Data Manager, Biomedical Engineering Research Officer Research Officer – Social Protection Research postgraduate Research Programmer Research

Scientist Research Scientist / Senior Research Scientist Research Scientist in Machine Learning and Computer Vision Research Software Developer Research Software Developer for the Herchel Smith Data Manager Senior Database Administrator Senior IT Developer Analyst Senior Mathematical Modeller Senior Media Developer Senior Representation for RSEs Changing the world one academic role at a time UK RSE: Grassroots movement to professional society

Term First First UK RSE First RSE First RSE Society of RSE RSE RSE founded fellowships Conference RSE Coined Group Workshop founded

2012 2013 2014 2015 2016 2019 2020 Growth of the community

Over 1400 on mailing list

1800 members ~300 active / week

23 The Society of Research Software Engineering

Open for membership society-rse.org

250 full members and growing RSECon 16 RSECon 18 204 attendees 334 attendees

RSECon 17 RSE Con 19 224 attendees 360 attendees INSTITUTIONAL CHANGE IS HARD

26 SUCCESS FOR RSE IS SUCCESS FOR RESEARCH

27 COMMON AIMS LOCAL TACTICS

28 GROUPS & COMMUNITIES

29 Benefits of RSE groups

For RSEs For research For researchers projects

✔ Stable careers ✔ Flexible access to ✔ Help & advice expertise Training ✔ Peer group ✔ ✔ Sharing between ✔ Infrastructure ✔ Recognition & projects development ✔ Focus for wider ✔ Access to niche network skills

30 The growth of RSE groups

2013: 1 group at UCL

2020: 28 groups across the UK

Average annual growth rate ~30%

RSE leaders peer support network

31 Aspiring RSE Leaders Workshop (2019)

Providing new and aspiring RSE leaders in the UK with the knowledge and network needed to drive the expansion of RSE groups

Slides: https://drive.google.com/drive/folders/1U9KdDB3MLDGEkpWpKmHAELdUYDKrB_iy

32 International RSE leaders workshop ( 2018)

Bringing together RSE leaders from around the world to help each other improve access to software expertise in research by pooling knowledge, coordinating efforts and establishing collaboration.

Read more at researchsoftware.org National RSE Associations Nordic Countries: www.nordic-rse.org

Netherlands: nl-rse.org

USA: : www.de-rse.org www.us-rse.org

South Africa: https://rsse-africa.sanbi.ac.za/

Australia & New Zealand: https://rse-aunz.github.io ESTABLISHING RECOGNITION PERSISTANCE

Builds its own momentum What next? >15,000 38 submissions based submissions on publications based on software Focus on three areas

1. Policy to recognise and reward software

2. Funding to extract value from software

3. Environment for world-leading skills in software Links

● National software survey: https://www.software.ac.uk/blog/2017-09-06-journey-reproducibility-excel-pandas ● The Society of Research Software Engineering: https://www.society-rse.org ● RSE Groups: https://society-rse.org/community/rse-groups/ ● The UK’s research and innovation infrastructure: opportunities to grow our capability: https://www.ukri.org/files/infrastructure/the-uks-research-and-innovation-infrastructure-opp ortunities-to-grow-our-capacity-final-low-res/ Thank you!

@alysbrett, @sjh5000

Slides at: http://bit.ly/RSEsSuperWales20

Acknowledgements

© Alys Brett & Simon Hettrick. These slides are licensed under a Creative Commons Attribution 4.0 International: https://creativecommons.org/licenses/by/4.0/ Image credits

Image credits

● RSE map: created with mapchart.net

● Code: Photo by Markus Spiske on Unsplash

● Laptop: Photo by Kari Shea on Unsplash ● Queen’s College, Oxford: By Kaofenlio - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=35804568

● Road: Photo by Ryan Stone on Unsplash

SUPERCOMPUTING WALES Timothy Lanfear, Director, Solution Architecture EMEA CHALLENGES HPC CUSTOMERS ARE FACING

Virtual GPU

End of Moore’s Law – Traditional vs New Top 500, Developers, Deploy and Scale – A return to Computer Workloads Utilization Complexity of integrating Science - The rise of DL and hybrid - Improving utilization and new workloads workloads maximizing investment There must be a return to applications. The free ride of the technology curve is over! THE NEW HPC Accelerating ML, DL, HPC and Visualization Workflows

MACHINE LEARNING DEEP LEARNING HPC VISUALIZATION

RAPIDS | H2O | more TensorFlow | PyTorch | more NAMD | GROMACS | +600 More ParaView | IndeX | more INTERSECTION OF HPC & AI HPC AI > Algorithms based on first principles > Neural networks that learn patterns from theory large data sets > Improve predictive accuracy and faster > Proven models for accurate results response time

SPEEDING PATH TO FUSION ENERGY EXASCALE WEATHER MODELING INDENTIFYING CHEMICAL COMPOUNDS O&G FAULT INTERPRETATION

90% Prediction Accuracy Tensor Cores Achieved 1.13 EF Orders Of Magnitude Speedup Time-to-solution Reduced From Publish in Nature April 2019 2018 Gordon Bell Winner 3M New Compounds In 1 Day Weeks To 2 Hours 4 NVIDIA DATA CENTER PLATFORM Single Platform Drives Utilization and Productivity

CUSTOMER Knowledge USE CASES Speech Translate Recommender Molecular Weather Seismic Creative & Healthcare Manufacturing Finance Simulations Forecasting Mapping Technical Workers CONSUMER INTERNET & INDUSTRY APPLICATIONS SCIENTIFIC APPLICATIONS VIRTUAL GRAPHICS

APPS & Amber +600 FRAMEWORKS NAMD Applications

MACHINE LEARNING DEEP LEARNING HPC VIRTUAL GPU CUDA-X & cuDF cuDNNcuML cuGRAPH cuDNN CUTLASS TensorRT OpenACC cuFFT vDWS vPC vAPPS NVIDIA SDKs

CUDA & CORE LIBRARIES - cuBLAS | NCCL

TESLA GPUs & SYSTEMS TESLA GPU NVIDIA DGX FAMILY NVIDIA HGX EVERY OEM EVERY MAJOR CLOUD APPLICATION AND DEVELOPER RESOURCES More GPU applications, higher system utilization, higher throughput

AI-POD

GPU Hackathons DLI NGC Application Experts and Solution Architects 2019: A YEAR OF RAPID GROWTH

NVIDIA Share of 40X

New Top 500 1.2M 800K DEVELOPERS Systems +50% 25X CRYOSPARC FUN3D GROMACS 41% Cryo CFD Chemistry 34% 2018 2019 AMBER CHROMA 24% LAMMPS 13M MILC CUDA NAMD 6% 8M DOWNLOADS +60% Quantum Espresso

MICROVOLUTION PARABRICKS WRF SPECFEM 3D Microscopy Genomics Weather SC'16 SC'17 SC'18 SC'19 2018 2018 2019 2019 25% MORE TOP500 50% GROWTH OF 600+ CUDA APPS MORE PERF SAME GPU SUPERCOMPUTERS NVIDIA DEVELOPERS

ABCI ORNL Summit Piz Daint LLNL Sierra ENI HPC4 Japan’s Fastest World’s Fastest Europe’s Fastest Fastest Industrial World’s 2nd Fastest 27,648 GPUs| 149 PF 5,704 GPUs| 21 PF 3,348 GPUs| 21 PF 17,280 GPUs| 95 PF 4,352 GPUs| 20 PF ANNOUNCING NVIDIA SERVER REFERENCE DESIGN FOR ARM NVIDIA Collaborating with Arm, Ampere, Cray, Hewlett Packard Enterprise, and Marvell

Defining a Server Reference Platform

Enabling a Range of GPU-Accelerated Servers for: • Hyperscale-Cloud to Edge • Simulation to AI • High-Performance Storage to Exascale Supercomputing

CUDA on ARM Beta Now Available

8 ANNOUNCING NVIDIA MAGNUM IO

GPU-Accelerated I/O and Storage Software to Eliminate Data Transfer Bottlenecks for AI, Data Science and HPC Workloads

High-Bandwidth, Low-Latency Massive Storage Access with Lower CPU Utilization

Delivers up to 20x faster data throughput on multi-server, multi-GPU computing nodes

9

HPC – The Future

Steve Smith HPC EMEA Dell EMC PowerEdge R7525 Unprecedented performance

Available February 2020

HIGHLY ADAPTABLE 2- SOCKET ROME- ON- ROME RACK SERVER GIVES POWERFUL PERFORMANCE AND FLEXIBLE CONFIGURATION

DATA ANALYTICS ALL FLASH SDS HPC/VDI

Maximized storage and memory 24 direct connect Gen4 Balanced core count and GPU configuration option enables NVMe supports all flash to support for maximum HPC, ML/DL/AI and rendering vSAN Ready Node numbers of end users Graphcore and Dell – Launch Platform What’s Going On? Compute-Memory-IO balance is degrading

7 8x Normalized Properties of 64+ Cores Memory & I/O Bandwidth 7x 6 Typical Server Processors Capacity per Core (GB/s) 6x 5 5x 4 4x

3x 3 Added DDR Channels gave a bump in 2017, >4000 pins but core count growth offsets 8 DDR channels 2x 8 DDR Channels 64+ PCIe Lanes 2 1x 1 0 2012 2013 2014 2015 2016 2017 2018 2019 0 Cores Pins 2012 2013 2014 2015 2016 2017 2018 2019 DDR Channels PCIe Lanes DRAM Bandwidth/Core PCIe Bandwidth/Core Processor memory and I/O technologies are being stretched to their limits Driving platform implications – Physics is NOT our friend

CPU Pin Count CPU Tdp

Liquid required region?

Compute IO & Memory needs driving Power & Size Rack Power Trends 2020-2021

Rack Power Limit Trends 40%

35%

30%

25%

20%

15%

10%

5%

0% <3KW 3-5KW 6-10KW 11-15KW 16-20KW >20KW Near Future

legacy current The Laws are NOT always on our side • Frequency – Stalled • CPU Addressing the Challenges – Small IPC gains (Some Areas of Focus) – Moore’s law vs Amdahl’s law • Domain Specific Architectures • Physical Realities • Edge Computing – IO and memory limitations • Single Socket Servers • Software • Server Disaggregation – Wirth's law is NOT stalled • Memory Interconnects – Parkinsons law applies still • Memory Centric Architecture • Performance • Virtualisation – Challenged by most laws • Power…Power…Power What Can We Do About It? Rise of Domain Specific Architectures Computa- tional Storage Silicon Process Parity Where and how we Storage spend transistors is Class changing Memory

FPGAs

GPUs/IPUs/ CPU ASICs New µArch Smart NICs Competition Offloads Reemerges Memory and Storage are Converging

With memory/storage convergence, memory semantic operations become predominant (volatile Current Time & non-volatile) Time Memory Centric Architecture Why MCA: Memory domain too claustrophobic

MCA Definition: Large tierable memory system with one physical address space accessed by the CPU using native load/store instructions – NO Protocol Stacks – Required for FULL Disaggregation

SAN HDD SAN HDD Local HDD Local HDD

SAN SSD Block SAN SSD Local SSD Local SSD

Memory DRAM Domain L/S DRAM

Latency critical applications vying for limited low latency memory Today’s Memory Tier Present

DRAM What we need….. Future

HBM

DRAM Fast

DRAM Slow

SCM Fast

SCM Slow

And then storage……. New Memory Interconnects What is CXL?

• CXL is an alternate protocol that runs across the standard PCIe physical layer x16 PCIe x16 CXL Card Card • CXL uses a flexible processor port that can auto- negotiate to either the standard PCIe transaction protocol or the alternate CXL transaction X16 Connector protocols PCIe channel • SERDES CXL usages expected to be key driver for an Connector aggressive timeline to PCIe 5.0 etc.

• CXL has three sub-protocols:

• CXL.io – discovery and configuration • CXL.cache – device access to processor memory Processor • CXL.memory – processor access to device memory Representative CXL Usages 19 of 30 What is Gen-Z

Memory Semantic Fabric SCM SCM SCM SCM DRAM DRAM DRAM DRAM • Open SoC SoC FGPA GPU GPU

• High-performance FGPA • Reliable Gen-Z • Secure (switch-based / any topology) • Flexible

• Compatible DRAM SCM DRAM SCM Network Storage • Economic DEDICATED OR SHARED MEMORY I/O

Used and modified with permission from “Gen Z The Best Interface for Emerging Memory Technologies” by Valerie Padilla CXL and GenZ

• GenZ and CxL are complementary

• CxL will exist as IN THE SERVER memory centric link • CxL is NOT a fabric, it is Host to Device in the node • Use Cases • Node Storage Class Memory expansion • Node DRAM expansion • Node Cache Coherent accelerators

• Gen-Z will exist as IN THE RACK(s) memory centric fabric • GenZ will exist for Rack level composability and disaggregation • Use Cases • Pools of SCM/DRAM • Pools of accelerators Gen-Z & CXL: In The Future Data Center

Compute Node Pooled DRAM/SCM

CXL

DDR5 Memory Channels Local PM

DRAM CPU/SOC CXL

GPU/FPGA Gen-Z Pooled Accelerators HBM Switch CXL Bridge to Gen-Z Virtualisation for HPC and DA Throughput Benchmarks

Bio Codes Monte Carlo

1 1 0.9 0.8 CLUSTALW 0.8 0.7 GLIMMER Run 1 0.6 GRAPPA 0.6 Run 2 0.5 HMMER PHYLIP Run 3 0.4 PREDATOR 0.4 0.3 TCOFFEE Run 4

BLAST Performance Ratio Performance Performance Ratio 0.2 0.2 0.1 FASTA 0 0 Native to Virtual Ratios (Higher is Better) • Dual-socket, 8-core IVB processor

• 128 GB memory

• Single 16-vCPU VM, 120 GB

• 16 single-threaded jobs run in parallel (one per core)

• Windows Server 2012 R2 MPI Applications

Native to Virtual Ratios Native to Virtual Ratios (Higher is Better) (High is Better) 1 1 FLUENT 0.9 FLUENT 0.9 LAMMPS 0.8 GROMACS 0.8 0.7 0.7 0.6 LAMMPS OpenFOA 0.6 M 0.5 LS-DYNA 0.5 STAR- 0.4 NAMD 0.4 CCM+ 0.3 0.3 WRF 0.2 OpenFOA Ratios Performance 0.2 0.1 M PerformanceRatios 0.1 0 0

Each job run with 320 MPI Each job run with 640 MPI processes (ranks) using 16 hosts Test Cluster Configuration: processes (ranks) using 32 hosts (nodes) each running a 20-vCPU VM (nodes) each running a 20-vCPU VM ESXi 6.5 32-node Cluster Dell PowerEdge C6320 Dual 10-core Haswell 128GB RAM 100Gb/s EDR InfiniBand Virtualizing Hadoop: 30TB TeraSort on vSphere 6

1.2 1.1 1 0.9 1 VM/host 0.8 0.7 2 VMs/host 0.6 0.5 4 VMs/host 0.4

Ratio Ratio 0.3 10 VMs/host 0.2 0.1 20 VMs/host 0 Native/Virtual Time Elapsed Native/Virtual

32X Dell PowerEdge R720xd 2X 10-core Intel Xeon E5- 2680v2 256GB DDR3 20X 600GB 10K SAS HDFS 10GbE Intel X540 ESX 6.0 SLES 11 SP3 Summary • HPC is expanding to include AI and Analytics – increasing reach • Physics is getting in the way of progress at the CPU level • The old way of doing things is not going to be sufficient • New models are being developed – Domain Specific Architectures – Edge Computing – Composable Infrastructure – Memory Centric Architecture • New technologies are being developed – Storage Class Memory & Data Services – Memory Interconnects • Don’t discount virtualization Thank You FUTURE TECHNOLOGIES AND ROADMAPS

VERNEGLOBAL.COM HPC: A JOURNEY IN THE CLOUDS True Cloud HPC

HPC is Evolving HPC IS EVOLVING

• TRADITIONAL HPC • NEW HPC • Big iron compute • Scalable server hardware compute • Supercomputer • GPU augmentation centers • Cloud native • Multi-year budgets • Often distributed & procurement • Segmented for • Research focus users • Timeshare type operations AI PRODUCTS NOW DRIVING HPC

• “Alexa who has the fastest ImageNet • “Alexa why is there such a drive for training?” more compute power?”

2015

November 2017 THE HPC EVOLUTION

• Large fundamental science no longer the catalyst

• Traditional HPC applications modest compared to AI training

• Production AI products becoming the compute monsters WHAT WE SEE NOW

• 77% of enterprises have at least one application or a portion of their enterprise computing infrastructure in the cloud – IDG

• Popular cloud compute tasks aren’t HPC

• Popular cloud compute applications aren’t HPC either

• Hyperscale Clouds are A LOT of compute • But is it really genuine HPC? • One size doesn’t fit all COMMON HYPERSCALE CLOUD ATTRIBUTES

• To scale for generic compute tasks the hyperscale clouds need generic infrastructure to be competitive and offer excellent ROIs: • Virtual machine environments • Ethernet connectivity • No physical location limitations • Limited specialist hardware

• Some resulting compromises • Data traffic jams • No HPC support • Pricing transparency FUTURE TREND FOCUS: “URGENT HPC”

“URGENT HPC”: HPC in the cloud for real-time, data-intensive emergency decision-making with strict time constraints • Responding to accidents, weather conditions and environmental disasters e.g. tsunamis • GPUs have now allowed much faster simulations for carrying out simulations “Faster Than Real Time” (FTRT) URGENT HPC & THE UGLY TRUTH

• Urgent HPC is costly

• Needs to be scalable

• Compute heavy:

• Hyperscalers are not eco friendly TrueHPC CLOUD SOLUTION: hpcDIRECT

• 50% cheaper and 30% faster than hyperscalers

• Tailored for your HPC tasks

• Expert full life-cycle support

• No need to over-provision: build your on-prem cluster for your day to day run rate, move peaks to a sustainably sourced cloud

• Not just energy smart, source hardware sustainably - EcoCompute hpcDIRECT: SUSTAINABLE CLOUD COMPUTING

• Using 100% renewable energy from Iceland

• Free cooling 365 days per year

• Lowest carbon footprint CUSTOMER FEEDBACK EXPERIENCE trueHPC WITH VERNE GLOBAL

SIMONE WARREN Vice President - Global Customer Strategy [email protected]

Training the future leaders in AI and advanced computing

Gert Aarts UK Government investment in AI v Investment in AI played dominant role in Industrial Strategy 2017 v One of four Grand Challenges AI & Data Economy, Future of Mobility, Clean Growth, Ageing Society v Relevance for academic research and training: o Turing AI Fellowships o Turing AI Acceleration Fellowships o Turing AI World-Leading Fellowships Wider context: o AI Masters Ø AI Sector Deal Ø o UKRI Centres for Doctoral Training in AI Office for AI Ø Alan Turing Institute Ø … UKRI Centres for Doctoral Training in AI v 16 CDTs across the UK v Each CDT trains > 50 students in 5 cohorts (first cohorts started 10/2019) v 4-year fully funded PhD programmes v Strong involvement from non-academic partners (industry, data companies, research labs, government partners, …)

Ø At least 800 new PhD positions between 2019 and 2024 Ø Funded by UK Research and Innovation (UKRI), across all Research Councils Ø £100M investment from Government, additional funding from partners UKRI Centres for Doctoral Training in AI

16 CDTs: focus areas of research UKRI CDT in Artificial Intelligence, Machine Learning & Advanced Computing

Consortium between 4 Welsh Universities, and Supercomputing Wales cdt-aimlac.org @AimlaCcommunity UKRI CDT in Artificial Intelligence, Machine Learning & Advanced Computing v Interdisciplinary CDT, revolving around AI, ML and computing v Research projects linked by use of common methods v Three research themes:

Ø Data from large science facilities (particle physics, astronomy, cosmology) Ø Biological, health sciences (medical imaging, health records, bioinformatics) Ø Novel mathematical, physical, and computer science approaches (data, hardware, software) T1: data from large science facilities

Particle physics, astronomy, cosmology: example projects

• Dark matter detection at LZ experiment in South Dakota B.P. Abbott et al, • PRL 116 (2016) Event reconstruction from raw waveform data 061102

• Real-time gravitational wave detection • Identify signals from merging neutron stars and black holes

• Phases of matter from Monte Carlo data • Classify phases and symmetry breaking in theories without order parameters T2: biological, health and clinical sciences

Medical imaging, health data, bioinformatics: example projects

• Development of breast cancer abnormalities • Mammographic morphology, manifold modelling

• Early prediction of cancer from electronic health records • Processing of primary and secondary EHRs for Wales A. Oliver et al, Knowledge-Based Systems • Tracking and treatment of sexually transmitted diseases 28 (2012) 68 • Linking of genomic sequence data to population-level health data T3: novel mathematical, physical, and computer science approaches Data, hardware, software, algorithms: example projects

• Spiking neural networks for enhanced situational awareness • In collaboration with Crime and Security Institute

• Precision and convergence of ML, using HPC and GPUs • Linking HPC experience with progress in AI

• Artificial intelligence for immersive analytics • Context-aware, context-adaptive and predictive interfaces for AI 2019 cohort

o 13 AIMLAC students o 3 students on related DI-CDT (Data-Intensive Science, STFC funded) CDT proposition

• Research project in internationally leading groups, at cutting edge of science, resulting in PhD thesis • Train PhD students to become fluent in, in our case, AI, machine learning, data science, computing

Ø Emphasis on computational and transferable skills training, delivered to the cohorts, to enhance wider skills set Ø Students from diverse academic backgrounds – computer science, physics, mathematics, data science Ø Ongoing engagement, including placements, with external partners to develop and fine-tune expectations from industry External partners v Knowledge exchange

v AI in industrial/ commercial setting

v Shape student experience

v Equality and diversity

v Responsible innovation

v Student placements Essential component: cohort training

• Training necessary to bring students to a common level • Develop cohort-based activities • Transfer knowledge across disciplines • Remote team work, transferable skills, cooperation

plays essential role in designing and delivering training and support Essential component: cohort training

Two-day training sessions at partner universities:

Ø Induction – September Cardiff Ø Software Carpentry – November Aberystwyth Ø Data Carpentry & Responsible Innovation – January Bristol Ø Data Visualisation – March Bangor Ø Science Conference – June Swansea

Coding challenge: team work across all 5 universities, using virtual tools Essential component: cohort training

Software Carpentry, Aberystwyth, Nov 18-20 Induction event Cardiff, Sept 24-26 v Cohort training co-developed and delivered by SCW, especially the RSEs v Training adapted to needs of the cohort v Delivered at different levels if required v Splendid source of support and technical expertise

A big thank you to Ed Bennett, Mark Dawson, Michele Mesiti and Colin Sauze Poll on our AIMLAC Slack channel, before the Software Carpentry meeting Parallel training sessions: o Introduction to Unix Shell o High-performance Numpy o Intro to version control with Git o Intermediate Git o Intro to Supercomputing Wales o Intro to parallel processing

Dense two-day hands-on programme Coding challenge v Develop collaborative skills across multiple locations v Three teams: members from each university and mixture of disciplines v Task to develop software suite: RSEs play role of customer and technical advisors, meetings organised via Slack

Ø See poster session for some outputs

SCW played an integral part in the CDT proposal, both with regard to access and support for hardware as well as RSE support and training

Informal feedback emphasised the importance of HPC/RSE facilities Future o 2020 cohort currently being recruited, see website cdt-aimlac.org o Three more cohorts in 2021, 2022, 2023 o Continuing SCW support after 2021 o Extend relationship with external partners o UKRI: liaise with 15 other CDTs and Alan Turing Institute

Contact o CDT manager: Rhian Morris, CDT research support officer: Roz Toft o cdt-aimlac.org, @AimlaCcommunity, [email protected] And finally, a PhD is not only about coding… , November 2019 , November Aberystwyth