Piz Daint: a modern research infrastructure for computational science

Thomas C. Schulthess

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 1 source: A. Fichtner, ETH Zurich

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 2 Data from many stations and earthquakes

source: A. Fichtner, ETH Zurich

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 3 Very large simulations allow inverting large data sets to generate high-resolution model or earth’s mantle

20 km 70 km

2.80 3.55 4.05 4.60

vs [km/s] vs [km/s]

source: A. Fichtner, ETH Zurich

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 4 The density of water and ice : rung 5 Ice floats! first early science result on “Piz Daint” (Joost VandeVondele, Jan. 2014)

MP2 liquid water: ~1020 g/l

MP2 ice: Est. ~960 g/l (273K)

high-accuracy quantum simulation produceDoes correct ice results float? for Yes! water

Accurate estimate of the density of liquid water :

the importance of van der Waals interactions ! NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 5 Jacob's ladder in DFT: Jacobs ladder of chemical accuracy (J. Perdew) a hierarchy of models for VXC Full many-body Schrödinger Equation: e2 w(i, j)= ⇥ri ⇥rj Each rung includes more ingredients in the model: | | (H E)(1,...,N )=0 = ⇤r, ⇥ ● increases accuracy { } N 2 N ● 2 1 increasesH = computational+ v(i) demands.+ w (i, j) 2mri 2 i=1 ✓ ◆ i=j=1 X 6 X

Rung 5)Ice Virtual floats withorbitals new “rung 5” simulations using MP2-baseddouble hybrid simulations : RPA / MP2 / GWwith CP2K Rung 4) Occupied orbitals Daint (VandeVondele & Hutter) hyper / hybrid : HF exchange Rung 3) Kinetic energy density Rosa meta : tau Rung 2) Gradients of the density Ice didn’t float with previous simulations using Palü “rungGGA 4” hybrid functionals Rung 1) Density LDA @CSCS Kohn-Sham Equation with Local Density Approximation:

2 ~ 2 + v (⇤r) ⇥ (⇤r)= ⇥ (⇤r) 2mr LDA i i i ✓ ◆

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 6 ModellingThe interactions challenge between of waterwater requires quantum simulations with extreme accuracy Weak interactions dominate properties! Hydrogen-bonding most prominent.

Energy scales Total energy: -76.438.... a.u.. Needed accuracy 0.0001 a.u. total energy: -76.438… a.u.

hydrogenRoughly 99.9999% bonds: accuracy ~0.0001 a.u. ! (or error cancellation) needed! required accuracy: 99.9999%

source: J. VandeVondele, ETH Zurich

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 7 Pillars of the scientific method

Mathematics / Simulation (1) Synthesis of models and data: recognising characteristic features of complex systems with calculations of limited accuracy (e.g. inverse problems) (2) Solving theoretical problems with high precision: complex structures emerge from simple rules (natural laws), more accurate predictions from “beautiful” theory (in the Penrose sense) Theory (models) Experiment (data)

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 8 Pillars of the scientific method

Mathematics / Simulation (1) Synthesis of models and data: recognising characteristic features of complex systems with calculations of limited accuracy (e.g. inverse problems) (2) Solving theoretical problems with high precision: complex structures emerge from simple rules (natural laws), more accurate predictions from “beautiful” theory Note(in the the changing Penrose sense) role of high-performance computing: HPC is now an essential tool for science, used by all scientists Theory(for better or (models) worse), rather than beingExperiment limited to the domain (data) of applied mathematics and providing numerical solution to theoretical problems only few understand

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 9 CSCS’ new flagship system “Piz Daint,” and one of Europe’s most powerful petascale

Presently the world’s most energy efficient petascale !

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 10 dimension. The backplane connects each Aries to each of the (AOCs) operating at 12.5 Gbps per lane. For small systems other 15 Aries within that chassis. electrical cables are used for the blue links. The 4 NICs on each Aries connect to eight router ports for packet injection into the network. When intra-group traffic is uniformly distributed over one dimension, a portion will stay local to each router: one part in 16 for the Green dimension and one part in 6 for the Black dimension. The design of the Cascade group (Fig. 4) ensures that the ratio of intra-group to inter-group bandwidths meets or exceeds the factor of two that is desirable for Dragonfly networks. With 10 optical ports per dimension. The backplane connects each Aries to each of the (AOCs) operating at 12.5 Gbps per lane. For small systems Aries the global bandwidth of a full network exceeds the other 15 Aries within that chassis. electrical cables are used for the blue links. injection bandwidth – all of the traffic injected by a node can be routed to other groups. The 4 NICs on each Aries connect to eight router ports for packet injection into the network. When intra-group traffic is D. System Configurations

uniformly distributed over one dimension, a portion will stay Cascade systems are constructed from a number of 2- Figurelocal 3. to eachA Cascade router: chassis one comprises part in 16 16 four for- node the blades Green with dimension an Aries cabinet groups, each containing 384 nodes. The global links perand blade. one The part chassis in 6 backplanefor the Blackprovides dimension. all-to-all connectivity The design between of thethe from each group are divided into bundles of equal size, with blades.Cascade Each group blade provides(Fig. 4) 5 ensures electrical thatconnectors the ratio used of to connectintra-group to other to one bundle from each group connecting to each of the other chassisinter -withingroup the bandwidths group. Each meetsAries also or providesexceeds 10 the global factor links. of These two thatlinks are taken to connectors on the chassis backplane. groups. The number of inter group cables per bundle is given is desirable for Dragonfly networks. With 10 optical ports per by: Aries the global bandwidth of a full network exceeds the injectionThe second bandwidth dimension, – all ofre ferredthe traffic to as injected the Black by adimension, node can ������ ������ ��� ����� Network Interface Controllers (NICs) and a 48-port high radix implementation,dimension. since each The tile isbackplane identical, and connects produces aeach Aries to each of the (AOCs) operating at 12.5 Gbps per lane. For small systems Hierarchicalrouter. setup of regularCray’s structure forconsists bephysical Cascade routed implementation. of to three other electricalgroups. The router architecture links tiles from each Aries within a ����� ����� ������ ≤ are arrangedother in a 6chassis ×158 matrix. Aries to Forty within its of peer the that tiles, in chassis. eachreferred of to theas other chassis within that electrical cables are used for the blue������ links. − 1 network tiles, connectgroup.D. toSystem the SerDes Configurations which connect devices This number may be reduced substantially in systems that together. Eight of the tiles, called processor tiles connect to The 4 NICs on each Aries connect to eight router ports for Cascade systems are constructedChassis from with a 16 number blades of 2- the four Aries NICs. Four of the processor tiles are shared packetdo not injection require fullinto globalthe network. bandwidth. When One intra advantage-group traffic of is Figure 3. A Cascade chassis comprises 16 four-node blades betweenwith an Aries NIC0 and NIC1,cabinet the groups, other four each are shared containing between 384 nodes. The global links reducing the number of inter-group cables is that it makes the per blade. The chassis backplane provides all-to-all connectivityNIC2 between and NIC3. the Trafficfrom is multiplexedeach group ov erare the dividedprocessor intotiles bundles of equal size, with uniformlysystems easier distributed to upgrade. over one For dimension, example, a a six portion group will (12 -stay blades. Each blade provides 5 electrical connectors used to connecton a packetto other-by -packetone basis. bundle This frommechanism each distributes group connecting the to each of the other local to each router: one part in 16 for the Green dimension chassis within the group. Each Aries also provides 10 global links.traffic These according links to load, allowing NICs to exceed their fair cabinet) system might be initially configured with bundles of are taken to connectors on the chassis backplane. share of bandwidth ongroups. non-uniform The numbercommunications of inter patterns group cables per bundle is given and12 opticalone part cables, in 6 for using the 60Black of the dimension. maximum The of design 240 ports of the such as gather and scatter.by: Cascade group (Fig. 4) ensures that the ratio of intra-group to FigureBlade 1. Awith Cascade 4 bduallade which socket consists of nodes 8 sockets, organized as 4 available. If the system is upgraded to eight groups (16- The dualsecond-socket nodes,dimension, and a single re Ariesferred ASIC. to as the Black dimension, ������ ������ ��� ����� intercabinets)-group then bandwidths new cable meets bundles or exceeds are added, the connectingfactor of two the that consists of three electrical links from each Aries within a ����� ����� ������ ≤ is desirable for Dragonfly networks. With 10 optical ports per Each of the four Aries NICs provides an independent PCI- ������ − 1 new groups together and connecting each of the existing Expresschassis Gen3 tox16 its host peer interface. in each In the of first the Cascade other blade chassis within that Ariesgroups the to the global two bandwidth new ones. Theseof a fullcables network can be exceedsadded to the designgroup. these interfaces connect to four independent dual This number may be reduced substantially in systems that injectionthose in place bandwidth without – the all need of the to rewiretraffic theinjected system. by The a node total can socket Xeon nodes. Each node has a pair of Sandy Bridge or do not require full global bandwidth. One advantage of Ivy Bridge processors and one SouthBridge. There are 8 DDR- benumber routed of to optical other cables groups. required is given by: NIC 0 NIC 1 NIC 2 NIC 3 3 memory channels per node, each with a single DIMM. reducing the number of inter-group cables is that it makes the Memory capacity is up to 128 GBytes per node (Fig. 2). systems easier to upgrade. For example, a six group (12- D. System����� ����� Configurations ������ × (������ − 1) × ������ ⁄ 2 Future Cascade blade designs might include other x86 cabinet) system might be initially configured with bundles of processors and different types of accelerators. SmallCascade and systems medium are size constructed systems can from be upgraded a number with of 2-

DDR3 12 optical cables, using 60 of the maximum of 240 ports DDR3 DDR3 Figure 3. DDR3 A Cascade chassis comprises 16 four-node blades with an Aries cabinetsingle chassis groups, granularity. each containing All groups 384 except nodes. the The last globalmust be links The node issues network requests by writing them across Xeon Xeon Figure 4. XeonStructureXeon of a Cascade electrical group. Each row represents the 16 per blade.available. The chassis If backplane the system provides is upgraded all-to-all connectivity to eight groups between (16the - the host interface to the NIC. The NIC packetizes these Ariescabinets) in a chassis, then with new 4 nodes cable attached bundles to each, are and added, connected connecting by the chassis the ffull.rom Largereach groupsystems are must divided be comprise into bundlesd of a wholeof equal number size, of with blades.backplane Each blade (green provides links). 5 Each electrical column connectors represents used an Aries to connect in one ofto theother six requests with virtual global addresses and issues the packets to new groups together and connecting each of the existing onecabinets bundle or a fromwhole each number group of groups. connecting to each of the other DDR3 DDR3 DDR3 the network. Packets are routed across the network to a DDR3 chassis chassiswithin theof a group. two-cabinet Each group,Aries alsoconnected provides by electrical 10 global cables links. (black These links). links destination NIC. The destination NIC executes the operation Xeon groupsXeon are to taken the to twoXeon connectors newXeon ones. on the These chassis cables backplane. can be added to groups. The number of inter group cables per bundle is given specified in the request packet and returns a response to the those in place without the need to rewire the system. The total by: III. CASCADE ROUTING source. Figure 2. A singleThe Ariesnumber second system -ofon -dimension, aoptical-chip device cables provides re requiredferrednetwork to is asgiven the byBlack: dimension, In a Cascade network request packets are routed from Support for I/O is provided by an I/O blade comprising connectivity for the four nodes on a Cascade blade. source node to destination node.������ Response ������ ��� packets ����� are two single-socket nodes and a single Aries ASIC. Each node consists of����� three ����� electrical ������ links× (������ from − each1) × Aries ������ within ⁄ 2 a ����� ����� ������ ≤ supports two PCI-Express Gen 3 x8 low-profile cards. An I/O C. Networkchassis Packaging to its peer in each of the other chassis within that independently routed back to the source������ node;− there1 is no Small and medium size systems can be upgraded with requirement for request and response traffic to follow the same blade is the same form factor as a compute blade. A chassis Each routergroup. provides both intra-group links that connect it This number may be reduced substantially in systems that can beFigure populated 4. Structure with a mix of aof Cascade compute electrical and I/O group.blades. Each row representsto other routersthe 16 in itssingle group andchassis inter- groupgranularity. links (also All known groups except the last must be path. Packets are routed deterministically or adaptively along Electrical group (2 cabinets) with 6 chassis do not require full global bandwidth. One advantage of CabinetAries in cooling a chassis, is implemented with 4 nodes usingattached side to-to each,-side and air flow,connected as by global the chassis links) that full.connect Larger it to the systems other groups. must The be groupcomprise d of a whole number of either a minimal or non-minimal path. Minimal routing within with airbackplane moving in (green series links). through Each all columncabinets represents in a row. anHeat Aries is in onesize of controls the six the scalabilitycabinets of theor asystem whole and number was influenced of groups. reducinga group willthe number always takeof inter at most-group one cables Green is and that one it makes Black the by many factors. To reduce cost, only electrical links are used extractedchassis via of water a two coils-cabinet embedded group, connected within each by cabinet.electrical Air cables (black links). Figure 5. The global (blue) links connect Dragonfly groups together. In a Source: G. Fannes et al., SC’12 proceedings (2012)within a group. The maximum group size is governed by the systemshop. Minimal easier routing to upgrade. between Forgroups example, will route a minimally six group in (12 - flow is accomplished by six horizontally mounted fans in Cascade largesystem system withIII. these 8 Clinks electricalASCADE are active R OUTINGgroups optical cables. (16 cabinets) separate enclosures placed after every pair of cabinets. maximum reach of the electrical cables, which is in turn cabothbinet) the system source andmight target be initially groups, andconfigured will take with exactly bundles one of limited by the signaling rate.In aRestriction Cascades on network theNorduGrid dimension request 2014, of Helsinki, packets May 20, are 2014 routedT. Schulthess from 11 12global optical (Blue) cables, link. Note using that 60 minimal of the routing maximum implies of a 240direct ports B. Network Topology Groups are connected via optical links, referred to as Blue the mechanical infrastructuresource limitnode the numberto destination of Aries routers node. Response packets are available.route between If thea source system and isa target, upgraded not the to minimal eight groups number (16 - The Cascade network design builds on the per chassis. The numberlinks (Fig. of ports 5) o. nEach the Aries Aries router provides and the 10 of these links for a total independently routed back to the source node; there is no cabinets)of hops required. then new Minimal cable p aths bundles may differ are added, in hop connectingcount if, for the BlackWidow system [8], one of the first systems to take balance between ofinter 960-group per group. and global Blue bandwidth links are also combined into sets of 4 links, advantage of high radix routers [5]. The Cray Dragonfly influenced the choicerequirement of group size. for request and response traffic to follow the same newinstance, groups one togetherrouting path and doesconnecting not require each a G ofree then and/or existing network design provides scalable global bandwidth while whichpath. restrictsPackets theare maximumrouted deterministically size of the system or adaptively to 241 groupsalong High speed, low cost electrical cables are used to construct groupsBlack hop to the in thetwo source new ones. and/or These destination cables groups can be due added to to minimizing the number of expensive optical links. Ideally, a (92,544either anodes) minimal. The or bluenon- minimallinks use path. 12X Minimal Active Opticalrouting withinCables router would have a sufficient number of ports to connect to groups containing up to 384 nodes housed within a pair of thoseplacement in place of the without Blue g lobalthe need link usedto rewire between the groups.system. The total a group will always take at most one Green and one Black all other routers in the system, with every node one hop away Cascade cabinets. These inter-group connections operate at 14 number of optical cables required is given by: from everyFigure other 5. nodeThe global - a network (blue) diameterlinks connect of one. Dragonfly This is notgro ups together.Gbps per In lane a and hop require. Minimal a maximum routing cable between length of 2.3groups will route minimally in large system these links are active optical cables. practicable for large systems, but if a group of routers, acting metres. Optical cablesboth are the used source for the and long target links between groups, and will take exactly one ����� ����� ������ × (������ − 1) × ������ ⁄ 2 in concert as a single very-high radix router, pool their links, groups. The Cascadeglobal design has(Blue) no external link. Noteswitches that and minimal half routing implies a direct then the Groups group as are a whole connected has enough via optical links to links, be directly referre dthe to number as B lueof optical links as an equivalent fat-tree. connected to all other groups in the system. This is the key route between a source and a target, not the minimal number Small and medium size systems can be upgraded with links (Fig. 5). Each Aries provides 10 of these links forThere a totalare three chassis (Fig. 3) within a Cascade cabinet, idea behind the Dragonfly network. Figure 4.of hopsStructure required. of a Cascade Minimal electrical paths group.may differ Each rowin hop represents count theif, for16 single chassis granularity. All groups except the last must be of 960 per group. Blue links are combined into setseach of contain4 links,ing 16 blades,instance, for a totalone ofrouting 192 nodes path per cabinet.does not require a Green and/or The Aries router is an evolution of the tiled design used in A two-dimensionalAries in all a- chassis,to-all structure with 4 is nodes used withinattached each to two each, and connected by the chassis full. Larger systems must be comprised of a whole number of which restricts the maximum size of the system to 241 groups Black hop in the source and/or destination groups due to the Cray(92,544 BlackWidow nodes) . systemThe blue [6] andlinks the use Cray 12X XE6 Active system Opticalcabinet Cables groupbackplane (Fig. 4) .(green The chassislinks). Each backplane column provides represents an Aries in one of the six cabinets or a whole number of groups. [3]. Using a tile-based micro-architecture simplifies connectivity inchassis the firstplacement of a dimension, two-cabinet of the referred group, Blue to connected g aslobal the Greenlink by used electrical between cables groups. (black links). III. CASCADE ROUTING In a Cascade network request packets are routed from source node to destination node. Response packets are independently routed back to the source node; there is no requirement for request and response traffic to follow the same path. Packets are routed deterministically or adaptively along either a minimal or non-minimal path. Minimal routing within a group will always take at most one Green and one Black Figure 5. The global (blue) links connect Dragonfly groups together. In a hop. Minimal routing between groups will route minimally in large system these links are active optical cables. both the source and target groups, and will take exactly one global (Blue) link. Note that minimal routing implies a direct Groups are connected via optical links, referred to as Blue route between a source and a target, not the minimal number links (Fig. 5). Each Aries provides 10 of these links for a total of hops required. Minimal paths may differ in hop count if, for of 960 per group. Blue links are combined into sets of 4 links, instance, one routing path does not require a Green and/or which restricts the maximum size of the system to 241 groups Black hop in the source and/or destination groups due to (92,544 nodes). The blue links use 12X Active Optical Cables placement of the Blue global link used between groups. Regular multi-core vs. hybrid-multi-core blades Initial Multi-core blade Final hybrid CPU-GPU blade

48 port router 48 port router

NIC0 NIC1 NIC2 NIC3 NIC0 NIC1 NIC2 NIC3

Xeon Xeon Xeon Xeon Xeon Xeon Xeon Xeon DDR3 DDR3 DDR3 DDR3 DDR3 DDR3 DDR3 DDR3

QPDC QPDC H-PDC H-PDC

Xeon Xeon Xeon Xeon GK110 GK110 GK110 GK110 DDR3 DDR3 DDR3 DDR3 GDDR5 GDDR5 GDDR5 GDDR5

4 nodes configured with: 4 nodes configured with: > 2 SandyBridge CPU > 1 Intel SandyBridge CPU > 32 GB DDR3-1600 RAM > 32 GB DDR3-1600 RAM ! > 1 NVIDIA K20X GPU ! > 6GB GDDR5 memory ! ! Peak performance of blade: 1.3 TFlops Peak performance of blade: 5.9 TFlops

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 12 A brief history of “Piz Daint”

• Installed 12 cabinets with dual-SandyBridge nodes in Oct./Nov. 2012 • ~50% of final system size to test network at scale (lessons learned from Gemini & XE6) • Rigorous evaluation of three node architectures based on 9 applications • dual-Xeon vs. Xeon/Xeon-Phi vs. Xeon/Kepler • joint study with Cray from Dec. 2011 through Nov. 2012 • Five applications were used for system design • CP2K, COSMO, SPECFEM-3D, GROMACS, Quantum ESPRESSO • CP2K & COSMO-OPCODE were co-developed with the system • Moving performance goals: hybrid Cray XC30 has to beat the regular XC30 by 1.5x • Upgrade to 28 cabinets with hybrid CPU-GPU nodes in Oct./Nov. 2013 • Accepted in Dec. 2013 • Early Science with fully performance hybrid nodes: January through March 2014

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 13 NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 14 source: David Leutwyler

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 15 Speedup of the full COSMO-2 production problem (apples to apples with 33h forecast of Meteo Swiss) Monte Rosa Tödi Piz Daint Piz Daint Cray XE6 Cray XK7 Cray XC30 Cray XC30 hybrid (GPU) (Nov. 2011) (Nov. 2012) (Nov. 2012) (Nov. 2013) 4x 4x

3x 3x

1.67x 3.36x

2x 2x 1.77x New HP2C funded code 1.49x 2.5x

1.4x 1.35x 1x 1x Current production code

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 16 Energy to solution (kWh / ensemble member)

Cray XE6 Cray XK7 Cray XC30 Cray XC30 hybrid (GPU) (Nov. 2011) (Nov. 2012) (Nov. 2012) (Nov. 2013) 6.0

Current production code

1.41x 1.75x 4.5

New HP2C funded code 6.89x 3.0 3.93x 1.49x 2.51x 2.64x 1.5

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 17 COSMO: current and new (HP2C developed) code

main (current / Fortran) main (new / Fortran)

dynamics (C++)

stencil library boundary physics dynamics (Fortran) conditions & physics (Fortran) X86 GPU halo exchg. (Fortran) with OpenMP / OpenACC Generic Shared Comm. Infrastructure Library MPI MPI or whatever

system system

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 18 velocities

pressure temperature Physical model Mathematical description water turbulence Discretization / algorithm

Domain science (incl. applied mathematics) lap(i,j,k) = –4.0 * data(i,j,k) + data(i+1,j,k) + data(i-1,j,k) + Code / implementation data(i,j+1,k) + data(i,j-1,k);

Code compilation “Port” serial code to supercomputers > vectorize > parallelize Computer engineering A given supercomputer > petascaling (& computer science) > exascaling > ...

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 19 velocities

pressure temperature Physical model Mathematical description water turbulence Discretization / algorithm

Domain science (incl. applied mathematics) lap(i,j,k) = –4.0 * data(i,j,k) + data(i+1,j,k) + data(i-1,j,k) + Code / implementation data(i,j+1,k) + data(i,j-1,k);

Code compilation “Port” serial code to supercomputers > vectorize Architectural options / design > parallelize Computer engineering A given supercomputer > petascaling (& computer science) > exascaling > ...

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 20 velocities

pressure temperature Physical model Mathematical description water turbulence Discretization / algorithm

Domain science (incl. applied mathematics) Tools & lap(i,j,k) = –4.0 * data(i,j,k) + Libraries data(i+1,j,k) + data(i-1,j,k) + Code / implementation data(i,j+1,k) + data(i,j-1,k);

Code compilation Optimal algorithm Auto tuning

Architectural options / design Computer engineering (& computer science)

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 21 Model development velocities based on Python or equivalent dynamic language pressure temperature Physical model Mathematical description water turbulence Discretization / algorithm Domain science Tools &

lap(i,j,k) = –4.0 * data(i,j,k) + Libraries data(i+1,j,k) + data(i-1,j,k) + Code / implementation data(i,j+1,k) + data(i,j-1,k);

Code compilation Optimal algorithm Auto tuning Computer engineering (& computer science) Architectural options / design

co-design

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 22 COSMO in five year: current and new (2019) code

main (current / Fortran) main (new / “Python”)

physics dynamics (C++ or “Python”) (“Python”) grid tools boundary some tools dynamics (Fortran) conditions & physics (Fortan / C++) BE1 BE… halo exchg. (Fortran) other tools Generic Shared Infrastructure Comm. BE.. Library MPI MPI or whatever

system system

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 23 Thank you!

NorduGrid 2014, Helsinki, May 20, 2014 T. Schulthess 24