
Nonhydrostatic Modeling with NICAM HIROFUMI TOMITA Research Institute for Global Change, JAMSTEC Advanced Institute for Computational Science, RIKEN Contents Numerical method on NICAM Dynamical core Horizontal discretization • Icosahedral grid – Modified by spring dynamics( Tomita et al. 2001, 2002 JCP) Nonhydrostatic framework • Mass and total energy conservation – ( Satoh 2002, 2003 MWR ) Computational strategy Domain decomposition in parallel computer Example of computational performance Problems Current problems Future problems Summary NICAM project NICAM project ( ~2000 ) Aim : • Construction of GCRM for climate simulation – Suitable to the future computer environment » Massively parallel supercomputer » First target : Earth Simulator 1 40TFLOPS in peak 640 computer nodes Strategy: • Horizontal discretization : – Use grid of icosahedarl grid (quasi-homogeneous) – Anyway, 2nd accuracy over the globe! • Dynamics: – Non-hydrostatic » Mass & total energy conservation • Physics : – Import from MIROC ( one of Japan IPCC models ) except for microphysics. Horizontal discretization Grid arrangement Arakawa A-grid type Velocity, mass • triangular vertices Control volume • Connection of center of triangles – Hexagon – Pentagon at the icosahedral vertices Advantage Easy to implement no computational mode • Same number of grid points for vel. and mass Disadvantage Non-physical Glevel-3 grid & control volume 2-grid scale structure • E.g. bad geostrophic adjustment Horizontal differential operator e.g. Divergence 1. ector : given at VPi u()Pi 2. Interpolation of u at Qi u()()()αPPP+ β u+ γ u u()Q ≈ 0 i 1+ mod(i , 6 ) i α+ β + γ 3. Gauss theorem 6 1 u()()QQi +1 u + mod(i , 6 ) ∇ •u()P0 ≈ ∑bi •ni AP()0 i=1 2 2nd order accuracy? NO Æ Allocation points is not gravitational center ( default grid ) Modified Icosahedral Grid (1) Reconstruction of grid by spring dynamics To reduce the grid-noise 1. STD-grid : Generated by the recursive grid division. 2. SPRING DYNAMICS : Connection of gridpoints by springs 6 dw0 ∑k()− die d i −α w0 M = i=1 dt dr w = 0 0 dt SPR-grid: Solve the spring dynamics Æ The system calms down to the static balance Modified Icosahedral Grid (2) Gravitational-Centered Relocation To make the accuracy of numerical operators higher 1. SPR-grid: Generated by the spring dynamics. Æ ● 2. CV: Defined by connecting the GC of triangle elements. Æ ▲ 3. SPR-GC-grid: The grid points are moved to the GC of CV. Æ ● Æ The 2nd order accuracy of numerical operator is perfectly guaranteed at all of grid points. Improvement of accuracy of operator est functionT Heikes & Randall(1995) Error norm Global error 1 / 2 I(,)(,)λ() x θ− x λ2 θ l() x = { [ T ]} 2 2 1 / 2 I{}[] x(,)Tλ θ Local error maxallλ,λx θ ( θ ,− )T x λ ( θ , ) l∞() x = maxallλ, θx(,)λ T θ STD-grid SPR-GC-grid L_2 norm Almost 2nd-order(●) Perfect 2nd-order(○) l_inf norm 2nd order(Not ▲) Perfect 2nd-order(△) Nonhydrostatic framework Design of our non-hydrostatic modeling Governing equation Full compressible system • Acoustic wave Æ Planetary wave Flux form • Finite Volume Method • Conservation of mass and energy Deep atmosphere • Including all metrics terms and Coriolis terms Solver Split explicit method • Slow mode : Large time step • Fast mode : small time step HEVI ( Horizontal Explicit & Vertical Implicit ) • 1D-Helmholtz equation Governing Equations Å MODE STAFL.H.S. : Æ Å MODE R.H.S. : SLOWÆ ∂ Vh ∂ ⎛ W 3 Vh ⎞ +R ∇h ⋅ +⎜ +G ⋅ ⎟ = 0 ⎜ 1 / 2 ⎟ (1) ∂t γ∂ ξ⎝ G γ ⎠ ∂ P ∂ ⎛ 3 P ⎞ +Vh ∇ h + ⎜G ⎟ =ADVh + Coriolis F (2) ∂t γ∂ ξ⎝ ⎠ γ ∂ ∂ ⎛ P ⎞ W + γ 2 ⎜ +⎟ Rg =ADV + F (3) ⎜ 1 / 2⎟ 2 z Coriollis ∂t ∂ξ⎝ G γ ⎠ ∂ ⎛ Vh ⎞ ∂ ⎡ ⎛ W 3 Vh ⎞⎤ Eh ⎜ h ⎟ h⎜ G ⎟ + ∇⎜ ⋅ ⎟ + ⎢ ⎜ 1 /+ 2 ⋅ ⎟⎥ ∂t ⎝ γ⎠ ∂ ξ⎣ ⎝ G γ ⎠⎦ Vh ⎡ P ∂ ⎛ PW3 ⎞⎤ 2 ∂ ⎛ P ⎞ − ⋅ ∇h +⎜G ⎟ − γ ⎜ +⎟Wg = Q ⎢ ⎜ ⎟⎥ ⎜ 1 / 2⎟ 2 heat (4) R ⎣ γ∂ ξ⎝ ⎠ γ⎦ R ∂ξ⎝ G γ ⎠ Prognostic variables Metrics • density RG=γ2 1 / 2ρ 1 / 2 ⎛ ∂z ⎞ G = ⎜ ⎟ 2 1 / 2 ⎝ ∂ξ ⎠ • horizontal momentum Vh =γG ρvh x, y 3 • vertical momentum W=γ2 G 1 / 2ρ w G=() ∇hξ z 2 1 / 2 H() z− z • internal energy E=γ G ρ in e ξ = s H− s z Temporal Scheme ( in the case of RK2) Assumption : the variable at t=A is known. ¾ Obtain the slow mode tendency S(A). HEVI solver 1. 1st step : Integration of the prog. var. by using S(A) from A to B. ¾ Obtain the tentative values at t=B. ¾ Obtain the slow mode tendency S(B) at t=B. 2. 2nd step : Returning to A, Integration of the prg.var. from A to C by using S(B). Æ Obtain the variables at t=C Small Step Integration step integration, there In small are 3 steps: 1. Horizontal Explicit Step ¾ of horizontal momentumUpdate HEVI 2. Implicit StepVertical ¾ of vertical Updates momentum and density. 3. Energy Correction Step ¾ Update of energy Horizontal Explicit Step • explicitly momentum is updated byHorizontal ⎡ t+ n Δτ [t , or+ t Δ / 2t ⎤ ] +t( n + 1 )τ Δ t+ n Δτ ⎛ P ∂ ⎛ 3 P ⎞⎞ ⎛ ∂Vh ⎞ Vh =Vh + Δτ ⎢⎜ −h ∇⎜G − ⎟⎟ + ⎜ ⎟ ⎥ ⎜ γ∂ ξ⎜ ⎟ γ⎟ ∂t ⎣⎢⎝ ⎝ ⎠⎠ ⎝ slow⎠ mode⎦⎥ Fast mode Slow mode : given Small Step Integration (2) Step Implicit Vertical • The equations of R,W, and E can be written as: RR+t( n + 1 )τ Δ− t+ n Δτ ∂ ⎛W+t( n + 1 )τ Δ⎞ + ⎜ ⎟ = G ⎜ 1 / 2 ⎟ R (6) Δτ ∂ξ ⎝ G ⎠ W+t( n + 1 )τ Δ−Wt+ n Δτ ∂ ⎛ P+t( n + 1 )τ Δ⎞ + γ 2 ⎜ ⎟R+ +t( n + 1 )τ Δg= G (7) ⎜ 1 / 2 2⎟ z Δτ ∂ξ⎝ G γ ⎠ PP+t( n + 1 )τ Δ− t+ n Δτ ∂ ⎡⎛W+t( n + 1 )τ Δ⎞ ⎤ R R + ⎜ ⎟c2t+ n Δτ + dW+t( n + 1 )τ Δ~ g = d G (8) ⎢⎜ 1 / 2 ⎟ s ⎥ E Δτ ∂ξ ⎣⎝ G ⎠ ⎦ CV CV – Eqs.(6), (7), Coupling and (8), we can obtain the 1D-Helmholtz equation for W: W+t( n + 1 )τ Δ ∂ ⎡ 1 ∂ ⎛ W+t( n + 1 )τ Δ⎞⎤ ⎡ ∂ ⎛ R W+t( n + 1 )τ Δ⎞⎤ g∂ ⎛ W+t( n + 1 )τ Δ⎞ − ⎜Δτ2c t+ 2 n Δτ ⎟ − ⎜Δτ 2 d g~ ⎟+ Δτ 2 ⎜ ⎟ 2 ⎢1 / 2 2 ⎜ s 1 / 2 ⎟⎥ ⎢ ⎜ 1 / 2 2 ⎟⎥ 2 ⎜ 1 / 2 ⎟ γ ξ∂ ⎣G γ∂ ⎝ ξ G ⎠⎦ ⎣∂ξ ⎝ CV G γ ⎠⎦ γ∂ ξ⎝ G ⎠ = R.H.S.(source term) (9) • Eq.(9) Æ W • Eq.(6) Æ R • Eq.(8) Æ E Small Step Integration (3) Energy Correction Step (Total eng.)=(Internal eng.)+(Kinetic eng.)+(Potential eng.) • consider the equation of total energyWe ∂ ⎡ Vh ⎤ ∂ ⎡ ⎛ W 3 Vh ⎞⎤ +E ∇ ⋅h h + k + Φ+ +h + k Φ⎜ G +⎟ ⋅ = 0 total ⎢()⎥ ⎢()⎜ 1 / 2 ⎟⎥ (10) ∂t ⎣ γ⎦ ∂ ξ⎣ ⎝ G γ ⎠⎦ where 2 1 / 2 Etotal = ργG(+ ein + k ) Φ • Additionally, Eq.(10) is solved as +t( n + 1 )τ Δ t+ n Δτ Etotal = Etotal +t( n + 1 )τ Δ ⎡ ⎡ Vh ⎤ ∂ ⎡ ⎛ W 3 Vh ⎞⎤⎤ − Δτ ∇h ⋅h + k ++ Φ +h + k Φ⎜ G +⎟ ⋅ ⎢ ⎢()⎥ ⎢()⎜ 1 / 2 ⎟⎥⎥ ⎣⎢ ⎣ γ⎦ ∂ ξ⎣ ⎝ G γ ⎠⎦⎦⎥ – Written by a flux form. • The kinetic energy and potential energy: Æ known by previous step. • Recalculate the internal energy: +t( n + 1 )τ Δ +t( + n 1 Δ )τ(t 1 )+ n +τ 2 Δ 1 / 2+t n + (τ Δ 1 ) E =Etotal ρ − G γ ( k + Φ) Large Step Integration Large step tendecy has 2 main parts: 1. Coliolis term ¾ Formulated straightforward. 2. Advection term • We should take some care to this term because of curvature of the earth Advection of momentum the Use of a catesian coordinate in which the origin is center of the earth. The advection of term Vh and W is calculated as follows. 1. Construct the 3-dimensional momentum V using Vh and W. 2. this vector as 3 components as (Express V1,V2,V3) in a fixed coordinate. ¾ These components are scalars. 3. Obtain a vector which contains 3 divergences as its components. Æ 1 / 2 2 ∇ ⋅()vVVV1 ,, ∇ v ⋅2 3 ∇ v ⋅ wherevi= V i ( γ G / ρ) 4. Split again to a horizontal vector and a vertial components. Æ ADVh , ADVz Computational strategy and performance Computational strategy(1) (0) region division level 0 (1) region division level 1 Domain decomposition 1. By connecting two neighboring icosahedral triangles, 10 rectangles are constructed. (rlevel-0) 2. For each of rectangles, 4 sub-rectangles are generated by connecting (2) region division level 2 (3) region division level 3 the diagonal mid-points. (rlevel-1) 3. The process is repeated. (rlevel-n) Computational strategy(2) Load balancing Example ( rlevel-1 ) # of region : 40 # of process : 10 Situation: Polar region: Less computation Equatorial region: much computation Each process manage same color regions Cover from the polar region and equatorial region. Avoid the load imbalance Computational strategy(3) Vectorization Structure in one region Icosahedral grid Æ Unstructured grid? Treatment as structured grid Æ Fortran 2D array Æ vectorized efficiently! 2D array Æ 1D array Higher vector operation length Computational Performance (1) Computational performance Depend on… • Computer architecture, degree of code tuning….. Performance on the old Earth Simulator Earth Simulator • Massively parallel super-computer based on NEC SX-6 architecture. – 640 computational nodes. – 8 vector-processors in each of nodes. – Peak performance of 1CPU : 8GFLOPS – Total peak performance: 8X8X640 = 40TFLOPS – Crossbar network Target simulations for the measurement • 1 day simulation of Held & Suarez dynamical core experiment Computational Performance (2) Scalability of our model (NICAM) --- strong scaling Configuration • Horizontal resolution : glevel-8 •30km resolution • Vertical layers : 100 Fixed • Computer nodes : increases from 10 to 80. Results Green : ideal speed-up line Red : actual speed-up line Computational Performance (3) Performance against the horizontal resolution --- weak scaling The elapse time should increase by a factor of 2. glevel Number of PNs Elapse Time Average Time GFLOPS (grid intv.) (peak performance) [sec] [msec] (ratio to peak[%]) 6 5 48.6 169 140 (120km) (320GFLOPS) (43.8) 7 20 97.4 169 558 (60km) (1280GFLOPS) (43.6) 8 80 195 169 2229 (30km) (5120GFLOPS) (43.5) 9 320 390 169 8916 (15km) (20480GFLOPS) Configuration As the grid level increases, # of gridpoints : X 4 Results # of CPUs : X 4 The elapse time increases by a factor of 2.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages30 Page
-
File Size-