Trilinos/NOX Nonlinear Solver

. Capabilities: • Newton-Based Nonlinear Solver - Linked to Trilinos linear solvers for scalability - Matrix-Free option • Anderson Acceleration for Fixed-Point iterations • Globalizations for improved robustness - Line Searches, Trust Region, Homotopy methods • Customizable: ++ abstractions at every level • Extended by LOCA package - Parameter continuation, Stability analysis, Bifurcation tracking

. Download: Part of Trilinos (trilinos.sandia.gov)

. Further information: Andy Salinger [[email protected]]

‹#› 1 NOX and ML are part of larger Trilinos solver stack: Linear solvers, Equations solvers, Analysis tools

EquationSolvers Analysis Tools Linear Solver (embedded) ! Interface Analysis Tools Nonlinear Solver!NOX Analysis Tools Time Integration (black-box) ! Continuation Optimization LinearSolvers Sensitivity Analysis UQ (sampling) Stability Analysis Parameter Studies Linear Algebra Optimization Iterative Solvers UQ Solver Direct Solvers Solved Problem Interface Algebraic Nonlinear Model Preconditioners Interface Multilevel ! Preconditioners ! ML Your Model Here

FASTMath SciDAC Institute ‹#› 2 Trilinos/NOX: Robustness for Ice Sheet Simulation: PISCEES SciDAC Application project (BER-ASCR)

. Ice Sheets modeled by nonlinear Stokes’s equation • Initial solve is fragile: Full Newton fails • Homotopy continuation on regularization parameter “γ” saves the day

γ=10-10

γ=10-10

γ=10-1.0 γ=10-6.0 γ=10-2.5 γ=10-10 Greenland Ice Sheet Surface Velocities (constant friction model)

‹#› 3 Multigrid in Trilinos

. ML: algebraic multigrid (AMG) • Aggregation-based AMG for - Poisson, elasticity, convection-diffusion, eddy current (E&M) • Various coarsening and data rebalancing options • Smoothers: SOR, polynomial, ILU, block variants, user-provided

. MueLu: templated multigrid framework • Same main algorithms as above, plus energy minimizing AMG • Leverages Trilinos templated sparse linear algebra stack - Can use kernels optimized for various compute node types - Templated scalar type allows for mixed precision, UQ, … • Advanced data reuse possibilities, extensible by design

. Download/further information: www.trilinos.org

‹#› 4 ML and MueLu: Application highlights

ML scales up to 512K BG/Q & 128K cores in Drekar simulations

SARANS turbulence* Tokamak*

MHD generator* *courtesy E. Cyr/SNL

PPE&Solve&Weak&Scaling&& 1.0# 0.9# 0.8# MueLu scales up to 524K BG/Q 0.7# 0.6# cores in Sierra/Nalu simulations 0.5# 0.4# 0.3# 0.2# 0.1# Turbulent OpenJet (Re=6600) 0.0# 128# 1,024# 8,192# 65,536# 524,288#

(courtesy P. Lin/SNL) Parallel&Efficiency:&Time/(Krylov&Itera7on)& #&MPI&Processes&

Edge2based# Element2based#

‹#› 5 PARPACK

. Capabilities: • Compute a few eigenpairs of a Hermitian and non-Hermitian matrix • Both standard and generalized eigenvalues • Extremal and interior eigenvalues • Reverse communication allows easy integration with application • MPI/BLACS communication . Download: http://www.caam.rice.edu/software/ARPACK/ . Further information: beyond PARPACK • EIGPEN (based on penalty trace minimization for computing many eigenpairs) • Parallel multiple shift-invert interface for computing many eigenpairs

‹#› 6 PARPACK

. Hybrid MPI/OpenMP implementation . MATLAB interface available (eigs)

‹#› 7 MFDn: Nuclear Configuration Interaction Calculation

. MFDn (Many-body Fermion Dynamics for nuclear physics) is the most robust and efficient configuration interaction code for studying nuclear structure . Solve the nuclear quantum many-body problem

. PARPACK allows low excitation states to be extracted from a low dimensional Krylov subspace

10 10 9 10 8 10 7 10 6 10 4He 5 6Li 10 8Be 4 10B 10 12C 3 16O 10 19F 2 10 23Na

M-scheme basis space dimension 1 27Al 10 0 10 0 2 4 6 8 10 12 14 Nmax

‹#› 8