A Machine Learning-Guided Adaptive Algorithm to Reduce The

https://doi.org/10.5194/gmd-2020-425 Preprint. Discussion started: 16 February 2021 c Author(s) 2021. CC BY 4.0 License. A machine learning-guided adaptive algorithm to reduce the computational cost of atmospheric chemistry in Earth System models: application to GEOS-Chem versions 12.0.0 and 12.9.1 Lu Shen1, Daniel J. Jacob1, Mauricio Santillana2, 3, Kelvin Bates1, Jiawei Zhuang1, Wei Chen4 5 1John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA 2Computational Health Informatics Program, Boston Children’s Hospital, Boston, MA, USA 3Department of Pediatrics, Harvard Medical School, Boston, MA, USA 4Center for Functional Nanomaterials, Brookhaven National Laboratory, Upton, NY 11973, USA 10 Correspondence to: Lu Shen ([email protected]) Abstract. Atmospheric composition plays a crucial role in determining the evolution of the atmosphere, but the high computational cost has been the major barrier to include atmospheric chemistry into Earth system models. Here we present an adaptive and efficient algorithm that can remove this barrier. Our approach is inspired by unsupervised machine learning clustering techniques and traditional asymptotic analysis ideas. We first partition species into 13 blocks, using a novel 15 machine learning approach that analyzes the species network structures and their production and loss rates. Building on these blocks, we pre-select 20 submechanisms, as defined by unique assemblages of the species blocks, and then pick locally on the fly which submechanism to use based on local chemical conditions. In each submechanism, we isolate slow species and unimportant reactions from the coupled system. Application to a global 3-D model shows that we can cut the computational costs of the chemical integration by 50% with accuracy losses smaller than 1% that do not propagate in time. Tests show that 20 this algorithm is highly chemically coherent making it easily portable to new models without compromising its performance. Our algorithm will significantly ease the computational bottleneck and will facilitate the development of next generation of earth system models. 1 https://doi.org/10.5194/gmd-2020-425 Preprint. Discussion started: 16 February 2021 c Author(s) 2021. CC BY 4.0 License. 25 1 Introduction There is a strong motivation to couple atmospheric chemistry with meteorology and surface processes in Earth system models (ESMs) because it exerts strong forcing and feedbacks on the radiative budget of the Earth both directly and indirectly (CLIMA et al., 2016), but this is challenging because of the high computational cost (National Research Council, 2012). Global atmospheric chemistry mechanisms typically include over a hundred chemical species coupled through 30 kinetics, and integrating the chemical evolution of that system requires solving a large and stiff system of differential equations (Brasseur and Jacob, 2017). However, characterizing the chemical composition in most regions of the atmosphere does not in fact require solving for the full chemical complexity of the mechanism. Here we present an adaptive, stable and chemically coherent algorithm for solving atmospheric chemistry in ESMs that reduces the computational cost in half, with losses in accuracy less than 1% that do not propagate forward in time. Our algorithm is based on general principles that can 35 be easily applied to a wide range of mechanisms. Previous approaches of simplifying atmospheric chemistry mechanisms all involve some loss of accuracy or generality (Brasseur and Jacob, 2017). Reducing the dimension of the coupled system can be obtained by decreasing the number of species (Sportisse and Djouad, 2000), isolating long-lived species (Young and Boris, 1977), and removing unimportant reactions (Brown-Steiner et al., 2018). However, the importance of a species or a reaction varies in different atmospheric 40 conditions, so these schemes are not well adapted to global models. Some studies (Jacobson 1995; Rastigeyev et al., 2007) use different subsets of the full chemical mechanism for different regions with specified or locally determined boundaries, but this has limited success because the atmosphere has a continuum of chemical regimes, and geographic boundaries between regimes should be dynamic rather than pre-defined. An adaptive method to define mechanism subsets locally and on the fly has been proposed by Santillana et al. (2010) but incurs very expensive computational overhead (Santillana et al., 45 2010). The overhead can be avoided by compiling a library of pre-defined mechanism subsets, but a challenge is to select these subsets in a manner that is chemically coherent and portable across mechanisms (Shen et al., 2020). Another direction involves the use of machine learning to replace the traditional chemical solver, but this method is subject to error growth over time and can only be applied to short-term simulations (Keller and Evans, 2019). In this work, we use machine learning clustering ideas to guide us build an ensemble of chemically coherent subsets 50 (chemical regimes) of a full chemical mechanism, with the appropriate regimes to be selected in the atmospheric model locally and on the fly. The success of this approach requires the chemical mechanism to have high generalizability and efficiency, which have not yet been achieved in previous studies. To address the generalizability issue, we enforce chemical coherence with a new clustering approach that includes the chemical distance between species in the full mechanism (obtained by analyzing the network structure of reactants and products) as part of the optimization. Species clusters are 55 identified using a machine learning optimization approach that not only considers whether a species is fast or slow (a species is slow if both its production and loss rates are smaller than a threshold), but also minimizes the chemical distance between 2 https://doi.org/10.5194/gmd-2020-425 Preprint. Discussion started: 16 February 2021 c Author(s) 2021. CC BY 4.0 License. species to make the resulting adaptive mechanism chemically coherent. To improve the efficiency, we explore the potential to simplify the chemical mechanism by separately removing both slow species and unimportant reactions from the coupled system dynamically. Compared to previous studies (Shen et al., 2020), our algorithm is chemically coherent, accurate, and 60 can halve the computational cost of the atmospheric chemistry integration. It can be ported to other chemical mechanisms or models with very little effort, which will significantly facilitate the development of the next generation of ESMs. 2 Model description 2.1 GEOS-Chem model 65 We use the GEOS-Chem version 12.0.0 global 3-D model for tropospheric and stratospheric chemistry (https://doi.org/10.5281/zenodo.1343547) with a horizontal resolution of 4°×5° and 72 pressure levels extending from surface to 0.01 hPa, driven by MERRA2 assimilated meteorology. The full chemistry mechanism in the model has 228 species and 724 reactions, including coupled gas-phase and aerosol chemistry for the troposphere and stratosphere (Sherwen et al., 2016; Eastham et al., 2014). The model includes 20% gridboxes below 2km and 50% gridboxes below 12 km. The 70 chemical reactions are integrated using a 4th-order Rosenbrock solver designed for high performance with the Kinetic Pre- Processor (Sandu et al., 1997; Damian et al., 2002). In part of this study, we test the performance robustness of our reduced algorithm by porting it to GEOS-Chem version 12.9.1 (https://doi.org/10.5281/zenodo.3950473). This new version of model has 262 species and 850 reactions, including improved organic nitrate chemistry (Fisher et al., 2018), isoprene chemistry (Bates and Jacob, 2019), and halogen chemistry 75 (Wang et al., 2019). From version 12.0.0 to 12.9.1, we need to remove 49 old species and add 83 new species. We use 12 CPUs in a shared-memory Open Message Passing (Open-MP) parallel environment to test the performance of our algorithm throughout this study. 2.2 Definition of slow, long-lived species and unimportant reactions We separate the atmospheric species as fast or slow based on their production and loss rates relative to a threshold δ: fast if th 80 either Pi(n) ≥ δ or Li(n) ≥ δ, slow if Pi(n) < δ and Li(n) < δ (Pi and Li refer to the production and loss rates of the i species, δ is a threshold, and n is the concentrations of all species). The hydroxyl radical (OH) has a daytime concentration of the order of 106 molecules cm-3 and a lifetime of 1s. Thus, species with production and loss rates smaller than 102-103 molecules cm-3 s-1 are unlikely to influence other species in the coupled scheme (Santillana et al., 2010; Shen et al., 2020). In this study, we use δ from 500 to 1500 molecules cm-3 s-1 to partition the fast and slow species. Similarly, species with a lifetime longer than 85 10 days are considered as long-lived. We pre-select a limited number (M) of submechanisms for which we pre-define the Jacobian matrix. In each submechanism, 3 https://doi.org/10.5194/gmd-2020-425 Preprint. Discussion started: 16 February 2021 c Author(s) 2021. CC BY 4.0 License. if a reaction is slower than 10 molecules cm-3 s-1 over all gridboxes that select this mechanism, this reaction is considered as unimportant and will be removed from the submechanism. We also have tested using other thresholds to remove the slow reactions, and we find any threshold higher that 10 molecules cm-3 s-1 will significantly compromise the accuracy. 90 2.3 Explicit analytical solution for slow or long-lived species We solve the fast coupled species using the Rosenbrock solver to ensure high accuracy. For the slow or long-lived species, we approximate the evolution of concentrations using an explicit analytical solution that assumes first-order loss (Santillana et al., 2010), written as �� # = � − � = � − � � (1) �� # # # # # �# (�) �#(�) 123(4)∆4 95 �#(� + ∆�) = + (�#(�) − )� (2) �#(�) �#(�) where ni is the concentration of species i, Pi and Li are the production and loss rates, ki is the rate coefficient of the first-order loss, and Δt is the time step.

A Machine Learning-Guided Adaptive Algorithm to Reduce The

Climate Models and Their Evaluation

Documentation and Software User’S Manual, Version 4.1

Atmospheric Dispersion of Radioactive Material in Radiological Risk Assessment and Emergency Response

Lecture 29. Introduction to Atmospheric Chemical Transport Models

A Brief Summary of Plans for the GMAO Core Priorities and Initiatives for the Next 5 Years

DJ Karoly Et

Climate Reanalysis

Atmospheric Dispersion Modeling of the February 2014 Waste Isolation

The New Hadley Centre Climate Model (Hadgem1): Evaluation of Coupled Simulations

The Dynamic Response of the Global Atmosphere-Vegetation Coupled System

Chaos Theory and Its Application in the Atmosphere

Atmospheric Dispersion Modeling to Inform a Landfill Methane