Meteorology-driven variability of air pollution (PM1) revealed with explainable machine learning Roland Stirnberg1,2, Jan Cermak1,2, Simone Kotthaus3, Martial Haeffelin3, Hendrik Andersen1,2, Julia Fuchs1,2, Miae Kim1,2, Jean-Eudes Petit4, and Olivier Favez5 1Institute of Meteorology and Climate Research, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany 2Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany 3Institut Pierre Simon Laplace, École Polytechnique, CNRS, Institut Polytechnique de Paris, Palaiseau, France 4Laboratoire des Sciences du Climat et de l’Environnement, CEA/Orme des Merisiers, Gif sur Yvette, France 5Institut National de l’Environnement Industriel et des Risques, Parc Technologique ALATA, Verneuil en Halatte, France Correspondence: Roland Stirnberg (
[email protected]) Abstract. Air pollution, in particular high concentrations of particulate matter smaller than 1 µm in diameter (PM1), continues to be a major health problem, and meteorology is known to substantially influence atmospheric PM concentrations. How- ever, the scientific understanding of the ways by which complex interactions of meteorological factors lead to high pollution episodes is inconclusive RS:, as the effects of meteorological variables are not easy to separate and quantify. In this study, a novel, data-driven 5 approach based on empirical relationships is used to characterise, and better understand the meteorology-driven component of PM1 variability. A tree-based machine learning model is set up to reproduce concentrations of speciated PM1 at a suburban site southwest of Paris, France, using meteorological variables as input features. RS:The model is able to capture the majority 2 of occurring variance of mean afternoon total PM1 concentrations (coefficient of determination (R ) of 0.58), with model performance depending on the individual PM1 species predicted.