GAW Report No. 181

Joint Report of COST Action 728 and GURME

Overview of Tools and Methods for Meteorological and Mesoscale Model Evaluation and User Training

For more information, please contact: World Meteorological Organization Research Department Atmospheric Research and Environment Branch 7 bis, avenue de la Paix – P.O. Box 2300 – CH 1211 Geneva 2 – Switzerland Tel.: +41 (0) 22 730 81 11 – Fax: +41 (0) 22 730 81 81

E-mail: [email protected] – Website: http://www.wmo.int/pages/prog/arep/index_en.html Report Joint of COST Action 728 GURME Report and 181 GAW No. WMO/TD - No. 1457 © World Meteorological Organization, 2008 © COST Office, 2008, ISBN 978-1-905313-59-4

The right of publication in print, electronic and any other form and in any language is reserved by WMO. Short extracts from WMO publications may be reproduced without authorization provided that the complete source is clearly indicated. Editorial correspondence and requests to publish, reproduce or translate this publication (articles) in part or in whole should be addressed to:

Chairperson, Publications Board World Meteorological Organization (WMO) 7 bis avenue de la Paix Tel.: +41 22 730 8403 P.O. Box No. 2300 Fax.: +41 22 730 8040 CH-1211 Geneva 2, Switzerland E-mail: [email protected]

NOTE

The designations employed in WMO publications and the presentation of material in this publication do not imply the expression of any opinion whatsoever on the part of the Secretariat of WMO concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries.

Opinions expressed in WMO publications are those of the authors and do not necessarily reflect those of WMO. The mention of specific companies or products does not imply that they are endorsed or recommended by WMO in preference to others of a similar nature which are not mentioned or advertised.

This document (or report) is not an official publication of WMO and has not been subjected to its standard editorial procedures. The views expressed herein do not necessarily have the endorsement of the Organization.

European COoperation in Science and Technology (COST)

COST, which is supported by the EU RTD Framework Programme, is the oldest and widest European intergovernmental network for cooperation in research. Established by the Ministerial Conference in November 1971, COST is presently used by the scientific communities of 35 European countries to cooperate in common research projects supported by national funds. The funds provided by COST support the COST cooperation networks (COST Actions) through which, with EUR 30 million per year, more than 30,000 European scientists are involved in research having a total value which exceeds EUR 2 billion per year. This is the financial worth of the European added value which COST achieves.

A “bottom up approach” (the initiative of launching a COST Action comes from the European scientists themselves), “_ la carte participation” (only countries interested in the Action participate), “equality of access” (participation is open also to the scientific communities of countries not belonging to the European Union) and “flexible structure” (easy implementation and light management of the research initiatives) are the main characteristics of COST.

As precursor of advanced multidisciplinary research, COST has a very important role for the realisation of the European Research Area (ERA) anticipating and complementing the activities of the Framework Programmes, constituting a “bridge” towards the scientific communities of emerging countries, increasing the mobility of researchers across Europe and fostering the establishment of “Networks of Excellence” in many key scientific domains such as: Biomedicine and Molecular Biosciences; Food and Agriculture; Forests, their Products and Services; Materials, Physical and Nanosciences; Chemistry and Molecular Sciences and Technologies; Earth System Science and Environmental Management; Information and Communication Technologies; Transport and Urban Development and Individuals, Societies, Cultures and Health. It covers basic and more applied research and also addresses issues of pre-normative nature or of societal importance. For further information visit: www.cost.esf.org

ESF logo - ESF provides the COST Office through an EC contract EU logo - COST is supported by the EU RTD Framework Programme

ESF provides the COST Office COST is supported by the EU through an EC contract RTD Framework programme

Joint Report of COST Action 728 (Enhancing Mesoscale Meteorological Modelling Capabilities for Air Pollution and Dispersion Applications)

and

GURME (GAW Urban Research and Environment Project)

OVERVIEW OF TOOLS AND METHODS FOR METEOROLOGICAL AND AIR POLLUTION MESOSCALE MODEL EVALUATION AND USER TRAINING

Editors: K. Heinke Schlünzen (Meteorological Inst., ZMAW, University of Hamburg, Germany), Ranjeet S Sokhi, University of Hertfordshire, UK

Contributors: Elissavet Bossioli, Peter Builtjes, Bruce Denby, Marco Deserti, John Douros, Barbara Fay, Gertie Geertsema, Marko Kaasik, Kristina Labancz, Volker Matthias, Ana Isabel Miranda, Nicolas Moussiopoulos, Viel Ødegaard, Denise Pernigotti, Christer Persson, Roberto San Jose, K. Heinke Schlünzen, Ranjeet Sokhi, Joanna Struzewska, Alessio D'Allura, Maria Athanassiadou, A. Arvanitis, Alexander Baklanov, Sylvia Bohnenstengel, J Elissavet Bossioli, Giovanni Bonafè, Carlos Borrego, Anabela Carvalho, Ulrich Damrath, Edouard Debry, Jaime Diéguez, Sandro Finardi, Bernard Fisher, Stefano Galmarini, Hubert Glaab, Steen C. Hoe, Nutthida Kitwiroon, Liisa Jalkanen, P. Louka, Alexander Mahura, Helena Martins, D R Middleton, Millán Millán, Alexandra Monteiro, Lina Neunhäuserer, Jose Luis Palau, Ulrike Pechinger, Gorka Perez-Landa, Martin Piringer, Denise Pernigotti, Víctor Prior, Maria Tombrou, C. Simonidis, Leiv Håvard Slørdal, Ariel Stein, Jens Havskov Sørensen, Y. Yu

Electronic version: November 2008 Website: www.cost728.org

WMO/TD-No. 1457 November 2008

Table of Contents

EXECUTIVE SUMMARY ...... i

1. INTRODUCTION...... 1

2. COST728 MESOSCALE MODEL INVENTORY...... 3

3. SUMMARY OF MESOSCALE MODEL APPLICATIONS ...... 7

4. DETERMINATION OF MODEL UNCERTAINTY...... 12 4.1 Monte Carlo Meteorological and Air Quality Data Uncertainty Analysis ...... 12 4.2 Sensitivity Analysis...... 16 4.2.1 Meteorological and Photochemical Ensemble Simulations...... 16 4.2.2 Input Parameters Sensitivity Analysis (Topography, Land-use)...... 16 4.2.3 Adjoint Modelling Approach...... 17 4.2.4 Sensitivity of Model Results to Nesting ...... 18 4.2.5 Sensitivity of UAP Forecasts to Meteorological Input and Resolution...... 18

5. MODEL QUALITY INDICATORS...... 21 5.1 Quality Indicators for Evaluating Meteorological Parameters ...... 21 5.1.1 Observation Availability ...... 21 5.1.2 Observation Error ...... 21 5.1.3 Recommended Quality Indicators for Different Meteorological Parameters...... 22 5.2 Quality Indicators for Air Quality Model Evaluation...... 24 5.2.1 Statistical Parameters for Concentrations ...... 24 5.2.2 EPA Quality Indicators...... 25 5.2.3 EU Directives Modelling Quality Objectives ...... 25 5.2.4 Application Examples ...... 26

6. VALIDATION DATASETS ...... 29 6.1 Model Validation Datasets and Selection Criteria...... 29 6.2 Mesoscale Model Validation Datasets and COST728 ...... 30 6.3 Other Efforts for the Harmonisation and Standardisation of Validation Datasets ...... 31

7. MODEL VALIDATION AND EVALUATION EXERCISES...... 35 7.1 Mesoscale Meteorological Model Validation and Evaluation Studies ...... 35 7.1.1 Use of the European Tracer Experiment (ETEX) for Model Evaluations...... 37 7.1.2 Meteorological Simulations over the Greater Athens Area Using MM5 and MEMO Mesoscale Models ...... 37 7.1.3 Evaluation of MEMO Using the ESCOMPTE Pre-campaign Dataset ...... 39 7.1.4 Modelling of SOA in the MARS-MUSE Dispersion Model...... 40 7.1.5 Photochemical Simulations over the Greater Athens Area ...... 41 7.1.6 Mesoscale Meteorological Model Inter-comparison and Evaluation in FUMAPEX ...... 42 7.1.7 Evaluation of COSMO-IT for Air Quality Forecast and Assessment Purposes ...... 44 7.1.8 Evaluation of MM5-CMAQ Systems for an Episode over the UK...... 46 7.1.9 Evaluation of the MM5-CMAQ-EMIMO Modelling System in Spain...... 48 7.2 Concentrations of Chemical Species ...... 49

8. MODEL EVALUATION METHODOLOGIES...... 51

9. USER TRAINING...... 54 9.1 User Training in Different Countries...... 54 9.2 Summary on User Training ...... 56 9.3 Recommendations for User Training ...... 57 9.3.1 Model User...... 57 9.3.2 Model Developer...... 57

10. Conclusions ...... 58

References ...... 60

Annex A: Glossary of terms...... 65 Annex B: Entries to the web based model inventory...... 66 Annex C: Estimates for measurement and model uncertainty ...... 68 Annex D: Statistical measures for meteorological parameters ...... 70 Annex E: Statistical measures for concentrations...... 74 Annex F: Evaluation of different wavelengths ...... 76 Annex G: Detailed evaluation results from FUMAPEX (fp 5 project) ...... 77 Annex H: Details on the evaluation of COSMO_IT for air quality and assessment purposes...... 79 Annex I: Detailed results of the evaluation of CMAQ for an episode over UK...... 83 Annex J: Detailed evaluation results for MM5-CMAQ-EMINO over Spain...... 85 Annex K: Structure of meta-database for model evaluation exercises ...... 87 Annex L: Entries in meta-database for model evaluation exercises...... 91 Annex M: Summary tables on model validation and evaluation ...... 95 Annex N: Mesoscale model user training...... 98 Annex O: Structure of WMO GURME air quality forecasting training course...... 106

EXECUTIVE SUMMARY

This report provides an overview of current methodologies and tools for mesoscale meteorological model validation and result evaluation, on validation datasets and user training. This overview will assist in the wider aim of COST728 to enhance European capabilities on meteorological models for air pollution dispersion applications. This report is meant as a first, but important, step for developing protocols for evaluating the use of mesoscale atmospheric models for pollution transport studies and for developing procedures for model quality assurance based on scientific and fundamental principles. Three different time scales are considered:

• Episodes (a few days). • Single cases that concern meteorological situations relevant for determining statistical values. • Extended periods/years, on an hour-by-hour to daily averaged basis, to determine air quality concentrations relevant for the EU-Directives.

A summary is provided in this report of the existing models and their capabilities (Section 2), of selected mesoscale model applications undertaken by COST728 partners (Section 3). Information on how to determine model uncertainty (Section 4) and how to evaluate model performance (Section 5) is given, and a review of available, well documented, three-dimensional datasets of known quality for model evaluation is given (Section 6). A summary of earlier evaluation exercises is given in Section 7. Validation methodologies and data include those attempts and data of meteorological parameters that are relevant for concentration forecasts. Concepts for model evaluation that are based on fundamental physical principles rather than on single case application are summarized in Section 8.

The evaluation of models is a necessary, but not the only, sufficient step to ensure reliable model results. Mesoscale models are too complex to be applied without deep knowledge of their application and experience in their application. There is currently no consensus on the extent or the depth of training that would be required for a non-expert to competently use mesoscale models. Mesoscale model user training being undertaken by COST728 partners is summarized in Section 9. First conclusions on how to evaluate mesoscale models and how to perform training are given in Section 10. A glossary of terms follows in Annex A.

i

1. INTRODUCTION

The objective of this report is to provide an overview of current methodologies and tools for evaluation of mesoscale meteorological models, on datasets and user training. This overview will assist in the wider aim of COST728 to enhance European capabilities on meteorological models for air pollution dispersion applications. This report is meant as a first, but important, step for developing protocols for evaluating the use of mesoscale atmospheric models for pollution transport studies and for developing procedures for model quality assurance based on scientific and fundamental principles. Three different time scales are considered:

• Episodes (a few days). • Single cases that concern meteorological situations relevant for determining statistical values. • Extended periods/years, on an hour-by-hour to daily averaged basis, to determine air quality concentrations relevant for the EU-Directives.

This report is based on literature reviews, questionnaires and on the web based metadata databases of COST728. The basic structure of the 5-piece meta-database system is shown in Figure 1. COST728 collaborates closely with COST 732 and ACCENT with respect to the model meta-database.

Model validation & evaluation exercises, evaluation methodologies

User training Guideline Validation data for QA sets (metadata) Model inventory

Model applications episodes sensitivity analyses Links

Figure 1. Distributed database of COST728 to collect meta-data: “Model inventory” (Section 2), “Model applications” (Section 3), “Validation datasets” (Section 6), “Validation and evaluation exercises”(Section 7),”User training” (Section 9)

The “model inventory” has been set up by University of Hamburg (UHH*) and the meta- databases on “validation datasets” and “model validation and evaluation exercises” have been set up by Aristotle University Thessaloniki (AUTH†). User training information is given in Annex E. The interlinked information from the database will be used to define a protocol for conducting quality assurance of mesoscale models.

This report provides a summary of the existing models and their capabilities (Section 2), a summary of selected mesoscale model applications undertaken by COST728 partners (Section 3), information on how to determine model uncertainty (Section 4) and how to evaluate model

* http://www.cost728.org † http://pandora.meng.auth.gr/mqat

1 performance (Section 5), and a review of available, well documented, three-dimensional datasets of known quality for model evaluation (Section 6). Model validation datasets include air pollution episode datasets resulting from or used in earlier projects (e.g. ESCOMPTE * , FUMAPEX † , COST715‡, CITY-DELTA§, TFS**).

A summary of earlier validation and evaluation exercises is given in Section 7. Evaluation methodologies and validation datasets include those attempts and datasets of meteorological parameters that are relevant for concentration forecasts. Within COST728 the impact of meteorological input uncertainties on concentration and meteorological model output is of primary concern. Uncertainty estimates for other datasets such as, emissions, chemistry and kinetic data will not be specifically investigated within COST728, instead the uncertainty values will be extracted from other sources. Evaluation concepts that are based on fundamental physical principles rather than on single case application are summarized in Section 8.

The evaluation of models is a necessary, but not the only, sufficient step to ensure reliable model results. Mesoscale models are too complex to be applied without deep knowledge of their application and experience in their application. There is currently no consensus on the extent or the depth of training that would be required for a non-expert to competently use mesoscale models. Mesoscale model user training being undertaken by COST728 partners is summarized in Section 9. First conclusions on how to evaluate mesoscale models and how to perform training are given in Section 10. A glossary of terms follows in Annex A.

As mentioned earlier the information discussed in this report has been derived mainly from existing methods and tools being used by the COST728 partners. This also relates to mesoscale meteorological model applications and user training aspects. Consequently, the report does not attempt to provide a fully comprehensive review covering all mesoscale models being employed or training being offered in Europe or elsewhere. However, the work cited here, substantially reflects the wider research and applications being undertaken within European organizations.

* ESCOMPTE † EC 5FP project FUMAPEX: Integrated Systems for Forecasting Urban Meteorology, Air Pollution and Population Exposure; web-site: http://fumapex.dmi.dk ‡ COST715 § CITY-DELTA ** TFS: Tropospheric Research Programme funded by the German Research Minister (1996-2002)

2 2. COST728 MESOSCALE MODEL INVENTORY K. Heinke Schlünzen(1), Roberto San Jose(2) (1) ZMAW, University of Hamburg, Meteorological Institute, Hamburg, Germany (2) Computer Science School, Technical University of Madrid, Madrid, Spain

A reliable forecast of meteorological data (wind direction and speed, turbulence parameters, humidity, radiation, boundary layer height, influence of heterogeneous terrain) is one of the preconditions for a reliable forecast of concentrations (Schlünzen, 2002). Therefore, COST728 places its emphasis on improving the meteorological models used in atmospheric dispersion studies. In this chapter an overview on the current capabilities of these models is given.

A web-based model inventory with detailed information on model capabilities is provided by COST728 (accessible from http://www.cost728.org). The inventory includes models for the microscale (models resolving the canopy layer and obstacles therein), mesoscale (regional models, domain covering at least 100 x 100 km2) and macroscale (hemispheric and global models) and covers meteorology, chemistry and transport models as well as models that simultaneously simulate meteorology, transport and chemistry. Table 1 summarizes the models relevant for COST728. Additional information on these entries can be found in Annex B.

Table 1. Summary of the regional Eulerian meteorology models included in the web based model inventory (status 30.10.2007)

Model Comments ADREA Meso-microscale model ALADIN (AU, PL) Comprehensive Air Quality model based on ALADIN forecast data ARPS Advanced Regional Prediction System BOLCHEM Meteorology model with chemistry included GESIMA Non-hydrostatic mesoscale model HIRLAM Regional model, limited area _Enviro _NH COSMO Non-hydrostatic mesoscale model (German versions COSMO_EU and COSMO_DE, Swiss (COSMO_CH§§, Alpine version COSMO_CH, Italian version COSMO_IT; COSMO ) COSMO_Climate model***, COSMO_EU†††, COSMO_DE‡‡‡, COSMO_IT§§§) MC2 Non-hydrostatic. Limited area model. Semi-Lagrangian MEMO Non hydrostatic mesoscale model. (ES, GR, PT) MERCURE Limited area MESO-NH non-hydrostatic limited area model. optional chemistry on-line (then called Meso-NHC) METRAS Non-hydrostatic community model with (passive) tracer and pollen transport; part of multi- scale model system M-SYS (meso-/microscale meteorology and chemistry); simplified non- research public domain version. MM5 Non-hydrostatic model. A globally used version. (GR, PT, UK, GER) RAMS Non-hydrostatic models. Limited area SAIMM Prognostic nonhydrostatic Model TAPM Very fast. Simplified linear chemistry UM Global and regional domains. Climatic variables WRF_ARW mesoscale meteorological model. Improved physics and new modules with respect to MM5. WRF_CHEM Chemistry included on-line. Hemispheric domain.

§§ Formerly named aLMo *** Formerly named CLM ††† Formerly named LME ‡‡‡ Formerly named LM §§§ Formerly named LAMI

3 The model qualities are summarized in several tables as part of the inventory for the different model types (see web based model meta-database at http://www.cost728.org/). Overview tables on the equations solved, on parametrizations and solution techniques used, as well as on initialisation and nesting techniques used within the models can be found on the website. Furthermore, summary tables provide details on the validation and evaluation of the models (Section 7). The database is open for updates and new entries. Changes in the existing entries are possible at any time; the summary tables are updated automatically once changes are made to the entries.

Most models provide several options for solving the basic equations, the approximations used and the applied parametrizations. In this document only a summary is given of how they are normally applied by the research institutes contributing to COST728. Currently (as of 30.10.2007) 18 different mesoscale meteorology model families (Table 1) have been introduced in the model meta-database; five of these are hydrostatic models (ALADIN, BOLCHEM, GME, HIRLAM, TAPM). Twelve models calculate precipitation from prognostic equations (ARPS, GESIMA, COSMO, MC2, MERCURE, MESO-NH, METRAS, MM5, RAMS, TAPM, UM, WRF), another four diagnose precipitation (ALADIN, BOLCHEM, GME, HIRLAM, MEMO) and two models do not calculate precipitation at all (ADREA, SAIMM). Cloud cover is diagnostically calculated in nine of the 15 models calculating clouds (ALADIN, BOLCHEM, GME, HIRLAM, COSMO, MEMO, MERCURE. MESO-NH, UM). With the exception of two models (ALADIN, UM) all calculate turbulent kinetic energy from a prognostic equation (ADREA, ARPS, BOLCHEM, COSMO climate model, ENVIRO- HIRLAM, GESIMA, HIRLAM, COSMO, M-SYS, MC2-AQ, MCCM, MEMO, MERCURE, Meso-NH, METRAS, MM5, RAMS, SAIMM, TAPM, WRF) and four models also calculate dissipation from a prognostic equation (ADREA, MERCURE, SAIMM, TAPM). Inversion heights are calculated using prognostic equations by six models (ADREA, MC2, MEMO, MM5, SAIMM, TAPM) and diagnosed in another nine (ALADIN, ARPS, BOLCHEM, HIRLAM, COSMO_EU, MERCURE, MESO-NH, METRAS, RAMS). Details on the parametrizations used in the model can be found in the COST728-WG1 report, and on modelling systems in the COST728-WG2 report (Baklanov et al., 2008). Further information can be found on the COST 728 website****.

Among the non-hydrostatic models, five models (COSMO, MC2, MM5 family members, UM, WRF) use no approximations (non-hydrostatic, fully compressible models), five the anelastic approximation (GESIMA, MERCURE, MESO-NH, METRAS, NH-HIRLAM) and six (BOLCHEM, GESIMA, MEMO, METRAS, RAMS, SAIMM) the Boussinesq approximation. ALADIN, BOLCHEM, HIRLAM, SAIMM and TAPM are hydrostatic models.

Table 2 lists the meteorological and chemical models according to the classical approach of solving the dispersion equation: Lagrangian †††† and Eulerian. Additionally, the classification includes the terms mesoscale, global and microscale for the Eulerian type of models in regard to the classical meteorological approach of the extension of the domain covered by the model simulation. The information on the models is again mostly based on the COST model inventory‡‡‡‡, which does not include OPANA and CFS. OPANA is documented in the European Environmental Agency model inventory§§§§ and CFS is the operational climate model from NCEP (USA)*****.

**** http://www.cost728.org/ †††† Lagrangian concerns here the model type and not the method of numerical solution (e.g. Lagranigian advection schemes) ‡‡‡‡ http://www.cost728.org §§§§ http://pandora.meng.auth.gr/mds/showlong.php?id=113 ***** http://cfs.ncep.noaa.gov/

4 Table 2. Summary of the Eulerian and Lagrangian transport and chemistry & transport models included in the web based model inventory (status 05.02.2007)

Chemistry & Comments transport model Eulerian mesoscale models AURORA AURORA employs exactly the same grid as its meteorological 'driver' (the ARPS model) BOLCHEM BOLCHEM can operate using two different gas phase chemistry schemes: SAPRC-90 and CB-IV. CAMx Many applications. Several chemical schemes. Optimized _ALADIN numeric’s. OSAT and PA. _MM5 CALGRID Meteorology is taken from CALMET. Simple chemistry. CHIMERE Meteorology with MM5 and COSMO-IT. CMAQ Meteorology with MM5. Several chemical schemes and process (PA). EMEP European applications. Enviro_HIRLAM Eulerian mesoscale models EPISODE Combined Eulerian and Gaussian approaches. Limited area. Simple linear chemistry. Diagnostic meteorology or from MM5. LOTOS-EUROS Ozone and Aerosol chemistry. European domain. 4 vertical layers. MARS Model for the Atmospheric Dispersion of Reactive Species MATCH Basic atmospheric chemistry. MC2-AQ Mesoscale Compressible Community - Air Quality MCCM Mesoscale Climate Chemistry Model MECTM††††† 3D Eulerian photochemistry and aerosol chemistry community model; part of model system M-SYS; meteorology from METRAS. MOCAGE With chemical data assimilation. MUSE Multilayer dispersion model. Photochemistry and PM Meteorology from MEMO. OFIS Two-layer 2-dimensional Eulerian photochemical model. OPANA Eulerian Chemical model based on MM5-CMAQ. Including EMIMO model (emission model) SILAM Dual Eulerian & Lagrangian Monte-Carlo modelling system with modular chemistry and data assimilation options. HIRLAM & ECMWF meteo. TCAM Aerosol chemistry. Limited area model. TREX Limited area. Simple chemistry. WRF/Chem Weather Research and Forecasting Chemistry model Lagrangian models FLEXPART Particle Lagrangian Model. European domain. FLEXTRA Trajectory Model. European domain. LPDM Lagrangian particle dispersion model SILAM Lagrangian dispersion model with a high-precision iterative advection algorithm and a Monte-Carlo random-walk representation of atmospheric diffusion Eulerian microscale models MICTM‡‡‡‡‡ 3D microscale Eulerian photochemistry model; meteorology from MITRAS. Eulerian global models CFS Climate Forecast System. An operational climate system

††††† Part of model system M-SYS ‡‡‡‡‡ Part of model system M-SYS

5 As of February 2007, 47 different transport or transport and chemistry model families (Annex B) are introduced in the database. In the mesoscale, 29 of the models (Table 2) include chemical reactions (AERMOD, ALADIN-CAMx, AURORA, BOLCHEM, CALGRID, CALPUFF, CAMx_ALADIN/_MM5, CHIMERE, CMAQ, EMEP, EPISODE, FARM, LOTOS-EUROS, MARS, MATCH, MC2, MCCM, MECTM, Meso-NH, MOCAGE, MUSE, OFIS, OPANA, RCG, SILAM, TAPM, TCAM, TREX, WRF/Chem), 13 include aerosol chemistry (ALADIN-CAMx, CHIMERE, CMAQ, ENVIRO-HIRLAM, FARM, MECTM, MESO-NH, MOCAGE, MUSE, OFIS, RCG, TCAM, WRF/Chem). The meteorology is taken from various other models and interpolated to the grid used in the CTM. Some chemistry transport models use the same grid as their meteorological drivers (ALADIN-CAMx, CMAQ, LPDM, MATCH, MC2-AQ, MECTM, MOCAGE, SILAM, TREX) and thereby avoid additional errors caused by the interpolation. Ten mesoscale models avoid the problem by calculating both, meteorology and chemistry (BOLCHEM, CALMET/CALPUFF, CALMET/CAMx, MC2-AQ, MCCM, Meso-NH, RCG, TAPM, WRF/Chem).

6 3. SUMMARY OF MESOSCALE MODEL APPLICATIONS Joanna Struzewska(1), K. Heinke Schlünzen(2) (1) Warsaw University of Technology, Faculty of Environmental Engineering, Institute of Environmental Engineering Systems, Warsaw, Poland (2) ZMAW, University of Hamburg, Meteorological Institute, Hamburg, Germany

The main purposes of mesoscale air quality models are to quantify the concentration levels of primary and secondary gaseous and particle pollutants, to asses the loading of acidifying compounds to the different parts of the ecosystem, and to understand the physical and chemical processes involved in formation, transport and deposition of these compounds.

In the past air quality models were divided into two categories: policy decision support models and research models. Modelling tools designed to provide input for policy purposes had simplified descriptions of physical and chemical processes and were used to carry out simulations over long time periods or for multiple scenarios. The research models, including the complex description of atmospheric processes, required large computer resources to carry out long term integrations or multiple runs for policy applications. At present, due to increasing computational capabilities, the state-of-the-art air quality models are being used in decision-making processes. Even long term assessments of abatement measures are to be evaluated with models that give reliable results under a variety of environmental conditions. This requirement is also crucial for operational and semi-operational systems used to inform the public on air quality or on the possible occurrence of smog episodes.

Table 3. Mesoscale air quality models' application

Policy support Research Long term • Long term assessment • Trends, seasonal and interannual variation of trace (concentration, deposition, species concentrations exposure) • Climatological transport pathways • Emission reduction policy • Regional climate change impacts and feedbacks Short term • Public information (on-line • Chemical processes studies (e.g. SOA formation, forecast, alerts on episodes) removal processes, photooxidants formation) • Emergency response • Biogenic and natural emissions variation (VOC, primary aerosols) • Impact of meteorological processes on pollutants transformation, transport and dispersion

For short term scenarios, air quality model applications might be connected with the description of dispersion characteristics, chemical transformation and removal. There is an ongoing effort to increase the understanding of the fundamental physical and chemical processes that govern pollutant transformation and transport in the atmosphere. This effort is important to improve and develop comprehensive parametrization schemes for meteorological and air quality modelling systems.

In the mesoscale, the air flow depends both on dynamics and on energy balance heterogeneities (i.e. spatial variation of surface characteristics, terrain slope). For air pollution dispersion, thermal effects are especially important during the periods characterised by weak synoptic forcing, which – due to poor ventilation - are favourable for the formation of pollution episodes. Uncertainties are mostly found in the flow fields above complex terrain, when precipitation and clouds develop, and in the structure of the planetary boundary layer. Hence, current research activity in this area is dedicated to better describe surface fluxes over complex terrain, turbulent mixing and height of the planetary boundary layer. In addition, convective processes and cloud formation are considered to be important factors influencing pollutant distributions.

7 Meteorological conditions not only impact the pollutants dispersion, but also the chemical transformation processes, the intensity of biogenic emissions and the efficiency of dry and wet removal. This requires proper treatment of surface layer characteristics and correct approaches for radiation and condensation. Figure 2 schematically shows the relation of calculated concentrations on meteorological parameters and chemical/physical processes that depend on pollutant and other characteristics. The Figure also indicates the meteorological, physical and chemical parameters and processes that should be treated in a mesoscale model.

Pollutant and release characteristics

Emission (gas, particle, point, area,..)

Pollutant Chemical Wind Physical Pollutant charac- trans- trans- charac- teristics Turbu- Radia- teristics fers lence tion fers and Concen- and meteoro- tration Tem- meteoro- Humi- logical pera- (i.e. logical (i.e. dity para- chemical ture aerosol para- meters reactions) Clouds growing) meters

Deposition (wet and dry)

Pollutant and surface characteristics

Figure 2. Sketch of necessary meteorological parameters (in green circle) and pollutant characteristics (in blue cube) including their dependency on external characteristic in the outermost field (adopted from Schlünzen & Krell, 1994)

Comprehensive chemical mechanisms (often with a few hundred reactions) involving a large number of chemical species (~70 or more) are included in current air quality models. However, considering the type of modelled problem, three major applications might be distinguished: summer photochemical pollution episodes, aerosol formation and distribution, and the assessment of acidification and eutrophication (Table 4).

Table 4. Model applications – types of air quality problems

Type of the problem Model applications Photooxidants • Impact of local circulation (breeze, urban heat island, flow over complex terrain) • Summer smog downwind of urban areas (urban plume) • Pan-European summer photochemical episodes • Long-range transport of ozone and its precursors Aerosols • Urban winter smog formation • SOA formation • Long range transport of naturally emitted aerosols • Aerosols size segregation and speciation

SOx, NOx, NHx • Critical loads (deposition) • Wet removal (rain and fog)

8 All institutions involved in the COST728 Action implement significant research programmes in the field of dispersion meteorology and air quality. Based on the results of an internal survey, universities are mostly research oriented, while meteorological services, due to their national responsibilities, combine scientific activity with more practical and policy oriented applications (Table 5). It is worthwhile to note that there is a clear tendency to go in the direction of "One- Atmosphere Models". This means that all the different aspects mentioned in Table 4 and Figure 2 are treated within the same model system.

Table 5 provides an overview of the partner activities within COST728. These activities reflect the types of applications in which air quality models are applied in Europe. Note that not all research work is listed for the different institutions.

Table 5. Main contributions of the participating institutions to the research activities of COST728

Application type Dispersion Air quality Chemistry meteorology Spatial cover Local to Global to Regional Regional to local scale regional scale regional scale scale Period Short term Long term Short and Short Short and long long term term term

episodes

air quality emissions processes‡ long periods long periods

Surface fluxes Long range and and Long range Urban air quality quality air Urban natural aerosols* aerosols* natural Aerosol processes local meteorology† meteorology† local Urban surface layer Dry and wet removal

Short-term air quality Emission & transport of & transport Emission transboundary transport transport transboundary Dispersion influenced by Institution Feedback to meteorology Climate change impacts on ABL height parametrization parametrization ABL height Air quality assessment over Anthropogenic and biogenic biogenic and Anthropogenic Aristotle University of X X X Thessaloniki ARPA-Hydro-Meteorological X X X X Service (Emilia Romagna region, Italy) Bogazici University X X

Bulgarian Academy of Sciences X X X

Czech Technical University X X

Danish Meteorological Institute X X X X X X X X

Deutscher Wetterdienst X X X (German Weather Service)

Dokuz Eylul University X

Earth System Research Lab Global Systems Dir (ESRL / X X X X X X X X GSD)

* Natural aerosol: dust, sea salt, pollens, particles from biomass burning † Local circulations and/or complex terrain ‡ gas pollutants and particles (incl. radionuclides)

9 Application type Dispersion Air quality Chemistry meteorology Spatial cover Local to Global to Regional Regional to local scale regional scale regional scale scale Period Short term Long term Short and Short Short and long long term term term

episodes

air quality emissions processes‡ long periods long periods

Surface fluxes Long range and and Long range Urban air quality quality air Urban natural aerosols* aerosols* natural Aerosol processes local meteorology† meteorology† local Urban surface layer Dry and wet removal

Short-term air quality Emission & transport of & transport Emission transboundary transport transport transboundary Dispersion influenced by Institution Feedback to meteorology Climate change impacts on ABL height parametrization parametrization ABL height Air quality assessment over Anthropogenic and biogenic biogenic and Anthropogenic

Finnish Meteorological Institute X X X X X X X

Flemish Institute for X X X X X X Technological Research

Fundación CEAM X X

GKSS Research Center X X

Graz University of Technology X X X

Hungarian Meteorological X X X Service Instituto de Meteorologia, X X Lisboa

Istanbul Technical University X X X X

KNMI Royal Netherlands X X X Meteorological Institute MAQNet, York University, X X X X Canada

Meteo-France X X

Meteorological Institute, ZMAW, X X X X X X X X x University of Hamburg National Institute of X X X Meteorology and Hydrology Norwegian Meteorological X X X Institute

Paul Scherrer Institute X X X X

Swedish Meteorological and X X X X Hydrological Institute Technical University of Madrid X X X X (UPM)

TNO-MEP X X X X X X X X X

UK X X

10 Application type Dispersion Air quality Chemistry meteorology Spatial cover Local to Global to Regional Regional to local scale regional scale regional scale scale Period Short term Long term Short and Short Short and long long term term term

episodes

air quality emissions processes‡ long periods long periods

Surface fluxes Long range and and Long range Urban air quality quality air Urban natural aerosols* aerosols* natural Aerosol processes local meteorology† meteorology† local Urban surface layer Dry and wet removal

Short-term air quality Emission & transport of & transport Emission transboundary transport transport transboundary Dispersion influenced by Institution Feedback to meteorology Climate change impacts on ABL height parametrization parametrization ABL height Air quality assessment over Anthropogenic and biogenic biogenic and Anthropogenic Universitat Politècnica de X X Catalunya

University of Athens (NKUOA) X

University of Aveiro X X X X X

University of Brescia X X X X X

University of Hertfordshire X X X X X X X X

University of Sofia X X

University of Tartu X

University of West Macedonia X X X

US EPA Atmospheric Modelling X X X X X X X X X X X Decision Warsaw University of X X X X Technology

11 4. DETERMINATION OF MODEL UNCERTAINTY Ana Isabel Miranda(1), Anabela Carvalho(1), Richard Tavares(1), Peter Builtjes(2), Víctor Prior(1), Carlos Borrego(1), K. Heinke Schlünzen(3), Barbara Fay(4), Veil Odegaard(5) (1) CESAM & Department of Environment and Planning, University of Aveiro, 3810-193 Aveiro, Portugal (2) TNO, Dep. of Air Quality and Climate Change, Utrecht, the Netherlands and Free Univ. Berlin, Inst. Of Meteorology, Berlin, Germany (3) ZMAW, University of Hamburg, Meteorological Institute, Hamburg, Germany (4) Deutscher Wetterdienst, Offenbach, Germany (5) Det Norske Meteorologisk Institutt, Blindern, Oslo, Norway

The main purpose of this chapter is to present a state-of-the-art review on the impact of model errors on meteorological data relevant for concentration calculations, giving some case studies as examples. Hence, this review deals with model uncertainty estimation methodologies, namely, those related to meteorological outputs important for air quality simulation.

Uncertainties associated with air quality model simulations are varied and complex (Fine et al., 2003). Despite the need to quantify these uncertainties (Dabberdt et al., 2004), few attempts have been made to investigate meteorological uncertainties and their role in limiting the expected accuracy of deterministic air quality simulations. The impact of uncertainties in meteorological inputs has been particularly difficult to assess because of the complex correlations, both in the spatio-temporal evolution of the individual meteorological inputs and among the meteorological inputs (Sathya, 2003). On the one hand, meteorology may control or influence emission rates of chemical species and aerosol formation processes (Seaman, 2000) due to the strong dependence of reaction rates on relative humidity, solar energy, temperature and the presence of liquid water and, on the other hand, chemical species concentrations are influenced by thermodynamical processes. Boundary layer structure is strongly related with chemical species concentrations in air quality modelling systems, especially concerning mixed-layer depth, boundary layer stability, turbulent mixing intensity and lower tropospheric three-dimensional wind field (Shafran et al., 2000). These quantities are determined by atmospheric processes that must be simulated accurately, namely, horizontal and vertical transport, turbulent mixing and convection. Seaman (2000) has listed the principal meteorological state variables usually supplied to air quality models:

• Horizontal and vertical wind components. • Temperature. • Water vapour mixing ratio. • Cloud fraction and liquid water content. • Solar actinic flux. • Sea level pressure. • Boundary layer depth. • Turbulence intensity. • Surface fluxes for heat, moisture and momentum.

With such a large number of input fields, estimating the uncertainties in the model outputs is not a trivial exercise. In principle, there are two main methods to investigate model uncertainty, Monte Carlo analysis (Section 4.1) and sensitivity studies (Section 4.2).

4.1 Monte Carlo Meteorological and Air Quality Data Uncertainty Analysis The Monte Carlo analysis is one of the most commonly used methods to estimate uncertainties in model input variables since it is based on quite simple principles (Hanna et al., 1998; Hanna et al., 2001; Bergin et al., 1999). It may be applied to a complete set of more than 100 input parameters and it allows the use of standard nonparametric statistical tests concerning confidence intervals. Several studies have included Monte Carlo simulations with perturbed meteorological and photochemical variables (Hanna et al., 2001; Beekmann and Derognat, 2003), which attempt to span the range of uncertainties of the input to parameters by quasi-random sampling from a specified probability distribution for each parameter, and adjoint linear sensitivity studies of meteorological and photochemical variables (Menut, 2003) about a control parameter set. According to Zhang et al. (2005), all these studies have limitations in the manner of treating

12 meteorological variability. The Monte Carlo simulations apply adjustments to meteorological fields that are uniform in space and time, thereby ignoring the true scales of meteorological variability and the differences in meteorological uncertainty across scales (Hogrefe et al., 2001). The linear sensitivities computed by the adjoint technique are valid only in the neighbourhood of the control simulation, and in the case of sensitivity to wind, that neighbourhood is likely to be quite small (Yegnan et al., 2002).

The Monte Carlo uncertainty analysis deals with only one component of the total model uncertainty: the uncertainty in the inputs to the model. In the Monte Carlo procedure, a model is run a large number of times. Each time new values for each of all the input variables whose variability is considered are selected from their respective “pre-defined” uncertainty distributions using a suitable re-sampling technique such as the Simple Random Sampling (SRS) or Latin Hypercube Sampling (LHS), and the model outputs are recorded. The ensemble of model outputs may then be subjected to statistical analysis to ascertain uncertainty in model predictions due to input uncertainties.

The following example intends to present a summary of the work developed by Hanna et al. (2001), which was started in 1997, in order to illustrate the Monte Carlo simple random sample methodology applied to input data uncertainties and their impacts on photochemical model results. The UAM-IV is the air quality model used in this study.

It is essential to obtain knowledge on each specific input variable uncertainty. The first step is to identify the input parameters that will be considered for the Monte Carlo experiment, and the second is to associate with each input parameter a distribution function (shapes and key parameters such as median and variance). All this information on data was gathered by Hanna et al. (2001) through an expert elicitation where around 20 experts were asked to give estimates of uncertainties, based on their experience, on a web page. The experts had to give estimates of the uncertainty range that would include 95% of the possible values (i.e., from the 2.5th percentile to the 97.5th percentile of the cumulative distribution function (CDF)). Table 6 gives the information compiled for the considered input variables by Hanna et al. (2001), their 95% uncertainty ranges, their assumed distribution functions and the standard deviations of the natural logarithm of the input variable (for log-normal distributions) or the input variable itself (for normal distributions).

No correlations on input variables were considered due to lack of information. Moreover, there are some constraints on independent random value estimations for input wind speed and direction for each site at the same instant. Almost all 128 input variables considered in the Monte Carlo experiment are described by a log normal distribution function, by a hypothesis. Exceptions are wind direction, ambient temperature, relative humidity and cloud cover which are assumed to follow a normal distribution.

Table 6. Uncertainty ranges (include 95% of data) and associated sigmas (standard deviations of log- transformed data) for some of the 128 UAM-V input variables studied in the Monte Carlo runs by Hanna et al. (2001). An uncertainty range defined by plus and minus and a “factor of…” encompasses 95 % of the data. For small uncertainty factors (i.e., less than 2), a factor of 1 + x uncertainty can be considered to be “plus and minus 100 x %”

Uncertainty range (includes Sigma (log-normal unless Variable 95 % of data) noted) Initial ozone concentration Factor of 3 0.549

Initial NOx concentration Factor of 5 0.805 Initial VOC concentration Factor of 5 0.805 Top ozone concentration Factor of 1.5 (50 %) 0.203

Top NOx concentration Factor of 3 0.549 Top VOC concentration Factor of 3 0.549 Side ozone concentration Factor of 1.5 0.203

Side NOx concentration Factor of 3 0.549 Side VOC concentration Factor of 3 0.549

Major point NOx emissions Factor of 1.5 0.203

13 Uncertainty range (includes Sigma (log-normal unless Variable 95 % of data) noted) Major point VOC emissions Factor of 1.5 0.203 Wind speed Factor of 1.5 0.203 Wind direction ± 40 º 20 º (normal) Ambient temperature ± 3 K 1.5 K (normal)

H2O concentration (as RH) 30 % 15.0 % (normal) Vertical diffusivity (8AM-6PM; < 1000 MAGL) Factor of 1.3 (30 %) 0.131 Vertical diffusivity (all other times and heights) Factor of 3 0.549 Rainfall amount Factor of 2 0.347 Cloud cover (tenths) 30 % 15 % (normal) Cloud liquid water content Factor of 2 0.347

Area biogenic NOx emissions Factor of 2 0.347 Area biogenic VOC emissions Factor of 2 0.347

Area mobile NOx emissions Factor of 2 0.347 Area mobile VOC emissions Factor of 2 0.347 Area low point VOC emissions Factor of 2 0.347

Other area NOx emissions Factor of 2 0.347 Other area VOC emissions Factor of 2 0.347

NO2, HCHOr, HCHOs, ALDs, and O3-O1 Factor of 2 0.347 Photolysis rates Factor of 2 0.347 Factor of 1.01 to 3.02 0.10 to 0.55 CB-4 reactions 1-94 Median 1.80, Mode 2.5 Median 0.30, Mode 0.46

Some additional estimates on input and output data uncertainty have been given by COST728 experts in Annex C.

The simple sample random Monte Carlo exercise was performed for a period of time containing an ozone episode at 12-14 July 1995 over the domain covering the north-eastern part of the United States and referred to as the Ozone Transport Assessment Group (OTAG) domain. This domain is divided into 11 sub-domains where the results were analysed. Four emission input files were considered: 1995 emissions, 2007 estimated emissions, 50 % NOx reductions on anthropogenic emissions at 2007 and, also for this particular year, 50 % VOC reductions on anthropogenic emissions. The sets of 128 random perturbation numbers for the 100 Monte Carlo runs were identical over the four base emissions scenarios.

Gross uncertainties in the output results were determined with a high level of confidence. The variance of the output variables can be well defined allowing, for example, to determine the range of variance in the predicted maximum daily hourly average ozone concentrations with 100 Monte Carlo runs.

To illustrate the consistent spread of the distributions, Table 7 lists the 2.5th, 50th, and 97.5th percentiles of the distributions of predicted maximum hourly averaged ozone concentration (ppm), over the 12-14 July 1995 period, for the 11 sub-domains and for the entire OTAG domain (whole-domain) from the 100 Monte Carlo runs with the median year-2007 projected emissions.

14 Table 7. 2.5th, 50th, and 97.5th percentile on the cumulative distribution function of the 100 Monte Carlo predictions of maximum hourly averaged maximum ozone concentration (ppm) for the 11 sub- domains and for the entire domain for the 12-14 July 1995 ozone period and for year-2007 median emissions, for the OTAG domain (12 km grid) (Hanna et al., 2001)

Sub-domain 2.5th percentile 50th (median) 97.5th percentile Atlanta 0.09 ppm 0.17 ppm 0.32 ppm Balt-Wash 0.08 ppm 0.14 ppm 0.22 ppm Nashville 0.07 ppm 0.12 ppm 0.21 ppm Chicago 0.07 ppm 0.12 ppm 0.19 ppm Louisville 0.06 ppm 0.12 ppm 0.19 ppm Pittsburgh 0.07 ppm 0.12 ppm 0.18 ppm Philly 0.07 ppm 0.11 ppm 0.19 ppm New York 0.06 ppm 0.11 ppm 0.19 ppm New England 0.06 ppm 0.11 ppm 0.19 ppm Charlotte 0.07 ppm 0.11 ppm 0.18 ppm St. Louis 0.06 ppm 0.09 ppm 0.15 ppm Whole Domain 0.13 ppm 0.19 ppm 0.32 ppm

In another study Beekmann and Derognat (2003) applied a Bayesian Monte Carlo (BMC) uncertainty analysis to a case study simulation (7 August 1998 and 16-17 July 1999) of photochemical smog formation in the Ile-de-France region during the ESQUIF campaign. The uncertainty assessment is based on the chemistry transport model CHIMERE covering the European continental scale with nesting options for several urban areas. The study addresses the overall model uncertainty due to several model input parameters (emissions, meteorological parameters, rate constants, photolysis frequencies). The Bayesian variant of Monte Carlo analysis allows the attribution of larger weights to those individual simulations which give a better fit to observations. The authors obtained the following results:

• Uncertainties in the simulated ozone maxima (O3 max) for the 3 days, both for the baseline and for the 50% reduced emissions scenario, are reduced by a factor between 1.5 and 2.7 by the measurement constraint and range between ±15 and ±30% (when expressed as relative differences between the 50th and the 10th or 90th percentiles). • Uncertainties in the simulated differential sensitivity of ozone formation to NOx and VOC emission reductions are reduced by a factor between 1.8 and 3.1 by the measurement constraint and range between ±4 and ±10 ppb (when averaged over the plume). The measurement constraint induces little changes in daily surface ozone (DSO) for 7 August, shifts it to even more positive values for 16 July, and shifts it to negative values for 17 July (larger probability for a sensitivity to NOx emission reductions). • The constraint by ozone measurements in the urban area and in the plume is, in most cases, sufficient to efficiently reduce uncertainties in O3 maxima for the baseline and for the 50% reduced anthropogenic emissions scenario. Additional nitrogen species measurements (NOy, NOx) in the plume are necessary to reduce the uncertainties in the DSO; additional constraints by VOC and wind measurements only slightly change the results. • Sensitivity tests with modifications in the BMC method (varying uncertainty ranges for input parameters, a lognormal instead of a normal distribution of the uncertainty in observations, and a larger number of Monte Carlo simulations) confirm that the results are robust. • Changes in the simulated sensitivity to NOx and VOC emission reductions are related to the modified a posteriori distributions of emissions, i.e., a smaller average VOC/NOx emission ratio for 16 July (-22%) and a larger one for 17 July (+27%). A possible underestimation of the PBL height on 17 July would cause a posteriori NOx emissions to decrease to a lesser extent (-11% instead of -21%). The median wind speed and propane equivalent carbon/NOy ratio obtained with the standard constraint alone are in good agreement with corresponding measurements from the DIMONA flight.

15 4.2 Sensitivity Analysis An alternative approach to investigate the model uncertainties are sensitivity studies.

4.2.1 Meteorological and Photochemical Ensemble Simulations Zhang et al. (2007) presents an ensemble approach in order to evaluate the impacts of meteorological uncertainties on ozone pollution. The purpose of the study is to investigate the sensitivity of Eulerian grid model ozone simulations to small perturbations of meteorological variables that are realistic in structure and evolution. Through this ensemble approach it is demonstrated that the sensitivity of ozone to such perturbations is substantial and constitutes a serious limitation on deterministic photochemical simulations. This study demonstrates the impact of such uncertainties on ozone pollution predictability, through ensemble forecast using current state-of-the-art meteorological and photochemical prediction models. The strong correlation between the peak ozone with initial wind and temperature uncertainties clearly demonstrated the importance of accurate representation of meteorological conditions for local prediction. This paper illustrates the real need for probabilistic evaluations and forecasting of air pollution, in particular for regulatory purposes.

4.2.2 Input Parameters Sensitivity Analysis (Topography, Land-use) This case study concerns the influence of thermally induced circulations on photochemical model results. The applied numerical system includes models MM5 and MARS. The model input domain covers the North-western part of Portugal mainland (40 x 40 grid cells with 5 x 5 km resolution). The simulated period for the pollutant’s dispersion in the atmosphere included two consecutive summer days under the influence of a thermal low-pressure system located over the Iberian Peninsula, 15 and 16 of July 2000. More details on this modelling study can be found in Carvalho et al. (2006).

The sensitivity analysis on vertical ozone concentration fields was carried out using the factor analysis developed by Stein and Alpert (1993). According to these authors 2n simulations are required to correctly evaluate the contribution and the interaction between the n factors. In this study a set of four simulations were performed (Table 8). The modified characteristics include constant height as flat terrain at 0 m and constant land use defined as mixed shrub and grassland (code 9 from the United States Geological Survey database).

Table 8. Sensitivity analysis options Simulation Land use Topography f12 USGS USGS f2 USGS Null f1 constant USGS f0 constant Null

The ozone concentration fields obtained are labelled according to the four simulations (Table 8). Hence, fields are obtained for the control simulation f12, for the flat terrain simulation (f2), for the constant land use category simulation (f1) and for the simulations where these two factors were also set constant, i.e., flat terrain and constant land use category (f0). To detect the contribution of these factors to the vertical ozone distribution it is necessary to observe the fields defined by: ˆ ˆ f0 = f0 f1 = f1 − f0

ˆ ˆ f 2 = f 2 − f0 f12 = f12 − ()f1 + f 2 + f0 ˆ Where f0 shows the ozone concentration not related with either of the two factors under analysis; ˆ ˆ f1 shows the ozone concentration induced by the topography; f2 shows the influence of ˆ heterogeneous land use and f12 gives information related to the non-linear interaction of these two factors on the ozone concentration field. It is possible to conclude that vertical gradients almost vanish for the simulation where flat terrain and constant land use are considered, although some

16 ozone spots appear below 5 km altitude. For these conditions higher ozone concentrations are observed in higher altitudes over land. An enhancement of ozone concentrations is simulated over the ocean (western part of the domain) being more pronounced when topography is the inductor ˆ factor ( f1 ). At noon, ozone concentrations increase in the western part of the domain due to both ˆ ˆ factors ( f1 and f 2 ) for the two simulated days, as well as its horizontal extension. This feature is ˆ also detected at 18 UTC but is less intense. On the other hand, land use effects ( f 2 ) are more related with ozone diminishing values over land (Centre and East part of the domain). Topography represents a leading factor in ozone transport to the higher levels in the western part of the domain.

The air quality stations selected for MARS ozone uncertainty estimation are Teixugueira, Estarreja and Coimbra, which are representative of a rural, an industrial and an urban location respectively. The coefficient of variation (CV; Annex E) was the selected parameter for uncertainty evaluation (Figure 3).

60.0

50.0

40.0

Control simulation - f12 CV (%) 30.0 Flat - f2 Const.Land-use - f1 20.0 Falt and const. Land-use - fo

10.0

0.0 COIMBRA AVANCA TEIXUGUEIRA

Air quality stations

Figure 3. Coefficient of variation (CV) obtained in the hourly ozone concentration results for each run due to input changes in topography as expressed in Table 8

As expected air quality stations located at a rural or industrial/rural site show greater CV values for all performed simulations, they may reach values higher than 50 % for the control simulation at Teixugueira. Over these sites an enhancement of variability is also observed when constant land use is introduced. Ozone simulated results at the urban air quality station of Coimbra are slightly more variable around mean values when flat terrain is considered. This result agrees with that obtained by Stein and Alpert (1993).

4.2.3 Adjoint Modelling Approach According to Menut (2003) several sensitivity studies have been performed on the basis that a single perturbation is applied on an impact parameter and its influence on the modelled concentration is diagnosed. These methodologies are powerful for uncertainties investigations. However, these studies provided limited information on the parameters ranking by magnitude of impact. In addition, no information can be derived on the time and location of the most important contribution for a chosen parameter. Adjoint modelling uses a different approach since sensitivity of one pollutant is estimated to all parameters under only one model integration. Menut (2003) focused on the sensitivity of O3, Ox and NOx in the surface layer.

17 4.2.4 Sensitivity of Model Results to Nesting Lenz et al. (2000) present a study where the sensitivity of the results of a high-resolution chemistry transport model are analysed with respect to nesting the model into a larger scale model. The simulations for meteorology were performed with the mesoscale transport and fluid model METRAS (Schlünzen, 1990; Schlünzen et al., 1996) and the simulations for transport and chemistry with the mesoscale model METCM (Müller et al. 2000). The model study has been performed for the second measuring campaign of the TRACT experiment which took place in the bordering area of south-western Germany, north-eastern France and the northern part of Switzerland on September 16, 1992 (Zimmermann, 1995). Airplane measurements of different meteorological and chemical quantities were taken along two flight patterns in the model area. These measurements have been used to compare the model results and to assess the importance of model nesting for selected quantities. Cumulative distribution functions were calculated and hit rates with respect to measurement reliability determined. METRAS and MECTM have been nested into the larger scale models MM5 (Grell et al., 1993) and CTM2 (Hass, 1991) from the EURAD model system (Ebel et al., 1997). For the prognostic meteorological variables, the method of nudging has been used (Schlünzen et al., 1996). MECTM is nested into time dependent boundary conditions interpolated from the trace gas concentrations of CTM2 (Niemeier, 1997; Müller et al., 2000). To determine the sensitivity of the model system METRAS/MECTM to model nesting, four model simulations have been performed:

• Full nesting: nesting of meteorological and chemical variables. • Nesting of meteorology, but chemical variables are not nested. • Meteorology not nested, but chemical variables are nested. • No nesting: neither meteorological nor chemical variables are nested.

In general, the nesting of the meteorological part of METRAS into a larger scale model enhances the precision of the forecast of the meteorological variables. However, the forcing data adapted from the larger scale model must be of good quality, because the nested model METRAS can only partly correct deficiencies included in the forcing data. In the considered case the forecast of NOx and O3 concentrations is more dependent upon a correct description of the meteorological boundary values than on the concentration fluxes across the lateral boundaries. The simulated NOx concentrations are rather insensitive to the nesting of trace gas concentrations, which is probably a result of the poor performance of the forcing data in the considered case. For the prediction of O3 concentrations, the nesting of meteorology has to be accompanied by the nesting of chemical variables. Thus, for both gases it is concluded that nesting meteorology is at least as relevant as nesting concentrations. Further sensitivity studies, as well as simulations for other simulation periods, are needed to confirm these findings.

4.2.5 Sensitivity of UAP Forecasts to Meteorological Input and Resolution† Numerical weather prediction (NWP) models are increasingly used as providers of meteorological data for urban air quality (UAQ) or urban air pollution (UAP) models. UAQ forecasts are used as a decision making tool for local authorities. The near-surface wind and temperature fields are the main forcing of UAP models, directly and indirectly in defining the turbulence regime and parameters. Attempts to improve the input of temperature, wind and turbulence parameters in the boundary layer are evaluated in terms of their effect on the UAQ forecasts.

In the EU FP5 project FUMAPEX‡ (Integrated Systems for Forecasting Urban Meteorology, Air Pollution and Population Exposure, 2002-2005), 6 partners participated in a mesoscale meteorological model inter-comparison and validation exercise with their distinct model chains. Simulations were performed for 10 different pollution episodes in 4-5 target cities (Helsinki, Oslo, Bologna, Valencia, (Torino)) with varying models and partners. The episodes are characteristic for the regions: winter inversion-induced and spring particle episodes in Helsinki, Oslo and Bologna, summer ozone episodes in Bologna and Valencia. The models were nested as follows: CEAM-

* This work was conducted within the BMBF funded tropospheric research programme TFS. † This work was conducted as part of the FUMAPEX FP5 EU project. ‡ http://fumapex.dmi.dk

18 RAMS (40, 13, 4.5, 1.5 km), DMI-HIRLAM (5, 1.4 km), DNMI-HIRLAM (10 km) and MM5 (9, 3, 1 km), DWD COSMO-EU (formerly: LM) and ARPA model COSMO-IT (7, 2.8, 1.1 km), FMI- HIRLAM (33, 22 km), MM5 UH (Univ. of Hertfordshire) (81, 27, 9, 3, 1 km).

Table 9. Horizontal resolution

Torino RAMS 4 km, 1 km Valencia RAMS 9 km, 4.5 km COSMO-EU 7 km/35l, 2.8 km/45l, 1.1 km/45l CAMx 12 km, 4 km Oslo COSMO-EU 7 km/35l, 2.8 km/45l, 1.1 km/45l Helsinki COSMO-EU 7 km/35l, 2.8 km/45l, 1.1 km/45l Copenhagen HIRLAM 15 km, 5 km, 1.4 km

The sensitivity of the UAQ forecasts to NWP model horizontal resolution (Table 9) is examined in the simulations of the FUMAPEX target cities Bologna, Turino (model system RAMS/FARM), Valencia (model systems RAMS/CAMx and COSMO-EU-trajectories), Oslo (model systems MM5/ AirQUIS, COSMO-EU-trajectories and COSMO-EU/LPDM), Helsinki (model system COSMO-EU-trajectories, COSMO-EU/LPDM) and Copenhagen (model system HIRLAM/DERMA) (Ødegaard et al., 2005). Furthermore, the sensitivity of the forecasted concentrations to vertical resolution (Table 10), forecast length, improved parameterization and introduction of urban surface was investigated.

Table 10. Vertical resolution

Valencia COSMO-EU 7 km/35l, 2.8 km/45l, 1.1 km/45l Oslo COSMO-EU 7 km/35l, 2.8 km/45l, 1.1 km/45l MM5 17l, 26l Helsinki COSMO-EU 7 km/35l, 2.8 km/45l, 1.1 km/45l

The influence of an increase in horizontal (and -in some models- vertical) resolution without adapted urban parameterisations for the below 10 km resolution is largest for the model results interpolated to the station locations. No height corrections were made. The model intercomparison revealed for increasing horizontal resolution sharper gradients in wind and temperature fields, higher maximum values of wind speed, increased channelling of trajectories, increased vertical velocity in steep terrain. The last might be an artefact resulting from a model problem. The impact can be summarised for the different types of stations as follows:

For coastal stations (Helsinki / Valencia) Improvements are mainly due to a better land/sea description of the coastline and the associated soil type distribution which affects the surface fluxes via its specific thermal and hydrological characteristics. Similar effects may also occur at inland stations but with smaller impact. The substantial influence of varying surface properties and physiographic parameters with increased resolution near the coast is stated for FMI-HIRLAM, DMI HIRLAM and DWD COSMO- EU (Fay et al. 2004, 2005).

For mountainous areas (Oslo / Valencia / Bologna) Large impacts and often some improvement (e.g. for 2 m temperature and 10 m wind) at station locations are mainly due to the more detailed orography leading to improved topography effects like blocking, shading, flow around mountains, increased channelling in valleys, more clearly defined convergence/divergence lines, improved foehn simulations and mesoscale circulations. The changes/improvements are very distinct when looking at horizontal model fields of temperature, wind speed and direction).

In all other cases, the impact of an increased model resolution on the investigated parameters is small for surface parameters but may increase for vertical profiles of meteorological parameters or parameters showing some height integration (e.g. total cloud cover, planetary

19 boundary layer height). Thus, the general results of previous studies about increasing model resolution (as reviewed e.g. in Mass et al., 2001) are confirmed for the mesoscale models participating in FUMAPEX and the considered urban regions and air pollution episodes. The results are described in detail in Fay et al. (2004, 2005, 2006) and Fay and Neunhäuserer (2006).

Increasing the vertical resolution did not reveal such clear results, since in most experiments horizontal and vertical resolution are changed simultaneously. However, to study the impact of vertical resolution clean tests are required, otherwise it is difficult to address which one has most impact, and on which results. Vertical resolution of air pollution model increases the complexity of the input data, since dispersion depends heavily on the height of the emissions.

The forecast length varied between 48 hours for all cities and 72 hours for Torino. The general NWP result that forecast error increases with forecast length is not valid for meteorological input to UAQIFS. In contrast, the error level depends on the meteorological processes to be described. Urban processes are difficult to reproduce with NWP simulations on short and long time scales. Initial errors and spin up problems are relevant for high resolution. The initial imbalances are due to differences in resolution between host model analysis and UAQ model

To study the impact of PBL parameterizations on concentrations, a strong inversion problem was studied for Oslo. The PBL schemes tested all seem capable of doing their job – i.e. in stable cases they guarantee small (zero) vertical exchange of heat and momentum. This is only based on MM5 results, thus a generalisation still needs to be performed.

20 5. MODEL QUALITY INDICATORS The statistical analysis to evaluate model performance and to estimate uncertainties comprises a set of parameters, giving information about the ability of the model to predict the tendency of observed values, errors on the simulation of average and peak observed values, and type of errors (systematic or unsystematic).

5.1 Quality Indicators for Evaluating Meteorological Parameters Veil Odegaard Det Norske Meteorologisk Institutt, Blindern, Oslo, Norway

Evaluation of meteorological parameters as simulated by mesoscale models should be performed by comparison with observations. Annex D summarizes the most used evaluation measures for meteorological parameters. The availability of observations is limited concerning spatial resolution and number of parameters. Moreover, an observation is not necessarily representative of the area surrounding the observational site. When running mesoscale models one important aim is to have simulations which are closer to the extreme (peak) values that are observed compared to more smooth simulations from synoptic scale NWP models. Statistical measures often favour smooth fields. These problems are discussed in detail below.

5.1.1 Observation Availability Mesoscale models’ output considerable amount of information on meteorological parameters, also on parameters that are not observed. Evaluation of models is inevitably limited to those parameters which are observed. Common parameters observed by ground observation networks are mean sea level pressure, 2 m temperature and 10 m wind, 2 m relative humidity and rainfall. From manual observation sites cloud coverage, fog and snow depth/snow coverage are often available as well. Precipitation and wind can be derived from radars. Surface (sea surface) temperature, wind direction, cloud cover, cloud top temperature and snow cover are commonly derived from satellite data.

Observations can be based on the stationary ground network of synoptic stations or they can be supplied by remote sensing instruments, also carried on satellites. The advantage of remotely sensed observations, from such as radar or satellite measurements, is usually the higher spatial resolution compared to the ground network. Horizontal resolution down to 1 km is available from many data sources. These data sources will play an important role in future observation systems, and great effort is made inside the meteorological community to make optimum use of these data. For evaluation purposes the main challenge lies in the interpretation of the monitored variables in terms of meteorological parameters, e.g. how to derive air temperature from radiation data, and to develop reliable methods for estimating precipitation amounts from radar reflections.

5.1.2 Observation Error The ground observing network supplies data which can be directly compared to model simulations. However, the ground network has a spatial resolution which is far lower than that of mesoscale models. In addition, the data are point wise and, in general, not representative of the larger surrounding area. The knowledge of the observation error (including measurement error and error due to the representativeness of the measurements) is important for data assimilation as well as for model evaluation.

To reduce the measurement error, data control routines are used to remove values outside the accepted range. The accepted range must consider all possible observation values and preferably it should be derived from long time series of observations (several years) at the specific location. The accepted range should also be physically sound, e.g. relative humidity should be in the range 0-100%, dew point temperature has a range related to the temperature itself and should, therefore, be expressed as dew point depression. The dew point temperatures are always equal to or lower than the ambient temperature.

The observation error also includes the representativeness of the measurements. WMO rules for the set-up of meteorological sites ensure that the sites have some representativeness of

21 the surrounding (few kilometres) and of some time (10 minutes averages). Local influences like trees, buildings or, urban growth which is especially relevant for long-term analyses of data, may affect the measurements. As a result of shading or increased insolation, e.g. in steep terrain, temperature can exhibit very local representativity. Within urban areas most measurements are usually of local representativity: meteorological data are very much influenced by building impacts, while concentration data additionally depend on the emission distribution which is of very local nature in urban areas.

5.1.3 Recommended Quality Indicators for Different Meteorological Parameters The quality indicators are described in Annex D. Special problems arising from different event timings in model results and measurements are discussed in Annex F.

5.1.3.1 Mean Sea Level Pressure Air quality is not directly related to mean sea level pressure (mslp). The parameter is important to establish the correct weather regime and thus determines a lot of other parameters to which air quality is sensitive. The quality of mslp forecasts is a general model quality indicator. RMSE and STDE are widely used and compared.

An overall measure for model performance is the hit rate H using an allowed deviation of i.e. 1.7 hPa (Table 24). Caution should be shown to BIAS in mslp, since mslp is not observed directly. The pressure is reduced from observation height to sea level using temperature among others. Similarly the model mslp is also reduced. One source of BIAS is that these reduction procedures are inconsistent. Similarly, values for H might be small since the error caused by pressure reduction is large.

5.1.3.2 Wind Speed Mesoscale models tend to have problems with capturing the total amplitude of the wind speed (ff). An overall measure for model performance is the hit rate H using e.g. an allowed deviation of 1 m s-1 (Annex D). However, air pollution episodes occur in low wind speed cases which need to be evaluated separately. For evaluation of the model's ability to capture the low wind speed cases the observations of these cases should be treated separately in the statistical computations. BIAS and hit ratio HR and false alarm ratio FAR can be calculated for the wind speed in the interval 0 to 1.5 m s-1.

Ozone episodes occur frequently in high pressure situations with sea breeze circulation, which has a diurnal cycle. Evaluation of the diurnal cycle in the model can utilize BIAS calculated for each time of the day. When very strong winds are observed the absolute value of the error is often large. Normalized measures are recommended for evaluation of long time series, which include all observations, and when the extreme events are not in focus.

5.1.3.3 Wind Direction An overall measure for model performance of wind direction (dd) is the hit rate H, using e.g. an allowed deviation of DA=30° (see Annex D). This value was suggested by Cox et al. (1998) for the evaluation of weather forecast models. It is relatively large for air quality applications and should be reduced for those to DA=10°, if the averaging interval for the comparison data is of several 10 minutes. When using BIAS for wind direction the wind observations must be sorted on directions, e.g. eight different directions. In areas with complex orography and surface properties the BIAS will tend to be pronounced for some wind directions, in particular those arising from surface properties that are not resolved in the applied model. In complex orography it is possible to have most of the wind cases distributed on different directions in observations and model data. Wind vector error is therefore a better measure. For this purpose the direction weighted wind error DIST can be used.

5.1.3.4 Temperature Air pollution episodes are not forced by the temperature itself but by the vertical (and horizontal) temperature gradient. Intense air pollution episodes occur when there is a temperature inversion. The height of the inversion level is the crucial parameter. Evaluation of inversion level

22 requires temperature observations in several levels, e.g. from observation towers or radio soundings. These observations are limited and have an insufficient coverage. The vertical resolution of radio sounding data is often too coarse. Moreover, there are few towers with sufficient height to capture inversion levels. Hit ratio (HR) and false alarm ratio (FAR) of near surface inversion is a possible measure.

Air temperature at 2 m level (T2m) is strongly tied to surface properties. Model error can vary considerably in areas with complex surface properties. BIAS calculated for e.g. coastal, urban, mountainous or forested areas separately points directly to model deficiencies and specific corrections of model errors. The strong influence of local conditions on the temperatures often gives a standard deviation of error STDE that has the same magnitude at the end of the forecast as at the beginning, due to model deficiencies to capture the local conditions.

The sea breeze regime results in a diurnal cycle of the temperature which is as pronounced as the diurnal cycle of the wind. The diurnal cycle of the BIAS reveals model deficiencies. Normalized measures should be avoided for temperature since error in high temperature cases should count equal to an error in low temperature cases. This can be considered when using hit rates H with an allowed deviation of 2 K (Annex D).

5.1.3.5 Cloud Cover Air pollution episodes occur in clear and cloudy conditions, but radiative processes like ozone production are active in direct sunlight. Cloud cover is a parameter with relatively large error in NWP models as well as measurements. Comparison of model quality by RMSE, STDE and BIAS are often used to gain information on model prediction of cloud cover.

5.1.3.6 Humidity Air quality modelling is not very sensitive to atmospheric humidity. Evaluation should be performed on dew point temperature since this parameter is less sensitive to error than the temperature itself. An overall evaluation of model performance should be made by using the hit rate (H), using an allowed deviation of 2 K (Annex D).

However, dew point temperature is not a directly calculated parameter because the saturation vapour pressure is a non-linear temperature dependent function. If the horizontal variation of humidity’s large, the model error tends to be large. As for temperature, the dependence of the humidity and humidity error on surface properties is also strong. Important surface properties that influence the observations might be unresolved in the model. This is easily recognized when inspecting the BIAS at sites where surface properties vary.

5.1.3.7 Precipitation Air quality is highly dependent on the scavenging effect of precipitation. An adequate evaluation measure is to calculate the false alarm ratio FAR of rain/no rain (Annex D). If the model tends to have too smoothed precipitation fields or the event is rare, FAR is close to 1. Hence, focusing on FAR would result in very similar values for very different reasons. As an additional measure HR or H can provide valuable insight into the model performance. The lower limit for categorizing rain must be so high that the rainfall can be effective in washout pollutants and binding dust, perhaps 1 mm / 6 hours is a reasonable value. However, spatial representativity of precipitation data is very small when using rain gauge data. Therefore, quantitative comparisons with these data should only be performed when comparing at least monthly values (95% of measurements within factor 1.4; Annex C). As for wind speed the error tends to be large when observed amounts are large. Therefore, normalized values should be used when calculating BIAS and RMSE on long time series.

23 5.2 Quality Indicators for Air Quality Model Evaluation Ana Isabel Miranda, Alexandra Monteiro, Helena Martins, Carlos Borrego CESAM & Department of Environment and Planning, University of Aveiro, 3810-193 Aveiro, Portugal

This section presents a collection of quality indicators currently used in the evaluation of concentration values calculated with air quality models, together with examples of their application. Part of this compilation and analysis work has been conducted in the scope of the European project Air4EU (Borrego et al., 2006). The methods to perform a statistical model evaluation which are discussed in this section are the most commonly used statistical parameters, including by the EPA, and are incorporated in the EU Framework Directive.

5.2.1 Statistical Parameters for Concentrations Discussion on the evaluation of air quality models and on the development of general evaluation methods has been carried out by many scientists. However, standard evaluation procedures and also performance standards still do not exist. Traditionally, model predictions are directly compared to observations, but this may cause misleading results because uncertainties in observations and model predictions arise from different sources (Chang and Hanna, 2004). Hanna et al. (1993) recommended a set of quantitative statistical performance measures for evaluating models, which have been widely used in many studies and have been adopted as a common European model evaluation framework (Olesen, 2001). Recently the hit rate (H) (Trukenmüller et al., 2004; VDI, 2005) has been added to these measures (Olesen, 2007). The main statistical parameters used as quality indicators are presented in Annex E.

The parameters defined in Annex E are not exhaustive; others can be defined and used according to the purpose and emphasis of the study. Multiple performance measures should be applied and considered in any model evaluation exercise, as each measure has advantages and disadvantages and there is not a single measure that is universally applicable to all conditions. Since, for most atmospheric pollutant concentrations, the distribution is close to lognormal, the Gaussian distribution based measures fractional BIAS (FB) and normalized mean square error (NMSE) may be overly influenced by infrequently occurring high observed and/or predicted concentrations, whereas the logarithmic measures geometric mean BIAS (MG) and geometric variance (VG) may provide a more balanced treatment of extreme high and low values. MG and VG may be overly influenced by extremely low values, near the instrument thresholds, and are undefined for zero values. Factor of two (FAC2) is the most robust measure as it is not overly influenced by outliers.

FB is a measure of mean relative BIAS and both, FB and MG, indicate only systematic errors, whereas NMSE is a measure of mean relative scatter and both, NMSE and VG reflect systematic and unsystematic (random) errors. The correlation coefficient (r) reflects the linear relationship between two variables and is thus insensitive to either an additive or a multiplicative factor. The hit rate (H) is a measure independent of the error distribution and is the only error measure that can consider both, absolute and relative measurement uncertainty by selecting corresponding values for W and A (Annex E).

Elbir (2003) proposed a statistical analysis that includes the index of agreement (IOA), which determines the degree to which magnitudes and signs of the observed value about mean observed value are related to the predicted deviation about mean predicted value, and allows for sensitivity toward difference in observed and predicted values as well as proportionality changes. IOA varies from 0.0 (theoretical minimum) to 1.0 (perfect agreement between observed and predicted values) and gives the degree to which model predictions are error free.

Schlünzen and Meyer (2007) and Trukenmüller et al. (2004) used in their evaluations hit rates (H_EU) based on the accuracy requirements defined in the EU daughter directives on air quality (Section 5.2.3). In contrast to the EU directives they, however, kept the timing in their evaluation. When all values are within an allowed difference, H_EU should be 100%.

24 5.2.2 EPA Quality Indicators EPA (1996) presents a compilation of a series of photochemical model simulations conducted within the United States and validation exercises. Validation focuses on the models’ ability to predict the domain-wide peak ozone concentration and the concentrations at all locations with observed ozone concentrations above 60 ppb. These quality indicators are described in Table 11 (only valid for ozone), including the acceptable values, which are merely indicative, because they were defined based on tests performed.

Table 11. EPA quality indicators for air quality model performance evaluation regarding ozone (P= Predicted value, O = Observed value, N = Number of values) Parameter Formula Acceptable values Normalized accuracy of the maximum 1-hour ± 15-20% Pmax − Omax concentration unpaired in space and time A u = Omax Mean normalized BIAS of all predicted and observed 1 N P − O ± 5-15% i i concentration pairs with Co>60ppb MNB60 = ∑ N i=1 Oi

Mean normalized gross error of all predicted and N ± 30-35% 1 Pi − Oi observed concentration pairs with Co>60ppb MNG60 = ∑ N i=1 Oi

This group of parameters complements the measures in Section 5.2.1, since it evaluates the model’s capability to simulate peaks, which is particularly important for the evaluation of atmospheric pollutants episodes, as described in the example in Section 5.2.4.1.

5.2.3 EU Directives Modelling Quality Objectives The Air Quality Framework Directive (FWD) sets a general policy framework for ambient air quality. For this purpose, a set of long-term objectives for air quality is established by the legislation. Monitoring and modelling are identified as air quality management tools, and the uncertainty of the monitoring data and modelling results is one of the essential issues of the FWD. The FWD and Daughter Directives establish requirements for air quality modelling, including the definition of the Modelling Quality Objectives, as a measure of modelling results acceptability. In this context, the uncertainty for modelling and objective estimation is defined as the maximum deviation of the measured and calculated concentration levels, over the period for calculating the appropriate threshold, without taking into account the timing of the events. The quality objectives defined for each quality indicator are listed in Table 12.

Table 12. Modelling Quality objectives established by European Directives

Pollutant Quality Indicator Quality Objective Directive Hourly mean 50-60% SO , NO , NO Daily mean 50% 2 2 x 1999/30/EC Annual mean 30% PM10, Pb Annual mean 50% CO 8-hour mean 50% 2000/69/EC Benzene Annual mean 50% 8-hour daily maximum 50% Ozone 2002/3/EC 1-hour average 50%

Model quality measures described in the above EU Directives have been interpreted as the relative maximum error without timing (RME), which is the largest concentration difference of all percentile (p) differences normalized by the respective measured value:

25 max(Pp − Op ) RME = (1) Op The question of timing is relevant for those target values defined as a number of allowed exceedances of a given threshold concentration. Besides that, the model quality objectives for the allowed uncertainty are given as a relative uncertainty, without clear guidance on how to calculate this relative uncertainty. It can be assumed that the respective measured value is used to normalize the absolute difference between the maximum deviation of the measured and calculated concentration levels. Another possibility would be to take the maximum relative deviation, but this approach could shift the emphasis to the very low measured concentration ranges, where usually the largest relative deviations between observations and calculations occur, which could be the main reason for non-compliance of annual mean values uncertainty requirements. Besides that, other problems of the interpretation of the model uncertainty requirements could occur since there are no differences between a short-term and long-term model application uncertainty analysis, being the first one in advantage due to the number of paired-in-time results. An alternative model error measure was proposed by Stern and Flemming (2004), defining the quality indicator as the concentration difference at the percentile p corresponding to the allowed number of exceedances of the limit value normalized by the observation (Relative Percentile Error - RPE):

Pp − Op RPE = ,p (2) Op

This measure is more robust than the error defined in the EU Directive, evaluating also the model performance in the high concentration ranges, but without the sensitivity to outliers. Since the model uncertainty is examined in the concentration range of the limit values with eq. (2), there is also a direct link to the EU Directives.

5.2.4 Application Examples

5.2.4.1 Application of Quality Indicators to Portugal In order to test and illustrate these model quality measures, a one-year simulation of the chemistry-transport MODEL 1 was used. MODEL 1 was applied in the regional scale mode, covering Portugal with a resolution of 10 km for the entire year 2001 (Borrego et al., 2005). The model results were compared with measured data from 23 sites of the national air quality monitoring network according to the EU directives thresholds. Table 13 presents the average of the relative maximum error (RME) and the relative error at the percentile, which corresponds to the allowed number of exceedings of the limit value threshold (RPE) for the background and all the monitoring sites, for each pollutant indicator defined by the EU Directives.

Table 13. Average of RME and RPE for the background and all the monitoring sites, for each pollutant indicator defined by the EU Directives Pollutant EU Directives indicators RME (%)* Percentile (P) RPE (%)* RPE (%)** th SO2 Human health protection 79 99.73 (25 max 34 40 (hourly mean) 1h mean) Human health protection 66 99.18 (4th max 57 69 (daily mean) 24h mean) Vegetation protection 33 annual mean 33 46 Vegetation protection 44 winter mean 44 58 th NO2 Human health protection 81 99.79 (19 max 39 48 (hourly mean) 1h mean) Human health protection 47 annual average 47 50 th O3 Human health protection (8h 69 93.15 (26 max 16 35 running daily mean) 8h daily mean) Vegetation protection 71 AOT40 49 65 *considering only background monitoring stations; ** considering all monitoring stations

26 Concerning the hourly and daily averages indicators, the analysis of the relative maximum error (RME) defined by the EU directives reveals that it is calculated at the highest measured value. In these cases, the assessment of the model uncertainty depends on the model performance in a concentration range having an extremely small probability. This also means that the model uncertainty assessment could probably be based on an outlier concentration caused by an error of the monitoring unit or an extreme weather situation. In fact, and in opposite to the RME (eq. 1), the error measure RPE (eq. 2) shows a quite total compliance with the legislation uncertainty requirement of 50% for all the pollutants indicators. These conclusions are in agreement with other model evaluation studies with similar or even higher complexity (Stern and Flemming, 2004; Hass et al., 2003). The analysis of Table 13 also reveals the problem of the heterogeneity of the observed concentration fields and the importance of selecting the adequate and representative monitoring sites for model resolution, since it is impossible for a grid model to simulate all stations with the required accuracy.

The same methodology concerning the estimation of model quality measures of the EU Directives should be applied in case of local scale applications. Restrictions to the application of this methodology will appear in the case of models with feasible temporal applications of several days, and for pollutants with averaging periods of 1 year, such as PM, Pb and Benzene.

Comparing EPA performance measures of two different air quality models (referred to as MODEL 1 and MODEL 2) reveals differences in performance. The models were applied to an ozone episode that occurred in Portugal from 27 to 31 May 2001. During this period, exceedances -3 of the O3 limit value (180 µg m ) were registered in 5 air quality monitoring stations over Portugal, three of them considered as background stations and two located in industrial areas. Table 14 presents the calculated EPA quality indicators.

Table 14. EPA quality indicators obtained for MODEL 1 and MODEL 2 simulation

Average for all stations Average for background stations Parameter Model 1 Model 2 Model 1 Model 2

Au 18.0 46.6 10.1 26.5

MNB60 -0.8 0.1 -1.1 0.1

MNE60 0.0 0.1 0.0 0.1

The results show that while the values for quality indicators are acceptable (Table 14) for Model 1, Au is not acceptable for Model 2.

5.2.4.2 Application of EU Quality Indicators for the Southern North Sea Region Schlünzen and Meyer (2007) investigated the impact of the meteorological situation and chemistry on dry deposition to the southern North Sea through the use of the high-resolution model system M-SYS that employs METRAS (meteorology) and MECTM (chemistry). For the evaluation of the METRAS meteorology results, the hit rates (H) and correlation coefficients (r) are calculated (Table 15); values from Schlünzen and Meyer, 2007). Values for the desired accuracy criteria DA were taken as given in Table 26 (Annex D) for the meteorological variables. SK gives results of Schlünzen and Katzfey (2003) for comparison. Temperature and dew point temperature as well as wind direction are well simulated, while wind speed agrees less (Table 15). The hit rates for wind speed ff are between 27% and 39%. These values are in the same range as the hit rates found by Cox et al. (1998). They received for 12–26 h forecasts hit rates for ff of 22–41%.

For the same case Schlünzen and Meyer (2007) calculated hit rates H_EU for the concentrations by using the accuracy requirements defined in the EU daughter directives on air quality (European Communities, 1999, 2002; Table 12). Based on the directives, the maximum deviation of the measured and calculated concentration levels must not exceed 50–60% of the -3 hourly limit value for sulphur dioxide (SO2: hourly limit value 350 μg.m ) and nitrogen dioxide (NO2: -3 hourly limit value 200 μg m ), and 50% of the hourly threshold value for ozone (O3: information

27 hourly threshold value is 180 μg m-3). Hit rates based on 15% of the maximum measured values were additionally calculated, to receive a more detailed information on the error spread (Table 16). H_EU are 100 % except for NO (H_EU=99%) and O3 (H_EU=94.8%). For the evaluation the timing was kept. H_15 values are in the same range as found for meteorological data. BIAS values range from about 10% of measured mean values (SO2, ozone) to 50% (NH3). NH3 maximum values show large differences, probably a result of too low emissions or missing transport from the NH3 emitting regions of western Germany that is bordering the eastern part of the model domain.

Table 15. Correlation coefficients and hit rates (in %) for the variables T (temperature), Td (dew point temperature), ff (wind speed), dd (wind direction) based on routine meteorological data (RD) and two specially equipped field sites (WAO, MPN, de Leeuw et al., 2003) Correlation coefficient r DA Hit rate H SK RD WAO MPN WAO MPN T 0.91 0.95 0.89 2 ºC 86 77 89 73 Td 0.85 0.91 0.94 2 ºC 76 87 94 79 ff 0.26 0.21 0.64 1ms-1 39 34 27 58 dd 0.74 0.78 0.80 30º 67 74 64 63

Table 16. Number of measurements (No), mean based on measurements (MeMean in mg m-3), BIAS of measured and simulated data (BIAS in mg m-3), maximum value based on measurements (MeMax in μg m-3) and model results (MoMax in μg m-3), hit rate (H_EU in %) with desired accuracy DA_EU (in μg m-3) defined as 50% of threshold values as given by EU directives, hit rate (H_15 in %) with desired accuracy DA_15 (in μg m-3) defined as 15% of MeMax

No MeMean BIAS MeMax MoMax DA_EU H_EU DA_15 H_15 NO 3531 8.0 3.1 247 196 100 99 37 91

NO2 4697 21.8 5.4 100 128 100 100 15 64

NH3 561 6.8 3.3 81 22 100 100 12 89

SO2 5000 8.7 0.9 133 108 175 100 20 96

O3 4777 51 -5.5 185 194 90 94.8 28 38

28 6. VALIDATION DATASETS John Douros(1), Kristina Labancz(2), Nicolas Moussiopoulos(1) (1) Aristotle University, Thessaloniki, Greece (2) Hungarian Meteorological Service, Department for Atmospheric Environment, Budapest, Hungary

The issue of model evaluation is of particular importance both for research model applications as well as in air quality management applications. Discussions of model performance tend to often revolve around the various definitions of several core concepts such as “validation” or “verification”. Model evaluation consists of a number of elements including usefulness and reliability of the model and its results. For a model to be useful, it must reflect the behaviour of the real world atmospheric processes being simulated with a pre-defined level of accuracy that is acceptable for the intended purpose of use. A model is regarded to be reliable if the implementation of the calculations involved reproduce the conceptual model of the system to be simulated. Model evaluation is necessary in order to identify the strengths and weaknesses of a model and assess the efficiency of its use in providing realistic results to be used in air quality management and assessment. Therefore, before any model is applied for a particular research or management purpose, it must be evaluated so that its suitability for the specific application is ensured. While validation incorporates the more traditional elements of model testing, e.g. comparisons with analytic solutions or more qualitative evaluations of the model behaviour, the evaluation process involves as a necessary step the comparison of model results with observations and quantitative measures. The measured values intended to be used for model evaluation are referred to as “validation datasets”, when they are produced and used specifically within a foreseen model evaluation procedure.

6.1 Model Validation Datasets and Selection Criteria Model validation datasets are produced within model development laboratories (e.g. in wind tunnels), within field experiments dedicated to produce validation datasets to check the performance of models for a specific model application, or derived from monitoring datasets. Several issues arise in using the appropriate model validation datasets for particular evaluation purposes. For each specific case the required data completeness (suitable size, temporal and spatial coverage, minimum number of data gaps and consideration of any compilation procedures that may have caused data to be eliminated), quality and accuracy have to be specified. These requirements vary according to the intended model application, as well as the model properties, such as model scale and parametrizations.

Although some requirements may differ depending on the application, the requirement for Quality Assurance and Quality Control (QA/QC) of the produced datasets is applicable in all cases. This is because all validation datasets should satisfy some general quality criteria in order for the model evaluation exercise to be realistic and to be useful to the modelling community. QA/QC assures that the relevant measurements performed meet some pre-defined standards of quality with a stated level of confidence. It should be emphasised that the function of QA/QC is not to achieve the highest possible data quality. Rather, it is a set of activities enabling the measurements to comply with the specific Data Quality Objectives for the particular monitoring programme. The main parts of the Quality System are:

Quality assurance (QA): the management of the activities within the data acquisition, and setting of overall objectives and criteria. Quality control (QC): the procedures of the day-by-day operations and data validation. Quality assessment: the external validation of the implementation of the quality system.

Assuming that these procedures are followed during the validation dataset generation, the data reported for a particular monitoring station will have a stated level of accuracy and precision, a specified area of representativeness, and a sufficient time coverage, as defined by the Data Quality Objectives (DQO). The EU Air Quality Directives* specify DQO, and certain data quality related requirements. DQO requirements are given for:

* http://www.europa.eu.int/comm/environment/air/ambient.htm

29 • Minimum accuracy and data capture for monitoring, as well as for modelled data, and objective estimation. • Location of monitoring stations. • Minimum number of stations. • Reference monitoring methods.

Therefore, before the generation of a validation dataset DQO for accuracy, precision, data capture and time coverage should be defined, which must comply with the EU AQ Directives and with the evaluation objectives. Site selection criteria for the location of the monitoring stations must be established taking into account the nature of the particular campaign. Moreover, a documented calibration programme and a data validation procedure complying with the Decision (97/101/EC) (EC, 1997) should be followed, in order to ensure that the criteria for quality are met.

Although the definition of “quality” and “accuracy” of the validation dataset may slightly vary depending on the specific application purpose and user group, in general accuracy refers to the closeness of results of observations to the true values (or the values accepted as being true). This implies that observations of most spatial phenomena are usually only considered to be estimates of the true value. Quality is a wider term that includes accuracy, and can simply be defined as the fitness for purpose of a specific dataset. Data that are appropriate for the purpose of evaluating a model for one application may not be suitable for the evaluation of the model for another application. Therefore, the definition for acceptable data quality varies depending on the scale, accuracy, and extent of the dataset, as well as the quality of other datasets to be used. Five components are most commonly associated with data quality definitions, which are:

• Lineage. • Positional accuracy. • Numerical accuracy. • Logical consistency (data should be presented in a consistent and unambiguous form). • Completeness.

The difference between observed and true (or accepted as being true) values indicates the accuracy of the observations. All model validation datasets should be adequately documented, and their availability and relevant contact details should be provided in the documentation. It is also important to include any information relevant to the error status (such as statistical error indices) of the data, so that the user will be able to make a subjective statement on the quality and reliability of the data, and thus a scientifically-based decision on the suitability of the data for the intended application. Model validation datasets independent of those used for calibration should be employed for model evaluation. Every effort should be made to evaluate the model across the range of conditions for which it will be run. Model evaluation and analysis of model errors must be undertaken for the key variables required from the modelling study. It has to be noted that within project or team members, a cycle of information exchange and feedback needs to be developed among model developers, technology developers, and technology appliers, in spite of the difficulty in sharing proprietary information.

6.2 Mesoscale Model Validation Datasets and COST728 Several European projects and working groups have produced model validation datasets as well as model validation databases that are available upon request by other project members, academic institutions or authoritative bodies for model assessment and model intercomparison studies. The present action, COST728, provides a framework for the use of model validation datasets. By reference to existing model results, the strengths and weaknesses of current approaches and common successes or failures (if any) will be established within COST728. WG4 in particular, aims at developing tools and methodologies that can be applied to evaluate mesoscale meteorological models for air pollution and dispersion applications. Within the scope of this aim, a database (meta-database) has been compiled (Figure 1), that includes information on available well-documented air quality and meteorological datasets, produced or used in earlier projects (http://pandora.meng.auth.gr/mqat/). Although the meta-database is an ongoing activity and is envisaged as a valuable data source for model users and demonstrates the wide range of

30 projects related with or meant to support model evaluation. Therefore, its sustainable operation and frequent update is of major importance for the modelling community world-wide. The model validation datasets established within COST728 WG4 (from, e.g. FUMAPEX, CITY-DELTA, ESCOMPTE, MESOCOM, VALIUM) and elsewhere (e.g. the EUMetNet Short Range Numerical Weather Prediction programme, SRNWP) will then be reviewed to highlight, amongst others, which model parametrizations are thought to be most critical in each case. Some examples of other projects and initiatives on model validation datasets are summarized in Annex I. The table presents the datasets produced and/or used particularly for model evaluation purposes that are currently uploaded into the meta-database.

Similar objectives as for the COST728 meta-database are also behind JRC’s DAM meta- database†. The main purpose of JRC’s DAM meta-database is to facilitate the access to valuable information regarding available datasets, to any model developer or user that intends to evaluate his/her modelling tool. DAM is intended to be an interface between modellers and the information available through existing web sites or contact points. Although DAM can not be considered as capable of answering any possible type of request from model users or developer, it aims at being as complete as possible, although continuous update is here as important as it is for the COST728 meta-database.

6.3 Other Efforts for the Harmonisation and Standardisation of Validation Datasets A notable effort to harmonise model evaluation studies using the same evaluation datasets has been undertaken by the Harmonisation initiative‡. In 1991, a European initiative was launched to enhance cooperation and standardisation of atmospheric dispersion models for regulatory purposes. The initiative responded to the increased need for several new dispersion models with advanced parametrizations to be developed in a well-organized manner and emerge into practical, generally accepted tools to be used by policy and decision makers. In this context, a series of workshops has been organised within the initiative during the last 10 years to promote the use of new-generation models within atmospheric dispersion modelling, and to improve "modelling culture". A central activity of the "Harmonisation" initiative, closely related to the conferences, is the development and distribution of the so-called Model Validation Kit. The Model Validation Kit is intended to be used for evaluation of atmospheric dispersion models. It is a collection of four field datasets as well as the suitable software where the data will be incorporated for model evaluation. The Kit is recommended as a practical tool serving as a common frame of reference for model performance evaluation. It addresses the classic problem of dispersion from a single point source. The package was updated to Version 2.0 in October 2005, and an extensive set of web pages, from where the Kit can also be downloaded, provide details on the contents of the Model Validation Kit§. An American Society for Testing and Materials (ASTM)** standard guide on model evaluation that was published in November 2000 represents an alternative approach to that of the Model Validation Kit. However, it is requested that the results from the Model Validation Kit should be interpreted with care, because it does not explicitly address the question of stochastic nature of observed concentrations. The ASTM standard guide contains detailed discussions on the framework and procedures for model evaluation. The framework is general in the sense that it does not refer to a certain model type or with a specific concentration variable. However, there is an Annex to the guide, which specifies an example where the framework is used. This example deals with the classic problem of a plume being emitted from an isolated point source.

Another initiative, the European Network of Excellence on Atmospheric Composition Change, ACCENT, operating since March 2004, aims to promote a common European strategy for research on atmospheric composition change, to facilitate this research and to optimise two-way interaction with policy-makers and the general public. Within ACCENT, the Transport and Transformation of Pollutants (T&TP) project aims at bringing together the European community of researchers of atmospheric sciences to identify current problems of understanding and promoting

† http://rem.jrc.cec.eu.int/dam/ ‡ http://www.harmo.org/default.asp § http://www.harmo.org ** http://www.harmo.org/astm

31 research work and to improve the performance of models for analysis and forecasting on global, regional and local scales. In this context, the refinement of methods for assessing urban and local scale air pollution levels is of particular importance. In particular, methods are required for source apportionment, for ensuring compliance to the air quality legislation and for the analysis of air pollution episodes. Towards this aim, the generation of model validation datasets on sites with different characteristics is required, in order to improve the quantitative level of confidence for model predictions.

The case of the MITRAS model provides an example where the model evaluation data were generated within an academic group to assess the performance of a model developed by the same community††. Based on the mesoscale model METRAS, the microscale model MITRAS has been developed by Schlünzen et al. (2003) in a consortium of four partners within the tropospheric research programme, funded by BMBF. MITRAS is a community model. The Meteorological Institute, Centre for Ocean and Climate, Research, University of Hamburg, Germany, coordinated the model development, implemented the modules that were produced within the consortium into MITRAS, and was also responsible for model validation. The datasets that were specifically produced for model validation were determined through wind tunnel (CEDVAL) experiments. These controlled CEDVAL wind tunnel experiments were tailored for the evaluation of microscale models. This approach cannot be transferred to mesoscale models, since the scaling does not allow the use of wind tunnel data and field experiments cannot be performed with controlled external boundary conditions.

The GAIM ‡‡ Task Force was formed as an overarching framework activity of the International Geosphere-Biosphere Programme (IGBP) for coordinating and promoting different multi-disciplinary research components that can be combined to formulate an integrated view of the Earth System, using as tools both data and models. In order to assess the validity of Earth system models, it is critical to understand the sensitivity of the system to each of the input data and to conduct model sensitivity analyses of dynamic vegetation models, ocean carbon cycle models, GCMs, and hydrologic models as well as for simple Earth system models with respect to the various input climate and ecological data. GAIM will also undertake addressing some of the more theoretical issues involved in complex model development, coupling and evaluation. In particular, in the case of the evaluation procedure, the minimum necessary resolution for model validation datasets will be determined and inverse methods for applying model validation datasets will be established.

Another example is the Working Group on iodine which has been established in the framework of the EMRAS programme §§ which continues some of the more traditional work of previous international programmes (VAMP -Validation of Model Predictions, BIOMOVS - BIOspheric Model Validation Study, BIOMASS – BIOsphere Modelling and ASSessment) on increasing confidence in methods and models for the assessment of radiation exposure related to environmental releases, for the purposes of radiation protection of the public and the environment. The preparation of appropriate model validation dataset is therefore an essential component of this work. In this case, the validation dataset formed a comprehensive database that took into consideration the issues of the different model variables, the temporal and spatial resolution of the simulations, and other elements that would justify the analysis of the model evaluation results. In particular, concerning the model validation dataset, it was suggested to prepare a large database including air concentration of iodine over Warsaw, meteorological data (such as precipitation data, wind trajectory, temperature), soil concentration of iodine for several locations and time periods and durations, iodine concentration in grass, specified leafy vegetables (lettuce) and milk. Epidemiological data were also monitored in this case, such as thyroid burden of inhabitants of the affected district and information about the age, sex, date of thyroid blocking and diet (milk consumption) as well as physical activity was associated by interview with each measurement.

†† http://www.mi.uni-hamburg.de/mitras ‡‡ http://gaim.unh.edu/Structure/GAIM_Plan/index.html §§ http://www-ns.iaea.org/projects/emras/emras-background.htm

32 The Biospheric Model Validation Study-Phase II (BIOMOVS II***) previously mentioned was an international cooperative programme that examined the accuracy of predictions of environmental assessment models. Model evaluation was based on calculations made by individual participants for ten test case scenarios focusing both on short- and long-term releases of radioactivity from facilities such as power reactors. Model predictions were compared with each other and, where possible, with independent field observations, and reasons were sought for any observed differences. Confidence intervals on predictions and differences between predictions and observations were often less than a factor of 10, although there was much variability among models and scenarios. Model performance depended not only on the formulation and parameter values of the model, but also on the experience and assumptions made by the user. The study demonstrated the need of using harmonised validation datasets to better explain and justify model structure and application and to assess sources of uncertainty. A key recommendation was that assessments should not be undertaken in isolation by one individual modeller using one model.

The VAMP (Validation of Environmental Model Predictions) programme which ran from 1988 to 1994, aimed at examining the widespread distribution of radionuclides in the environment after the Chernobyl accident. The results of the measuring and monitoring campaigns performed in this context established a basis for evaluating the predictions of mathematical models. The VAMP programme proved to be very successful and involved over 100 scientists from many different countries. The exercises in VAMP provided a unique opportunity for testing the accuracy of model predictions, following a common evaluation methodology. In some cases, existing models and transfer coefficients were found to give a reasonable representation of the transfer of radionuclides through the environment. In other cases, previous generic assumptions regarding, for example, dietary intakes and food sources, proved to be inappropriate for application to a particular environment. In the model testing studies, there was a general trend towards over-prediction. One of the most likely reasons for this is associated with the use to which these models are normally put, that is, they are most commonly used for the purpose of comparing radiation doses received by critical population groups from releases of radionuclides from operating practices with dose limits. In this application, there is a need to be sure that doses do not exceed the dose limit and so the assumptions and parameter values in the models tend to be selected in a way which will make underestimation unlikely.

The NAME Data Management activities are being coordinated by the NCAR Earth Observing Laboratory (EOL)†††. EOL has established and maintains the NAME Project‡‡‡ including the data management pages and final project archive. The NAMAP-2 Project§§§ has recently been initiated to bring together a variety of modelling comparisons and evaluation efforts. Designated modelling group participants are being identified, as are experimental details such as standard protocols, formats, and model validation datasets. This project will take advantage of the large amount of special research and verification datasets collected during the 2004 NAME EOP to improve model development. The NAME data group in EOL will facilitate archiving and coordination of NAMAP-2 model and validation datasets.

SATURN **** (project of Eurotrac-2, 1997-2002) was comprised of different experimental activities, such as local scale field experiments, urban scale field experiments and large urban scale field campaigns. Comprehensive model validation datasets resulted from these field campaigns which were planned in urban areas representative for conditions prevailing in different parts of Europe. Each of these campaigns satisfied the following criteria: The scientific aims were well-defined (for instance: analysis of specific physical or chemical processes or checking a suggested working hypothesis); the field measurement programme included all quantities necessary to address the given scientific problem; the temporal and spatial resolution was sufficient for establishing a dataset applicable for model validation purposes. The evaluated field

*** http://info.casaccia.enea.it/evanet-hydra/Cadarache/VAMP%20BIOMOVS%20assessment/VAMP-BIOMOVS.htm ††† formerly the Field Operations and Data Management Group of UCAR’s Joint Office for Science Support ‡‡‡ http://www.joss.ucar.edu/name §§§ http://www.joss.ucar.edu/name/namap2/ **** http://www.gsf.de/eurotrac/sp-sat-f.htm , http://aix.meng.auth.gr/lhtee/saturn.htm

33 campaign results are summarised in the COST728 WG4 data metadata inventory and accessible via http://www.cost728.org/.

Some further examples of field datasets selected and used for model validation and evaluation purposes in Europe and the U.S. †††† are presented in Annex I, along with data information and availability details.

†††† http://camp.gmu.edu/FieldDatasetInventory.htm

34 7. MODEL VALIDATION AND EVALUATION EXERCISES

Several validation and evaluation exercises are reported in the model meta-database*. More details are given in the model validation and evaluation exercises data base (Annex I). A comprehensive collation of exercises has not been possible and the focus has been to select from those that have been undertaken by the groups participating in COST728. However, even for the COST728 groups this summary is not complete in the sense that all model validation exercises conducted by the groups are listed. The validation exercises reported, however, serve to reflect the spectrum of evaluation attempts performed by some key groups in Europe. They comprise different approaches to demonstrate the usefulness of the model to answer particular scientific questions. The approaches that were used for the different models are listed Annex M.

7.1 Mesoscale Meteorological Model Validation and Evaluation Studies Volker Matthias GKSS Research Centre, Geesthacht, Germany

Mesoscale meteorological models are particularly suited for analysing complex meteorology and air pollution situations. One important application has been the analysis of air pollution episodes. As part of such applications, models have been evaluated with a range of methods. We distinguish between four types of model validation: i) Comparison to analytic solutions. ii) Comparison to reference datasets. iii) Model intercomparisons. iv) Additional efforts (case studies).

Out of the 15 mesoscale meteorological models reported in the database. MM5 is operated by three different groups and appears three times in the database. Seven mesoscale meteorological models were compared against analytic solutions. Mostly the wind field was investigated for flows over mountainous terrain. Eight models were compared to reference datasets. This includes standard meteorological measurements as wind, temperature and humidity at ground level, vertical profiles from radiosoundings and Radar. Especially, the operational models from the weather services like COSMO-EU, COSMO-DE and GME undergo daily comparisons to measurements. Additionally, well documented and quality tested datasets from extensive field campaigns were taken to evaluate the models. This is documented for e.g. METRAS using data from TRACT, BERLIOZ and FLUMOB (see Section 6) and from experiments in the Arctic.

Model intercomparison studies are reported for 10 models. The operational weather forecast models are very extensively tested by comparing their results to daily NWP forecasts from other European weather services. Other evaluation exercises include the EU FP5 project FUMAPEX, COST715 and MESOCOM. MESOCOM (Thunis et al., 2003) includes seven models and use idealised cases which provides information on the variability of the model results and on possible reasons for the differences but not about the ‘correctness’ or accuracy of the model results. In FUMAPEX, measurements were also taken into account. Seven models, most of them operational NWP models, were applied to pollution episodes in different European cities. For more information on the FUMAPEX results see also sections 4.2.5, 7.1.6, 7.1.7 of this overview report.

By far, most of the activities are attributed to what is here called ‘additional validation and evaluation efforts’ which are reported for 11 models. Almost every publication of model results includes at least a paragraph on the evaluation of the model by comparing the results to measurements. However, the comparisons have not always been conducted very systematically and variables other than temperature, wind and humidity come into play. Comparisons of cloud liquid water content were made for GME, COSMO-EU, MM5, NHHIRLAM and UM, for example. Boundary layer height was compared to, among others, lidar and radar measurements for

* http://www.mi.uni-hamburg.de/costmodinv

35 COSMO-EU, MM5 and UM. METRAS was also tested in other regions of the Earth than the mid- latitudes, namely the sub-tropics and the Arctic.

This section describes the results of model evaluation studies conducted by COST728 partners. Several meteorological and air quality models have been evaluated for modelling episodes. These models include CALMET, COSMO-IT, MEMO, MM5, MARS/MUSE, MM5/AirQUIS, M-SYS and MM5/CMAQ. Table 17 shows the model evaluations employed by model users within COST728. These models are not in themselves representative of the whole field but they do provide real examples of how mesoscale models are being evaluated for studying episodic conditions. Examples for sensitivity study results can be found in Section 4.2.

Table 17. Summary of model evaluation studies undertaken recently by COST728 members

Model name Analytic Comparison Model inter- Statistical Sensitivity test solutions with comparison Analysis on model observations setting ADREA X X X ALADIN/A X ALADIN/PL X ARPS X X X COSMO-CH X X ARPS X X X BOLCHEM X X CALMET X X X COSMO-EU X X X GESIMA X GME X X X X Hirlam X COSMO-IT X X X X COSMO-DE X X X COSMO-EU_MH X X MARS/MUSE X X X M-SYS X X MC2-AQ X X MCCM MEMO X X X X MERCURE X X X Meso-NH X X X METRAS X X X X X MM5 (GR, /AirQUIS / X X X X CMAQ (UH) NHHIRLAM X X RAMS X X RCG X X SAIMM X UM X X WRF/Chem X X

The following studies also indicate that a full model evaluation exercise would involve the following elements:

• Comparison of model results with observations. • Intercomparison of models for same cases. • Statistical quantification of the model performance. • Sensitivity analysis of the outputs to changes in input parameters and model formulations or scheme options.

36 Not all model evaluation examples given in the following sections have employed all of the above methods. This in itself suggests that more thorough studies of mesoscale model evaluation for meteorological and air pollution applications are required.

7.1.1 Use of the European Tracer Experiment (ETEX) for Model Evaluations Christer Persson SMHI, Norrköping, Sweden

Within ETEX two successful atmospheric dispersion experiments on meso- and European scale were carried out in October and November 1994, ETEX-1 and ETEX-2. At each experiment an inert, non water soluble tracer was emitted to the atmosphere from a site in Western France. The release time was each time 12 hours and air samples were taken during 72 hours after the release at 168 stations in 17 European countries. Also upper air tracer measurements were made from three aircrafts.

During ETEX-1 the tracer plume was emitted into an unstable atmosphere and first transported north-eastwards across Europe and reached during the second and third days a deformation zone with weak winds and a rather complex atmospheric transport situation. After 2.5 days the tracer plume was stretched in a broad band from Norway south to Bulgaria. In ETEX-2 the tracer emission took part in stable warm air and a very strong south-westerly wind (Nodop et al., 1998).

Although old and using a passive tracer, these experiments still can be very valuable for model evaluations on meso- and European scales. ETEX-1 and -2 are two of the few existing sets of information related to controlled long-range dispersion of tracers. These datasets are today easily available within the ENSEMBLE project for nuclear emergency preparedness at JRC, Ispra†. The ENSEMBLE software can be used in a convenient way for model evaluations, including some statistical measures, based on these datasets. Meteorological data needed for dispersion model calculations, in the form of gridded 3-D analyses / forecasts every 3-h as well as observations, can be obtained from ECMWF (European Centre for Medium-Range Weather Forecasts‡) and also, sometimes in more high resolution than from ECMWF, from several European National Meteorological Services. The ENSEMBLE project is a RTD FP5 project supported by the European Commission and is described, for example, in Galmarini et al. (2004) and based on work within the ENSEMBLE Consortium § . Model evaluations of user–oriented measures of effectiveness to transport and dispersion model predictions have been performed by Warner et al. (2004, 2005). This work is based on the ETEX experiments and the quite extensive evaluation methods are described in the papers.

7.1.2 Meteorological Simulations over the Greater Athens Area Using MM5 and MEMO Mesoscale Models Nicolas Moussiopoulos(1), John Douros(1), George Tsegas(1), Evangelia Fragkou(1), Anabela Carvalho(2), Carlos Borrego(2) (1) Laboratory of Heat Transfer and Environmental Engineering, Aristotle University Thessaloniki, Thessaloniki, Greece (2) Department of Environment and Planning, University of Aveiro, Aveiro, Portugal

MEMO simulations have been performed in complex terrain areas, including the Greater areas of Athens, Greece and Marseille, France. In the case of Athens, MEMO was evaluated against observations and also compared to the 3D Eulerian, limited-area, nonhydrostatic, terrain- following MM5 model system. The Greater Athens Area (GAA) presents several terrain irregularities and large water bodies. It is located on an oblong basin and is surrounded by mountains at three sides and is open towards the Saronikos Bay to the southwest. The local wind circulations caused by this complex topography, particularly the sea breeze, greatly influence air

† http://rem.jrc.cec.eu.int/etex/ ‡ http://ecmwf.org/ § http://ensemble.jrc.it

37 pollution circulation in the GAA. The period between the 16th and 19th of July 2002 was simulated, for which a complete set of meteorological observations was available. Comparisons with observations were carried out at 10 different stations in and around the GAA, including mainly suburban and two urban stations, Patision and Marousi. Both MEMO and MM5 reproduced the afternoon sea breeze, although in the case of MM5 the flow was generally more homogenous, whereas MEMO simulated a sea breeze in two different cells of the peninsula (Figure 4). More specifically, a south-easterly change in wind direction in the Gulf of Petalion encouraged the development of a sea breeze cell in the Mesogia Plain, which was hardly apparent in the case of MM5.

(a) (b)

Figure 4. MM5 (a) and MEMO (b) predicted wind (m s-1) over terrain (m), 19th of July 2002, 14:00

Both MEMO and MM5 followed the observed wind speed diurnal pattern successfully (Figure 5a,b). The wind speed BIAS reveals a tendency of both models to overestimate and MEMO’s overestimations appear to be slightly more pronounced than MM5’s. However, the correlation coefficient of the time series makes evident MEMO’s capability to follow the diurnal variation of wind speed more accurately almost everywhere in GAA. Regarding temperature, both MM5 and MEMO were able to capture the diurnal pattern, as well as the gradual decreasing trend during the simulation period (Figure 5c,d). This decrease in temperature was both observed and predicted at all stations, in agreement with the prevailing synoptic conditions. At most stations, MEMO was closer to the observations. MM5 performed well in the beginning of the simulation, but it underestimated significantly the temperatures during the last two days of the modelled period.

38 (a) (b) 7 7

Obs. 6 Obs. 6 MM5 MM5 MEMO MEMO 5 5

] 4 ] 4 -1 -1

Wind [ms Wind 3 [ms Wind 3

2 2

1 1

0 0

16-Jul-02 17-Jul-02 18-Jul-02 19-Jul-02 20-Jul-02 16-Jul-02 17-Jul-02 18-Jul-02 19-Jul-02 20-Jul-02 Date in July 2002 Date in July 2002 (c) (d) 38 38

37 37

36 36

35 35

34 34

33 33

32 32

31 31 C] C] o o 30 30

29 29

28 28

Temperature [ Temperature 27 [ Temperature 27

26 26

25 25 Obs. MM5 24 24 Obs. MEMO 23 MM5 23 22 MEMO 22

21 21

20 20

16-Jul-02 17-Jul-02 18-Jul-02 19-Jul-02 20-Jul-02 16-Jul-02 17-Jul-02 18-Jul-02 19-Jul-02 20-Jul-02 Date in July 2002 Date in July 2002

Figure 5. Surface wind speed (a, b) and temperature (c, d) at the stations of Marousi (a, c) and Liosia (b, d) for the whole simulation period

7.1.3 Evaluation of MEMO Using the ESCOMPTE Pre-campaign Dataset N. Moussiopoulos(1) , I. Douros(1), P. Louka(1), C. Simonidis(2), A. Arvanitis(1) (1) Laboratory of Heat Transfer and Environmental Engineering, Aristotle University Thessaloniki, Thessaloniki, Greece (2) Institute of Technical Thermodynamics, University of Karlsruhe, Karlsruhe, Germany

The greater Marseille Area (GMA) is a challenging area for mesoscale simulations since it includes certain discrete geographical formations (sea, the Southern Alps, the Rhone valley) which directly influence the local circulation. The selected case-study was the period between June 29 and July 1, 2000, i.e. a summer period for which, depending on the meteorological conditions, the formation of photochemical smog was favoured. Three out of a total of 15 measuring stations were considered at locations where differences in local meteorology characteristics can be expected, namely a location by the sea (Marseille), a location further inland (Tarascon) and a location far from the sea (Carpentras). In order to assess the sensitivity of the model results to the grid resolution, two different cell sizes were used, namely 4×4 km2 and 2×2 km2.

In general, the simulated values for both resolutions are comparable, with those of highest resolution not surprisingly capturing more details. Taking a closer look at each selected station, it is shown that for the station of Marseille located by the sea the correlation with measurements is better for the MEMO run with the 2 km resolution than for 4 km resolution, especially for wind speed and temperature, as the local flow is more accurately captured. The model performance for temperature is found to be better for Marseille and Tarascon, while the correlation is considerably higher for the daytime values than for night-time values for the station of Carpentras, for which night-time temperature is overpredicted. As Carpentras is located on a mountain slope, this is possibly due to an underestimation of the radiative heat flux from the ground associated with the

39 land-use categorisation implemented in the model. Overprediction of the night-time temperature is generally observed at mountainous stations of similar positions.

(a)

(b)

Figure 6. BIAS for wind speed (a) and air temperature (b) during the simulation period (29 June to 1 July 2000). Red bars denote midnight

The statistical analysis generally suggests that there is an overestimation of wind speed early in the morning of the last day of the simulation (Figure 6a), which, however, does not correspond to a bad prognosis of the wind direction during the same time period (not shown). Temperature, on the other hand, reveals the night - overestimation trend (Figure 6b), which was already evident from the diurnal profiles at the selected stations. Generally, the BIAS at a grid spacing of 4 km is comparable to that at 2 km, a fact that does not justify the much higher computational effort associated with the higher resolution.

7.1.4 Modelling of SOA in the MARS-MUSE Dispersion Model Edouard Debry, Ioannis Douros and Nicolas Moussiopoulos Laboratory of Heat Transfer and Environmental Engineering, Aristotle University Thessaloniki, Thessaloniki, Greece

The MARS/MUSE (Moussiopoulos, 1995) 3D Eulerian mesoscale photochemical modelling system was evaluated for the area of Milan. In the version examined, a modal aerosol model was incorporated for secondary organic aerosols (SOA), in which coagulation, condensation/evaporation and nucleation were solved for each mode of the aerosol distribution. Simulation results were compared with sets of measured data for PM10, O3, NO2 and NO.

The simulation time is one week starting on the 1st of April 1999. Initial and boundary conditions are provided by predictions of the EMEP model. The model was able to reproduce the average level of observations quite well (small BIAS), however, it could not adequately follow the diurnal variation of PM10 for the stations of Magenta and Limito (large BIAS, small correlation coefficient; Table 18). Secondary organics produce an increase in PM10 concentration but do not change significantly the PM10 behaviour. This tends to mean that secondary organics rapidly reach an equilibrium between gas and aerosol phase. The simulation with SOA slightly reduces the error regarding simulation without SOA, except for Limito station, but in both cases the RMSE and correlation coefficients remain of the same order of magnitude (Table 18). The impact of SOA on the BIAS values is more evident.

40 Table 18. Statistic analysis of the MARS/MUSE simulations with and without SOA

Correlation coefficient Station BIAS RMSE r Limito (with SOA) +13.81 24.99 0.15 Limito (without SOA) +11.42 23.22 0.14 Meda (with SOA) -2.44 16.86 0.19 Meda (without SOA) -4.94 17.34 0.15 Vimercate (with SOA) 0.38 13.90 0.43 Vimercate (without SOA) -2.15 13.38 0.44 Magenta (with SOA) -3.46 15.15 -0.02 Magenta (without SOA) -4.99 15.42 -0.03

The time average and standard deviation of PM10 and PM2.5 concentrations at Meda station for observations and simulations can be examined with or without SOA. The fact that the SOA increment also appears for PM2.5 predictions in equivalent proportions indicates that secondary organics are mainly associated with finer aerosols.

7.1.5 Photochemical Simulations over the Greater Athens Area Elissavet Bossioli, Maria Tombrou-Tzella University of Athens, Department of Applied Physics, Laboratory of Meteorology, Athens, Greece

In the study of Bossioli et al., (2007), several factors that influence the ozone concentration levels over the Greater Athens Area (GAA) are examined by applying the three-dimensional photochemical Urban Airshed Model, UAM-V off-line coupled with the Penn State/NCAR meteorological Mesoscale Model MM5-v3.6. The initial scenario (Base Case) constitutes the basis for all the numerical experiments. In particular, this scenario considers both the meteorological and emission data of anthropogenic origin in their primary form with no further modification.

The Base Case scenario (BC) reproduces the observed ozone patterns, but it underestimates the observed peaks in most of the downwind suburban stations (Figure 7). Based on the Base Case scenario, several numerical experiments were performed focusing on: i) a better representation of the anthropogenic emissions; ii) the incorporation of the spatial and hourly distribution of the biogenic non methane hydrocarbon emission rates; iii) the adoption of two speciation profiles for the anthropogenic NMVOC emissions; iv) the effect of the urban sector introduced via a simplified urbanized meteorological dataset; v) the application of the MM5 model without nesting in order to isolate the synoptic effects from the local circulation evolution and vi) the effect of the ozone boundary inflows.

The performance of the UAM-V model is evaluated by using statistical measures for management studies, recommended by US EPA (1991). The measures that are used are the mean normalized BIAS (MNB), the mean normalized error (MNE), the unpaired peak prediction accuracy (Au) and the spatially-paired peak estimation accuracy (As). All the calculated values are based on the hourly prediction-observation differences normalized by the observed ozone concentrations for all the monitoring stations. The MNB and MNE measures have been calculated -3 for three cut-off concentration levels (110, 80 and 40 μg m ). The As values were calculated both for the entire set of stations and for each category separately.

41

(a) (b) Patision (measur) Athinas (measur) N.Smirni (measur) BC BC BCB BCB BCB_spec (standard) BCB_spec (standard) BCB_spec (urbanized) BCB_spec (urbanized)

300 300

200 200 ) ) 3 3 g/m g/m μ μ ( 100 ( 100

0 0 1 3 5 7 9 11131517192123 1 3 5 7 9 11 13 15 17 19 21 23

(c) (d) Geoponiki (measur) BC Likovrisi (measur) BC BCB BCB_spec (standard) BCB BCB_spec (standard) BCB_spec (urbanized) BCB_spec (urbanized)

300 300

200 200 ) ) 3 3 g/m g/m μ μ ( ( 100 100

0 0 1 3 5 7 9 11131517192123 1 3 5 7 9 11131517192123

(e) (f) Liossia (measur) BC Demokritos (measur) BC BCB BCB_spec (standard) BCB BCB_spec (standard) BCB_spec (urbanized) BCB_spec (no nesting) BCB_spec (urbanized)

300 300

200 200 ) ) 3 3 g/m g/m μ μ ( 100 ( 100

0 0 1 3 5 7 9 11131517192123 1 3 5 7 9 11131517192123

Figure 7. Measured and predicted mean hourly O3 concentrations at (a) the urban traffic station of Athinas (measurements at Patision are also included), (b) the urban background station of N.Smirni, (c) the suburban industrial station of Geoponiki, (d) the suburban station of Likovrisi, and the suburban background stations of (e) Liossia, and (f) Demokritos for various scenarios. BC: Base Case; BCB: BC plus biogenic emissions; BCB_spec (standard): BCB with different NMVOC speciation profiles (the meteorological model is applied with nesting); BCB_spec (urbanized): BCB_spec with the effect of the urban sector introduced via a simplified urbanized meteorological dataset; BCB_spec (no nesting): the meteorological model is applied with no nesting. Figure from Bossioli et al. (2007)

7.1.6 Mesoscale Meteorological Model Inter-comparison and Evaluation in FUMAPEX Barabra Fay(1), Veil Odegaard(2) (1) Deutscher Wetterdienst, Offenbach, Germany (2) Det Norske Meteorologisk Institutt, Blindern, Oslo, Norway

As already outlined in 4.2.5, an extensive model inter-comparison was performed within EU FP5 project FUMAPEX. Evaluations were performed for episodes including evaluations of 1-year time series. Evaluation was separately performed for selected episodes and for a full year (long- term evaluation. A few results are highlighted here, more details can be found in Annex G and Fay et al. (2005).

42 A main outcome of the evaluations performed in FUMAPEX is an insufficient simulation of temperature inversions in all models. This is attributed to the following model deficiencies:

• Model set-up (insufficient vertical resolution, hydrostatic modelling (HIRLAM), terrain- following coordinates. • Physiographic parameters (incl. deficient land-sea mask). • Soil and surface parameterisations (invariant snow properties, false soil moisture, lack of urbanised parameterisations). • Cloud parameterisation. • Surface evaporation (overestimated). • Simulation of strong stability (deficient turbulence parameterisation, overpredicted vertical exchange and vertical wind shear of horizontal wind). • Data assimilation (missing snow and sea ice, insufficient vertical soundings, soil and surface parameters and urban observations).

For the one year evaluation and the summer and winter season the ranges of the parameter scores in FUMAPEX are shown for COSMO-EU/COSMO-IT, DNMI HIRLAM and MM5, and FMI HIRLAM in Table 19. Model performance for episode forecasting seems to depend mainly on the model ability to forecast the specific meteorological episode features in sometimes complex locations and even for extreme meteorological conditions, and on the station representativeness and observation quality. The performance depends much less on the location being urban, suburban or rural.

Table 19. Range of score for forecast lengths below 48 hours for wind speed at 10 m (FF 10 m), Temperature at 2 m (T 2m), dew point temperature at 2m (Td 2 m) FF 10 m (m s-1) T 2 m (°C) Td 2 m (°C)

BIAS, 1 year data, not seasonal -1.0 to +1.2 -1.5 to 2.0 -2.0 to 3.5 RMSE Year 1.5 to 2.3 1.8 to 4.2 1.4 to 4.8 Summer 1.0 to 2.9 1.2 to 4.3 1.2 to 8.0 Winter 1.2 to 3.9 1.6 to 4.5 1.6 to 6.6

Comparing the results of the different models for the different episodes in terms of their skill in forecasting air pollution episodes, the models apparently perform better in predicting the summer episodes than the winter/spring inversion episodes. In some regions like Valencia, summer episode conditions are very frequent and possibly include less unusual or extreme meteorological conditions than most other areas. This picture is also confirmed in the evaluation of the episode performance against the background of longer-term statistical scores. These results clearly show the scope, but also the limitations of even highly resolving mesoscale NWP models, especially for the sometimes extreme episode conditions. Very strong inversions and stability, complex orography, superimposed valley-mountain and land-sea breeze systems combined also with larger-scale circulations may decrease model performance and challenge model predictability. Information on the model evaluation strategy used in FUMAPEX and detailed single model evaluation statistics (full tables and graphs, for the whole year and the seasons) and their interpretation for all models and episodes are compiled in Fay et al. (2004, 2005).

43 7.1.7 Evaluation of COSMO-IT for Air Quality Forecast and Assessment Purposes M. Deserti(1), G. Finzi(6), S. Bande(2), G. Bonafè(1), E. Minguzzi(1), M. Stortini(1), E. Angelino(3), M.P. Costa(3), G. Fossati(3), E. Peroni(3), G. Pession(4), F. Dalan(5), S. Pillon(5), C. Carnevale(6), E. Pisoni(6), G. Pirovano(7), M. Bedogni(8) (1) ARPA Emilia Romagna ([email protected]) (2) ARPA Piemonte (3) ARPA Lombardia (4) ARPA Valle d’Aosta (5) ARPA Veneto (6) DEA, Università degli Studi di Brescia (7) CESIRICERCA S.p.A. (8) Mobility and of Milan

The prognostic meteorological model COSMO-IT is the Italian version of the non- hydrostatic limited area model COSMO-EU (formerly named Lokal Modell; Steppeler, 2003). It is run twice a day with a horizontal resolution of about 7 km, and provides meteorological forecasts and analysis for Italy. Two validation exercises were performed with the aim to evaluate the performances of the operational COSMO_IT model as input to a for producing forecast and/or hindcast simulations of air quality in the Po Valley (northern Italy),. The first exercise was performed by the Hydrometeorological Service of Emilia-Romagna (SIM) in the framework of the EU FP5 FUMAPEX project (Fay et al., 2004), and comparisons of model output (forecasts and analysis) with routine meteorological data and with data from special measurements campaigns were performed. To assess the performances of the coarse model (7 km horizontal resolution), the model outputs were also compared with high resolution outputs during pollution episodes over three different areas represented in Figure 8 (COSMO_IT domain 1.1 km and COSMO_IT domain 2.8 km horizontal resolution). The second exercise was carried out by the Meteorology Centre of Teolo (CMT) in the framework of the Italian CTN-ACE project (Deserti, et al. 2004) and compared model output (reanalysis) with routine meteorological data over a one year period. In addition, COSMO_IT 7 km results were compared with results from a diagnostic mass consistent meteorological model (CALMET, 4 km horizontal resolution) run on a sub domain (CALMET Domain, Figure 8).

Figure 8. Model domains for the Italian evaluation exercises

In Annex H results of the COSMO_IT evaluation are given for special campaigns and in the long-term in a tabular form. They can be summarized as follows:

• COSMO_IT forecasts were generally better in flat terrain than in the mountains (complex terrain) .

44 • The results are less good during peak pollution episodes (atmospheric stability, low or calm wind, clear sky). • Long term verification shows that T 2 m temperature forecasts are of acceptable quality (mean absolute error MAE < 2.5 K except for urban stations), wind speed forecasts are generally acceptable especially over flat terrain. • Results for wind are dependant on forecast time and season (Figure 9). • In the Po Valley winds are frequently overestimated while in the Apennine mountains winds are frequently underestimated, leading to incorrect air pollution episode forecasting. • The forecast of wind direction is generally poor (the direction MAE ranges between 30° and 80°) and depends strongly on the station (Po Valley better than Apennine mountains), season (summer is the worst) and the wind sector (225° – 270° sector during the night, 45°- 90° sector during the day, 0°-45° sector during the day in the mountains).

Figure 8. Wind roses for most of the selected surface stations (left), for COSMO_IT (right)

The following general conclusions can be drawn from the evaluation of the suitability of COSMO_IT results (Annex H) for AQ forecasts:

• Errors in temperature and humidity forecasting in the PBL are partly due to an incorrect partitioning of surface heat fluxes into sensible and latent heat fluxes. These errors lead to a strong underestimation of the surface temperature inversions (Figure 10). A more detailed soil texture field and an operational routine for the soil moisture initialization could reduce these errors. • The errors in the T 2 m daily cycle, the cold and moist BIAS in the PBL, the overestimation of the 10 m wind over flat terrain and the bad treatment of the cases with extreme thermal stratification are problems related to the turbulence scheme implemented in the used version of COSMO_IT. That problem will probably be solved in the near future by reorganizing the turbulence scheme and tuning some parameters of the PBL scheme, changing the interpolation of model variables to synoptic levels, reducing the depth of the lowest model layer and testing and implementing improved schemes for soil moisture analysis.

COSMO_IT was also run at 2.8 km. A 1.1 km horizontal resolution simulation shows that wind field structures become more detailed and realistic. Next to that, an impact was found on turbulence and the variability of vertical velocity, but, due to the lack of experimental data, it was not possible to validate these effects with observations. Therefore, it can not be concluded from this study that an increase of horizontal resolution can improve the accuracy of meteorological input for air quality models.

45 (a) (b)

Figure 9. Temperature profile at “San Pietro Capofiume” station (rural), the 21st of June 2002 at noon (a) and at midnight (b), observed (black line) and forecasted with the three different horizontal resolutions: 7 km (blue line), 2.8 km (green line) and 1.1 km (red line)

Both, episodic and long term verification show that forecasts were generally better in rural than in urban regions. This result shows the need to account for urbanization. The FUMAPEX project has suggested several strategies and techniques for NWP urbanization, nevertheless, it should be considered that any type of urbanization of COSMO_EU could enhance turbulence in the urban areas. This could improve the model’s ability to forecast mixing height and other related turbulence parameters, but could also reduce its ability to forecast inversions. This is, at the moment, one of the most critical problems for peak pollution episode forecasting.

7.1.8 Evaluation of MM5-CMAQ Systems for an Episode over the UK Y Yu(1), R S Sokhi(1), Nutthida Kitwiroon(1), Bernard Fisher(2), D R Middleton(3) (1) Centre for Atmospheric and Instrumentation Research (CAIR), University of Hertfordshire, UK (2) Environment Agency, Reading, UK (3) Met Office, Exeter, UK

The MM5/CMAQ was applied to an air pollution episode on 22-28 June 2001 to examine the performance characteristics of MM5/CMAQ for simulating regional ozone and NO2 and SO2. Further details of the model setup and analysis can be found in Yu et al. (2006, 2008). This section focuses on the results of the model evaluation exercise. A range of statistical measures were used for MM5 and CMAQ evaluation. These were calculated for the innermost domain of MM5 and CMAQ for the whole simulated period of June 2001. Model values used in these calculations are extracted from the first model level (about 14 m AGL) for CMAQ evaluations.

7.1.8.1 MM5 Performance Table 20 lists the results of the statistics for MM5. These values reflect averages over space (all monitoring stations in the innermost MM5 domain) and time (all hours in the simulation episode). All the statistics indicate a good overall agreement between observations and model predictions, especially for 2 m temperature and 10 m wind direction with a correlation coefficient of 0.94 and 0.8, respectively. Considering the low wind speed observed during this period, the model wind speed performance is very good and comparable to values found in the literature (Zhong et al. 2003).

46 Table 20. Performance statistics for the meteorological predictions with 3-km grid spacing

10 m wind speed 10 m wind direction Score 2m temperature (oC) (m s-1) (degree) Mean Obs. 18.2 3.4 155 Mean Sim. 18.8 3.0 158 Total # (N) 4599 4414 4391 Corr. Coeff. R 0.94 0.59 0.8 BIAS 0.7 -0.3 7.3 NMB% 3.7 -8.8 4.7 MAE 1.44 1.2 28.2 NME% 7.6 36.6 18.2 RMSE 1.7 1.5 42.6 Index of Agrement IOA 0.97 0.75 0.93

7.1.8.2 CMAQ Performance Measured hourly air quality data at 22 monitoring stations were used in the model evaluation for CMAQ. Qualitatively, the model simulates the diurnal O3 and NO2 concentration patterns very well at all sites. Figure 11 compares the measured O3 and NO2 time series with the modelled results extracted from the first model level (about 14 m AGL) at two representative sites. At both sites, the model captures the O3 night time lows quite well, but it tends to underpredict daytime peaks during high ozone days, for example on 24 June at London Bexley and on 25/26 June at Harwell.

(a) (b)

180 Obs 250 Harwell Obs London Bexley CMAQ (3km) CMAQ (3km) 160 CMAQ (9km) CMAQ (9km) 140 200 120 ) ) -3 -3 150 100 gm g m μ μ

( 80 ( 3

3 100 O 60 O

40 50 20 0 0 12:0024/06 12:00 25/06 12:00 26/06 12:00 27/06 12:00 28/06 12:00 12:0024/06 12:00 25/06 12:00 26/06 12:00 27/06 12:00 28/06 12:00 local time local time (c) (d) 140 80 London Bexley Obs Harwell Obs 120 3km 70 3km 9km 9km 60 100 ) -3 )

-3 50

g m 80 μ gm ( μ 2 40 ( 60 2 NO

NO 30 40 20

20 10

0 0 12:0024/06 12:00 25/06 12:00 26/06 12:00 27/06 12:00 28/06 12:00 12:0024/06 12:00 25/06 12:00 26/06 12:00 27/06 12:00 28/06 12:00

Figure 10. Comparison of measured and modelled time series of O3 (a, b) and NO2 (c, d) concentrations at London Bexley (a, c) and Harwell (b, d). Modelled O3 and NO2 concentrations were from the first model level (about 14 m AGL) for 23 to 28 June 2001

47 Statistical parameters indicate a satisfactory overall model performance (Annex I). The CMAQ model was able to reproduce the observed temporal and spatial variations of O3 (Table 33) and NO2 (Table 34). On average, the model slightly under-predicts O3 concentrations with a BIAS of -3.6 μg m-3 and a MNB of 30 % for 3 km resolution and a BIAS of -0.8 μg m-3 and a MNB of 39.4 % for 9 km resolution. For O3 the 9-km and the 3-km resolution simulations gave comparable model performances. However, the model tends to miss very high peak O3 values. The causes of this disagreement should be investigated as part of future work. In the case of NO2 prediction the -3 model shows an under-prediction of NO2 concentrations with a BIAS of -11.8 μg m and a MNB of –14.7% for 3 km resolution and a BIAS of -13.6 μg m-3 and a MNB of -14.0 % for 9 km resolution. The model performs better for ozone than for primary pollutant such as NO2. For NO2, generally the 3-km resolution gives better predictions than the 9-km resolution simulation.

7.1.9 Evaluation of the MM5-CMAQ-EMIMO Modelling System in Spain Roberto San Jose Computer Science School - Technical University of Madrid, Campus de Montegancedo, Madrid, Spain

The MM5-CMAQ air quality modelling system has been used with the EMIMO model which has been developed by UPM in 2001 and with several versions afterwards (San José et al., 2004, 2005, 2006, 2007; Sokhi et al., 2006). The system has been used in Spain to carry out a large set of air quality impact studies for new industrial sites (combine cycle power plants, incinerators, etc.). In most of the cases the system has been implemented on a domain configuration of 400 x 400 km2, 100 x 100 km2 and 24 x 24 km2 centred on the industrial plant with 9 km, 3 km and 1 km spatial resolution, respectively. Recently, the system has been applied over a whole year, in earlier applications it was only applied for 5 days, a month, or for 60 days. The system is also being applied in forecasting mode to provide air quality forecasts to several Spanish and UK cities (Leicester City Council), such as Madrid, Las Palmas de Gran Canaria. These systems are operating on a daily basis. The system is implemented on sophisticated cluster platforms to provide real-time forecasts of the industrial emissions of electric and cement companies to help to take decisions by the commercial partners and policy makers. In the latter case, the system is run in parallel with different scenarios and the impact of industrial emissions is assessed by calculating the differences between scenarios. For application the system has been calibrated by comparing the concentration results with observational data provided by the different air quality monitoring stations.

In Figure 12 the hourly area average observed ozone data obtained from the Madrid Community monitoring network and the average modelled data with MM5-CMAQ-EMIMO are compared for year 2005. The differences between the mean observed values and the mean modelled values are 1.1 µgm-3 which is approximately 0.04 % of uncertainty. This result is obtained with a 3 km spatial resolution model domain nested within a 9 km and 50 km European CMAQ model domain. .

Figure 11. MM5-CMAQ-EMIMO model simulation over Madrid (Spain) domain with 3 km spatial resolution. Comparison between observed and modelled ozone data averaged over 23 different monitoring stations in Madrid Community for 2005 (365 x 24 hours = 8760 data)

48 More details on results of the MM5.CMAQ-EMINO system can be found in Annex J. The system has been used in hindcast mode (air quality impact studies) and forecast mode (real-time forecasting systems with 96 – 120 hours into the future). Figure 13 shows two examples in real- time forecasting mode.

(a) (b)

Figure 12. Ozone observations versus modelled data produced by the MM5-CMAQ-EMIMO air quality modelling system (3 km resolution) operating in real-time 24 hours forecasting mode for (a) 20-28 August 20-28 2006 in Torrejón (Madrid Community) and for (b) 7-28 August 2006 in the Torrejon monitoring station (Madrid Community)

The results obtained by the ESMG (Environmental Software and Modelling Group) of the UPM-FI show similar characteristics to those obtained in applications in UK (see Section 7.1.8).

7.2 Concentrations of Chemical Species Besides the studies mentioned in Section 7.1, numerous studies have been published, in which the performance of a single, specific regional scale Chemical Transport Model (CTM) has been described. The outputs of these CTMs are compared to observations, statistical analysis is carried out, and conclusions are drawn to indicate that “there is reasonable agreement between model results and observations”.

Over the last couple of years several model validation and model intercomparison studies have been carried out in Europe, in which several models participated in contrast to the usual single model evaluation studies. The large advantage of such a set-up is that the models are also tested against each other, and that a more open discussion originates, in which the strong and the weak parts of the different models are analysed. One of the first regional scale studies was reported by Hass et al. (1996 and 1997), in which four photo-oxidant models were compared and validated against O3 observations for a 2-day episode in 1990. The results of this study have been used in a later study by Delle Monache and Stull (2003) to investigate the possibilities of ensemble modelling. Roemer et al. (2003) performed a study with ten different CTMs focussing on ozone trends over Europe. This study was followed by a study to intercompare aerosol modelling over Europe (Hass et al., 2003), in which six different CTMs participated. In the framework of the review of the EMEP model, seven models were evaluated and intercompared both for gas-phase species and aerosols (van Loon et al. 2004). Within the project EURODELTA, led by JRC-Ispra, long term ozone simulations are compared and evaluated from calculations by seven different models to analyse their ensemble average, their combined uncertainty and their overall performance (van Loon, 2006; Vautard, 2006).

49 Considering the above stated studies, a number of joint characteristics can be seen:

• The model evaluations were focussed on trace gas and aerosol characteristics, and did not explicitly consider meteorology. Obviously, specific studies have been performed to evaluate meteorological models, also focussed on air quality applications, as an example see Seaman (2000). In the studies listed above both prognostic / NWP meteorology as well as diagnostic meteorology is used as input, but the impact of using these two types of input data is not explicitly studied. • In general, common types of statistical parameters are used in the analysis, such as correlation coefficient r, RMSE, NMSE, Fractional BIAS, Standard Deviation STDE. Sometimes PDF of differences between observations and model results, frequency analysis, with appropriate plots and tables are calculated. • The main driving force of air quality, the emission data are harmonised in the intercomparison study between the different models. • The models taking part in these studies should be considered as operational, deterministic models, which address hour-by-hour calculations over several years. This means that these are not empirical models, for which the model evaluation is quite different, or models, which focus on the evaluation of process studies.

Summarising and presenting the results of model intercomparison and validation studies is often quite complicated due to the large amount of data that is produced. At JRC-Ispra in the framework of the City-Delta project, a tool has been developed based on the so-called Taylor diagrams in which the statistical evaluation of models can be presented in a coherent way. At TNO, within the EMEP-review project, a system has been developed by which the statistical analysis of the results of model validation and intercomparison can be presented in a handy, tabular form.

In practice, many mistakes are made during model application and model improvement studies by small coding and input errors. A possibility to avoid at least part of such mistakes is the use of a test-set/test-system/quick scan which enables the testing, in a fast way, the behaviour of new and updated model versions. Such a quick scan system might be used in a general way by different modelling groups and might contain a kind of standard test-set. For chemical box models, such a system has been developed by Poppe and Kuhn (1996).

50 8. MODEL EVALUATION METHODOLOGIES K. Heinke Schlünzen ZMAW, University of Hamburg, Meteorological Institute, Hamburg, Germany

The current status of model evaluation mainly concerns comparisons of single components with data or analytic solutions, model intercomparisons or comparisons with evaluated reference data (Section 7). After some early attempts for suggesting generic evaluation protocols (Model Evaluation Group, 1994) and for mesoscale model evaluation guidelines (Schlünzen, 1997), several more approaches followed during the past few years. The SATURN project (Moussiopoulos, 2003; Borrego et al., 2003) initiated work in this area. Activities, associated with more general model evaluation include:

• The Clean Air for Europe project (CAFE) under the 6th Environment Action Programme that strives to develop a thematic strategy on air pollution. • The City-Delta project which has been organised by the Joint Research Centre of the EC and focuses on urban background concentrations in several European cities. • The ENV-e-City project which aims to improve access to environmental data, whereby meteorology for air pollution assessments is a pilot application area. • The Network of excellence ACCENT*.

With respect to what the actual procedures of a model evaluation protocol should consist, the initial attempt by the Model Evaluation group (1994) concluded that the following steps should be followed: a. A complete description of the model. b. A complete description of the database which is used for the evaluation of the model. c. Scientific evaluation: a description of the equations employed to describe the physical and chemical processes that the model has been designed to include. d. Code verification including sensitivity analysis and model inter-comparison. e. Model validation including comparison with experimental data and statistical analysis on the basis of selected measures. f. User oriented assessment which essentially includes a documentation of the code, including best practice guidelines.

The above list is similarly used in a Guideline for evaluating obstacle resolving microscale models (VDI, 2005). The above six points are mainly aiming at the model developer. Following the structure of VDI (2005) they can be summarized in three groups:

1. General evaluation, includes points a and f and can be performed off the computer without deep scientific knowledge. 2. Scientific evaluation, includes points b and c and can also be performed off the computer but needs scientific knowledge. 3. Benchmark tests, includes points d and e which need computer simulations and detailed comparisons.

The results of these three evaluation steps should be summarized in an evaluation protocol (Figure 14). In a second part of the evaluation, control steps should be suggested to ensure, the model user receives reliable results. These can be part of a best practise guideline (Point f of the above list).

* http://www.accent-network.org/

51 Objective

1. General evaluation

2. Scientific evaluation Part I: to be applied by model developer 3. Benchmark tests •Σ Evaluation Protocol

Part II: to be applied Operational evaluation by model user

Figure 12. Structure of an evaluation guideline (from Schlünzen et al., 2007)

The different steps given in Figure 14 are detailed in Figure 15. While the general evaluation is generic and needs not be more specific for the mesoscale, all other steps need to be defined specifically for the mesoscale and the applications intended (e.g. integration period).

(a) (b) 1. General evaluation 2. Scientific evaluation Checking the comprehensibility Æ Specification needs to be 1.1 Documentation must be available, scale dependent Å consisting of • Short model description Identify the processes required in the • Extended model description model • User manual • Model equations • Technical reference • Model approximations 1.2 Source code open for inspection • Parameterisations 1.3 Three publications in refereed • Boundary conditions journals • Initialisation • Input data •…

(c) (d) 3. Benchmark tests (I) Σ Evaluation protocol

Æ Specification needs to be Compiles all evaluation results scale dependent Å on one page 3.1 Quality indicators • Corr. coef., NRMSE, Stddev, hit rate, etc. • PDF of differences between OBS and MOD • Frequency analysis 3.2 Definition of validation test cases • Specification of grid structure • Define time scale • Define horizontal and vertical resolution (e) • Specification of input and comparison data • Quality assurance of data, spatial representativeness, flagging (Use Operational evaluation previous intercomparisons and new studies) Æ Specification needs to be • Prescribe fixed input data (emissions, land-use, meteorology, BC, etc.) scale dependent Å • Specify evaluation criteria, evaluation • Demands on model grid structure variables and error tolerances 3.3 Definition of sensitivity tests • Use operational on-line quality (These should reflect the purpose) checks of model • All specifications as for validation test • Quality control of model results cases • No 2 Δx-oscillations (inspection of cross 3.4 On-line quality tests sections). (These are operationally checking for numerical • check of „independence“ of model results from problems) resolution and model area size (5% differences • No numerical oscillations in time allowed). • mass conservation • check model results for plausibility and – whenever possible- quantitatively compare with • no exceedance of threshold values (e.g. negative specific measurements and results of other models. humidity/concentrations) • Documentation of model evaluation •... and model limitations.

Figure 13. Details of the different parts of a generic evaluation guideline. General evaluation (a), scientific evaluation (b), benchmark tests (c), evaluation protocol (d) and operational evaluation (e) (Figures from Schlünzen et al., 2007)

52 The steps 1 to 3 should be performed by a model developer and be summarized in an evaluation protocol, while the operational evaluation should also be performed by the model user. The details on what to check is, however, again scale specific and needs to be defined within an evaluation protocol.

The structure outlined in Figure 14 and Figure 15 is already applied in VDI (2005) for the evaluation of microscale models. It is currently taken as structure for VDI (2008) and outlines the structural approach to be used in COST728. Here, models are to be evaluated with respect to their ability for air pollution dispersion applications. For this purpose special focus will be on meteorological parameters (e.g. wind direction, PBL height, radiation and, in addition, concentrations). The model resolution for which the guideline is to be developed is 1-16 km in the horizontal direction. Focus will lie on the model results with respect to hourly data within a time period between a few days and up to one year.

53 9. USER TRAINING Marko Kaasik(1), Ranjeet Sokhi(2), K. Heinke Schlünzen(3), Gertie Gertsema(4), Barbara Fay(5), Liisa Jalkanen(6) (1) University of Tartu, Institute of Environmental Physics, Tartu, Estonia (2) Centre for Atmospheric and Instrumentation Research (CAIR), University of Hertfordshire, UK (3) ZMAW, University of Hamburg, Meteorological Institute, Hamburg, Germany (4) KNMI, Section Observations and modelling / Department research applied models, De Bilt, Netherlands (5) Deutscher Wetterdienst, Offenbach, Germany (6) WMO, 7 bis avenue de la Paix, Postale N° 2300, Geneva, Switzerland

User Training is an important aspect of model implementation and application, especially for routine or operational situations or assessment studies. In order to gauge the status of user training provision within Europe, a questionnaire about mesoscale model user training was distributed to Cost728 participants. This consultation resulted in 10 responses from 6 countries (Annex N). The main responses were collected in November 2006 with some updates in October 2007. Different responses from the same country represent either different institutions or different models in the same institution. Regular user training exists in Finland, Germany, Portugal and United Kingdom. Formal user training dedicated to mesoscale models was reported not to exist in Estonia, Norway and the Netherlands. In these countries the users are trained individually, in the form of individual supervising and consultation.

9.1 User Training in Different Countries Estonia Regular training courses for users of any specific model do not exist. There is a course “Numerical Methods in Meteorology” (32 hours for M.Sc. students) in the University of Tartu, including basics of numerical integration of equations of the atmosphere and parametrizations. Students get hands-on experience individually, supervised by researchers of the working group of atmospheric dynamics, preparing their B.Sc., M.Sc. and Ph.D theses. The number of students specialised in atmospheric modelling (one per year as average) is not sufficient to carry out a specialised course. The basic air quality model is SILAM and the basic meteorological model – HIRLAM. Most of students acquire skills as model developers at B.Sc. or Ph.D. level.

France Two training courses were presented for France. For Meso-NH the training course mainly focuses on accidental realises of radioactivity and aiming on Meso-NH users. The course takes 25 hours, covering code modification as well as the results’ visualization and Meso-NH-Chemistry. Half of the time is dedicated to practical work on all the topics. Web sites are available for the Meso-NH training course†, as well as for the training course for CHIMERE‡. Each new CHIMERE user can run the model for a real test-case by using the web site supported by model documentation. In this documentation, a chapter is dedicated to this test run.

Germany Training on at least the theory of mesoscale atmospheric modelling is given at most Universities in Germany, where Meteorology can be studied as a Major (Berlin, Bonn, Cologne, Frankfurt, Hamburg, Hannover, Karlsruhe, Kiel, Leipzig, Mainz). Some of the institutes have a research focus on mesoscale meteorology and add to the theoretical training hands-on lectures for all students. This is most deeply developed at University of Hamburg, where undergraduate students start to use mesoscale models with only the basic knowledge on the theory of modelling. The course is extended within in new master course, starting 2008, by adding to the current lecture curriculum another 28 hours of lecture dedicated to numerical schemes and to physical modelling.

† http://mesonh.aero.obs-mip.fr/mesonh/ ‡ http://euler.lmd.polytechnique.fr/chimere/

54 The example model used for applications is the in-house developed mesoscale model METRAS that is also used for consultancy work in Europe. Lectures are therefore also open for consultants.

Another training institution is the German Weather Service (Deutscher Wetterdienst) which provides training for the use of the non-hydrostatic COSMO_EU of the COSMO consortium on small-scale modelling and DWD’s High Resolution Model HRM (Majewski, 2001) which is the operational NWP model in 9 countries worldwide. While the COSMO_EU user training used to be a one-day workshop of mainly practical training in operating the COSMO_EU with different run-time options for university students and future COSMO_EU users, it recently has tended to be a more specialised and theoretical training, e.g. in advanced numerical methods. The HRM user training may be a 1- to 2-week course at DWD or abroad aimed at HRM users often from less scientifically advanced countries who operate the HRM. It comprises intensive theoretical lectures and hands- on training on all model aspects. It is supplemented by visits of DWD members abroad to install the HRM at the specific institutes and perform hands-on training. Regularly, groups of HRM scientists from abroad also spend about a month at DWD for specific HRM work and research.

Portugal In Portugal the University of Aveiro in collaboration with the Institute for the Environment and Development organize training events with the TAPM model. The majority of the participants come from institutions connected to the government mainly to apply and understand the model features, its capabilities and skills. At their institutions these users will be able to install the model, run it and interpret the obtained results. The adequate assessment and interpretation of the model predictions based on the input parameters was one of the main goals of the training events. In the future, these training events may be repeated for other air quality models that are operated at the University of Aveiro such as MARS, CAMx and CHIMERE.

The Netherlands Regular users-consultation sessions are given in which new developments are communicated together with the reasons for these changes and their consequences are given. Regular training course to new users and developers is not available. If necessary new users can follow courses at ECMWF. Model users and developers are academics mostly with a PhD in physics, training is thus being provided by universities. Dedicated training for special purposes is on ad-hoc basis. Training of model developers is mainly on the job, i.e. information and knowledge is acquired through close collaboration with experienced developers.

Forecasters are model output users who need a profound understanding of the characteristics of NWP models. Nowadays forecasters are academics holding a degree in physics with a major in meteorology. New forecasters receive training courses either in-house, or at sister institutes (e.g. ECMWF). Experience is acquired through the operational tasks. Information on model updates is given in meetings, via a Dutch magazine dedicated to meteorology and via intranet. In the in-house courses different aspects of meteorology are trained. These courses last typically half a day to several days.

UK Within the UK most, if not all, users of mesoscale models are within the atmospheric science ‘research’ community, represented by the National Centre for Atmospheric Science (NCAS§). All main UK atmospheric science research groups are affiliated to NCAS which also provides the infrastructure and support for its members. In addition to the research community, other organizations are also interested in the use of mesoscale models for air pollution problems. These include the Governments’ Department of Environment, Food and Rural Affairs (DEFRA), the Environment Agency and industrial users such as those within the power industry. Although individual research groups can employ any mesoscale model, NCAS only provides support for the (UM) which has been developed by the UK Met Office. Details on the NCAS Computational Modelling Support can be found on the web**.

§ http://www.ac.uk ** http://ncas-cms.nerc.ac.uk/

55 The users (research, policy and industrial) which have a particular interest in air pollution application of mesoscale models belong to the MESOMAQ network (Mesoscale Modelling for Air Quality applications††)). A list server has been setup for MESOMAQ which aids communication and exchange of ideas in this field‡‡. As part of MESOMAQ new developments are underway to enable users to employ UM meteorological fields to drive the CMAQ model. In addition to some universities providing science based training in mesoscale modelling, NCAS also organises training sessions on the Unified Model (UM). Future initiative are underway to extend this form of training.

USA Comprehensive training is provided within the USA by CMAS (Community Modelling and Analysis System) on the use of the Models 3 System§§. Although much of the training is held within the USA, recently, training has been provided by CMAS within Europe (e.g. as supported by ACCENT). Models 3 training normally includes an introduction to CMAQ and the emissions processor SMOKE. Separate events have been held on MM5 training. With the development of the new Weather Research and Forecasting (WRF) model, training is regularly being offered***.

WMO The WMO GAW Urban Research Meteorology and Environment (GURME) project Training Team has developed a training course on Air Quality Forecasting (AQF†††). The five day course was delivered in Lima, Peru, in July 2006 for participants from Latin American countries, please see Annex O for course content. The use of satellite data for accessing aerosol properties was included in training in January 2006, the presentations are available for download at http://www.wmo.int/pages/prog/arep/gaw/urban_training_finals_en.htm. This topic is also planned to be part of appropriate future AQF courses. South Asian countries, in December 2008 in Pune, India, received training on AQF, including air quality impacts on health and agriculture and AQ management for policy support, from local and foreign instructors. A course is planned to be held together with AIRNow International in Shanghai in spring 2009 for enhancing capabilities for air quality forecasting.

9.2 Summary on User Training The inventory seems too incomplete to provide an overall evaluation of user training or highlight which country might provide a better user training in Europe and elsewhere. Therefore, the following summary should not be interpreted as an evaluation of the training needs and provisions in the different countries but as an effort to derive possible advantages and shortcomings from the information available.

In Germany the user training is carried out at the Deutsche Wetterdienst (models COSMO_EU and HRM) and at the universities. As an example, detailed information was given for the University of Hamburg (METRAS model). In United Kingdom training exists on the UM model, in Finland – Finnish Meteorological Institute (SILAM model) and in Portugal – University of Aveiro plus Institute for the Environment and Development (TAPM model). The two courses provided for France have both web based documentations.

The number of academic hours in the course varies greatly – from 8 to 90; 33 hours are the average training hours. The frequency of introductory courses is 1-2 times per year, some more specific and individual parts up to 4 times per year. In Germany the courses are long and detailed, based on 1-2 teachers per course (plus invited teachers sometimes). In other countries the specific parts of the course are divided between experts. Teachers are either atmospheric scientists (meteorologists) or computer specialists. The average number of participants varies from 6-20 per

†† , http://ncasweb.leeds.ac.uk/mesomaq/ ‡‡ http://www.ncas.ac.uk/mailman/listinfo/mesomaq §§ http://www.cmascenter.org/ *** http://www.wrf-model.org ††† http://www.wmo.ch/web/arep/gaw/urban.html * http://www.cost728.org

56 basic course and 6-35 per country. The highest number per country is in Germany. However, the relatively small Finland is remarkable with 20 participants once, or sometimes even twice per year. Advanced or specific courses have less participants in general, but there exist seminars with 20-40 participants held 2-3 times per year in UK.

Among the objectives of the lectures are listed training for operational modelling and research. Hands-on lectures constitute 33-75% of volume of the course (higher fraction for shorter courses). Only courses in the University of Hamburg and DWD (most extensive ones) end with examination. In general the trainers expect that a skilled user can run the model independently, understand the basic constructions of the model and general input-output relationships. In Portugal the user must be able to install the TAPM model, understand its basic formulations and interpret the obtained results.

9.3 Recommendations for User Training In general, we recommend to distinguish two levels of training courses (1) for model users (operational and practical in character) and (2) for model developers (more emphasis on the science of the model and its application).

9.3.1 Model User The preliminary knowledge expected from model users includes basic computer skills (data processing, data formats) and desirably some knowledge about processes in the atmosphere and environmental management.

During the course, they must be trained to run the existing programme, to make necessary changes in the input and be able to understand and critically evaluate the results. Such training is targeted at practical application and (to some extent) at testing the model. It is desirable that a model user is able to install the model. However, in the case of a complicated model, consisting of several modules of different types, the model package needs to be configured for the certain configuration of operational system, installing is beyond the skills of ordinary user. The skills for installing are intermediate between necessary ones for an ordinary user and a developer in such a case.

9.3.2 Model Developer The preliminary knowledge expected (in addition to the user skills) from people to be trained as model developers, is understanding of processes in the atmosphere and their mathematical representation, and programming language(s) used to write the code. Typically, an academic degree in atmospheric sciences is expected. Alternatively, for developing some certain parts of the code (e.g. data assimilation, interfaces for user and other software modules), advanced skills in IT and programming are desirable.

The course includes overviews of functions of modules and connections between them, functionality and mathematical formulation of each module and how they are programmed. Hands- on exercises must be sufficient to learn to change, debug and compile the code and control the changes. As a result, the developer must be able to follow the results of any changes in the code (or a part of code he / she is expected to develop) and critically evaluate the output: whether the changes in the code made it better (results closer to reality, a test case, an analytical solution etc.) or worse, and how much. However, some creativity on defining problems and planning of further steps is expected from a developer.

57 10. CONCLUSIONS

This report provides an overview on the range of mesoscale models being used for air pollution dispersion applications by the COST728 partners. The emphasis of this report is on existing methodologies for mesoscale meteorological model evaluation and related applications. Results from several validation and evaluation exercises are summarized and an overview of user training in different countries is presented. The report contains the basic information for developing recommendations for model evaluation that will be specified in the next phase of work. In addition, a formal evaluation protocol will be developed. A goal of this protocol is to lead to a model quality assurance process that is based on scientific and fundamental principles. The protocols will be target orientated and therefore be different for the three different time scales considered in COST728:

• Episodes (few days). • Single cases that concern meteorological situations relevant for determining statistical values. • Extended periods / years, on an hour-by-hour to daily averaged basis, to determine air quality concentrations relevant for the EU-Directives.

The time scale oriented evaluation protocol will follow the structure given in Figure 14. Test cases are currently defined for COST728 that will allow testing of the evaluation protocol. The target of the protocol are both scientific and user oriented. From the users point of view the objectives are:

• To define quality standards for meteorological data usable to perform air quality assessment (concentration hindcast). • To assess the quality of meteorological input data to forecast air pollution for the next 2-3 days (mainly PM10 daily average, O3 hourly average and 8 hours running mean, NO2 hourly average) from regional to local scale (urban agglomerations, about 5*5 km2 horizontal resolution).

Several points can be highlighted from the report:

• A web-based inventory of mesoscale models, with details on model characteristics, has been created by COST728 and is accessible from the web*. It is hoped that the database will also be useful to the wider mesoscale modelling community and will act as a focal point for users requiring technical information in a summarised format on mesoscale models. The model inventory is already additionally used by the microscale and global scale modelling community. • Selected applications of mesoscale models have been reviewed. These include research and policy related applications spanning local to regional scales. The perspective, however, has been from the evaluation view point with the aim of providing a first overview of the methods being used by COST728 partners to test and evaluate their models. University partners are mostly research oriented, while meteorological services, due to their responsibilities, combine scientific activity with more practical and policy oriented applications. The examples cited in this report show that there is a clear tendency to go in the direction of unified or integrated atmospheric models, sometimes referred to as ‘one- atmosphere’ models, where all the main aspects are treated within the same modelling framework. • Estimation of model uncertainty can be examined either through Monte Carlo analysis or through sensitivity studies. Examples are given where the model uncertainty has been estimated in meteorological parameters as well as in pollutant concentrations. Input parameters can be varied through sensitivity analysis to estimate the resultant change in the output concentrations. This technique has also been applied to investigate the influence of model configuration and settings including changes in initial and boundary conditions, nesting of domains, resolution and the use of parametrization schemes.

58 • A range of model performance quality indicators have been examined including standard error STDE, BIAS, correlation coefficient r and hit rates H. Examples of case studies have been given to show how these indicators should be used for meteorological and air quality applications. More recent indicators such as Relative Percentile Error (RPE) are also examined and could be more robust than the RME indicated in the first EU Air Quality Directives. The measure is currently reformulated and will probably soon be replaced by a measure similar to RME (Eq. 1), but with using the limit values in the denominator. • It is important that model validation datasets should be independent of those used for setting up or calibrating models. To provide a confidence in the model performance, the model should be tested across the range of conditions relevant to its intended applications. In order to facilitate model evaluation, datasets have been collated on a meta-database developed with in COST728 (accessible via www.cost728.org). Such needs have been examined within a wider context of model applications and other European and international initiatives such as ACCENT, IGBP, BIOMOVS II and VAMP. Validation datasets need to be of known quality. • Although the above concepts and framework are continuously being developed within this COST Action, some selected examples of individual studies are provided to show the range of approaches being adopted by the wider community. This report will form the basis for developing more common recommendations and protocols for evaluating mesoscale models. It will lead to model quality assurance procedures based on scientific and fundamental principles. It is intended that these will be employed in joint case studies being planned by COST728. • A limited overview of training for mesoscale modelling in some European countries has been conducted. It was evident that even from this small survey the level of training is quite disparate within the EU. Regular user training seems only to exist in a few countries including Finland, France, Germany, Portugal and United Kingdom.

59 REFERENCES

AQEG, 2004: Nitrogen Dioxide in the United Kingdom. Air Quality Expert Group (AQEG) First Report, Department of the Environment, London, U.K.† Baklanov, A., Fay, B., Kaminki, J. (2007): Overview on existing integrated (off-line and on-line) mesoscale systems in Europe. Available from the web-site: http://www.cost728.org . Beekmann, M., Derognat, C., 2003: Monte Carlo uncertainty analysis of a regional-scale chemistry model constrained by measurements from the Atmospheric Pollution Over the Paris Area (ESQUIF) campaign. Journal of Geophysical Research, 108 , 8559, doi: 10.1029/2003JD003391. Bergin, M.S., Noblet, G.S., Petrini, K., Dhieux, J.R., Milford, J.B., Harley, R.A., 1999: Formal uncertainty analysis of a Lagrangian photochemical air pollution model. Environmental Science and Technology, 33(7), 1116-1126. Bohnenstengel, S. and Schlünzen, K.H., 2006: A locality index to classify meteorological situations with respect to precipitation, submitted to Journal of Applied Meteorology, in review. Bonafè, G. and Jonghen, S., 2006: “ LAMI verification for air quality forecast and assessment purposes: case studies, special measurement campaigns, long-term evaluation, ARPA-SIM Internal Report (available from www.arpa.emr.it/sim). Borrego, C., Schatzmann, M. and Galmarini, S., 2003. Quality assurance of air pollution models. Chapter 7 in SATRUN (Studying Atmospheric Pollution in Urban Areas) ed. Moussiopoulos, Springer. Borrego, C.; Miranda, A.; Costa, A.; Monteiro, A.; Ferreira, J.; Martins, H.; Tchepel, O.; Carvalho, A., 2006: AIR4EU Milestone Report 6.5 - Cross-Cutting 2: Uncertainties of Models & Monitoring, July 2006, Portugal. Borrego, C.; Monteiro, A.; Ferreira, J.; Miranda, A.I.; Costa, A.M.; Sousa, M., 2005: Modelling uncertainty estimation procedures for air quality assessment. In 3rd International Symposium on Air Quality Management at Urban, Regional and Global Scales (AQM), 26-30 September 2005; Istanbul, Turkey - Proceedings of the 3rd International Symposium on Air Quality Management at Urban, Regional and Global Scales. Eds. S. Topçu, M.F. Yardim, A. Bayram, T. Elbir and C. Kahya, Vol. I, pp. 210- 219. Bossioli, E., Tombrou, M., Dandou, A., Soulakellis, N., 2007: Simulation of the effects of critical factors on ozone formation and accumulation in the greater Athens area. Journal of Geophysical Research, 112, D02309. Carvalho, A.C.; Carvalho, A.; Gelpi, I.; Barreiro, M.; Borrego, C.; Miranda, A.I.; Pérez-Muñuzuri, V.: 2006: Influence of topography and land use on pollutants dispersion in the Atlantic Coast of Iberian Peninsula. Atmospheric Environment 40 (21), 3969-3982. Chang, J.C. and Hanna, S.R., 2004: Air quality model performance evaluation. Meteo. Atmos. Phys. 87, 167- 196. Cox, R., Bauer, B. L. and Smith, T. 1998: Mesoscale model intercomparison. Bull. Am. Meteorol. Soc. 79, 265–283. Dabberdt, W.F., Carroll, M.A., Baumgardner, D., Carmichael, G., Cohen, R., Dye, T., Ellis, J., Grell, G., Grimmond, S., Hanna, S., Irwin, J., Lamb, B., Madronich, S., McQueen, J., Meagher, J., Odman, T., Pleim, J., Peter, H., Westphal, D. L., 2004: Meteorological Research Needs for Improved Air Quality Forecasting: Report of the 11th Prospectus Development Team of the U.S. Weather Research Program. Bulletin of the American Meteorological Society, 85 (4), 563-586. Delle Monache, L. and Stull, R., 2003: An ensemble air-quality forecast over western Europe during an ozone episode. Atmos. Environ., 37, 3469-3474. Deserti, M., Lollobrigida, F., Angelino, E., Bonafè G., Minguzzi, E., Stortini, M.; Cascone, C., De Maria, R., Clemente, M., Mossetti, S. and Angius, S., 2004: “Modelling techniques for air quality evaluation and managing in Italy: the work of the national Topic Center”; Proceedings of the 9th Int. Conf. On harmonization within atmospheric Dispersion Modelling for Regulatory Purposes, 197 – 201. Ebel, A., Elbern, H., Feldmann, H., Jakobs, H. J., Kessler, C., Memmesheimer, M., Oberreuter, A. and Piekorz, G., 1997: Air Pollution Studies with the EURAD Model System (3): EURAD - European Air Pollution Dispersion Model System, Mitteilungen aus dem Institut für Geophysik und Meteorologie der Universität zu Köln, Heft 120.

†. http://www.defra.gov.uk

60 EC, 1997: Council Decision 97/101/EC, establishing a reciprocal exchange of information and data from networks and individual stations measuring ambient air pollution within the Member States, OJ L 035, 05.02.1997, 14-22, and its Amended Annexes to 97/101/EC, Commission Decision 2001/752/EC, OJ L 282, 26.10.2001, 69-76.‡ EC, 1999: First Daughter Directive, Council Directive 1999/30/EC, relating to limit values for sulphur dioxide, nitrogen dioxide and oxides of nitrogen, particulate matter, and lead in ambient, OJ L 163, 29.06.1999, 41-60.§ EC, 2002: Third Daughter Directive, Council Directive 2002/3/EC, relating to ozone in ambient air, OJ L 67, 09.03.2002, 14-30.** Elbir, T., 2003: Comparison of model predictions with the data of an urban air quality monitoring network in Izmir, Turkey; Atmospheric Environment 37 (2003) 2149–2157. Fay, B. and L. Neunhäuserer (2006) Evaluation of high-resolution simulations with the non-hydrostatic numerical weather prediction model Lokalmodell for urban air pollution episodes in Helsinki, Oslo and Valencia. Atm. Chem. and Phys.,6. SRef-ID: 1680-7324/acp/2006-6-2107, 2107-2128. Fay, B., L. Neunhäuserer, J.L. Palau, G. Perez-Landa, J.J. Dieguez, V. Ødegaard, G. Bonafe, S. Jongen, A. Rasmussen, B. Amstrup, A. Baklanov, U. Damrath (2005) Evaluation and inter-comparison of operational mesoscale models for FUMAPEX target cities. EU-project FUMAPEX Report D3.4, DWD, Offenbach, Germany, 110pp. Fay, B., L. Neunhäuserer, J.L. Palau, J.J. Dieguez, V. Ødegaard, N. Bjergene, M. Sofiev, M. Rantamäki, I.Valkama, J. Kukkonen, A. Rasmussen, A. Baklanov (2004) Model simulations and preliminary analysis for three air pollution episodes in Helsinki. EU-project FUMAPEX Report D3.3, DWD, Offenbach, Germany, 60pp. Fay, B., Neunhäuserer L., Baklanov, A., Bonafé, G., Jongen, S., Kukkonen, J., Ødegaard, V., Palau, J. L., Perez-Landa, G.,Rantamäki, M., Rasmussen, A., Sokhi, R. S., Yu, Y. (2006) Final results of the model inter-comparison of high-resolution simulations with numerical weather prediction models for 8 urban air pollution episodes in 4 European cities in the FUMAPEX project. Proceedings for oral pres. at 28th ITM, May 2006, Leipzig, Germany. Fay, B., Neunhäuserer, L., Ødegaard, V., Sofiev, M., Valkama, I., Kukkonen, I., Palau, J.L., Pérez-Landa, G., Bonafé, G., Rasmussen, A., Baklanov, A., 2004.: Evaluating and inter-comparing operational NWP and mesoscale models for forecasting urban air pollution episodes in FUMAPEX. 4th Annual Meeting of the European Meteorological Society. Nice, France, 27-30 Sep 2004. Fine, J.; Vuilleumier, L.; Reynolds, S.; Roth, P., and Brown, N., 2003:. Evaluating Uncertainties in Regional Photochemical Air Quality Modeling. Annual Reviews. Environmental Resources. 28:59-106. Downloaded from http://arjournals.annualreviews.org/. Galmarini, S. , Bianconi, R. , Addis, R. , Andronopoulos, S. , Astrup, P. , Bartzis, J.C., Bellasio, R. , Buckley, R. , Champion, H. , Chino, M. , D'Amours, R. , Davakis, E., Eleveld, H. , Glaab, H. , Manning, A. , Mikkelsen, T. , Pechinger, U. , Polreich, E., Pradanova, M. , Slaper, H. , Syrakov, D. , Terada, H. , van der Auwera, L. (2004). Ensemble dispersion forecasting, Part II: Application and evaluation. Atmospheric Environment, 38, 28, 4619-4632. Galmarini, S., Bianconi, R. , Klug, W., Mikkelsen, T. , Addis, R. , Andronopoulos, S. , Astrup, P. , Baklanov, A., Bartiniki, J., Bartzis, J.C., Bellasio, R., Bomyay, F., Buckley, R., Bouzom, M., Champion, H. , D'Amours, R. , Davakis, E., Eleveld, H. , Geertsema, G.T., Glaab, H., Kollax, M., Ilvonen, M., Manning, A. , Pechinger, U., Persson, C., Polreich, E., Potemski, S., Pradanova, M., Saltbones, J., Slaper, H., Sofiev, M.A., Syrakov, D. , Sørensen, J.H., van der Auwera, L., Valkama, I., Zelazny, R., 2004: Ensemble dispersion forecasting, Part I: Concept, approach and indicators. Atmospheric Environment, 38, 28, 4607-4617. Grell, A.G., Dudhia, J. and Stauffer, D.R., 1993: A Description of the Fifth-Generation PENN STATE/NCAR MESOSCALE MODEL (MM5), NCAR Technical Note 398+IA. National Center for Atmospheric Research, Boulder, Colorado, USA. Hanna, S.R., Chang, J.C, Strimaitis, D.G., 1993: Hazardous gas model evaluation with field observations. Atmos Environ 27A: 2265-2285.

‡ http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31997D0101:EN:NOT http://air-climate.eionet.europa.eu/announcements/country_tools/aq/aq-dem/docs/2001_752_EC.pdf § http://eur-lex.europa.eu/LexUriServ/site/en/oj/1999/l_163/l_16319990629en00410060.pdf ** http://eur-lex.europa.eu/pri/en/oj/dat/2002/l_067/l_06720020309en00140030.pdf

61 Hanna, S.R., Chang, J.C. and Fernau, M.E., 1998 Monte Carlo estimates of uncertainties in predictions by a photochemical grid model due to uncertainties in input variables. Atmospheric Environment 32 (21), 3619-3628. Hanna, S.R., Zhigang, L., Frey, H.C.; Wheeler, N., Vukovich, J., Arunachalam, S., Fernau, M., Hansen, D.A., 2001: Uncertainties in predicted ozone concentrations due to input uncertainties for the UAM-V photochemical grid model applied to the July 1995 OTAG domain. Atmospheric Environment 35, 891-903. Hass, H., 1991: Description of the EURAD Chemistry-Transport-Model Version 2 (CTM2), Mitteilungen aus dem Institut für Geophysik und Meteorologie der Universität zu Köln, Heft 83. Hass, H., Builtjes, P.J.H., Simpson, D., and Stern, R., 1997: Comparison of model results obtained with several European regional air quality models, Atmos. Environ., 31, No. 19, 3259-3279. Hass, H., van Loon, M., Kessler, C., Matthijsen, J., Sauter, F., Stern, R., Zlatev, R., Langner, J., Fortescu, V., Schaap, M., 2003: Aerosol modeling: results and intercomparison from European Regional-scale Modeling Systems. A contribution to the EUROTRAC-2 subproject GLOREAM. EUROTRAC report. Hogrefe, C., Rao, S.T., Kasibhatla, P., Hao, W., Sistla, G., Mathur, R., and McHenry, J., 2001: Evaluating the performance of regional-scale photochemical modelling systems: Part II – ozone predictions. Atmospheric Environment, 35, 4175-4188. Kukkonen J., Valkonen E., Walden J., Koskentalo T., Aarnio P., Karppinen A., Berkowicz R.and Kartastenpää R. 2001: A measurement campaign in a street canyon in Helsinki and comparison of results with predictions of the OSPM model. Atmos. Environ. 35-2, pp 231-243. Kukkonen, J., Valkonen E., Walden J., Koskentalo T., Karppinen A., Berkowicz R. and Kartastenpää R., 2000: Measurements and modelling of air pollution in a street canyon in Helsinki. Environmental Monitoring and Assessment 65 (1/2):371-379. Kunz, R. and Moussiopoulos, N. 1995: Simulation of the wind field in Athens using refined boundary conditions. Atmos. Environ. 29, 3575-3591. Lenz, C.-J., Müller, F. and Schlünzen, K.H., 2000: The sensitivity of mesoscale chemistry transport model results to boundary values. Env. Monitoring and Assessment, 65, 287 -298. Majewski, D.,2001: HRM – User’s Guide. Document by Deutscher Weterdienst, Offenbach, Germany, 73 p. Menut, L., 2003: Adjoint modelling for atmospheric pollution sensitivity at regional scale. Journal of geophysical research, 108, 8562, doi: 10.1029/2002JD002549. Model Evaluation Group, 1994: Guideline for model developers’ and model evaluation protocol. European Community, DG XII, Major Technological Hazards Programme, Brussels, Belgium. Moussiopoulos, N. (ed.), 2003: Air Quality in Cities, SATURN/EUROTRAC-2, Subproject Final Report, Springer, Berlin, 298 pp. Moussiopoulos, N., 1995: The eumac zooming model, a tool for local-to-regional air quality studies. Meteor. Atmos. Phys., 57, 115-133. Müller, F., Schlünzen, K.H. and Schatzmann, M., 2000: Test of numerical solvers for chemical reaction mechanisms in 3D air quality models. Environmental Modelling & Software, 15, 639-646. Neunhäuserer, L., Fay, B., Raschendorfer, M. (2007): Towards urbanisation of the non-hydrostatic numerical weather prediction model Lokalmodell (LM). Bound. Lay. Met. 124, 81-97. Niemeier, U., 1997: Chemische Umsetzungen in einem hochauflösenden mesoskaligen Modell - Bestimmung geeigneter Randwerte und Modellanwendungen, Berichte aus dem Zentrum für Meeres- und Klimaforschung, Reihe A, 28. Zentrum für Meeres- und Klimaforschung der Universität Hamburg, Meteorologisches Institut. Nodop, K., Klug, W., Kulmala, A., Dop, H.V., Pretel, J., Addis, R., Fraser, G., Girardi, G.G.F., Inoue, Y., and Kelly N., 1998: ETEX: a European tracer experiment; observations, dispersion modelling and emergency response. Atmos. Environ. 32(24), 4089 – 4094. Nurmi, Pertti (1994) Recommendations on the verification of local weather forecasts, Technical Memorandum 200, European Centre for Medium-Range Weather Forecasts, Shinfield Park, Reading. http://www.ecmwf.int/publications/library/do/references/show?id=86094 Ødegaard, V., A. D'Allura, A. Baklanov, J. Dieguez, B. Fay, S. Finardi, H. Glaab, S.C. Hoe, M. Millan, A. Mahura, L. Neunhauserer, J.L. Palau, G. Perez, L.H. Slørdal, A. Stein, J. Havskov Sørensen (2005) Study of sensitivity of UAP forecasts to meteorological input, met.no report 13/2005. http://met.no/english/r_and_d_activities/publications/2005/13_2005/abstract_13_2005.html. Olesen, H.R., 2001: Ten years of harmonization activities: past, present and future. 7th Int. Conf. On Harmonization within Atmospheric Dispersion Modelling for Regulatory Purposes, Belgirate, Italy. National Environmental Research Institute, Roskilde, Denmark. Web page: www.harmo.org.

62 Olesen, H.R., 2007: Computing hit rate. National Environmental Research Institute, Roskilde, Denmark. Web page: www.harmo.org. Pernigotti, D., Sansone, M. and Ferrario, M., 2005: Validation of one-year LAMI model re-analysis on the Po- Valley, northern Italy. Comparison to CALMET model output on the sub-area of the Veneto Region. Proceedings of the 10th International Conference on Harmonisation within Atmospheric Dispersion Modelling for Regulatory Purposes, HARMO 10, Sissi (Crete, Greece) 17-20 October 2005. Poppe, D. and Kuhn, M., 1996: Intercomparison of the gas-phase chemistry of several chemistry and transport models. EUROTRAC –ISS Report. Roemer, M., Beekmann, M., Bergström, R., Boersen, G., Feldmann, H., Flatøy, F., Honore, C., Langner, J., Jonson, J.E., Matthijsen, J., Memmesheimer, M., Simpson, D., Smeets, P., Solberg, S., Stern, R., Stevenson, D., Zandveld, P. and Zlatev, Z., 2003: Ozone trends according to ten dispersion models. EUROTRAC-2 Special Report, EUROTRAC International Scientific Secretariat, GSF – National Research Center for Environment and Health, Munich, Germany. San José, R., Pérez, J.L., González, R.M., 2004: A mesoscale study of the impact of industrial emissions by using the MM5-CMAQ modelling system. Intern. J. of Environment and Pollution, 22, 1 – 2, 144 – 162. San José, R., Stohl, A., Karatzas, K., Bohler, T., James, P. , Pérez, J.L., 2005: A modelling study of an extraordinary night time episode over Madrid domain. Environmental Modelling and Software, 20, 5, 587-593. San José, R., Pérez, J.L., González, R.M., 2006: The use of MM5-CMAQ for an incinerator air quality impact assessment for metals, PAH, dioxins and furans: Spain case study. Lecture Notes, Large–Scale Scientific Computations, pp. 498-505. Springer-Verlag GmbH. Computer Science. Vol 3743. San Jose, R., Pérez, J.L., González, R.M., 2007: An operational real time air quality modelling system for industrial plants. Environmental Modelling and Software , 22, 297-307. Sathya, V., 2003: Uncertainty analysis in air quality modelling – the impact of meteorological input uncertainties. PhD Thesis. École Polytechniquye Féderale de Lausanne. Schlünzen, K.H., 1990: Numerical studies on the inland penetration of sea breeze fronts at a coastline with tidally flooded mudflats, Beitr. Phys. Atmosph., 63, 243-256. Schlünzen, K.H., 1997: On the validation of high-resolution atmospheric mesoscale models , J. Wind Engineering and Industrial Aerodynamics, 67 & 68 , 479-492. Schlünzen, K.H., 2002: Simulation of transport and chemical transformations in the atmospheric boundary layer - review on the past 20 years developments in science and practice. Meteorol. Zeitschrift, 11, 303 - 313. Schlünzen, K.H., Bigalke, K., Lüpkes, C., Niemeier, U., and von Salzen, K., 1996: Concept and realization of the mesoscale transport- and fluid-model 'METRAS', Meteorologisches Institut, Univerität Hamburg, Germany, METRAS Techn. Rep. 5, 156. Schlünzen, K.H., Builtjes, P., Deserti, M., Douros, J., Kaasik, M., Labancz, K., Matthias, V., Miranda, A.I., Moussiopoulos, N., Ødegaard, V., San Jose, R., Sokhi, R., Sofiev, M., Struzewska, J., 2007: Model evaluation methodologies for mesoscale atmospheric models. DACH 2007, 10.-14.09.2007, Hamburg, extended abstract on the web. †† Schlünzen, K.H., Hinneburg, D., Knoth, O., Lambrecht, M., Leitl, B., Lopez, S., Lüpkes, C., Panskus, H., Renner, E., Schatzmann, M., Schoenemeyer, T., Trepte, S. and Wolke, R., 2003: Flow and transport in the obstacle layer - First results of the microscale model MITRAS. J. Atmos. Chem., 44, 113-130. Schlünzen, K.H., Katzfey, J.J., 2003: Relevance of subgrid-scale land-use effects for mesoscale models. – Tellus 55A, 232–246. Schlünzen, K.H., Krell, U., 1994: Mean and local transport in air. In: Circulation and contaminant fluxes in the North Sea, Springer Verlag, Berlin, p.317-344. Schlünzen, K.H., Meyer, E.M.I, 2007: Impacts of meteorological situations and chemical reactions on daily dry deposition of nitrogen into the Southern North Sea. Atmospheric Environment, 41-2, 289-302. Scire, J. et al., 2000: A user’s guide for the CALMET Meteorological Model. Seaman, N.L., 2000: Meteorological modeling for air-quality assessments. Atmospheric Environment 34, 2231-2259. Shafran, P.C., Seaman, N.L. and Gayno, G.A., 2000: Evaluation of Numerical Predictions of Boundary Layer Structure during the Lake Michigan Ozone Study. Journal of Applied Meteorology 39 (3), 412–426.

†† http://meetings.copernicus.org/dach2007/download/DACH2007_A_00399.pdf

63 Sokhi, R.S., San José, R., Kitwiroon, N., Fragkou, E., Pérez, J.L., Middleton, D.R., 2006: Prediction of ozone levels in London using the MM5-CMAQ modelling system. Environmental Modelling and Software, 21, 4, 566-576. Stein, U., Alpert, P., 1993: Factor separation in numerical simulations. Journal of the Atmospheric Sciences 50 (14), 2107-2115. Steppeler, J., Doms, G., Schättler, U., Bitzer, H.W., Gassmann, A., Damrath, U., Gregoric, G., 2003: Meso- gamma scale forecasts using the nonhydrostatic model LM. Meteorology and Atmospheric Physics, 82, 75-96. Stern, J.; Flemming, R., 2004: Formulation of criteria to be used for the determination of the accuracy of model calculations according to the requirements of the EU Directives for air quality – Examples using the chemical transport model REM-CALGRID, Freie Universität Berlin, Institut für Meteorologie. Thunis, R., Galmarini, S., Martilli, A., Clappier, A., Andronopoulos, S., Bartzis, J., Vlachogianni, M., deRidder, K., Moussiopoulos, N., Sahm, P., Almbauer, R., Sturm, P., Oettl, D., Dierer, S., and Schlünzen, K.H., 2003: MESOCOM: An inter-comparison exercise of mesoscale flow models applied to an ideal case simulation. Atmos. Environ., 37, 363 - 382. Trukenmuller, A.; Grawe, D. and Schlunzen, K.H., 2004: A model system for the assessment of ambient air conforming to EC Directives. Meteo. Zeist. 13, 387-394. US EPA, 1991: Guideline for regulatory application of the Urban Airshed Model. EPA-450/4-91-013. United States Environmental Protection Agency, Research Triangle Park, NC 27711, July 1991. US EPA, 1996: Compilation of Photochemical Models’ Performance Statistics for 11/94 Ozone SIP Applications. EPA-454/R-96-004. US EPA, Office of Air Quality Planning and Standards, Research Triangle Park, NC 27711, 156 pp. web page: http://nepis.epa.gov/pubtitle.htm. van Loon, M., 2006: Evaluation of long-term ozone simulations from seven regional air quality models and their ensemble average. Submitted to Atmospheric Environment. van Loon, M., Roemer, M.G.M. and Builtjes, P.J.H., 2004: Model inter-comparison in the framework of the review of the Unified EMEP model. Report prepared by TNO Environment, Energy and Process Innovation, Apeldoorn, The Netherlands (forthcoming) (http://www.mep.tno.nl/EMEP_review). Vautard, R., 2006. Is regional air quality model diversity representative of uncertainty for ozone simulation? Submitted to Geophysical Research Letters. Vautard, R., Honore, C., Beekmann, M., Rouil, L., 2005: Simulation of ozone during the August 2003 heat wave and emission control scenarios, Atmospheric Environment, 39, no16, pp. 2957-2967. VDI, 2005: Environmental meteorology – Prognostic microscale wind field models – evaluation for flow around buildings and obstacles. VDI Guideline 3783, Part 9, VDI Düsseldorf, Germany. VDI, 2008: Environmental meteorology – Prognostic mesoscale wind field models – Evaluation for dynamically and thermally induced flow fields. VDI Guideline 3783, Part 7, VDI Düsseldorf, Germany, in preparation. Warner, S., Platt, N. and Haegy, J.F., 2004: Applications of user–oriented measure of effectiveness to transport and dispersion model predictions of the European tracer experiment. Atmospheric Environment, 38, 6789-6801. Warner, S., Platt, N. and Haegy, J.F., 2005: Comparison of transport and dispersion model predictions of the European tracer experiment - user oriented measures of effectiveness. Atmospheric Environment, 39, 4425-4437. Yegnan, A., Williamson, D.G., Graettinger, A.J., 2002: Uncertainty analysis in air dispersion modelling. Environmental modelling and Software, 17 (7), 639-649. Yu, Y., Sokhi, R.S. and Middleton, D.R., 2006: Estimating contributions of Agency-regulated sources to secondary pollutants using CMAQ and NAME III models. Report for the UK Environment Agency. Yu, Y., Sokhi, R. S., Kitwiroon, N., Middleton, D. R. and Fisher, B., 2008: Performance characteristics of MM5-SMOKE-CMAQ for a summer photochemical episode in Southeast England, United Kingdom. Atmospheric Environment, 42, 4870-4883. http://dx.doi.org/10.1016/j.atmosenv.2008.02.051 Zhang, F., Bei, N., Nielsen-Gammon, J.W., Li, G., Zhang, R:, Stuart, A., Aksoy, A., 2007: Impacts of meteorological uncertainties on ozone pollution predictability estimated trough meteorological and photochemical ensemble forecasts. J.Geophys. Res. 112, D04304, doi:10.1029/2006JD007429 Zhong, S.Y. and Fast, J., 2003: An evaluation of the MM5, RAMS, and Meso-Eta models at subkilometer resolution using VTMX field campaign data in the Salt Lake Valley. Monthly Weather Review, 131 (7): 1301-1322. Zimmermann, H., 1995: Field Phase Report of the TRACT Field Measurement Campaign, EUROTRAC International Scientific Secretary, Garmisch-Partenkirchen, Germany.

64 ANNEX A

Glossary of Terms

Accuracy*) Extent of agreement between a value to be determined and a reference value Deviation from result*) Difference between a model result and the reference value Evaluation*)†) Assessment of the response of a model and the associated programmes with respect to its performance characteristics, including comparison with measured data, probilistic and statistical analysis, process analysis and sensitivity analysis. Typically, comparisons are made against a set of standards. Macroscale model Hemispheric and global models Mesoscale model Regional model covering domain scales of the order of a few 100km to few 1000km. Model*) Description of atmospheric and associated processes according to physical principles, using fundamental physical equations, assumptions, approximations, and parametrizations. The equation systems of the models described are solved by means of numerical methods with specified boundary and initial values. Microscale model Model which resolves the canopy layer and obstacles Programme*) Translation of the model on a computer using a computer programming language Model calculation*) Use of the programme for a specific application Validation*), †) Testing the extent to (or the accuracy with) which a programme describes, within the formal scope of the model, the phenomena for which it is developed. This can include a comparison with measured data. To validate a model, scope specific criteria need to be defined. Verification*) Act of confirming that the model exhibits the specified behaviour in terms of output results and process analysis for a given case. A model can not be verified in general (all possible cases) but might be verifiable for single cases. Precision‡) Extend of agreement between independent measurements or independent model results for the same situation. Repeatability* Extend of agreement of results of two model experiments performed under the same conditions (same computer, compiler, person, input). Representativeness Range of validity of a measurement over space and time Reproducibility* A measure to indicate how well the model results can be reproduced e.g. by another person by following a procedure defining how to perform the model experiment to be reproduced.

* definition based on VDI (2005) and clarified for the mesoscale † Clarifications used in the definition given by Schlünzen (1997) used here ‡ definition taken as an abridged version from DIN ISO 6879 and adapted for mesoscale models

65 ANNEX B

ENTRIES TO THE WEB BASED MODEL INVENTORY

The tables summarize all entries into the web based model inventory* (Table 21, Table 22, Table 23) for the different scales.

Table 21. Entries in the web based model inventory – microscale models (status 05.02.2007)

transport or chemistry & meteorology & chemistry meteorology transport & transport ADREA AERMOD ADREA Chensi Chensi M-SYS M-SYS MERCURE MERCURE M-SYS Meso-NH microscale - COST Meso-NH Meso-NH MICTM 732 MIMO MIMO MIMO MITRAS RCG MITRAS RCG NAME STAR-CD RCG VADIS STAR-CD VADIS

Table 22. Entries in the web based model inventory – global scale models ( status 05.02.2007)

transport or chemistry & meteorology & chemistry meteorology transport & transport CAM-CHEM CHIMERE CHIMERE (ARPA-IT) EMEP FLEXPART GME FLEXPART/A Macroscale Hirlam GEOS-Chem UM GOCART IMPACT LPDM MATCH MOCAGE SILAM

* http://www.cost728.org

66 Table 23. Entries in the web based model inventory – mesoscale models ( status 24.10.2007)

transport or chemistry & meteorology & chemistry & meteorology transport transport ADREA AERMOD ALADIN-CAMx ADREA AURORA ALADIN/A BOLCHEM ALADIN/PL CAC ARPS CALGRID BOLCHEM CALMET/CALPUFF CALMET/CALPUFF CALMET/CAMx CALMET/CAMx CAMx CLM CHIMERE COSMO-2 CHIMERE (ARPA-IT) COSMO-7 CMAQ ENVIRO-HIRLAM CMAQ(GKSS) GESIMA EMEP GME ENVIRO-HIRLAM Hirlam EPISODE LAMI* EURAD-IM BOLCHEM COSMO_IT FARM CALMET/CALPUFF LM-MUSCAT FLEXPART CALMET/CAMx LME FLEXPART V6.4 LM-MUSCAT LME_MH FLEXPART/A M-SYS M-SYS mesoscale - COST728 LM-MUSCAT MC2-AQ MC2-AQ LOTOS-EUROS MCCM MCCM LPDM Meso-NH MEMO (UoT-GR) M-SYS RCG MEMO (UoA-PT) MARS (UoT-GR) TAPM MERCURE MARS (UoA-PT) WRF/Chem Meso-NH MATCH METRAS MC2-AQ MM5 (met.no) MCCM MM5 (UoA-GR) MECTM MM5 (UoA-PT) MEMO (UoT-GR) MM5 (UoH-UK) MERCURE MM5(GKSS) Meso-NH NHHIRLAM MOCAGE RAMS MUSE RCG NAME SAIMM OFIS TAPM RCG UM SILAM WRF-ARW TAPM WRF/Chem TCAM TREX WRF/Chem

* LAMI is the old name of COSMO-IT

67 ANNEX C

ESTIMATES FOR MEASUREMENT AND MODEL UNCERTAINTY

Table 24. Estimates by Heinke Schlünzen (Meteorological Institute, University Hamburg) and Sylvia Bohnenstengel (Max-Planck-Institute für Meteorologie, Hamburg)

Uncertainty estimate for the Estimate for the variable (includes variable (includes 95 % of 95 % of data) Variable measurement data) (initial and from model results comparison data) ozone concentration at surface ± 15%* Factor of 2† * † NOx concentration at surface ± 15% Factor of 2 NO/NO2 speciation at source ± 10% VOC concentration at surface ± 30% Factor of 3† Top ozone concentration ± 25% Factor of 2

Top NOx concentration ± 25% Factor of 2 Top VOC concentration ± 50% Factor of 2 Side ozone concentration ± 15% Factor of 2† † Side NOx concentration ± 15% Factor of 2 Side VOC concentration ± 30% Factor of 3† † Major point NOx emissions Hourly base: factor of 5 Factor of 2 Major point VOC emissions Hourly base: factor of 8 Factor of 3† ± 1 m/s; 15% ± 3 m/s Wind speed ± 5 m/s Wind direction ± 30 ° ± 60 ° Ambient temperature ± 1.5 K ± 4 K Dewpoint temperature ± 1.5 K ± 4 K RH absolute ±5%, ± 10% RH RH absolute ±10%, ± 20% RH H O concentration (as RH) 2 relative relative Vertical diffusivity (8AM-6PM; Factor of 10 < 1000 MAGL) Vertical diffusivity (all other Factor of 5 times and heights) daily value factor of 3.3, weekly daily value factor of 4, weekly value: Rainfall amount‡ value: factor of 2.1, monthly value factor of 3, monthly value factor of factor of 1.4, annual values: 2%§ 1.5, annual values: 5% Cloud cover (tenths) ± 30% ± 80% Cloud liquid water content This is upper air: Factor 10000 This is upper air: Factor 10000 Surface pressure ± 170 Pa ± 5 hPa ** Area biogenic NOx emission Factor 2 Factor 3 Area biogenic VOC emission Factor 2** Factor 3 * Area mobile NOx emission Factor 3 Area mobile VOC emission Factor 3* Area low point VOC emission Factor 3* * Other area NOx emissions Factor 3 Other area VOC emissions Factor 3*

* This includes uncertainties resulting from lack in representativeness of the sites (but traffic sites are left out) † Within the urban canopy layer uncertainties will be larger ‡ Assumed is here that single station data are used for comparison. The area representative model results are also thought to be compared with single station measurements (not radar or satellite data) § Bohnenstengel & Schluenzen (2007): A locality index to classify meteorological situations with respect to precipitation. Submitted to Journal of Applied Meteorology ** Concerns the principal relation of biogenic emission with respect to vegetation (does not include temperature/humidity /radiation errors)

68

* NO2, HCHOr, HCHOs, ALDs, Factor 3 and O3-O1 Photolysis rates Factor 2* Factor 3†

Table 25. Estimates by Ranjeet Sokhi, University of Hertfordshire

Uncertainty estimate for the Uncertainty range Sigma (log-normal Variable variable (includes 95 % of data) unless noted) (includes 95 % of data) Initial ozone concentration Factor of 3 0.549

Initial NOx concentration Factor of 5 0.805 Initial VOC concentration Factor of 5 0.805 Top ozone concentration Factor of 1.5 (50 %) 0.203

Top NOx concentration Factor of 3 0.549 Top VOC concentration Factor of 3 0.549 Side ozone concentration Factor of 1.5 0.203

Side NOx concentration Factor of 3 0.549 Side VOC concentration Factor of 3 0.549

Major point NOx emissions Factor of 1.5 0.203 Major point VOC emissions Factor of 1.5 0.203 Wind speed ±10% Factor of 1.5 0.203 Wind direction ±10o ± 40 º 20 º (normal) Ambient temperature ±2 K ± 3 K 1.5 K (normal)

H2O concentration (as RH) 30 % 15.0 % (normal) Vertical diffusivity (8AM-6PM; Factor of 1.3 (30 %) 0.131 < 1000 MAGL) Vertical diffusivity (all other Factor of 3 0.549 times and heights) Rainfall amount Factor of 2 0.347 Cloud cover (tenths) 30 % 15 % (normal) Cloud liquid water content Factor of 2 0.347

Area biogenic NOx emission Factor of 2 0.347 Area biogenic VOC emission Factor of 2 0.347

Area mobile NOx emission Factor of 2 0.347 Area mobile VOC emission Factor of 2 0.347 Area low point VOC emission Factor of 2 0.347

Other area NOx emissions ±8% Factor of 2 0.347 Other area VOC emissions ±10% Factor of 2 0.347 NO , HCHOr, HCHOs, ALDs, -15% 2 Factor of 2 0.347 and O3-O1 Photolysis rates Factor of 2 0.347 0.10 to 0.55 Factor of 1.01 to 3.02 CB-4 reactions 1-94 Median 0.30, Mode Median 1.80, Mode 2.5 0.46 * assuming hourly emission data. † including errors in cloudiness and upper-air concentrations.

69 ANNEX D

STATISTICAL MEASURES FOR METEOROLOGICAL PARAMETERS

The statistical measures most commonly used with numerical weather prediction (NWP) models are discussed below in relation to meteorological parameters from NWP simulations. For some parameters the error inevitably increases with the measured value. The normalization of the measure is one way to avoid that the large errors related to a few extreme observations dominate the error measure.

In the following, predicted values are shortened by Pi and observed values by Oi . For each single site and each time (i). All in all we consider a dataset of N values.

D-1 ERROR (EI) The difference (Ei) between simulation and observation is calculated for each site and each time by

Ei = Pi − Oi (3) This error should be 0.0 for an ideal forecast.

D-2 AVERAGE VALUES The average value of measurements O and model results P are calculated as given in Eq. (4) and (5), respectively. 1 N O = ∑Oi N i=1 (4) 1 N P = ∑Pi N i=1 (5)

The two averages should be the same for an ideal forecast.

D-3 STANDARD DEVIATIONS The standard deviation of measurements and model results are calculated as given in Eq. (6) and (7), respectively. 1 N 2 σo = ∑ ()Oi − O (6) N i=1 1 N 2 σP = ∑ ()Pi − P (7) N i=1

The two standard deviations should be the same for an ideal forecast.

D-4 BIAS The average difference (BIAS) of all Ei for each forecast length is calculated as 1 N BIAS = ∑Ei = P − O (8) N i=1

The BIAS should be zero for an ideal forecast.

BIAS gives a measure of the sign of the error of the simulations. This is particularly useful for model developers, as it points to weak parts of the model. In addition, BIAS is easily corrected with statistical post-processing. It is important to note that BIAS can vary in time and space. Breaking up the data into time of the year and the day, and into single locations gives more information and better possibilities for correction. Moreover, the BIAS gives information of systematic model errors for particular observational values, e.g. low wind speed cases if the data are sorted when

70 calculating the BIAS.

D-5 STANDARD DEVIATION OF ERROR (STDE) The standard deviation of error (STDE) evaluates the non-systematic part of the error and is a measure of model predictability. N 1 2 STDE = ∑[]()Pi − P − ()Oi − O (9) N i=1

The STDE should be zero for an ideal forecast. STDE usually increases with forecast length. When STDE has the same magnitude throughout the model simulation it can be interpreted as if the error is saturated initially due to model deficiencies.

D-6 SKILL VARIANCE (SKVAR) The skill variance (SKVAR) evaluates the ability of the model in reproducing the variance of the observed data. It is also sometimes named normalized standard deviation. σ SKVAR = P (10) σO

The SKVAR should be one for an ideal forecast.

D-7 ROOT MEAN SQUARE ERROR (RMSE) The total error (RMSE) results from STDE and BIAS:

N N 1 2 1 2 2 2 RMSE = ∑()Pi − Oi = ∑Ei = BIAS + STDE (11) N i=1 N i=1

Root mean square error (RMSE) is simply a combination of BIAS and STDE and expresses the total model error. The RMSE should be zero for an ideal forecast. It is a useful measure for a comparison of e.g. two different models. One should bear in mind that the squaring implies that few large errors have relatively more impact on the measure than many small errors.

D-8 CORRELATION COEFFICIENT (r) The correlation coefficient (r) is very similar to STDE. However r is dimensionless while STDE has the dimension of the measured parameter. There is a caution on using r on very long time series, e.g. years. The annual amplitude of the temperature is usually so large that r will be dominated by this large scale structure. This hides the smaller scale errors like diurnal cycles. Caution should also be given to r, because a large systematic error (BIAS) will not be expressed. It is calculated as: ⎡ 1 N ⎤ ()O − O ()P − P ⎢N ∑ i i ⎥ r = ⎢ i=1 ⎥ (12) ⎢ σOσP ⎥ ⎣⎢ ⎦⎥

The correlation coefficient r should be one for an ideal forecast.

D-9 HIT RATE (H) Hit rate (H) is defined as the fraction of the total simulation data that have a value which is inside an acceptable range DA of the simultaneous observations. H is particularly useful as an overall measure to evaluate model performance. H is one of the few measures that do not assume a Gaussian distribution of the errors. The hit rate can be interpreted as the probability of detection (POD).

71 N 1 ⎧1 for Ei ≤ DA H = ∑ni with ni = ⎨ (13) N i=1 ⎩0 else

DA is the desired accuracy (Table 26). H allows the comparison of model results for quite different meteorological situations. Table 26 gives guidance on ranges that can be used for categorizing values of some weather parameters. The hit rate H should be 100% for an ideal forecast.

Table 26. Desired accuracy DA (values taken from Cox et al., 1998* Variable Temperature Dew point Wind speed Wind Pressure (°C) depression (°C) (m s-1) direction (hPa) Desired accuracy ± 2 ± 2 ± 1 for ff< 10m s-1 ± 30° ± 1.7 DA ± 2.5 for ff> 10m s-1

Precipitation categories are suggested to have a desired accuracy dependent on the precipitation amount (0-1mm/day, 1-5mm/day, 5-10mm/day, 10-25 mm/day, above 25 mm/day). A similar approach can be used for wind speed.

D-10 HIT RATIO (HR) For precipitation the hit ratio HR is also a very good measure. It describes the model's capability of simulating extreme events (no rain or rain). Using the contingency table for a-d (Table 27), HR can be calculated using eq. 14. Optimum value for HR is 1.

a + d HR = (14) a + b

Table 27. Contingency table (yes means value inside interval DA) observed event yes observed event no forecast event yes A B forecast event no C D

For calculation of an overall HR, e.g. hit in all categories of the event, the accuracy ranges used in the contingency tables are suggested to be constant for parameters as temperature, dew point temperature and pressure. For wind speed and precipitation the accuracy ranges should be larger for large amounts/speed as the variability in the observed values increase with increasing amounts/speed. For precipitation the following ranges are suggested: 0 to 1 mm/day, 1 to 5 mm/ day, 5 to 10 mm/day, 10 to 25 mm/day, above 25 mm/day. For wind the suggested ranges are 0 to 2 m/s, 2 to 5 m/s, 5 to 10 m/s, 10 to 20 m/s, above 20 m/s. The optimum vale for HR is 1.

D-11 FALSE ALARM RATIO (FAR) The false alarm ratio (FAR) should also be calculated to ensure that the frequency of extreme events is not over-predicted. Using the classification given in Table 27, FAR can be calculated: b FAR = (15) a + b Values for FAR range from 0 to 1, optimal score is 0.

______

* Values are also used in Schlünzen and Katzfey (2003), Trukenmüller et al. (2004), Schlünzen and Meyer (2006)

72 An alternative definition is: b FAR1= (16) b + d Again, the optimum values is 0.

D-12 DIRECTION WEIGHTED WIND ERROR (DIST) The direction weighted wind error DIST takes into account both wind intensity and direction. With ui and vi being the wind vector components (either measured (Oi) or predicted (Pi)) at a specific site and time, and n being the total number of observations, DIST is defined as:

1 N 2 2 (17) DIST(s) = ∑ ((uPi-uOi ) + ( v Pi -v Oi ) ) N i=1

The DIST should be zero for an ideal forecast.

D-13 MEAN ABSOLUTE ERROR (MAE)

1 N MAE = ∑ Ei (18) N i=1 takes only positive values and is less sensitive to large errors than is the root mean square error. The MAE should be zero for an ideal forecast.

D-14 PROBABILITY OF DETECTION (POD) In order to evaluate the model’s ability to forecast a particular event, e.g. rain(event yes) / no rain (event no) the probability of detection POD (Nurmi, 1994) is commonly used. Using the classification given in Table 27, POD is defined as: a POD = (19) a + c

The values of POD lies between 0 and 1, optimal score is 1. POD can be interpreted as the number of correct alarms in relation to number of occurring events.

Combining POD and FAR gives ⎧> 1 systematic overestimation ⎪ POD + FAR ⎨= 1 no bias (20) ⎪ ⎩< 1 systemaitc underestimatiuon

D-15 HANSSEN-KUIPERS SKILL SCORE (KSS) POD and FAR1 are combined to give the Hanssen-Kuipers skill score (KSS):

KSS = POD − FAR1 (21) KSS range is -1 to 1, optimal score is 1.

73 ANNEX E

STATISTICAL MEASURES FOR CONCENTRATIONS

This section includes those statistical measures that are frequently used for evaluating concentration forecasts. Some of the parameters are already defined in Annex D and the formulas are only repeated here for clarity. As in Annex D, predicted values are shortened by Pi and observed values by Oi and N values are considered.

Table 28. Quality indicators for air quality model performance evaluation Parameter Formula Ideal Eq. value Average observed value 1 N (4) O = ∑Oi N i=1 Average modelled value 1 N Same (5) P = P ∑ i as O N i=1 Standard deviation of N 2 (6) measurements 1 σo = ∑ ()Oi − O N i=1

Standard deviation of 2 Same (7) 1 N model results as σ σP = ∑ ()Pi − P O N i=1 Average normalized ⎛ P − O ⎞ 0.0 (22) ⎜ ⎟ absolute BIAS ANB = ⎜ ⎟ ⎝ O ⎠ Mean normalised BIAS N 0.0 (23) 1 ⎛ Pi − Oi ⎞ MNB = ∑⎜ ⎟ N i=1 ⎝ Oi ⎠ Mean normalised error 1 N ⎛ P − O ⎞ 0.0 (24) ⎜ i i ⎟ MNE = ∑⎜ ⎟ N i=1 ⎝ Oi ⎠ Standard deviation of N 0.0 (9) 1 2 error STDE = ∑[]()Pi − P − ()Oi − O N i=1 Fractional bias (P − O) 0.0 (25) FB = 0.5()P + O Geometric mean bias ⎛ 1 N 1 N ⎞ 1.0 (26) MG = exp⎜ ∑lnPi − ∑lnOi ⎟ ⎝ N i=1 N i=1 ⎠ Geometric variance 2 1.0 (27) ⎛⎛ 1 N 1 N ⎞ ⎞ VG = exp⎜⎜ lnP − lnO ⎟ ⎟ ⎜ ∑∑i i ⎟ ⎝⎝ N i==1 N i 1 ⎠ ⎠ Skill variance σ 1.0 (10) SKVAR = P σO Root mean square error N N 0.0 (11) 1 2 1 2 RMSE = ∑()Pi − Oi = ∑Ei N i=1 N i=1 = BIAS2 + STDE2 N Normalized mean square 1 2 0.0 (28) error ()P − O N ∑ i i NMSE = i=1 PO 74 Parameter Formula Ideal Eq. value Correlation coefficient ⎡ 1 N ⎤ 1.0 (12) (O − O)(P − P) ⎢N ∑ i i ⎥ r = ⎢ i=1 ⎥ ⎢ σOσP ⎥ ⎣⎢ ⎦⎥ Coefficient of variation STDE 0.0 (29) CV = O Fraction of predictions ⎧ P 1.0 (30) within a factor of two of 1 N ⎪1for 0.5 ≤ i ≤ 2 observations FAC2 = ∑ni with ni = ⎨ Oi N i=1 ⎪ ⎩0 else

Hit rate 1 N 1.0 (31) HC = ∑ni N i=1

⎧ Ei ⎪1 for ≤ A or Ei ≤ DA with ni = ⎨ Oi ⎪ ⎩0 else A desired relative accuracy DA minimum desired absolute accuracy N Index of agreement 2 1.0 (32) ∑()Pi − Oi i=1 IOA = 1− N 2 ⎛ ⎞ ∑⎜ Pi − P + Oi − O ⎟ i=1 ⎝ ⎠ Unpaired peak 0.0 (33) Pmax − Omax concentration accuracy A u = Omax

Pmax , Omax are unpaired maxima (no timing / spacing considered ) Spatially-paired peak 0.0 (34) Pmax ,x − Omax,x concentration accuracy A s = Omax,x

Pmax,x , Omax,x are maxima paired in space (but not in time)

75 ANNEX F

EVALUATION OF DIFFERENT WAVELENGTHS

Given two curves, for example. wind direction, observed and simulated, one can calculate the standard deviation of the difference between them in the form of standard deviation of the error (STDE). When the simulated wind direction has a phase displacement (an error) of 60° compared to the observations, the STDE is approaching the one having the simulations as a straight line (Figure 16). Intuitively there is more information in the curve with correct amplitude which is displaced 60° than in a field without any variation. This shows that STDE is not always the best measure.

Horizontal displacement of min/max values might occur as a result of physiographic properties which are not fully resolved in the numerical model. Doing a similar exercise as for STDE with correlation coefficient gives the same result. How to deal with this property of the statistical measures is not clear. Several attempts to suggest other measures have been made but still the traditional statistical measures are widely used. Comparing frequency distribution (or doing spectral analysis) from observations and models is one method to evaluate models but that doesn't give a measure of how the model captures the day to day observed values on a specific location. When using for example the hit rate it is at least ensured that simulated data are within an allowed difference to the measured values.

Figure 16. Standard deviation of error calculated for the difference between observations (solid) and simulations (broken) when the simulations have a phase displacement of 15° (top), 30° (second plot), 45° (third plot) and 60° (fourth plot) relative to the observations, and if the simulation is a straight line (lower)

76 ANNEX G

DETAILED EVALUATION RESULTS FROM FUMAPEX (FP 5 PROJECT)

A summary of the results from the various intercomparisons is provided here, separated in the episodic evaluation and the long-term evaluation.

Episode Evaluation Results The main results for the episodes are:

• Poor forecast of meteorological inversions in most models: - underpredicted inversion strength dominant in Northern European areas with inversion- induced episodes (Helsinki Dec 1995; Oslo Jan 2003); - for all models: underpredicted inversion strength also in Helsinki spring dust episodes; - for most models: underpredicted inversion strength in Po valley for inversion-induced winter episodes and for night-time inversions in summer ozone episodes. • Overpredicted surface and 2 m temperatures in many models/cases for extreme inversion episodes. • Underpredicted stability (key meteorological episode-predictor) for all models for cases of (very) stable stratification/inversions, exceeding vertical exchange. • Overpredicted 10 m wind speeds in calms or low wind conditions for all models (episode- predictor especially for inversion-induced episodes, coincides with / leads to reduced inversion strength and stability. • False wind direction forecasts, often in combination with inversions and overpredicted wind speeds may lead to erroneous temperature advection especially in regions with large temperature gradients: - in mountain-valley systems like the Po valley or - in coastal areas with frozen land / open sea in Scandinavian winters (e.g. Helsinki Dec 1995 for FMI HIRLAM, DWD COSMO_EU, CEAM RAMS, partly DMI HIRLAM; Oslo Jan 2003 for DWD COSMO_EU (and partly DNMI MM5)). • Successful forecast of the episode-predictors and various wind field structures determining pollutant concentrations (e.g. rising maximum temperatures, drainage winds, time- dependent convergence lines inland and at sea, combined sea breeze and upslope circulations with pollutant injections) for the Valencia ozone episode, for both participating models (CEAM RAMS, DWD COSMO_EU). • A very complex wind situation with poor predictability seems to prevail for Bologna city (2002 episodes) due to the very variable occurrence and superposition of local-scale, mesoscale and synoptic scale influences between an Apennine valley-mountain circulation and a larger Po valley circulation. Both participating models (ARPA COSMO_IT, DMI HIRLAM) succeed in forecasting nocturnal drainage winds and mountain-valley circulations in the Apennines and sudden wind speed increases marking interruptions or the end of episodes for ARPA COSMO_IT.

Long-term Model Evaluation Results The meteorological services ARPA-SIM, DMI, DNMI, DWD and FMI participated in the longer-term statistical evaluation (usually 1 year) for 2 m temperatures (Figure 17), 10 m winds, and 2 m relative humidity performed to analyse a more representative data sample and place the episode results in the longer-term context. Especially the results for the 50 stations for COSMO_EU/COSMO_IT were also investigated in grouped categories: urban, suburban, rural, Po valley and Apennine mountains (Section 7.1.7).

The different models perform well or poorly depending on: • Chosen station or group of stations. • Meteorological parameter and time of day (forecast hour). • Chosen statistical score (BIAS or RMSE). • Partly also on the season for the same parameter. Comparing the size of parameter deviations from observations during the episode with the 77 evaluated one year statistical scores for the models, some unusually poor model performance is detected for some of the episodes. This also points to these episodes showing some extreme behaviour of meteorological parameters. The episodes and models concerned are

• Helsinki Dec 1995 (all models). • Oslo Jan 2003 (DNMI MM5, DMI HIRLAM, DWD COSMO_EU = all models participating). • Bologna Jan (DMI HIRLAM, ARPA COSMO_IT = all models participating).

(a) (b) LM all LM urb LM sub Lm rur LM Blind LM all LM urb LM sub Lm rur LM Blind LAMI mount LAMI val Piet(sub) Panig(rur) Met(urb) LAMI mount LAMI val Piet(rur) Panig(sub) Met(urb) FMI-HIRLAM Blind 1 Blind 3 Blind 10 Tryv 1 FMI-HIRLAM Blind 1 Blind 3 Blind 10 Tryv 1 Tryv 3 Trayv 10 H10 Helsin H10 Copenh Tryv 3 Tryv 10 H10 Helsin H10 Copenh 4,5 4,5 4 4 3,5 3,5 3 3 2,5 2,5 T2n rmse (°C)

2 T2m rmse (°C)

1,5 2

1 1,5

0 3 6 9 5 8 3 6 9 5 8 0 3 6 9 5 8 7 6 5 12 1 1 21 24 27 30 3 3 3 42 4 4 12 1 1 21 24 2 30 33 3 39 42 4 48 forecast hour forecast hour Figure 17. Example statistics of model inter-comparison of 2 m temperature, RMSE summer (a), RMSE winter (b). If only one value is given in the position of forecast hours 22 to 27, it is valid as averaged over the whole 48 h forecast (for all DNMI results: only 24 forecast hours available) Piet(rur), Panig(sub) and Met(urb) are simulated with COSMO_IT, Blind 1,3,10 and Tryv 1,3,10 with DNMI MM5(1+3 km) and DNMI HIRLAM (H10=10 km), as are H10 Helsinki + Copenhagen

Model performance for episode forecasting seems to depend mainly on the model ability to forecast the specific meteorological episode features in sometimes complex locations and even for extreme meteorological conditions, and on the station representativeness and observation quality. The performance depends much less on the location being urban, suburban or rural.

78 ANNEX H

DETAILS ON THE EVALUATION OF COSMO_IT FOR AIR QUALITY AND ASSESSMENT PURPOSES

Meteorological parameters playing a key role in the Po Valley winter pollution episodes are temperature inversions and wind fields. Summer episodes are characterised by gradually increasing maximum temperature values and very weak winds, including local circulations. The key parameters evaluated during some pollution episodes are temperature inversion, temperature and relative humidity at 2 meter, surface energy budget, wind speed and direction at 10 meter, wind profiles, cloudiness and turbulent kinetic energy. The experimental data to evaluate the model, were collected during a micrometeorological field campaign carried out in San Pietro Capofiume, a rural site located in the Po Valley, 25 km from Bologna, during winter 2004/2005 and spring 2005. Solar and infrared radiation data have been collected by means of a radiometer; sensible heat flux, friction velocity and Monin-Obukhov length data have been collected by means of a sonic anemometer. The site is a flat grassland area surrounded by farmland.

The data from this campaign have been compared with the routine analysis of COSMO_IT (LAMA) that uses data from standard meteorological stations for assimilation. At the same rural location soil moisture data have been collected by means of a TDR and compared with the model soil moisture. Soil moisture is a relevant parameter for air quality purposes. It modifies the partitioning of the heat fluxes at the surface between sensible and latent heat flux, which determines the temperature profile of the PBL and consequently its stability. In some chemical transport models (CTM), the soil moisture plays a key role in the calculation of the resuspension of the aerosols.

The evaluation of the standard meteorological parameters (temperature, wind speed, wind direction and humidity) for the 48 h forecasts starting at 00 UTC was performed for a long term period and the 4 seasons (April 2003 - March 2004). The evaluation was separated for different areas using data (Table 29 - Table 31) from the following stations:

• Bologna (Bonafè and Jonghen, 2006): 3 stations (1 rural, 1 urban, 1 suburban). • Emilia Romagna (Bonafè and Jonghen, 2006): 32 temperature stations (28 in the Po Valley + 4 in the Apennine), 13 wind stations (12+1), 1 radiosounding station. • Veneto, Piemonte and Emilia-Romagna (Pernigotti et al., 2005): 42 wind stations. On the Veneto test area the COSMO_IT analysis were also compared with the CALMET (Scire et al., 2000) wind fields (diagnostic fields based on 21 wind stations).

Table 29. Summary of the COSMO_IT results for the pollution episodes

Parameters/features Where when COSMO_IT results Inversion Po Valley strongly underestimated

T2m and RH2m daily cycles close to Bologna summer too smooth T in the PBL close to Bologna summer Underestimated urban heat island urban areas not reproduced wind at 10 m Po Valley more frequently over- than underestimated valleys in the Alps always underestimated wind canalization valleys in the Apennine night, summer Underestimated, not very accurate valleys in the Alps night, summer 1.1 km: well reproduced, but often it lasts too long Cloudiness sometimes different with different resolution

T2m over the Alps winter episode 7 km: 15-18°C underestimated km: 4-8°C underestimated (but prognostic T doesn’t change) TKE close to Bologna winter episode Increases with finer resolution spatial variability of the vertical close to Bologna winter episode increases with finer resolution velocity

79 Table 30. Summary of the results of the evaluation of COSMO_IT analysis against observed data collected during special campaigns

Parameters/features When COSMO_IT results friction velocity mean daily course is reproduced late afternoon, evening ∼0.2 m/s overestimation Monin-Obukhov length overpredicts the occurrence of unstable conditions underpredicts the occurrence of stable conditions Evening stabilization of the surface layer often occurs too late SHF random errors afternoon ~200 W/m2 instead of ~100 W/m2 Night [-20, -50] W/m2 instead of [0, –10] W/m2 infrared radiation budget when observed is [-80,0] W/m2, simulated is often [-120,-75] W/m2 visible radiation budget good agreement between observations and simulation more than half of the errors are positive and very small soil moisture good relation between observation and simulation systematic overprediction of ∼0.03 m3/m3

Table 31. The most relevant errors detected through the long-term evaluation

Parameters Where When COSMO_IT results T2m Bologna urban area Night strong underestimation (∼3°C) evening, winter/spring/summer large RMSE (∼4°C) Summer/winter underestimation (∼2.5°C) Apennine spring too strong daily cycle Winter very large RMSE (∼6°C) Po Valley Summer too smooth daily cycle all stations spring/winter large RMSE (>3°C) Summer RMSE grows with verification time RH2m Bologna urban area night, spring/summer overestimation (>20%), large RMSE (∼30%) Apennine summer overestimation (∼20%) Po Valley summer too smooth daily cycle all stations summer RMSE grows with verification time wind speed Bologna urban area overestimation (∼1m/s) morning, spring large RMSE (>2 m/s) at the mouth of Reno Valley summer underestimation (∼1 m/s), large RMSE (∼2 m/s) morning, spring large RMSE (>2 m/s) Apennine large RMSE (>2 m/s) afternoon, summer underestimation (∼1 m/s) night, autumn underestimation (∼1 m/s) Po Valley winter large RMSE (∼2.5 m/s) autumn/winter overestimation (∼0.5 m/s) wind direction at the mouth of Reno Valley night, autumn/winter/spring very large RMSE (∼90°) night SSW winds poorly forecasted all stations large RMSE (∼70°) Apennine noon NNE winds poorly forecasted Po Valley N wind frequency overestimated Inversions Eastern part of the Po Valley summer strong underprediction autumn/winter/spring Underprediction central part of the Po Valley summer Underprediction

Scores calculated are mean absolute error MAE, BIAS and RMSE, for wind direction mean absolute error, hit rate HR for 45° sectors and wind roses for the wind speed classes <1 m s-1, <2 m s-1, <4 m s-1, <7 m s-1, <10 m s-1, <20 m s-1: the evaluation is separated for the 1st, 2nd and 3rd day of forecast. For the statistical evaluation of wind in comparison with CALMET results and measured data for the Veneto area BIAS (m/s), RMSE (m/s), skill variance SKVAR, r (correlation

80 coefficient) were used with wind direction hit rate H ±30° (H30°) and ±60° (H60°). The DIST score and the combined hit rate H for wind speed and direction (H0.5m/s&30° and H1m/s&60°) were calculated to take into account both wind speed and direction.

Table 32 summarize the results of the comparison between COSMO_IT and CALMET where the statistical parameter on verification stations are reported: the first three columns below COSMO_IT refer to Po Valley domain (42 stations); the fourth rows refer to COSMO_IT evaluated just on the surface stations inside the CALMET domain to be compared to the CALMET column, which refer to CALMET model (21 stations).

Table 32. Medians and skill scores for the distributions of some statistical values for the comparison between wind speed data and COSMO_IT output over the test areas of Piemonte (PIE), Emilia-Romagna (EMR), Veneto (VEN), and CALMET

COSMO_IT Median values based on all the CALMET CALMET PIE EMR VEN available surface stations domain # DATA 8687 8327 8758 8758 8759 MODEL_MEAN 2.15 2.49 2.69 2.61 2.18 STATIONS MEAN 1.71 2.17 1.92 1.92 1.92 MODEL DEV 1.48 1.56 1.70 1.68 1.46 STATIONS DEV 1.21 1.56 1.37 1.44 1.44 BIAS (m/s) 0.52 0.31 0.74 0.69 0.31 RMSE (m/s) 1.60 1.44 1.63 1.58 1.40 SKVAR 1.32 1.02 1.24 1.17 1.03 R (correlation) 0.39 0.62 0.65 0.65 0.66 H30° 0.31 0.43 0.42 0.44 0.42 H60° 0.52 0.66 0.70 0.70 0.70 DIST (m/s) 2.13 2.07 2.04 2.00 1.79 H0.5m/s&30° 0.16 0.21 0.19 0.20 0.23 H1m/s&60° 0.36 0.44 0.41 0.43 0.51

As can be seen from Table 32, COSMO_IT systematically overestimates the wind speed, especially in the Veneto and Piemonte regions; CALMET performs slightly better but is still overestimating the wind. It has to be noted that on Piemonte the correlation coefficient (r) and the rates of success (H30° and H60°) are particularly poor, in particular COSMO_IT does not seem to model correctly the north-western component.

The surface energy budget and dispersion parameters were verified against data from measurement campaigns. Results show that the mean daily course of the friction velocity (Figure 18a,b) and sensible heat fluxes (Figure 18c,d) are well reproduced, but the sensible heat flux is overestimated. Soil moisture in the upper ground level is in good agreement with data during wet periods (Figure 19), but is overestimated during dry periods (BIAS ~0.03 m3/m3 during summer, ~0.08 m3/m3 during winter).

81

(a) (b)

(c) (d)

Figure 18. Daily course of the quartiles (minimum, 1st quartile, median, 2nd quartile and maximum) of the friction velocity (a, b) and the sensible heat flux (c, d) simulated (a, d) and observed (b, d) for a rural site in the Po Valley for 27.03.05-11.04.05

Figure 19. Water content[m3/m3] in the upper 10 cm of the soil, observed (black dots) and simulated by COSMO_IT (assimilation cycle, red line), in a rural site in the Po Valley

82 ANNEX I

DETAILED RESULTS OF THE EVALUATION OF CMAQ FOR AN EPISODE OVER UK

Figure 20 shows scatter plots of measured–modelled O3 pairs (a) and NO2 pairs (b) for all the modelling hours and sites for the 3-km resolution simulation. Table 33 and Table 34 summarize the corresponding O3 and NO2 performance statistics for both 9-km and 3-km resolution simulations.

(a) (b) 250 300 rural Rural Suburban Suburban 250 200 Urban Background Urban BG urban center Urban Centre ) ) -3 200 -3 g m

150 gm μ μ 150 100 100 Modelled ( modelled (

50 50

0 0 0 50 100 150 200 250 0 100 200 300 -3 Observed (μg m-3) Observed (μgm )

Figure 20. Measured versus modelled O3 (a) and NO2 (b) concentration for all data pairs. Modelled O3 and NO2 concentrations were from the first model level (about 14 m AGL)

-3 On average, the model slightly under-predicts O3 concentrations with a BIAS of -3.6 μg m and a MNB of 30 % for 3 km resolution and a BIAS of -0.8 μg m-3 and a MNB of 39.4 % for 9 km resolution (Table 33). The mean absolute error (MAE) and mean normalized error (MNE) values over all hours and sites are 24 μg m-3 and 60 %, respectively, for 3 km resolution and are 25 μg m-3 and 70 %, respectively, for 9 km resolution. In total, nearly 81 % of all modelled concentrations are within a factor of 2 of the corresponding measured O3 concentrations. The 9-km and the 3-km resolution simulations give comparable model performance for most of the stations, but a better performance is achieved by the 3-km resolution simulation at sub-urban and urban centre stations, where the effects of local sources are more important.

Table 33. Quantitative performance statistics for near surface O3 predictions with both 9-km and 3- km resolution for the innermost domain, i.e. domain 4

Scores All sites Rural Sub urban Urban BG Urban centre 3 km 9 km 3 km 9 km 3 km 9 km 3 km 9 km 3 km 9 km -3 Obs0 (μg m ) 73.3 88 64.6 81.8 56.5 -3 Mod0 (μg m ) 69.6 72.4 82.8 81.7 66.3 71.7 73 74.3 57 62.7 Corr. Coeff. R 0.69 0.66 0.76 0.73 0.65 0.63 0.69 0.69 0.56 0.53 BIAS (μg m-3) -3.6 -0.8 -5.3 -6.3 1.7 7.0 -8.8 -7.5 0.5 6.1 MNB% 30.4 39.4 8.4 8.1 67.2 85.4 11.3 12.8 41.1 60.3 MAE (μg m-3) 24.2 25.2 21.5 22.3 24.6 26.6 25.7 25.4 23.8 25.6 MNE% 59.8 69.9 29.6 31.0 94.6 108.4 43.3 43.1 75.6 87.1 RMSE 31.2 31.9 28.7 29.8 30.6 31.7 33.3 33.0 30 31.7

σo (μg m-3) 42.5 41.2 40 44.3 34.7

σp (μg m-3) 28.1 26.9 22.8 22.4 27.3 25.9 27 26.9 28.6 28

83 In the case of NO2 prediction the model shows an under-prediction of NO2 concentrations with a BIAS of -11.8 μg m-3 and a MNB of –14.7% for 3 km resolution and a BIAS of -13.6 μg m-3 and a MNB of -14.0 % for 9 km resolution. The mean error (ME) and mean normalized error (MNE) values over all hours and sites are 18.5 μg m-3 and 50.5 %, respectively, for 3 km resolution and 20.0 μg m-3 and 55.7%, respectively, for 9 km resolution. Over all the sites, nearly 62 % of modelled concentrations are within a factor of 2 of the corresponding measured values. In general the 3-km resolution gives better predictions than the 9-km resolution simulation, especially for urban areas.

Table 34. Quantitative performance statistics for near surface NO2 predictions with both 9-km and 3-km resolution for the 3-km innermost domain, i.e. domain 4

Statistics All sites Rural Sub urban Urban BG Urban centre (μg m-3) 3 km 9 km 3 km 9 km 3 km 9 km 3 km 9 km 3 km 9 km 40.7 Obs 17.8 38.2 41.1 61.5 0

Mod0 28.8 27.1 10.0 12.8 26.7 23.7 30.2 29.0 42.6 39.3 Corr. 0.64 0.58 0.53 0.52 0.61 0.52 0.61 0.56 0.5 0.47 Coeff. r BIAS -11.8 -13.6 -7.8 -4.9 -12.1 -14.5 -11.3 -12.4 -17.4 -22.5 MNB% -14.7 -14.0 -39.1 -8.9 -20.1 -26.2 -6.5 -4.1 -10.6 -22.2 MAE 18.5 20.0 10.1 9.6 16.8 19.1 19.1 20.2 27.1 29.4 MNE% 50.5 55.7 57.6 57.4 43.5 49.7 53.9 61.3 49.5 49.4 RMSE 25.8 24.8 14.1 12.9 23.5 26.5 26.2 27.9 34.6 37.6

σo 29.22 13.3 25.4 29.0 32.3

σp 23.1 22.2 10.6 10.5 20.1 18.3 22.9 22.8 26.5 25.8

84 ANNEX J

DETAILED EVALUATION RESULTS FOR MM5-CMAQ-EMINO OVER SPAIN

Figure 21 shows the relation of observed and modelled data by using MM5-CMAQ-EMIMO for the 9 km spatial resolution model domain. The correlation coefficient r for the 8760 values is r=0.80 which is quite acceptable.

Figure 21. MM5-CMAQ-EMIMO model simulation over Madrid (Spain) domain with 9km spatial resolution. Comparison between observed and modelled ozone data averaged over 23 different monitoring stations in Madrid Community for 2005 (365 x 24 hours = 8760 data) and also averaging the modelled data over the different grid cells

Figure 22 and Figure 23 show four examples of the high degree of correlation between monitored and modelled ozone data by using MM5-CMAQ-EMIMO.

(a) (b)

Figure 22. Ozone time series observations versus modelled data for stations (a) Móstoles (Madrid Community) and (b) Estanca (Aragón Community) with correlation coefficients 0.80 and 0.90, respectively

85 (a) (b)

Figure 23. Ozone time series observations versus modelled data (MM5-CMAQ-EMIMO) for Majadahonda and Aranjuez in Madrid Community

86 ANNEX K

STRUCTURE OF META-DATABASE FOR MODEL EVALUATION EXERCISES Krisztina Labancz(1), Ana Isabel Miranda(2), Ulrike Pechinger(3), Martin Piringer(3), Peter Builjtes(4), Nicolas Moussiopoulos(5), John Douros(5)

(1) Hungarian Meteorological Service, Department for Atmospheric Environment, Budapest, Hungary (2) CESAM & Department of Environment and Planning, University of Aveiro, 3810-193 Aveiro, Portugal (3) Zentralanstalt für Meteorologie und Geodynamik, Wien, Austria (4) TNO, Dep. of Air Quality and Climate Change, Utrecht, the Netherlands and Free Univ. Berlin, Inst. Of Meteorology, Berlin, Germany (5) Laboratory of Heat Transfer and Environmental Engineering, Aristotle University, Thessaloniki, Thessaloniki, Greece

The database is structured as given below and can be found at: http://pandora.meng.auth.gr/mqat/

Notes: - comments are in italics - tables represent SQL data table variables - text can not be searched for, but appears in case of any other query - internal links between the SQL tables are under construction

1. General information

1.1 Dataset type

Meteorology Air Quality

1.2 Purpose of dataset

Data validation Intercomparison Application

1.3 General information – currently can not be asked for in the query

Name of dataset (for registering in the database) Name of site(s) / area Name of institute / organization / co-operation / anything responsible Contact person Published information and references - lists, links, pdfs

1.4 Availability Free Free but for extraction cost On agreement Restricted

1.5 Measurement type

Regular monitoring Measurement campaign or experiment

87 2. General description

2.1 Topographical details

Flat Mountainous Coastal Suburban Urban Rural

2.2 Approximate horizontal distance between stations

100 m 1 km 10 km 1000 km

2.3 Policy issue – only if intercomparison or applications

Meteorological processes Urban meteorology Summer meteorology Winter meteorology Climate change Stratospheric ozone depletion Acidification Eutrophication Tropospheric ozone Summer smog Winter smog Episode forecast

3. Details of measurement data

3.1 Temporal representation of measurements

Daily Hourly etc. see http://pandora.meng.auth.gr/mqat/

88 3.2 Measurement data origin

Meteorological mast observations Surface observations Upper air radiosonde station Tethered balloon Sodar vertical profiles Aircraft reports Satellite retrievals Roof level stations Street level stations

3.3 Monitoring station type

Urban Traffic Suburban Industrial Rural Background

3.4 Meteorological parameters – only if meteorology

u, v, w wind components Wind speed and wind direction etc. see http://pandora.meng.auth.gr/mqat/

3.5 Quality assurance protocols – yes or no

3.6 City, Country, Capital

4. Models

4.1 Model application purpose

Meteorological application Meteorology assessment Regulatory purposes and compliance Emergency planning Public information Scientific research

4.2 Model type – both meteorological and air quality

Diagnostic Prognostic etc. see http://pandora.meng.auth.gr/mqat/

89 4.3 Scale Street < 1 km Urban 1 -10 km Mezo < 100 km Regional < 1000 km Global > 1000 km

4.3 Statistical indices

Maximum observed and modelled values nth percentile prediction etc. see in http://pandora.meng.auth.gr/mqat/

5. Source conditions – only if air quality

5.1 Pollutants considered

Non-reactive gases Reactive gases etc. see http://pandora.meng.auth.gr/mqat/

5.2 Source description

Point source Elevated source Line source Area source Gridded emissions

5.3 Emission data

Total volume Emission rate Release Gas exit Release duration flow height temperature and velocity Available available available available available not available not available not available not available not available

90 ANNEX L

ENTRIES IN META-DATABASE FOR MODEL EVALUATION EXERCISES

Table 35. Datasets in the meta-database of COST728 (status: 05.February, 2007)

Station Temporal Dataset Location Duration Station Type Parameters measured Background resolution AUT / LHTEE Meteorological Thessaloniki, 25/10/2005- Roof level met. Wind components, temperature station Urban background Hourly Greece 10/07/2006 Station and relative humidity

Bologna 2001/2002 urban/rural Sodar, sonic Wind field, peak wind speed, PBL micrometeorological 03/05/2001- anemometer , roof Sub-urban, urban Less than an temperature, mixing height, Bologna, Italy campaigns 15/03/2002 level stations, and rural stations hour surface heat fluxes, vertical surface stations turbulence ARPA-SIM Bologna 2005/2006 urban PBL Wind field, incoming solar 29/04/2005- micrometeorologica micrometeorological campaigns Bologna, Italy Urban background Hourly radiation, net radiation, vertical 06/04/2006 l mobile station, turbulence, surface heat flux. roof level station Surface Wind field, peak wind speed, CTN_ACE 01/04/2003- Po Valley, Italy meteorological Rural Hourly relative humidity, incoming solar 31/03/2004 stations radiation, net radiation. Urban background, Daily Air Quality Newsletter in Mean daily, 01/01/2005- Surface, street traffic, suburban, SO , NO , CO, O , PM , black Athens Athens, Greece maximum daily 2 2 3 10 ongoing level stations rural, coastal, smoke values mountainous Wind field, peak wind speed, 23/10/1994- Bulgaria, Surface, upper air temperature, dew point 28/10/1994 European Tracer EXperiment Denmark, radiosonde, 3-hours, less temperature, potential and Varied (ETEX) France, tethered balloon, than an hour temperature, relative humidity, 14/11/1994- Germany, U.K. sodar pressure, precipitation, cloud 19/11/1994 cover fraction Measurements and statistics 08/03/1992- Urban background, Hourly, daily, less U.K. CO, PM, PM , SO , NO data ongoing traffic, industrial than an hour 10 2 2 Wind field, peak wind speed, Hourly, less than temperature, dew point Met Office - Land Surface 01/01/1983- hourly, mean U.K. Surface, met. mast Varied temperature, potential Observation Stations Data ongoing daily, maximum temperature, relative humidity, daily pressure, precipitation, cloud

91 Station Temporal Dataset Location Duration Station Type Parameters measured Background resolution cover fraction Surface, met. mast, Wind components, temperature, 01/03/2005- upper air Flat, mountainous, Monitoring network for PBL in vertical potential temperature Po valley, Italy 30/11/2010 radiosonde, urban background, Varied the Veneto Region (Po Valley) gradient, vertical turbulence, tethered balloon, rural mixing height sodar 11/11/204- Wind field, incoming solar Po Valley 2004/2005 rural PBL Po valley, Italy 10/04/2005 Surface stations Flat, rural Hourly radiation, net radiation, vertical micrometeorological campaigns turbulence, surface heat flux Flat, coastal, Brussels, 16/06/1998- suburban, urban Hourly, three- Wind field, temperature, dew Southern North Sea June 1998 Amsterdam, 20/06/1998 Surface stations (background, hourly (varied) point temperature, pressure London traffic, industrial), rural 01/01/2005- Mast, surface, roof Flat, coastal, Stockholm, Wind field, temperature, relative Stockholm 27/06/2006 level, street level, suburban, urban Hourly, annual Sweden humidity sodar background

Table 36. Summary of datasets used in different model evaluations

Data archive Data information Use and availability The METREX experiment consisted of 6 hour emissions of perfluorocarbons simultaneously, from two different locations, every 36 hours. The tracer release locations were in suburban Washington, D.C, while 8 hour air Acquisition only. samples were collected at three locations within the urban area. The experiment ran for one full year. In addition Downloaded from publicly available Metropolitan Tracer monthly air concentration samples were collected at about 60 locations throughout the region. source: Experiment http://www.arl.noaa.gov/ss/transport/tr (METREX) acer.html

Perfluorocarbon tracers were released from Monterfil, Brittany, France, in October and November 1994 and tracked for 72 hours across 17 European countries by a network of 168 ground stations. Upper-air ETEX (European measurements were also made by three aircraft. Three samplers were located in the North Sea. The average OMEGA model evaluation. Tracer Experiment) spacing between two sampling stations in the resulting configuration was about 80 km. Each station was designed to sample over a period of 72 consecutive hours (24 three-hour samples), with sampling starting time progressively delayed from West to East. The stations closest to the source started sampling 3 hours before the 92 Data archive Data information Use and availability release start; the most distant stations ended sampling 90 hours after the release start. Overall some 9000 samples were successfully collected in the two experiments. In October, a westerly air flow carried the plume to the northeast across Europe. In November, the plume went east. In the course of the two releases, the tracer clouds were sampled as far away as Poland, Sweden, and Bulgaria. The experiment recorded tracer concentrations at ground level and in the upper air, routine and special meteorological conditions, and the trajectories of constant-altitude balloons. http://rem.jrc.cec.eu.int/etex/ Finite duration releases of PMCH (perfluoromethylcyclohexane) and PMCP (perfluoromethylcyclopentane) in the city of Birmingham, U.K. for three field experiments; Main purpose were to test new instrumentation; Data used for urban dispersion Birmingham Dispersion over 1 – 10 km; Available: Release and concentration data, surface meteorology, profiler data. A models validations. report and journal papers provide sufficient information about the data. Consisted of a nine-day period in the Eastern U.S. in July 1995 when regional ozone concentrations were Data used for evaluation of the high. The dataset mainly contains meteorological data from conventional National Weather Service network, performance of MM5 and RAMS. OTAG (Ozone and predictions by the MM5 and RAMS mesoscale meteorological models. Hanna, S.R., et al., 2001: Use of Transport Monte Carlo uncertainty analysis to Assessment evaluate differences in observed and Project) predicted ozone concentrations. To appear in J. Environ. Poll. Include data from one field (Nevada Test Site) and three wind tunnels (U. of Arkansas, U. of Surrey, and EPA). The wind tunnel tests are designed to mimic the field tests, which involved ground-level releases

(duration ~ 20 s to several minutes) of CO2 in a nested obstacle array. The measuring arcs for the field tests Extensive model evaluation are up to 225 m downwind. The field programme lasted for about a month. (HGSYSTEM) done with the data, data used to develop new vertical entrainment laws. Kit Fox Datasets and reports are available. Hanna, S. R. et al., in Atmos. Env., 35, 13, pg 2223-2229 and 2231-2242, 2001.

Consisted of three, three-week sessions of intensive field measurements in Cape Canaveral and Vandenberg AFB, where SF tracer was released (both puffs and plumes) and sampled from both aerial and ground-based MVP (Model 6 Data being analyzed. To be used in platforms. The plumes and puffs were tracked by two sampling aircraft, six ground sampling vehicles, and Validation model evaluation (HPAC, CALPUFF, three infrared camera teams. Onsite meteorological data was supplemented by the use of a specially-equipped Programme) and VLSTRACK) in the future. meteorological aircraft, three ground meteorology stations, two acoustic sodars, sodar and two acoustic anemometers. LROD (Long-Range Consisted of a series of airborne SF releases from an Air Force C-130 transport aircraft, which flew 6 Data used to study along-wind Overwater perpendicular to the prevailing wind direction and released lines of SF in the marine boundary layer in the 6 dispersion. Diffusion) Pacific Ocean northwest of Kauai, Hawaii. A second aircraft with a continuous SF6 analyzer sampled the

93 Data archive Data information Use and availability plume at distances up to 100 km downwind, with six small boats also tracking the plume to similar distances. Meteorological data were collected by the NOAA/ATDD Long-EZ airborne Mobile Flux Platform (MFP) during horizontal transects, and gentle ascents and descents.

The Kincaid power plant is located in Illinois and surrounded by flat farmland with some lakes. Tracer gas SF6 was released from the plant stack, for approximately 350 hours during three three-week periods. Roughly 200 Have been used to evaluate a monitors were deployed on concentric arcs from 0.5 to 50 km from the stack. Meteorological data were Kincaid dispersion model (HPDM) from measured at one 100-m tower (including turbulence data), one Doppler acoustic sounder, slow-rise temperature power-plant plumes. sounders, and several 10-m towers. The plumes were also scanned by lidar. There were also 30 SO2 monitors operated for about one year.

94 ANNEX M

SUMMARY TABLES ON MODEL VALIDATION AND EVALUATION

Table 37. Analytic solutions – Meteorology

Model name u v w T Qv Qlc qsc Qlr zi Other Remarks GESIMA x x x x mountain lee waves (hydrostatic and non-hydrostatic) GME x x shallow water version, Williamson test METRAS x x x potential temperature Long solution a NHHIRLAM x x RAMS x x x x SAIMM x x x x x x UM x x x x Various standard dynamical core tests.

Table 38. Evaluated reference Dataset – Meteorology

Model name u v w T qv qlc qsc qlr zi Other Remarks COSMO_CH x x x x x Measurements of SYNOP, GPS, radar, radiosoundings COSMO climate x x x x surface energy budget model components GME x x x x x daily evaluation against operational global measurements, standard NWP verification Hirlam x x x x COSMO_EU x x x x x daily measurements COSMO_EU_M x x windprofiler/ RASS measurements H METRAS x x x x x x TRACT, BERLIOZ, FLUMOB UM x x x x x X X x Various standard dynamical core tests including global. GCSS intercomparisons with other CRM/LES models.

95 Table 39. Model intercomparison – Meteorology

Model name u v w T qv qlc qsc qlr zi Other Remarks ALADIN/A x x x X COSMO_CH x COSMO exchange of precipitation fields COSMO climate x x surface energy budget model components GME x x x X x all standard NWP evaluation daily with operational NWP forecasts of other met. variables services (ECMWF, MeteoFrance, UKMetOff,NCEP.. COSMO_EU x x x X x daily with other operational NWP-models of other European met. Services COSMO_EU_MH x x with parcel methods according to COST715, WG2 (Fisher et al., 1998), EU FP5 project FUMAPEX MEMO x x x x (UoA-PT) METRAS x x x X x MESOCOM, TFS NHHIRLAM x x RAMS x x x x X x

Table 40. Additional validation and evaluation efforts – Meteorology

Model name u v w T qv qlc qsc qlr zi Other Remarks ALADIN/A x x x X GESIMA x x comparison with wind farm sites GME x x x X x x x all standard NWP evaluation APE, GCSS (Pacific Cross-section intercomparison) variables Aquaplanet Experiment (with Univ. of Mainz) etc. Hirlam x x x x COSMO_EU x x x x X x x x x experiments, research COSMO_EU_MH x x radiosoundings, EU FP5 project FUMAPEX MEMO (UoA-PT) x x x x METRAS x x x x X x DWD surface measurements; applied validation concept of Schlünzen (1997), J. Wind Engineering and Industrial Aerodynamics, 67 & 68, 479-492 MM5 x x x x x x x (UoA-GR) MM5 x x x x x x x (UoA-PT)

96 MM5(GKSS) x x x x x x cloud fraction Comparison to radiosondes and ground data are under way. Liquid water and cloud fraction have been compared to other model results and to remote sensing data (van Lipzig et al., 2006) NHHIRLAM x x x x x x x evaluations of forecasts using observations RAMS x x UM x x x x x x x Operational verification plus verification against e.g. cloud radar/lidar.

97 ANNEX N

MESOSCALE MODEL USER TRAINING

Table 41. Mesoscale model user training – Replies of institutions without user training

Information given by Marko Kaasik Leonor Tarrason and Dag Bjoerge Gertie Geertsema Anonymous Estonia Norway the Netherlands UK 1) Is there a training No No; No, No programme at your We have HIRLAM and a few AQM The Universities in Oslo and Bergen For the meteorological model the daily institution? If yes, (SILAM, MATCH, AirViro, AEROPOL), give lectures on general NWP. In use provides the training continue with questions but Estonian user community is too Oslo, some students use versions of of the operational meteorologists. in block A, if not please small to carry out any courses as HIRLAM, the operational model at the A good discussion about the answer questions of such. Usual way of teaching is Norwegian Meteorological Institute, for necessities of a user-training is Block B. individual supervising. their thesis. Similarly, MM5 is used in obscured by the work-load on the Bergen in cooperation with the private people who should give this training. company "Storm Weather Center". There is a regular users-consultation in which new developments are communicated together with the reasons for these changes and their consequences are illustrated. A) Tools and / or methods for model user training exist Models used HIRLAM (High Resolution Limited RAMS (Regional Atmospheric Area Model) Modelling System)

SILAM, MATCH, AirViro, AEROPOL UM (hopefully mesoscale limited area versions to be running soon) B) What do you expect of user training

98 Table 42. Mesoscale model user training – Replies of institutions with user training

Information given by Mikhail Sofiev, Finland MARINA COVRE, Heinke Schlünzen, UHH, Barbara Fay, DWD, Barbara Fay, DWD, FRANCE Germany Germany Germany 1) Is there a training Yes Yes Yes Yes Yes programme at your institution? If yes, continue with questions in block A, if not please answer questions of Block B. A) Tools and / or methods for model user training exist Models used SILAM Meso-NH METRAS Local Model (COSMO_EU) HRM (High Resolution (Mesoscale); MITRAS Model) (microscale)

Title of training programme Radiation exercise and Meso-NH training course Mesoscale modelling Lokalmodell (COSMO_EU) HRM Training Workshop SILAM a) Introduction user training May 2006 b) Master course c) PhD course Institution Finnish Meteorological Meteo-France (Toulouse, Meteorological Institute, German Weather Service, German Weather Service, Institute France) Univ. Hamburg Offenbach Offenbach

Duration in total (in hours) 8 25 a) 28 h 8 65 h b) 42 h c) 20 h Frequency: (e.g. every year, 1-2/ year 2/year a,b) 1 / year 1/ year 1 / year at most once per five years, once) c) 1-3 per year

Teacher Name Several teachers (e.g., Pilvi (1) User support team at Heinke Schlünzen Ulrich Schättler, Jürgen Detlev Majewski Siljamo, Tuula Summanen, CNRM: Steppeler Minna Rantamäki, Mikhail Christine Lac (and invited teachers) Sofiev) Isabelle Mallet Jeanine Payart Juan Escobar (2) Specialized atmospheric scientists : S. Malardel P. Le Moigne C. Mari

99 Information given by Mikhail Sofiev, Finland MARINA COVRE, Heinke Schlünzen, UHH, Barbara Fay, DWD, Barbara Fay, DWD, FRANCE Germany Germany Germany Background (e.g. atmospheric atmospheric scientist, Numerical methods, Meteorologist Mathematics (numerics)/ Meteorologist scientist, computer specialist, computer specialist computer science. computer scientist give details)) Meteorologists. Average number of 20 12 a) 15 10 15 participants b) 10 c) 1 Objective (e.g. training for training for radioactive training for radioactive Training for bachelor, Training of current/future weather forecast, consultants accidents accidents master and PhD students of COSMO_EU users, mainly - give details) atmospheric sciences; from co-operating environmental engineers; universities and from small- advanced scale COSMO_EU training for consultants; modelling consortium courses are to be taken in COSMO sequence a, b, c 9) Topics addressed Initiation into proceeding a) basic learning of limited Theory: 2h: Equations, Theory: 10h: Equations, (please mark where simulations with the area numerical model approx., numerics, coord. approx., numerics, coord. appropriate and give details atmospheric research model structure System, initialisation, (data System, initialisation, (data plus number of training hours) Meso-NH. b) theory of mesoscale ass., bound. Conditions, ass., bound. Conditions, It is aimed to people who modelling, investigation of model nesting, model nesting, model intend to use Meso-NH. numerical and validation, parametrization impacts on Parametrization model results c) specific training with respect to PhD thesis topic 1 About 12.5 hours: a) 4 1h 1h Model system overview Meso-NH and application b) 28 (equations, examples Model’s dynamics, approximations, physics, and externalized parametrizations, numerical surface Computer solutions, initial and procedures, from the code boundary value impact, modification to the results’ model evaluation) visualization, Meso-NH- c) 6 Chemistry (optional on-line module) Hands-on lectures Model - About 12.5 hours: a) 2 2h 10 setup and run (e.g. one way- Practical works on all the b) 2 nesting, topics for half the time (on c) 4 two-way nesting, data ideal and real cases) 100 Information given by Mikhail Sofiev, Finland MARINA COVRE, Heinke Schlünzen, UHH, Barbara Fay, DWD, Barbara Fay, DWD, FRANCE Germany Germany Germany assimilation etc.)

Hands-on lectures: 1 a) 4 1 12 preparation of input data b) 1 c) 4 Hands-on lectures: output 4 a) 10 2 14 incl. evaluation interpretation b) 4 c) 2 post processing 1 a) 2 1 12 b) 1 c) 4 Other topics 1 Practical training a) 6 Practical training, model 6h:Operational scheduling, evaluation Research, HRM network, b) 7 model performance HRM future

Tool for training available? Computers, www-browser The tutorial class takes Models, computers Yes, system of test Yes, HRM model systems, (give details) place at Meteo-France; modules, computers computers students work on computers of Meteo-France (Fujitsu, NEC).

Final examination? No No Yes (test cases to simulate No Yes (test cases to simulate and interpret) and interpret)

Other issues - Other issues No forecaster training - No forecaster training the courses are given in French; information on the web : http://mesonh.aero.obs- mip.fr/mesonh/ Contact information: [email protected]

101 B) What do you expect of user training Please describe your Weather forecasters can run Consultants and students Consultants and students expectations on a trained SILAM-model and should be able to run a should be able to run model model user understand the model’s model and understand and understand impact of output impact of input data, input data, numerics and numerical schemes and parametrizations on model parametrizations on model output output

Table 42. (Continued): Mesoscale model user training – Replies of institutions with user training)

Information given by JOANNA STRUZEWSKA, POLAND ANA ISABEL MIRANDA, UA, Portugal Anonymous UK 1) Is there a training programme at your Yes Yes Yes institution? If yes, continue with questions in block A, if not please answer questions of Block B. A) Tools and / or methods for model user training exist Models used EK100W, TAPM UM ADMS-3

Title of training programme Application of engineering software in air Air quality assessment with TAPM model a) UM Introduction Course pollution analysis b) Individual Mentoring Programme c) PhD Talks d) Seminars/talks Institution ATMOTERM s.a. University of Aveiro (UA) / Institute for the Department of Meteorology Environment and Development (IDAD) University of Reading, UK

Duration in total (in hours) 8-16 6.5 h a) 6-8hrs. b) 8-10hrs c) 2hrs d) 2hrs Frequency: (e.g. every year, once per five 1 per two year a) 1-2 per year. years, once) b) 3-4 per year. c) 1 per year. d) 2-3 per year

102 Information given by JOANNA STRUZEWSKA, POLAND ANA ISABEL MIRANDA, UA, Portugal Anonymous UK Teacher Name Marek Kuczer, Ana Isabel Miranda, Carlos Borrego (UA); a) Dr. Lois Steenman-Clark Marek Rosicki Miguel Coutinho, Clara Ribeiro (IDAD) b) Dr Changgui Wang, Dr. Lois Steenman- Clark c) and d) Various Background (e.g. atmospheric scientist, Atmospheric physic scientist, computer Professors at the Department of Computer Specialists computer specialist, give details)) specialist, air protection Environment and Planning, University of Aveiro, Portugal; atmospheric scientists. Average number of participants 25 6 a) 15 b) 2 c) 5-10 d) 20-40 Objective (e.g. training for weather forecast, training for air quality assessment Training for air quality assessment with Training for research capability consultants - give details) TAPM model.

9) Topics addressed a) Theory on modelling of air pollution (please mark where appropriate and give dispersion, details plus number of training hours) b) Application of engineering software in air pollution analysis The TAPM model description concerning its √ Model system overview main capabilities, assumptions, parameterizations, input data requirements and outputs. Hands-on lectures Model setup and run TAPM was applied to different study-cases √ (e.g. one way-nesting, two-way nesting, data assimilation etc.) Hands-on lectures: preparation of input For the different study cases, topography, √ data land-use, meteorology and pollutants emissions were prepared considering the input data requirements. Hands-on lectures: output interpretation The estimated pollutants concentrations √ were analysed in order to detect local characteristics of pollution patterns, synoptical and mesoscale forcings on pollutants distribution and pollutants emission rates contribution to air pollution episodes.

103 Information given by JOANNA STRUZEWSKA, POLAND ANA ISABEL MIRANDA, UA, Portugal Anonymous UK post processing √

Other topics Tools for data manipulation Performance of models

Tool for training available? (give details) Computers, models The TAPM model is installed at the Web based materials University of Aveiro and during the training Notes for students events the participants had the opportunity Power point to apply it. The model was also installed at the Institute for the Environment and Development in order to be applied by the trained model users. Final examination? No No No

Other issues On-going support after training

B) What do you expect of user training Please describe your expectations on a A trained model user should be capable to Expect users to be able to use the models trained model user install and operate the model for which was and tools independently for their research trained. The correct assessment of the projects. simulated study cases is also an important skill that the trained user should acquire. The scientific background, physical approximations and all the modules that constitute the model should also be understood by the user.

104 Answers on user training:

CHIMERE user training: For CHIMERE, the choice was to propose a complete training on the web: http://euler.lmd.polytechnique.fr/chimere/. Each new user can run the model for a real test-case (the Heat Wave of summer 2003 over the western Europe) only using the web site and the model documentation (a PDF file available on the web site). In this documentation, a chapter is dedicated to this first run. A complete tutorial is proposed, including:

- The download of prepared meteorological files (MM5 runs). - Surface emissions fluxes: originated from EMEP and regridded over the CHIMERE domain - Boundary conditions, provided by several groups: MOZART, GOCART and LMDz-INCA. - How to install the codes (pre-requisite softwares such as NetCDF, LAM-MPI, F90 compilers etc.) and compile the model for the first time. - A suite of programme dedicated to plot the chemical concentrations fields and to check that the users results are the same than those published in the paper by Vautard et al. (2005).

Finally, a user support is proposed by the way of an e-mail address: [email protected] and a list of user exists where every new users can register to have updated information about the model development.

User training in The Netherlands A regular users-consultation in which new developments are communicated together with the reasons for these changes and their consequences are illustrated. Regular training courses to new users and developers are not available. If necessary new users are following courses at ECMWF. Model users and developers are academics mostly with a PhD in physics, training thus been provided for by universities. Dedicated training for special purposes is on ad-hoc basis. Training of model developers is mainly on the job, that is information and knowledge is acquired through close collaboration with experienced developers.

Forecasters are model output users who need a profound understanding of the characteristics of NWP models. Nowadays forecasters are academics holding a degree in physics with a major in meteorology. New forecasters receive training courses either in-house, or at sister institutes (e.g. ECMWF). Experience is acquired through the operational tasks. Information on model updates is given in meetings, via a Dutch magazine dedicated to meteorology and via intranet. In the in-house courses different aspects of meteorology are trained. These courses last typically half a day to some days.

User training recommendation The preliminary knowledge expected from people to be trained as a forecaster is an understanding of atmospheric process on a synoptic scale and more local scales. Knowledge of the translation of these processes into a numerical model is useful insofar this translation bears on the quality and characteristics of the model output. Forecasters need to have knowledge on uncertainties in observations and be able to interpret observations of different signatures like synoptic, ground based, radar and satellite information. Skills necessary are the capability to understand and interpret vast amounts of information on different scales in different formats (text, graphical representations). A forecaster does not need to be able to run the model, the model data is available.

A course for a model data user should give an overview of observations, models, model output and translating this information into text and graphics. For observations the different types and their uncertainties have to be discussed. Atmospheric processes and their translation into numerical models are important with an emphasize on different models and their effect on the characteristics of the model output. The forecaster must be able to translate model output into text and graphics for the lay person who needs to be informed on the weather forecasts, like the public and/or decision makers. The forecaster must be able to follow the effects of significant changes in the code on the output. A forecaster needs to be able to assess the quality of a model forecast in a short time period.

105 ANNEX O

STRUCTURE OF WMO GURME AIR QUALITY FORECASTING TRAINING COURSE AS GIVEN IN LIMA, PERU IN 2006

Section 1 Acknowledgements Section 2 Executive Overview Section 3 Introduction and Overview Section 4 What Are We Forecasting Section 4 Exercises Section 5 How Are Forecasts Used Section 6 Health Effects Section 7 Chemical Aspects of Air Pollution Section 7 Exercises Section 8 Pollutant Monitoring Section 8 Exercises Section 9 Pollutant Lifecycle and Trends Section 10 Air Pollution Meteorology Section 10 Exercises Section 11 Case Studies Section 12 Air Quality Forecasting Tools Section 12a Madrid Forecast Model Section 12b1 About the TAPM Model Section 12b3 Background and setup of TAPM Section 13 Developing a Forecast Programme Section 13 Exercises Section 14 Daily Forecast Operations Section 15 References

106 GLOBAL ATMOSPHERE WATCH REPORT SERIES

1. Final Report of the Expert Meeting on the Operation of Integrated Monitoring Programmes, Geneva, 2 -5 September 1980.

2. Report of the Third Session of the GESAMP Working Group on the Interchange of Pollutants Between the Atmosphere and the Oceans (INTERPOLL-III), Miami, USA, 27-31 October 1980.

3. Report of the Expert Meeting on the Assessment of the Meteorological Aspects of the First Phase of EMEP, Shinfield Park, U.K., 30 March - 2 April 1981.

4. Summary Report on the Status of the WMO Background Air Pollution Monitoring Network as at April 1981.

5. Report of the WMO/UNEP/ICSU Meeting on Instruments, Standardization and Measurements Techniques for Atmospheric CO2, Geneva, 8-11; September 1981.

6. Report of the Meeting of Experts on BAPMoN Station Operation, Geneva, 23–26 November 1981.

7. Fourth Analysis on Reference Precipitation Samples by the Participating World Meteorological Organization Laboratories by Robert L. Lampe and John C. Puzak, December 1981.

8. Review of the Chemical Composition of Precipitation as Measured by the WMO BAPMoN by Prof. Dr. Hans-Walter Georgii, February 1982.

9. An Assessment of BAPMoN Data Currently Available on the Concentration of CO2 in the Atmosphere by M.R. Manning, February 1982.

10. Report of the Meeting of Experts on Meteorological Aspects of Long-range Transport of Pollutants, Toronto, Canada, 30 November - 4 December 1981.

11. Summary Report on the Status of the WMO Background Air Pollution Monitoring Network as at May 1982.

12. Report on the Mount Kenya Baseline Station Feasibility Study edited by Dr. Russell C. Schnell.

13. Report of the Executive Committee Panel of Experts on Environmental Pollution, Fourth Session, Geneva, 27 September - 1 October 1982.

14. Effects of Sulphur Compounds and Other Pollutants on Visibility by Dr. R.F. Pueschel, April 1983.

15. Provisional Daily Atmospheric Carbon Dioxide Concentrations as Measured at BAPMoN Sites for the Year 1981, May 1983.

16. Report of the Expert Meeting on Quality Assurance in BAPMoN, Research Triangle Park, North Carolina, USA, 17-21 January 1983.

17. General Consideration and Examples of Data Evaluation and Quality Assurance Procedures Applicable to BAPMoN Precipitation Chemistry Observations by Dr. Charles Hakkarinen, July 1983.

18. Summary Report on the Status of the WMO Background Air Pollution Monitoring Network as at May 1983.

19. Forecasting of Air Pollution with Emphasis on Research in the USSR by M.E. Berlyand, August 1983.

20. Extended Abstracts of Papers to be Presented at the WMO Technical Conference on Observation and Measurement of Atmospheric Contaminants (TECOMAC), Vienna, 17-21 October 1983.

21. Fifth Analysis on Reference Precipitation Samples by the Participating World Meteorological Organization Laboratories by Robert L. Lampe and William J. Mitchell, November 1983.

22. Report of the Fifth Session of the WMO Executive Council Panel of Experts on Environmental Pollution, Garmisch- Partenkirchen, Federal Republic of Germany, 30 April - 4 May 1984 (WMO TD No. 10).

23. Provisional Daily Atmospheric Carbon Dioxide Concentrations as Measured at BAPMoN Sites for the Year 1982. November 1984 (WMO TD No. 12). 107 24. Final Report of the Expert Meeting on the Assessment of the Meteorological Aspects of the Second Phase of EMEP, Friedrichshafen, Federal Republic of Germany, 7-10 December 1983. October 1984 (WMO TD No. 11).

25. Summary Report on the Status of the WMO Background Air Pollution Monitoring Network as at May 1984. November 1984 (WMO TD No. 13).

26. Sulphur and Nitrogen in Precipitation: An Attempt to Use BAPMoN and Other Data to Show Regional and Global Distribution by Dr. C.C. Wallén. April 1986 (WMO TD No. 103).

27. Report on a Study of the Transport of Sahelian Particulate Matter Using Sunphotometer Observations by Dr. Guillaume A. d'Almeida. July 1985 (WMO TD No. 45).

28. Report of the Meeting of Experts on the Eastern Atlantic and Mediterranean Transport Experiment ("EAMTEX"), Madrid and Salamanca, Spain, 6-8 November 1984.

29. Recommendations on Sunphotometer Measurements in BAPMoN Based on the Experience of a Dust Transport Study in Africa by Dr. Guillaume A. d'Almeida. September 1985 (WMO TD No. 67).

30. Report of the Ad-hoc Consultation on Quality Assurance Procedures for Inclusion in the BAPMoN Manual, Geneva, 29-31 May 1985.

31. Implications of Visibility Reduction by Man-Made Aerosols (Annex to No. 14) by R.M. Hoff and L.A. Barrie. October 1985 (WMO TD No. 59).

32. Manual for BAPMoN Station Operators by E. Meszaros and D.M. Whelpdale. October 1985 (WMO TD No. 66).

33. Man and the Composition of the Atmosphere: BAPMoN - An international programme of national needs, responsibility and benefits by R.F. Pueschel, 1986.

34. Practical Guide for Estimating Atmospheric Pollution Potential by Dr. L.E. Niemeyer. August 1986 (WMO TD No. 134).

35. Provisional Daily Atmospheric CO2 Concentrations as Measured at BAPMoN Sites for the Year 1983. December 1985 (WMO TD No. 77).

36. Global Atmospheric Background Monitoring for Selected Environmental Parameters. BAPMoN Data for 1984. Volume I: Atmospheric Aerosol Optical Depth. October 1985 (WMO TD No. 96).

37. Air-Sea Interchange of Pollutants by R.A. Duce. September 1986 (WMO TD No. 126).

38. Summary Report on the Status of the WMO Background Air Pollution Monitoring Network as at 31 December 1985. September 1986 (WMO TD No. 136).

39. Report of the Third WMO Expert Meeting on Atmospheric Carbon Dioxide Measurement Techniques, Lake Arrowhead, California, USA, 4-8 November 1985. October 1986.

40. Report of the Fourth Session of the CAS Working Group on Atmospheric Chemistry and Air Pollution, Helsinki, Finland, 18- 22 November 1985. January 1987.

41. Global Atmospheric Background Monitoring for Selected Environmental Parameters. BAPMoN Data for 1982, Volume II: Precipitation chemistry, continuous atmospheric carbon dioxide and suspended particulate matter. June 1986 (WMO TD No. 116).

42. Scripps reference gas calibration system for carbon dioxide-in-air standards: revision of 1985 by C.D. Keeling, P.R. Guenther and D.J. Moss. September 1986 (WMO TD No. 125).

43. Recent progress in sunphotometry (determination of the aerosol optical depth). November 1986.

44. Report of the Sixth Session of the WMO Executive Council Panel of Experts on Environmental Pollution, Geneva, 5-9 May 1986. March 1987.

45. Proceedings of the International Symposium on Integrated Global Monitoring of the State of the Biosphere (Volumes I-IV), Tashkent, USSR, 14-19 October 1985. December 1986 (WMO TD No. 151).

108 46. Provisional Daily Atmospheric Carbon Dioxide Concentrations as Measured at BAPMoN Sites for the Year 1984. December 1986 (WMO TD No. 158).

47. Procedures and Methods for Integrated Global Background Monitoring of Environmental Pollution by F.Ya. Rovinsky, USSR and G.B. Wiersma, USA. August 1987 (WMO TD No. 178).

48. Meeting on the Assessment of the Meteorological Aspects of the Third Phase of EMEP IIASA, Laxenburg, Austria, 30 March - 2 April 1987. February 1988.

49. Proceedings of the WMO Conference on Air Pollution Modelling and its Application (Volumes I-III), Leningrad, USSR, 19- 24 May 1986. November 1987 (WMO TD No. 187).

50. Provisional Daily Atmospheric Carbon Dioxide Concentrations as Measured at BAPMoN Sites for the Year 1985. December 1987 (WMO TD No. 198).

51. Report of the NBS/WMO Expert Meeting on Atmospheric CO2 Measurement Techniques, Gaithersburg, USA, 15-17 June 1987. December 1987.

52. Global Atmospheric Background Monitoring for Selected Environmental Parameters. BAPMoN Data for 1985. Volume I: Atmospheric Aerosol Optical Depth. September 1987.

53. WMO Meeting of Experts on Strategy for the Monitoring of Suspended Particulate Matter in BAPMoN - Reports and papers presented at the meeting, Xiamen, China, 13-17 October 1986. October 1988.

54. Global Atmospheric Background Monitoring for Selected Environmental Parameters. BAPMoN Data for 1983, Volume II: Precipitation chemistry, continuous atmospheric carbon dioxide and suspended particulate matter (WMO TD No. 283).

55. Summary Report on the Status of the WMO Background Air Pollution Monitoring Network as at 31 December 1987 (WMO TD No. 284).

56. Report of the First Session of the Executive Council Panel of Experts/CAS Working Group on Environmental Pollution and Atmospheric Chemistry, Hilo, Hawaii, 27-31 March 1988. June 1988.

57. Global Atmospheric Background Monitoring for Selected Environmental Parameters. BAPMoN Data for 1986, Volume I: Atmospheric Aerosol Optical Depth. July 1988.

58. Provisional Daily Atmospheric Carbon Dioxide Concentrations as measured at BAPMoN sites for the years 1986 and 1987 (WMO TD No. 306).

59. Extended Abstracts of Papers Presented at the Third International Conference on Analysis and Evaluation of Atmospheric CO2 Data - Present and Past, Hinterzarten, Federal Republic of Germany, 16-20 October 1989 (WMO TD No. 340).

60. Global Atmospheric Background Monitoring for Selected Environmental Parameters. BAPMoN Data for 1984 and 1985, Volume II: Precipitation chemistry, continuous atmospheric carbon dioxide and suspended particulate matter.

61. Global Atmospheric Background Monitoring for Selected Environmental Parameters. BAPMoN Data for 1987 and 1988, Volume I: Atmospheric Aerosol Optical Depth.

62. Provisional Daily Atmospheric Carbon Dioxide Concentrations as measured at BAPMoN sites for the year 1988 (WMO TD No. 355).

63. Report of the Informal Session of the Executive Council Panel of Experts/CAS Working Group on Environmental Pollution and Atmospheric Chemistry, Sofia, Bulgaria, 26 and 28 October 1989. 64. Report of the consultation to consider desirable locations and observational practices for BAPMoN stations of global importance, Bermuda Research Station, 27-30 November 1989.

65. Report of the Meeting on the Assessment of the Meteorological Aspects of the Fourth Phase of EMEP, Sofia, Bulgaria, 27 and 31 October 1989.

66. Summary Report on the Status of the WMO Global Atmosphere Watch Stations as at 31 December 1990 (WMO TD No. 419).

109 67. Report of the Meeting of Experts on Modelling of Continental, Hemispheric and Global Range Transport, Transformation and Exchange Processes, Geneva, 5-7 November 1990.

68. Global Atmospheric Background Monitoring for Selected Environmental Parameters. BAPMoN Data For 1989, Volume I: Atmospheric Aerosol Optical Depth.

69. Provisional Daily Atmospheric Carbon Dioxide Concentrations as measured at Global Atmosphere Watch (GAW)-BAPMoN sites for the year 1989 (WMO TD No. 400).

70. Report of the Second Session of EC Panel of Experts/CAS Working Group on Environmental Pollution and Atmospheric Chemistry, Santiago, Chile, 9-15 January 1991 (WMO TD No. 633).

71. Report of the Consultation of Experts to Consider Desirable Observational Practices and Distribution of GAW Regional Stations, Halkidiki, Greece, 9-13 April 1991 (WMO TD No. 433).

72. Integrated Background Monitoring of Environmental Pollution in Mid-Latitude Eurasia by Yu.A. Izrael and F.Ya. Rovinsky, USSR (WMO TD No. 434).

73. Report of the Experts Meeting on Global Aerosol Data System (GADS), Hampton, Virginia, 11 to 12 September 1990 (WMO TD No. 438).

74. Report of the Experts Meeting on Aerosol Physics and Chemistry, Hampton, Virginia, 30 to 31 May 1991 (WMO TD No. 439).

75. Provisional Daily Atmospheric Carbon Dioxide Concentrations as measured at Global Atmosphere Watch (GAW)-BAPMoN sites for the year 1990 (WMO TD No. 447).

76. The International Global Aerosol Programme (IGAP) Plan: Overview (WMO TD No. 445).

77. Report of the WMO Meeting of Experts on Carbon Dioxide Concentration and Isotopic Measurement Techniques, Lake Arrowhead, California, 14-19 October 1990.

78. Global Atmospheric Background Monitoring for Selected Environmental Parameters BAPMoN Data for 1990, Volume I: Atmospheric Aerosol Optical Depth (WMO TD No. 446).

79. Report of the Meeting of Experts to Consider the Aerosol Component of GAW, Boulder, 16 to 19 December 1991 (WMO TD No. 485).

80. Report of the WMO Meeting of Experts on the Quality Assurance Plan for the GAW, Garmisch-Partenkirchen, Germany, 26-30 March 1992 (WMO TD No. 513).

81. Report of the Second Meeting of Experts to Assess the Response to and Atmospheric Effects of the Kuwait Oil Fires, Geneva, Switzerland, 25-29 May 1992 (WMO TD No. 512).

82. Global Atmospheric Background Monitoring for Selected Environmental Parameters BAPMoN Data for 1991, Volume I: Atmospheric Aerosol Optical Depth (WMO TD No. 518).

83. Report on the Global Precipitation Chemistry Programme of BAPMoN (WMO TD No. 526).

84. Provisional Daily Atmospheric Carbon Dioxide Concentrations as measured at GAW-BAPMoN sites for the year 1991 (WMO TD No. 543).

85. Chemical Analysis of Precipitation for GAW: Laboratory Analytical Methods and Sample Collection Standards by Dr Jaroslav Santroch (WMO TD No. 550).

86. The Global Atmosphere Watch Guide, 1993 (WMO TD No. 553).

87. Report of the Third Session of EC Panel/CAS Working Group on Environmental Pollution and Atmospheric Chemistry, Geneva, 8-11 March 1993 (WMO TD No. 555).

88. Report of the Seventh WMO Meeting of Experts on Carbon Dioxide Concentration and Isotopic Measurement Techniques, Rome, Italy, 7-10 September 1993, (edited by Graeme I. Pearman and James T. Peterson) (WMO TD No. 669).

110 89. 4th International Conference on CO2 (Carqueiranne, France, 13-17 September 1993) (WMO TD No. 561).

90. Global Atmospheric Background Monitoring for Selected Environmental Parameters GAW Data for 1992, Volume I: Atmospheric Aerosol Optical Depth (WMO TD No. 562).

91. Extended Abstracts of Papers Presented at the WMO Region VI Conference on the Measurement and Modelling of Atmospheric Composition Changes Including Pollution Transport, Sofia, 4 to 8 October 1993 (WMO TD No. 563).

92. Report of the Second WMO Meeting of Experts on the Quality Assurance/Science Activity Centres of the Global Atmosphere Watch, Garmisch-Partenkirchen, 7-11 December 1992 (WMO TD No. 580).

93. Report of the Third WMO Meeting of Experts on the Quality Assurance/Science Activity Centres of the Global Atmosphere Watch, Garmisch-Partenkirchen, 5-9 July 1993 (WMO TD No. 581).

94. Report on the Measurements of Atmospheric Turbidity in BAPMoN (WMO TD No. 603).

95. Report of the WMO Meeting of Experts on UV-B Measurements, Data Quality and Standardization of UV Indices, Les Diablerets, Switzerland, 25-28 July 1994 (WMO TD No. 625).

96. Global Atmospheric Background Monitoring for Selected Environmental Parameters WMO GAW Data for 1993, Volume I: Atmospheric Aerosol Optical Depth.

97. Quality Assurance Project Plan (QAPjP) for Continuous Ground Based Ozone Measurements (WMO TD No. 634).

98. Report of the WMO Meeting of Experts on Global Carbon Monoxide Measurements, Boulder, USA, 7-11 February 1994 (WMO TD No. 645).

99. Status of the WMO Global Atmosphere Watch Programme as at 31 December 1993 (WMO TD No. 636).

100. Report of the Workshop on UV-B for the Americas, Buenos Aires, Argentina, 22-26 August 1994.

101. Report of the WMO Workshop on the Measurement of Atmospheric Optical Depth and Turbidity, Silver Spring, USA, 6-10 December 1993, (edited by Bruce Hicks) (WMO TD No. 659).

102. Report of the Workshop on Precipitation Chemistry Laboratory Techniques, Hradec Kralove, Czech Republic, 17-21 October 1994 (WMO TD No. 658).

103. Report of the Meeting of Experts on the WMO World Data Centres, Toronto, Canada, 17 - 18 February 1995, (prepared by Edward Hare) (WMO TD No. 679).

104. Report of the Fourth WMO Meeting of Experts on the Quality Assurance/Science Activity Centres (QA/SACs) of the Global Atmosphere Watch, jointly held with the First Meeting of the Coordinating Committees of IGAC-GLONET and IGAC-ACE, Garmisch-Partenkirchen, Germany, 13 to 17 March 1995 (WMO TD No. 689).

105. Report of the Fourth Session of the EC Panel of Experts/CAS Working Group on Environmental Pollution and Atmospheric Chemistry (Garmisch, Germany, 6-11 March 1995) (WMO TD No. 718).

106. Report of the Global Acid Deposition Assessment (edited by D.M. Whelpdale and M-S. Kaiser) (WMO TD No. 777).

107. Extended Abstracts of Papers Presented at the WMO-IGAC Conference on the Measurement and Assessment of Atmospheric Composition Change (Beijing, China, 9-14 October 1995) (WMO TD No. 710).

108. Report of the Tenth WMO International Comparison of Dobson Spectrophotometers (Arosa, Switzerland, 24 July - 4 August 1995).

109. Report of an Expert Consultation on 85Kr and 222Rn: Measurements, Effects and Applications (Freiburg, Germany, 28-31 March 1995) (WMO TD No. 733).

110. Report of the WMO-NOAA Expert Meeting on GAW Data Acquisition and Archiving (Asheville, NC, USA, 4-8 November 1995) (WMO TD No. 755).

111 111. Report of the WMO-BMBF Workshop on VOC Establishment of a “World Calibration/Instrument Intercomparison Facility for VOC” to Serve the WMO Global Atmosphere Watch (GAW) Programme (Garmisch-Partenkirchen, Germany, 17-21 December 1995) (WMO TD No. 756).

112. Report of the WMO/STUK Intercomparison of Erythemally-Weighted Solar UV Radiometers, Spring/Summer 1995, Helsinki, Finland (WMO TD No. 781).

112A. Report of the WMO/STUK ’95 Intercomparison of broadband UV radiometers: a small-scale follow-up study in 1999, Helsinki, 2001, Addendum to GAW Report No. 112.

113. The Strategic Plan of the Global Atmosphere Watch (GAW) (WMO TD No. 802).

114. Report of the Fifth WMO Meeting of Experts on the Quality Assurance/Science Activity Centres (QA/SACs) of the Global Atmosphere Watch, jointly held with the Second Meeting of the Coordinating Committees of IGAC-GLONET and IGAC- ACEEd, Garmisch-Partenkirchen, Germany, 15-19 July 1996 (WMO TD No. 787).

115. Report of the Meeting of Experts on Atmospheric Urban Pollution and the Role of NMSs (Geneva, 7-11 October 1996) (WMO TD No. 801).

116. Expert Meeting on Chemistry of Aerosols, Clouds and Atmospheric Precipitation in the Former USSR (Saint Petersburg, Russian Federation, 13-15 November 1995).

117. Report and Proceedings of the Workshop on the Assessment of EMEP Activities Concerning Heavy Metals and Persistent Organic Pollutants and their Further Development (Moscow, Russian Federation, 24-26 September 1996) (Volumes I and II) (WMO TD No. 806).

118. Report of the International Workshops on Ozone Observation in Asia and the Pacific Region (IWOAP, IWOAP-II), (IWOAP, 27 February-26 March 1996 and IWOAP-II, 20 August-18 September 1996) (WMO TD No. 827).

119. Report on BoM/NOAA/WMO International Comparison of the Dobson Spectrophotometers (Perth Airport, Perth, Australia, 3-14 February 1997), (prepared by Robert Evans and James Easson) (WMO TD No. 828).

120. WMO-UMAP Workshop on Broad-Band UV Radiometers (Garmisch-Partenkirchen, Germany, 22 to 23 April 1996) (WMO TD No. 894).

121. Report of the Eighth WMO Meeting of Experts on Carbon Dioxide Concentration and Isotopic Measurement Techniques (prepared by Thomas Conway) (Boulder, CO, 6-11 July 1995) (WMO TD No. 821).

122. Report of Passive Samplers for Atmospheric Chemistry Measurements and their Role in GAW (prepared by Greg Carmichael) (WMO TD No. 829).

123. Report of WMO Meeting of Experts on GAW Regional Network in RA VI, Budapest, Hungary, 5 to 9 May 1997.

124. Fifth Session of the EC Panel of Experts/CAS Working Group on Environmental Pollution and Atmospheric Chemistry, (Geneva, Switzerland, 7-10 April 1997) (WMO TD No. 898)

125. Instruments to Measure Solar Ultraviolet Radiation, Part 1: Spectral Instruments (lead author G. Seckmeyer) (WMO TD No. 1066)

126. Guidelines for Site Quality Control of UV Monitoring (lead author A.R. Webb) (WMO TD No. 884).

127. Report of the WMO-WHO Meeting of Experts on Standardization of UV Indices and their Dissemination to the Public (Les Diablerets, Switzerland, 21-25 July 1997) (WMO TD No. 921).

128. The Fourth Biennial WMO Consultation on Brewer Ozone and UV Spectrophotometer Operation, Calibration and Data Reporting, (Rome, Italy, 22-25 September 1996) (WMO TD No. 918).

129. Guidelines for Atmospheric Trace Gas Data Management (Ken Masarie and Pieter Tans), 1998 (WMO TD No. 907).

130. Jülich Ozone Sonde Intercomparison Experiment (JOSIE, 5 February to 8 March 1996), (H.G.J. Smit and D. Kley) (WMO TD No. 926).

112 131. WMO Workshop on Regional Transboundary Smoke and Haze in Southeast Asia (Singapore, 2 to 5 June 1998) (Gregory R. Carmichael). Two volumes.

132. Report of the Ninth WMO Meeting of Experts on Carbon Dioxide Concentration and Related Tracer Measurement Techniques (Edited by Roger Francey), (Aspendale, Vic., Australia).

133. Workshop on Advanced Statistical Methods and their Application to Air Quality Data Sets (Helsinki, 14-18 September 1998) (WMO TD No. 956).

134. Guide on Sampling and Analysis Techniques for Chemical Constituents and Physical Properties in Air and Precipitation as Applied at Stations of the Global Atmosphere Watch. Carbon Dioxide (WMO TD No. 980).

135. Sixth Session of the EC Panel of Experts/CAS Working Group on Environmental Pollution and Atmospheric Chemistry (Zurich, Switzerland, 8-11 March 1999) (WMO TD No.1002).

136. WMO/EMEP/UNEP Workshop on Modelling of Atmospheric Transport and Deposition of Persistent Organic Pollutants and Heavy Metals (Geneva, Switzerland, 16-19 November 1999) (Volumes I and II) (WMO TD No. 1008).

137. Report and Proceedings of the WMO RA II/RA V GAW Workshop on Urban Environment (Beijing, China, 1-4 November 1999) (WMO-TD. 1014) (Prepared by Greg Carmichael).

138. Reports on WMO International Comparisons of Dobson Spectrophotometers, Parts I – Arosa, Switzerland, 19-31 July 1999, Part II – Buenos Aires, Argentina (29 Nov. – 12 Dec. 1999 and Part III – Pretoria, South Africa (18 March – 10 April 2000) (WMO TD No. 1016).

139. The Fifth Biennial WMO Consultation on Brewer Ozone and UV Spectrophotometer Operation, Calibration and Data Reporting (Halkidiki, Greece, September 1998)(WMO TD No. 1019).

140. WMO/CEOS Report on a Strategy for Integrating Satellite and Ground-based Observations of Ozone (WMO TD No. 1046).

141. Report of the LAP/COST/WMO Intercomparison of Erythemal Radiometers Thessaloniki, Greece, 13-23 September 1999) (WMO TD No. 1051).

142. Strategy for the Implementation of the Global Atmosphere Watch Programme (2001-2007), A Contribution to the Implementation of the Long-Term Plan (WMO TD No.1077).

143. Global Atmosphere Watch Measurements Guide (WMO TD No. 1073).

144. Report of the Seventh Session of the EC Panel of Experts/CAS Working Group on Environmental Pollution and Atmospheric Chemistry and the GAW 2001 Workshop (Geneva, Switzerland, 2 to 5 April 2001) (WMO TD No. 1104).

145. WMO GAW International Comparisons of Dobson Spectrophotometers at the Meteorological Observatory Hohenpeissenberg, Germany (21 May – 10 June 2000, MOHp2000-1), 23 July – 5 August 2000, MOHp2000-2), (10 – 23 June 2001, MOHp2001-1) and (8 to 21 July 2001, MOHp2001-2). Prepared by Ulf Köhler (WMO TD No. 1114).

146. Quality Assurance in monitoring solar ultraviolet radiation: the state of the art. (WMO TD No. 1180).

147. Workshop on GAW in RA VI (Europe), Riga, Latvia, 27-30 May 2002. (WMO TD No. 1206).

148. Report of the Eleventh WMO/IAEA Meeting of Experts on Carbon Dioxide Concentration and Related Tracer Measurement Techniques (Tokyo, Japan, 25-28 September 2001) (WMO TD No 1138).

149. Comparison of Total Ozone Measurements of Dobson and Brewer Spectrophotometers and Recommended Transfer Functions (prepared by J. Staehelin, J. Kerr, R. Evans and K. Vanicek) (WMO TD No. 1147).

150. Updated Guidelines for Atmospheric Trace Gas Data Management (Prepared by Ken Maserie and Pieter Tans (WMO TD No. 1149).

151. Report of the First CAS Working Group on Environmental Pollution and Atmospheric Chemistry (Geneva, Switzerland, 18- 19 March 2003) (WMO TD No. 1181).

152. Current Activities of the Global Atmosphere Watch Programme (as presented at the 14th World Meteorological Congress, May 2003). (WMO TD No. 1168).

113

153. WMO/GAW Aerosol Measurement Procedures: Guidelines and Recommendations. (WMO TD No. 1178).

154. WMO/IMEP-15 Trace Elements in Water Laboratory Intercomparison. (WMO TD No. 1195).

155. 1st International Expert Meeting on Sources and Measurements of Natural Radionuclides Applied to Climate and Air Quality Studies (Gif sur Yvette, France, 3-5 June 2003) (WMO TD No. 1201).

156. Addendum for the Period 2005-2007 to the Strategy for the Implementation of the Global Atmosphere Watch Programme (2001-2007), GAW Report No. 142 (WMO TD No. 1209).

157. JOSIE-1998 Performance of EEC Ozone Sondes of SPC-6A and ENSCI-Z Type (Prepared by Herman G.J. Smit and Wolfgang Straeter) (WMO TD No. 1218).

158. JOSIE-2000 Jülich Ozone Sonde Intercomparison Experiment 2000. The 2000 WMO international intercomparison of operating procedures for ECC-ozone sondes at the environmental simulation facility at Jülich (Prepared by Herman G.J. Smit and Wolfgang Straeter) (WMO TD No. 1225).

159. IGOS-IGACO Report - September 2004 (WMO TD No. 1235)

160. Manual for the GAW Precipitation Chemistry Programme (Guidelines, Data Quality Objectives and Standard Operating Procedures) (WMO TD No. 1251).

161 12th WMO/IAEA Meeting of Experts on Carbon Dioxide Concentration and Related Tracers Measurement Techniques (Toronto, Canada, 15-18 September 2003).

162. WMO/GAW Experts Workshop on a Global Surface-Based Network for Long Term Observations of Column Aerosol Optical Properties, Davos, Switzerland, 8-10 March 2004 (edited by U. Baltensperger, L. Barrie and C. Wehrli) (WMO TD No. 1287).

163. World Meteorological Organization Activities in Support of the Vienna Convention on Protection of the Ozone Layer (WMO No. 974).

164. Instruments to Measure Solar Ultraviolet Radiation: Part 2: Broadband Instruments Measuring Erythemally Weighted Solar Irradiance (WMO TD No. 1289).

165. Report of the CAS Working Group on Environmental Pollution and Atmospheric Chemistry and the GAW 2005 Workshop, 14-18 March 2005, Geneva, Switzerland (WMO TD No. 1302).

166. Joint WMO-GAW/ACCENT Workshop on The Global Tropospheric Carbon Monoxide Observations System, Quality Assurance and Applications (EMPA, Dübendorf, Switzerland, 24 – 26 October 2005) (edited by J. Klausen) (WMO TD No. 1335).

167. The German Contribution to the WMO Global Atmosphere Watch Programme upon the 225th Anniversary of GAW Hohenpeissenberg Observatory (edited by L.A. Barrie, W. Fricke and R. Schleyer (WMO TD No. 1336).

168. 13th WMO/IAEA Meeting of Experts on Carbon Dioxide Concentration and Related Tracers Measurement Techniques (Boulder, Colorado, USA, 19-22 September 2005) (edited by J.B. Miller) (WMO TD No. 1359).

169. Chemical Data Assimilation for the Observation of the Earth’s Atmosphere – ACCENT/WMO Expert Workshop in support of IGACO (edited by L.A. Barrie, J.P. Burrows, P. Monks and P. Borrell) (WMO TD No. 1360).

170. WMO/GAW Expert Workshop on the Quality and Applications of European GAW Measurements (Tutzing, Germany, 2-5 November 2004) (WMO TD No. 1367).

171. A WMO/GAW Expert Workshop on Global Long-Term Measurements of Volatile Organic Compounds (VOCs) (Geneva, Switzerland, 30 January – 1 February 2006) (WMO TD No. 1373).

172. WMO Global Atmosphere Watch (GAW) Strategic Plan: 2008 – 2015 (WMO TD No. 1384)

173. Report of the CAS Joint Scientific Steering Committee on Environmental Pollution and Atmospheric Chemistry (Geneva, Switzerland, 11-12 April 2007) (WMO TD No.1410).

114 174. World Data Centre for Greenhouse Gases Data Submission and Dissemination Guide (WMO TD No. 1416).

175. The Ninth Biennial WMO Consultation on Brewer Ozone and UV Spectrophotometer Operation, Calibration and Data Reporting (Delft, Netherlands, 31-May – 3 June 2005) (WMO TD No. 1419).

176. The Tenth Biennial WMO Consultation on Brewer Ozone and UV Spectrophotometer Operation, Calibration and Data Reporting (Northwich, United Kingdom, 4-8 June 2007) (WMO TD No. 1420).

177. Joint Report of COST Action 728 and GURME – Overview of Existing Integrated (off-line and on-line) Mesoscale Meteorological and Chemical Transport Modelling in Europe (ISBN 978-1-905313-56-3) (WMO TD No. 1427).

178. Plan for the implementation of the GAW Aerosol Lidar Observation Network GALION, (Hamburg, Germany, 27 - 29 March 2007) (WMO TD No. 1443).

179. Intercomparison of Global UV Index from Multiband Radiometers: Harmonization of Global UVI and Spectral Irradiance (WMO TD No. 1454).

180. Towards a Better Knowledge of Umkehr Measurements: A Detailed Study of Data from Thirteen Dobson Intercomparisons (WMO TD No. 1456).

115

116