Forecasting Demonstration Project - Sydney 2014

Editors: Alan Seed, Aurora Bell, Peter Steinle, Susan Rennie

May 2019

Bureau Research Report – 046

FDP- SYDNEY 2014

Sponsor

Dr Ray Canterford, Division Head; Hazards, Warnings and Forecasts

Project Staff

Aurora Bell, Deryn Griffiths, Peter Steinle, Charles Sanders, Xiao, Alan Seed, Phil Purdam, Nathan Faggian, Shaun Cooper, Sandy Dance, Morwenna Griffiths, Kevin Cheong, Mark Curtis, Justin Peter, Susan Rennie, Michael Foley, Tim Hume, James Sofra, Phillip Riley, Chris Ryan, Beth Ebert, Martin Cope, Alan Wain, Andrew McCrindell, Harald Richter, Hei Meng Wong, Tennessee Leeuwenburg, David Scurrah, Tom Pagano, Jack Wells, Andrew Donaldson, James Kelly, Ian Senior, John Bally

Project Managers

Claire Cass, Aoife Murphy, Stephen Lellyett, Howard Jacobs

Operational Forecasters

Mick Logan, Kylie Egan, Peter Clegg, Dean Narramore, David Grant, Rob Taggart, Katarina Kovacevic, Claire Yeo, Phil King, James Taylor, Sarah Chadwick, Melanie Web, Mohammed Nabi, Lily Simeonova, Alicia Tuppack

Enquiries should be addressed to:

Dr Alan Seed Bureau of Meteorology GPO Box 1289, Melbourne 3001,

Contact Email: [email protected]

National Library of Australia Cataloguing-in-Publication entry

Editors: Alan Seed, Aurora Bell, Peter Steinle, Susan Rennie

Title: Forecasting Demonstration Project – Sydney 2014

ISBN: 978-0-9942757-1-4

i FDP- SYDNEY 2014

Series: Bureau Research Report - BRR046

Copyright and Disclaimer

© 2019 Bureau of Meteorology. To the extent permitted by law, all rights are reserved, and no part of this publication covered by copyright may be reproduced or copied in any form or by any means except with the written permission of the Bureau of Meteorology.

The Bureau of Meteorology advise that the information contained in this publication comprises general statements based on scientific research. The reader is advised and needs to be aware that such information may be incomplete or unable to be used in any specific situation. No reliance or actions must therefore be made on that information without seeking prior expert professional, scientific and technical advice. To the extent permitted by law and the Bureau of Meteorology (including each of its employees and consultants) excludes all liability to any person for any consequences, including but not limited to all losses, damages, costs, expenses and any other compensation, arising directly or indirectly from using this publication (in part or in whole) and any information or material contained in it.

ii FDP- SYDNEY 2014

Contents

1. Executive Summary ...... 1 2. Introduction ...... 3 3. Aims, objectives and activities...... 6 4. SREP systems tested during fdp ...... 7 5. Downstream prototype systems ...... 9 6. Pre-FDP Workshops ...... 10 7. FDP Wiki- blog and Share-Point ...... 12 8. FDP Participants ...... 12 9. FDP Visits ...... 12 10. FDP Presentations ...... 13 11. FDP Mid-day Model Discussion ...... 13 12. Summary of the operational phase of the FDP ...... 14 13. Verification and Evaluation ...... 17 13.1 Rainfields3 ...... 17 13.2 Convective Weather Outlook ...... 17 13.3 Aviation Evaluation ...... 17 13.4 Objective Verification ...... 18 13.5 Subjective and Qualitative Evaluation ...... 21 14. FDP Major Findings...... 29 14.1 Summary...... 29 14.2 CSM RUC ...... 31 14.3 Nowcast Systems ...... 31

iii FDP- SYDNEY 2014

14.4 Radar Network ...... 31 14.5 Subjective Evaluation ...... 31 14.5.1 Strengths of the RUC ...... 31 14.5.2 Actions to make the RUC fit for purpose ...... 32 14.6 Knowledge gaps in the use of RUC in a Nowcasting Service ...... 32 15. Appendices ...... 34 15.1 List of Acronyms ...... 34 15.2 Point Based Verification ...... 35 15.2.1 Method ...... 35 15.2.2 Guidance ...... 35 15.2.3 Verification Regions ...... 35 15.2.4 Verification Scores ...... 36 15.2.5 Direct Model Output: ACCESS City and ACCESS RUC ...... 36 15.2.6 Post Processed Guidance ...... 40 15.3 Use of Rapid Update Cycle Predictions in the GFE ...... 45 15.3.1 Introduction ...... 45 15.3.2 Incorporating RUC Guidance into the GFE ...... 45 15.3.3 Outcomes from the Forecast Demonstration Project ...... 46 15.3.4 Limitations ...... 46 15.3.5 Successes and Transition to Operations ...... 47 15.4 Performance of the Met Office Convective Diagnosis Procedure during the Sydney Forecast Demonstration Project ...... 48 15.4.1 Introduction ...... 48 15.4.2 CDP diagnostic in MOGREPS-G ...... 48 15.4.3 Objective lightning verification...... 49 15.4.4 Lightning case study ...... 50 15.4.5 Conclusions ...... 51 15.4.6 References ...... 52 15.5 NTFGS/CALIBRATED THUNDERSTORM PROBABILITIES: EVALUATION ...... 53 15.5.1 Introduction ...... 53 15.5.2 Reliability of Calibrated Thunder ...... 54 15.5.3 Relative operating characteristic of Calibrated Thunder ...... 55 15.5.4 Summary ...... 56 15.5.5 Refrences ...... 56 15.6 The Air Quality Sub-Project of the Sydney Forecast Demonstration Project ...... 58 15.6.1 Introduction ...... 58 15.6.2 Methodology ...... 59 15.6.3 Performance assessment ...... 67 15.6.4 Preliminary conclusions ...... 70 15.6.5 Acknowledgements ...... 71 15.6.6 References ...... 71 15.7 Radar wind assimilation and data quality ...... 73 15.8 Aviation Evaluation ...... 77

iv FDP- SYDNEY 2014

15.8.1 Aviation Objectives and Aims...... 77 15.8.2 Operations and Demonstrations ...... 77 15.8.3 Evaluation of FDP Tools for Enhanced Thunderstorm Prediction in the TMA ..... 77 15.8.4 FDP tools and outputs for wind affecting runway changes ...... 78 15.8.5 FDP Tools for accuracy in predicting alternate and landing minima ...... 80 15.8.6 Conclusions and further study...... 80 15.9 Subjective Evaluation ...... 81 15.9.1 Purpose, Methods, and Data of Subjective Evaluation in FDP-Sydney2014 ...... 81 15.9.2 Cognitive Task Analysis – examples of outcomes ...... 92 15.9.3 Conclusions ...... 97 15.9.4 References ...... 98 15.10 On the Use of Convective Scale Models in Nowcasting -the forecaster's role and training needs ...... 99 15.10.1 Scale and predictability ...... 99 15.10.2 Assessing the convective mode and evolution: a problem at different scales ... 103 15.10.3 What are the parameters relevant to the convection mode that the RUC takes from the parent model? ...... 103 15.11 Post-FDP Benefits ...... 105

v SYDNEY FDP 2014

List of Tables

Table 1 FDP Summary of “feature of the day” record and evaluation for the period 29 September-27 October 2014 ...... 14 Table 2 Summary of the evaluation of the RUC by Aviation Services ...... 17 Table 3 Summary of the subjective evaluation of the Rapid Update Cycle NWP model during the forecasting process: A. analysis and diagnostics, B. Very Short-Range Forecasts (0-12 hours), C. Warning for severe weather...... 22 Table 4 Summary of the major finding for each FDP system ...... 30 Table 5 The domain average scores (and worst performance) for the ACCESS City and ACCESS RUC models during the FDP for 2 m temperature...... 37 Table 6: The domain average scores (and worst performance) for the ACCESS City and ACCESS RUC models during the FDP for 10 m wind speed forecasts...... 38 Table 7 The domain average scores (and worst performance) for the ACCESS City and ACCESS RUC models during the FDP for screen dew point forecasts...... 39 Table 8 The domain average scores (and worst performance) for the GOCF Dev, GOCF RUC and GFE forecasts during the FDP for 2 m temperature...... 41 Table 9 The domain average (and worst performance) for the GOCF Dev and GFE forecasts of wind speed throughout the FDP...... 42 Table 10 The domain average score (and worst performance) for the GOCF RUC and GFE forecasts of dew point throughout the FDP...... 44 Table 11 Forecast locations for which ensemble forecasts were generated...... 59 Table 12 Reliability of ensemble-based forecasts for probability of daytime FFDI exceeding a value of 25 for 43 sites in NSW during Dec2014-Feb2015...... 68 Table 13 Example of data representation for the day of 29th of September 2014...... 85 Table 14 Examples of developing Situational Awareness by the forecaster ...... 92 Table 15 Background knowledge for the effective use of a CSM RUC - Training needs ...... 101 Table 16 Analysis and Diagnosis in the Forecast Process based on CSM RUC...... 102

List of Figures

Figure 1 The location of the radars located within the greater FDP domain. The inner box shows the location of the fixed 1.5 km resolution area used by the forecasters, the outer domain is the variable resolution region used to nest the RUC into ACCESS G. 9 Figure 2 FDP Familiarisation workshop for NSW forecasters. From the left Peter Steinle, Kylie Egan, Zach Berry-Porter, and Sarah Chadwick 11

SYDNEY FDP 2014

Figure 3 FDP Familiarisation workshop for NSW forecasters. From the left Kylie Egan, Katarina Kovacevic, Mick Logan, and Phil Purdam. 11 Figure 4 An example of using GPATS to verify the Convective Weather Outlook 17 Figure 5 Time series of RMSE for forecasts of 2m temperature at 12-hour lead time for coastal stations 19 Figure 6 Time series of RMSE for forecasts of 2m temperature at 12-hour lead time for inland stations 20 Figure 7 Time series of RMSE for forecasts of 2m temperature at 12-hour lead time for mountain stations 20 Figure 8 Histograms of the domain average RMSE, residual and STD scores for the 2 m temperature forecasts from the ACCESS City and ACCESS RUC models during the FDP. 38 Figure 9 Histograms of the domain average RMSE, residual and STD scores for the 10 m wind speed forecasts from the ACCESS City and ACCESS RUC models during the FDP. 39 Figure 10 Histograms of the domain average RMSE, residual and STD scores for the screen dew point forecasts from the ACCESS City and ACCESS RUC models during the FDP. 40 Figure 11 Histograms of the domain average RMSE, residual and STD scores for the 2m temperature forecasts from the GOCF Dev, GOCF RUC and the GFE during the FDP. 42 Figure 12 Histograms of the domain average RMSE, residual and STD scores for wind speed forecasts from the GOCF Dev and the GFE during the FDP. 43 Figure 13 Histograms of the domain average RMSE, residual and STD scores for screen dew point forecasts from the GOCF RUC and the GFE during the FDP. 44 Figure 14 Typical diagnostic plot for lightning and hail for Australia assessed during the FDP. Identical plots focused on the FDP region were also produced. 48 Figure 15 Map of 33 airports located in the FDP region (red dashed area). 49 Figure 16 Receiver Operating Characteristic (ROC) Lightning index >1 for 06Z run at T+24. 50 Figure 17 Reliability diagram Lightning index > 1 for 06Z run at T+24. 50 Figure 18 CDP lightning diagnostics using a) instantaneous rainfall rate and b) the maximum rainfall rate for 6Z on 1/11/14 T+24. 51 Figure 19 ACCESS-R12 calibrated thunder 15-hour forecast for thunderstorm probabilities during 6-9 UTC 08 November 2014. The forecast probabilities are shaded in purple. The observed GPATS lightning flash density is overlaid in brighter yellow and purple shades. 54 Figure 20 Reliability of the thunder probability forecasts based on ACCESS-R12 (top) and RUC (bottom). The observed lightning frequency is taken from GPATS observations (mainly cloud-to-ground strokes) and are applied to a 40 km x 40 km box centred on the observed stroke latitude and longitude. 55 Figure 21 Relative Operating Characteristic (ROC) of the November 2014 15-hour thunder probability forecasts based on ACCESS-R12. The green squares show the performance of the traditional NTFGS combined surface-based and elevated thunder diagnostic. 56

SYDNEY FDP 2014

Figure 22 Ensemble meteograms for FFDI, GFDI, wind speed, and wind direction for Katoomba, NSW, starting on 9 November 2014. The box-whiskers diagrams show the 25-75th percentile range (box), the full range (whiskers), and the median value (red line) and the control forecast (green line). 62 Figure 23 As in Figure 22 for precipitation, air temperature, relative humidity, and total cloud cover. 63 Figure 24 Forecast probability of FFDI in the range 0-12 and exceeding thresholds of 12, 25, 50, 75, and 100 at hour 114 (4.75 days) of the forecast. 64 Figure 25 Modelling domains for the Chemical Transport Model (CTM) nested inside ACCESS- R and ACCESS-RUC. 65 Figure 26 Air quality forecast display showing daily maximum value of PM2.5 forecast for 00-23 UTC on 22 April 2015. 66 Figure 27 23h forecasts for PM2.5 and AQI for forecasts initialised at 00UTC 22 April 2015. 66 Figure 28 Consecutive 1-day forecasts of carbon monoxide, ozone, nitrogen dioxide and sulphur dioxide at Randwick, NSW, from model runs initialised at 00UTC on 19, 20, 21, and 22 April 2015. 67 Figure 29 Mean forecast and observed diurnal cycle of FFDI averaged over 43 sites in NSW during Dec2014-Feb2015. 68 Figure 30 Observed and modelled 1-h concentration time series for nitrogen dioxide (top), PM2.5 (middle), ozone (bottom) for the period 10–25 October 2013 for Chullora air monitoring station in Sydney. Two model simulations are shown: ‘Anthropogenic’ excludes all ambient fire emissions from the model, ‘Urban+RFS’ include all fires in the simulation. 69 Figure 31 Consecutive 1-day forecasts (red) and observations (blue) of ozone concentrations at Liverpool, NSW, during the week beginning 6 April 2015. 70 Figure 32 FDP domain and location of radars. Dashed box is the 1.5 km fixed resolution domain. Dashed circles show assimilated area of coverage. 73 Figure 33 Example of radar classification, showing precipitation near the Kurnell radar. 73 Figure 34 Mean speed bias (model – observation) for each radar. Data points left to right are for 0.9°, 2.4°, 4.2° and 5.6° elevation scans. Dashed lines are standard deviations. Error bars are the standard error of the means. Results are for precipitation (prp) and insects (clear air: ca). 74 Figure 35 The temporal and spatial mean innovations for each scan elevation (lines) and mean across all elevations (circles) as well as the standard deviations (dashed lines/open circles). 75 Subjective evaluation is a qualitative analysis of qualitative and quantitative data (Figure 36 ). In our case, it was about how experts feel about the CSM RUC performance. 82 Figure 37 The surface temperatures from ACCESS R12 for the Sydney area for 29 September 2014. 83 Figure 38 The surface temperatures from the RUC for the Sydney area for 29 September 2014. 84

SYDNEY FDP 2014

Figure 39 Example of a 10-minute rainfall forecast in RUC (green colours) and the 6-minute rain rates from Rainfields3 (blue colours) for the 14 October 2014 case study. 88 Figure 40 Wind speed and direction for 27 October 2014 89 Figure 41 Wind speed and direction at 08:30Z 27 October. 90 Figure 42 05Z 27 October. Top: MSLP and dewpoint temperature. Bottom: MSLP and wind barbs. 91 Figure 43 Depiction of model grid resolution for Australian models. 100

SYDNEY FDP 2014

1. EXECUTIVE SUMMARY

Findings

• High resolution numerical weather prediction (NWP) is able to contribute to improved services because it can forecast the timing, location, and mode of convection, model the distribution of wind, dew point, and temperature in the context of complex topography including valleys, mountains, and coasts, depict low cloud and fog, and improve the forecaster’s understanding of the meteorological situation • Differences in the calibration and performance of the five radars in the Sydney area limits our ability to treat them as an integrated observation network • Lack of integration between the current nowcasting applications is a major handicap to their use in the future • Training that is provided by the Bureau of Meteorology Training Centre will need to be adapted so as to equip forecasters with the conceptual models that are needed for Day 0 nowcasting at the mesoscale that is based on high resolution NWP • There is considerable value in taking both scientists and experienced forecasters off- line so that together they can evaluate new NWP models and other operational systems in real-time and in a structured way

Existing Initiatives

• Reimagining the role of the regional forecast centres and emphasis on 0 – 12 h forecasting • ACCESS-City systems upgraded to 1.5 km convective allowing models, with a downscaler system operational in FY17/18 and a system with data assimilation preparing for operations in FY19/20 • The success of the ACCESS-City systems and the potential for supporting nowcasting led to the planning a National mesoscale Analysis System, scheduled for operational implementation by FY21/22 • Dual-pol upgrade project will improve the quality of the clear air Doppler observations that are needed for assimilation. Data from the first dual-pol radars became available in FY17/18

Recommendations

• Support science for improving mesoscale NWP, for example land surface modelling at mesoscale, optimising the assimilation of radar, satellite and other observation sources, and mesoscale ensembles • Implement a quality assurance framework for the radar network

Page 1 SYDNEY FDP 2014

• Improve and extend the nowcasting service through integrated applications, improved mesoscale meteorological understanding, and task-oriented training • Develop an integrated nowcasting workbench to integrate observations, NWP forecasts, and warnings • Develop the conceptual models and physical understanding of Australian mesoscale meteorology • Update the training of severe weather forecasters in mesoscale meteorology and forecasting in the 0–12-hour lead time based on the new tools that are available • Develop a culture of continuous learning in the forecast centres

Page 2

2. INTRODUCTION

The aim of the Strategic Radar Enhancement Project Forecasting Demonstration Project (SREP FDP) was to test how the high-resolution rapid update cycle NWP (ACCESS RUC) and the radar applications that were developed during SREP could be transitioned into operations. This was achieved through collaboration between operations and research that was focused on real- time forecasting and evaluation activities. The real-time trial was conducted in the NSW Sydney Regional Office over the period 29 September 2014 – 5 December 2014 when active severe weather events were likely.

The Bureau, in common with most National Meteorological Services, has found that it is difficult to move research into operations, and that this process can be slow and fraught. Experience in the USA at the Spring Experiment shows that rapid infusion of new science and technology into operational forecasting requires direct, focused interactions between research scientists, numerical model developers, information technology specialists, operational forecasters, and end users.

It has become evident that the type of severe weather that occurs (heavy rainfall, tornadoes, hail, or damaging winds) is often closely related to the convective mode (or morphology) of the storms, such as discrete cells, squall lines (or quasi-linear convective systems), or multicellular convective systems. In addition, there are classes of thunderstorms, such as supercells and bow echoes, which produce a disproportionate number of tornados or widespread straight-line wind damage events. Accurate severe weather forecasts depend on forecasters being able to predict not only where and when severe thunderstorms will develop and how they will evolve over the next several hours, but also the convective modes that are most likely to occur.

It is clear from the initial verification of ACCESS RUC and published results on other mesoscale NWP systems that the benefits in mesoscale NWP modelling are largely found in the first 12 hours of the forecast. The positive influence of data assimilation over a small scale (including observations from 8 radars, three of which are at the domain boundary, Figure 1) is confined to about the first 6 hours of the forecast. Therefore, given the expense of running ACCESS RUC, there was no advantage in using small domain ACCESS RUC for medium range forecasts as this is done more effectively by lower resolution NWP forecasts over a larger domain.

While forecast accuracy has been shown to improve with increasing resolution, there are significant forecast errors at the scales that are of interest to applications that use mesoscale numerical weather predictions. Therefore, the current trend is towards high-resolution ensemble systems, for example the Met Office has a 2.2 km ensemble, where the spread in the ensemble represents the uncertainty in the forecasts. Developing the capacity to assimilate radar data into a deterministic NWP model is an important first step towards the eventual goal of running a high-resolution ensemble NWP model.

Page 3

There is an increasing consensus that probabilistic stream flow forecasts require large ensembles of rainfall forecasts at high space/time resolution as input. For example, the Met Office uses STEPS to generate a 100-member ensemble out to 12 hours that is used for flood warning.

The FDP was designed to provide a setting to facilitate interactions between scientists and forecasters. This allowed participants to better understand the scientific, technical, and operational challenges that are associated with using NWP with high resolution and an hourly update cycle, as well as the use of the improved radar products, for prediction of weather at high resolution out to 12 hours.

Page 4

3. AIMS, OBJECTIVES AND ACTIVITIES

FDP Aims

• Assess the utility of ACCESS RUC to provide more detailed and useful guidance out to a maximum lead-time of 12 hours on environmental conditions (primarily the low troposphere) and subsequent thunderstorm development and evolution.

• Show how SREP radar science outputs can be delivered in real-time within and through existing forecast systems and tools so as to improve the Bureau’s very short-term forecast and warning services.

• Investigate how the internal forecast process for convective weather could be further optimised to realise the full benefits of new capabilities.

• Gather information on how to transition the SREP science into an operational service.

• Determine if the new models, systems and tools can contribute to improving the actual services and identify new services that can be developed based on the new FDP prototypes.

FDP Objectives

• Run ACCESS RUC in real time with an hourly update cycle and deliver these forecasts in real time to the FDP in the Regional Office for the period 29 September to 5 December 2014.

• Evaluate the ability of ACCESS RUC to depict the pre-storm and storm low level boundaries (e.g. fronts, discontinuity lines, sea breezes) and the convective mode.

• Evaluate the ability of ACCESS RUC to deliver 10-minute point forecasts of surface wind, temperature, humidity etc. for fire weather and aviation applications.

• Evaluate the utility of the hourly update cycle to provide continuous improvements (lead time and accuracy) to the convective weather outlook before the initiation of convection.

• Evaluate the performance of Rainfields3 to perform quality control on the radar reflectivity and radial velocity for assimilation into ACCESS RUC, to generate quantitative rainfall estimations, and as input to the ensemble quantitative precipitation forecasts (Rainfields QPF).

• Demonstrate the use of SREP science products from ACCESS RUC and Rainfields3 in existing operational tools (GFE, VW, GOCF, NTFGS, TIFS, TITAN, 3DRapic).

• Use the material collected during the experiment to identify the potential services and guidance from SREP science that could be transitioned into operations.

FDP planned activities to support achievement of objectives:

Page 6

• Pre-FDP workshops and training

• Operational phase (delivered by forecasters): daily forecasting activities (7 am- 7 pm, 7 days a week) with creation of Convective Outlooks based on RUC (100 days September-December 2014), trial of SREP systems. The Operational Plan book prepared by Aurora Bell is available internally .

• Feedback-evaluation phase: daily evaluation (wiki-blog), midday discussions, occasional interviews

• Communication phase: visits and presentations

• Post-FDP workshops and training

4. SREP SYSTEMS TESTED DURING FDP

ACCESS RUC description

The RUC used 3D-Var assimilation on an hourly analysis forecast cycle, based on the configurations described in Lean et al. (2008) and Dixon et al. (2009). The forecast model used version 8.2 of the Unified Model on a regular latitude-longitude horizontal grid with a spacing of 0.0135 degrees (approximately 1.5km) in each direction, and 70 vertical levels, up to a model top at 40km. The 3D-Var was run on a reduced grid of 0.027 degrees (3km). The set of observations assimilated was similar to those used in the global and regional model at the time (see Operations Bulletin 98, http://intweb.bom.gov.au/nmoc/stan/opsbull/opsbull93- aps1g.pdf for a comprehensive list). The differences from the regional data assimilation included:

• A restricted set of satellite sounding channels to focus on low peaking channels and those sensitive to water vapour

• The addition of radial winds from Doppler-enabled radars within the domain

• The use of observations every 10 minutes from automatic weather stations.

The restricted set of satellite sounding channels was due to the top model level being at 40km rather than 80km as in the regional model.

For a given forecast base time (T), observations from T−30minutes to T+30minutes were extracted at T+55 minutes. This allowed the assimilation and forecast model to be completed within 80 to 100 minutes of the nominal analysis time (T). This forecast was then used as a background for the next assimilation/forecast cycle with a base-time of T+1 hour. The forecast model was nested in the most recently available run of the regional model.

References:

Page 7

Dixon M, Li Z, Lean H, Roberts N, Ballard S. 2009. Impact of data assimilation on forecasting convection over the United Kingdom using a high-resolution version of the Met Office Unified Model. Mon. Weather Rev. 137: 1562–1584.

Lean HW, Clark PA, Dixon M, Roberts NM, Fitch A, Forbes R, Halliwell C. 2008. Characteristics of high-resolution versions of Met Office Unified Model for forecasting convection over the United Kingdom. Mon. Weather Rev. 136: 3408–3424.

Rainfields3

SREP funded the development of Rainfields3 and an enhanced radar data quality control algorithm to classify radar echoes into rain and a number of non-meteorological classes. Rainfields3 represents a significant enhancement to the Bureau’s operational radar rainfall estimation capability due to improved architectural design that improves its scalability to the entire 58-radar network, flexibility in configuring the quality control chain for each radar in the network, and improved algorithms for echo classification, real-time correction of the mean vertical profile of reflectivity, spatially varying gauge adjustment in real-time, and a real- time radar quality metric.

Page 8

Figure 1 The location of the radars located within the greater FDP domain. The inner box shows the location of the fixed 1.5 km resolution area used by the forecasters, the outer domain is the variable resolution region used to nest the RUC into ACCESS G.

5. DOWNSTREAM PROTOTYPE SYSTEMS

GFE Generate fire weather indices and hazards that are based on ACCESS RUC and alerting when thresholds are reached.

Page 9

Visual Weather View ACCESS RUC, Calibrated Thunder, and Rainfeilds3 products together with the other real-time observations.

GOCF Use ACCESS RUC to downscale wind, temperature, and rainfall forecasts and to use these fields in the GFE to calculate the Fire Danger Index.

Calibrated Thunder Use ACCESS RUC to generate Calibrated Thunder Convective Weather Outlook.

TITAN Combine Rainfields3 tracks with the cell tracking and use Rainfields3 quality-controlled radar reflectivity as input.

TIFS Use Rainfields3 probability forecasts and FDP TITAN tracks to generate imminent storm threat areas.

3D-Rapic Visualise Rainfields3 radar reflectivity and Doppler quality-controlled products and rainfall nowcasts.

Rainfields QPF Downscale ACCESS RUC rainfall forecasts from 1.5 km, 10 min resolution to the radar resolution of 1 km, 6 min and blend with radar advection nowcasts to generate ensembles at a resolution of 6 min, 1 km out to 6 hours, updated every 6 minutes.

6. PRE-FDP WORKSHOPS

The first feedback workshop was organized on 18th June 2012, part of the SREP project. The aim was to prepare for the NWP science to be used in FDP.

Four workshops were held in the lead up to the FDP to ensure that the systems would be ready and to understand more fully how the various systems could be used in the FDP (December 2013, April 2014, May 2014, June 2014)

The final workshop was held on 22–25 September 2014, the week before the start of the FDP. The objective was to take the Sydney forecasters who would be contributing to the FDP through the daily workflow. Kylie Egan of SARO (acting in a BMTC role) prepared and presented the material for two (identical) 2-day workshops.

Page 10

Figure 2 FDP Familiarisation workshop for NSW forecasters. From the left Peter Steinle, Kylie Egan, Zach Berry-Porter, and Sarah Chadwick

Figure 3 FDP Familiarisation workshop for NSW forecasters. From the left Kylie Egan, Katarina Kovacevic, Mick Logan, and Phil Purdam.

Pre-FDP workshops were organised for the participant forecasters so as to develop a clearer vision of the workflow while familiarising them with the aims and intent of the FDP. Many of the forecasters were not available at the times of the workshops due to operational priorities and special training was organised for them. Each inter-state forecaster required a day for familiarisation on arrival in Sydney using the FDP Handbook and a day working alongside an experienced FDP forecaster.

Page 11

7. FDP WIKI- BLOG AND SHARE-POINT

The blog was a major tool for recording the findings of each day and the forecasters were allocated a significant amount of time each day to record their thoughts on the Wiki. This strategy proved to be very successful and the Wiki represents a significant resource for future research. It could be worth considering how to use a Wiki-blog in a structured way in the Forecast Centres and future Demonstrations.

8. FDP PARTICIPANTS

It became apparent early in the planning for the FDP that the availability of operational forecasters was a significant issue that would need careful attention. An agreement on how to manage the selection of the participating forecasters was reached with Assistant Director of Services with the understanding that all participants would be available for recall to operational duties should the need arise.

Forecasters: Mick Logan (NSW), Rob Taggart (NSW), Katarina Kovacevic (NSW), Mohammed Nabi (NSW), Sarah Chadwick (NSW), Kylie Egan (SA), Peter Clegg (WA), Dean Narramore (QLD), David Grant (QLD), Phil King (VIC), James Taylor (VIC), Melanie Web (TAS), Lily Simeonova (SAMU), Alicia Tuppack (AWS).

Developers (HO): Harald Richter, Peter Steinle, Yi Xiao, Susan Rennie, Jin Lee, Alan Seed, Aurora Bell, Michael Foley, Phil Purdam

International Developer: Rebecca Stretton (Verification Scientist, Met Office)

9. FDP VISITS

The following visits to the FDP operational room:

• Robert Vertessy (DIR) with the Bureau Executive

• Alasdair Hainsworth Assistant Director Services, ADS

• Barry Hanstrum

• Terry Hart

• NSW Hydrology team

• Beth Ebert with RFS for smoke modelling workshop

• Andrew McCrindell with Aviation Industry representatives

• Stephen Lellyett with a delegation from the Chinese Meteorological Administration (CMA)

Page 12

• Fiona Johnson from Dept. Civil and Environmental Engineering, UNSW visited Alan Seed

10. FDP PRESENTATIONS

Several presentations were made during the FDP to facilitate communication and training. These presentations are available internally.

• Peter Steinle gave a review of the Rapid Update Cycle model and what could be expected of it to the RFC

• Harald Richter gave a presentation on Calibrated Thunder and then on request presented a radar analysis of the storm of the 14th October (Severe wind and rain associated with an East Coast Low and Southerly Change) to the RFC

• Jin Lee gave a talk on data assimilation to the RFC

• James Taylor presented case studies to the RFC and to the FDP team

• Alan Seed presented an analysis of the relative calibration of the Sydney radars to the RFC

• Aurora Bell presented a seminar on the use of high-resolution NWP for the mesoscale analysis

• Rebecca Stretton gave a presentation to the FDP team on the Met Office Convective Development Potential product

• Alan Seed gave a presentation to the Aviation and CMA delegations

11. FDP MID-DAY MODEL DISCUSSION

A daily model discussion was held from 12:00–1:00 pm to bring together FDP forecasters and scientists for a discussion on the performance of the RUC for yesterday’s “feature of the day” and to nominate a feature for the current day. The discussion focused on aspects of the FDP activities, including the subjective evaluation of yesterday’s RUC forecasts, the formulation and rationale of the two morning outlooks, and perceptions of the usefulness of mesoscale and near-storm scale models that were used in the forecast preparation.

The meeting was held every week day and chaired by Alan Seed and was an important opportunity for the CAWCR scientists and FDP forecasters to discuss features in the RUC performance over the past day and for forecasters to evaluate how their outlook from yesterday verified. During the 10-week experiment there were 47 reviews of the RUC performance.

A subjective and informal scoring system was used to note when the RUC was a significant improvement on the operational models and when it was significantly worse. The final score,

Page 13

based on representation of severe weather situations only, was 9-3 in favour of the RUC. The RUC was significantly better than the operational models on the two most significant days, 29 September extreme heat in Sydney and 14 October East Coast low with extreme wind and rain.

12. SUMMARY OF THE OPERATIONAL PHASE OF THE FDP

Forecasters had to use two consecutives runs of the Convective Scale Model-Rapid Update Cycle (CSM-RUC) to generate two products: a Convective Outlook of the day (afternoon) and a specific 3-h Detail window inside the Convective Outlook, in the afternoon. The objectives were to evaluate if the use of hourly runs improved the Convective Outlook and if the high- resolution model helped with describing better the evolution of convection. The forecasters also issued cell-based nowcasts on days with significant convection.

At the end of each day the forecasters evaluated how the RUC handled mesoscale features, the mode of convection, against the observations, radars, satellite and the rest of the operational models and recorded their findings in the FDP blog.

The results of this analysis were reviewed during the mid-day discussion on the following day to determine if the CSM-RUC was clearly better, equivalent, or clearly worse than the operational model in forecasting the feature of that day. The results are summarised in Table 1.

Table 1 FDP Summary of “feature of the day” record and evaluation for the period 29 September-27 October 2014

Date Features Comments Evaluation

29 September Sea breeze interaction RUC solves breeze better than R12, keeps it Major longer; improvement with trough at the North – sea breeze is more advanced in the T max Sydney: high profile forecast, near North record for max temp; OCF performed poorly; Sea breeze good on Kurnell; dew points issues

30 September Strong sea breeze , Convection Rainfields3 anaprop; Calibrated Th issues; Minor in the North RUC overdid but other models underdid it; improvement synoptic change overnight, all models had Synoptic change overnight weaker winds over land

01 October Strong sea breeze; the change Very clear breeze on Kurnell 14:15 LT; All models goes up the coast- synoptic performed well feature; Cold front over the Well depicted breeze winds in the RUC sea, gale force winds

Page 14

02 October Westerlies and sea breeze All models performed well

03 October Sea breeze interaction with a Low visibility not properly captured by RUC All models trough in the South (more and all models; stream showers equivalent advanced inland in the South);

6 October Karman vortex in low clouds at Vortex in VIS satellite Performed better Mt Neville in Northerly

7 October Low cloud feature in VIS and Performed slightly RUC worse

8 October Blustery Change Change too early in RUC by 3 hr, all models Performed worse; had it well. Stream shower behind the change synoptic timing Sea breeze; nicely depicted in RUC, timing was good; RUC error had precip over the ocean but too close to Roll clouds spaning along wind the coast; Spot issued based on RUC at shear Armidale, but the change was a failure (the change was quite sharp, to investigate); Alan recalibrated STEPS- bad in shallow convection; Problems with ZR relation in maritime mass, stream showers.

9 October Feature that produces lined Precip OK in RUC no other models did precip Major Showers over the sea- Coastal well; Convection mode OK, offshore showers, improvement line the RUC saw the land breeze

10 October- Sea breeze Good timing and shift inland Friday

13 October Upper level front: linear Pre-storm day, convection in the NW on a All models bad: did convection line, no conceptual model known; broken not represent the RUC; different calibration for dry line convection Dry line in the NW- linear radars. Initiation was off for the dryline but properly convection from the surface recovered later. The line mode of the that arrived in Sydney, bowing convection took everybody by surprise and of the line or wave-like-line; RUC didn’t make a squall line. Access R placed it differently. The convection on the Mid North Coast was good- other models didn’t see that. Bowing in the convection line seen on radar, not on RUC

14 October East Coast Low and Southerly RUC was good with the wind intensity and Major change in waves (prefrontal) convection mode (rotation) and rain improvement, with wind obs record (87 kt) rate; RUC showed the complex structure of and FF warnings were issued the event; RFC conceptual model for that Important case for heavy rainfall. Vortices event was too simple. Harald was in RFC with

Page 15

along the shear line (coastally the sev wx forecaster and monitored by study trapped Kelvin waves) radar. The RUC fields and obs were at the same space and time scale but forecasters couldn’t integrate them in real time.

16 October Orographically triggered 10 RUC updates had the same storm, but To examine in isolated storm nothing happened. more detail

17 October Coastal Wind Convergence – RUC failed with the Westerlies in the first To examine in trough line. runs but later runs readjusted and didn’t more detail have the Easterlies anymore. Easterlies bring moisture to initiate TS on Tablelands: other Feature over the sea (not land breeze) that models had dry air from initiates TS on waters – unknown conceptual westerlies model.

20 October Southerly change (synoptic RUC captured it. Spot: RUC better job due to Major scale- cold front) in the night- the funnelling on the terrain. Changes along improvement unpredicted by RFC- gale the coast are better in RUC but inland warning at short notice underdoes it. overnight. Uniform SE flow over all domain.

21 October Orographically triggered TS RUC showed the mode of convection Major with outflows triggering new improvement storms

22 October Convergence on the hills, Satellite and GPATS confirmed TS Major isolated convection; Outflows improvement triggering new cells

23 October Convection initiated on the The outlook has a coastal zone free of To examine in case ranges due do topography- convection; very good verification but study stationary on the ranges; Sea significant convection over night that RUC breeze inhibits convection didn’t forecast. No models picked that along the coast. Supercell and up. (RUC identified the mode) hail.

24 October Convection in the afternoon Case study: delayed by morning comparison of RUC cloudiness. Mode of convection with convection: multicell and other models. supercell

27 October Warm gust, westerlies winds RUC did well the arrival of the change but Case study: fire change underestimated the speeds. indices

Page 16

13. VERIFICATION AND EVALUATION

13.1 Rainfields3

The calibration issue with the Terrey Hills radar affected the quality of the radar rainfall estimates in the FDP domain for much of the FDP. The issue was present during the major rainfall event on 14 October and was resolved by the 22 October with the other issue of clutter breaking through at irregular intervals resolved finally on 20 November.

13.2 Convective Weather Outlook

GPATS lightning was used to verify the Convective Weather Outlooks in Visual Weather. The template that is used to perform this verification could be used by the Forecast Centres to verify their Outlooks after the completion of the FDP.

Figure 4 An example of using GPATS to verify the Convective Weather Outlook

13.3 Aviation Evaluation

Alicia Tuppack and Lily Simeonova conducted an evaluation of the RUC over a three-week period. This interaction with aviation forecasters was extremely valuable to the FDP. The findings from the evaluation over 12 days by Alicia Tuppack and a group of aviation forecasters are presented in an extended report available internally. The main conclusions are in summary as follows:

Table 2 Summary of the evaluation of the RUC by Aviation Services

Phenomena Comment

Page 17

Sea Breeze Did well on timing close to the coast and is able to propagate the shift inland, but with less accuracy

Southerly Change Does not handle southerly change at coast well

Katabatic Wind RUC appeared to have better timing with regards to the start and cessation of the stronger winds. Upper wind too fast. Better timing of start and end of stronger winds

Fog Able to differentiate between mist, fog, and low cloud

Temperature, Dew Point, Moisture Surface inversions not modelled properly, but temperature profile seems to be accurate

13.4 Objective Verification

Shaun Cooper

Objective verification was carried out using three sets of Automatic Weather Stations as ground truth: Coastal, Inland, Mountains, so as to determine how the RUC performed in areas that were affected by the coastal or orographic circulations. Three basic metrics were calculated; the bias which is the difference between the forecast and observation averaged across all stations in the group for a particular day, the average root mean square error for all stations in the group on a particular day, and the standard deviation of the forecast error (which is not affected by the bias term as is the RMSE).

The verification included temperature, dew point temperature, and wind speed. A number of basic statistics were calculated, but the primary focus was on the bias and the root mean square error for each set of stations for the hour at 12 hours lead time.

Figure 5 shows the time series of the RMSE for coastal stations for the period 21 October to 20 November. It can be seen that the RUC consistently provides the most accurate coastal forecasts for temperature at 12-hour lead time.

Page 18

Figure 5 Time series of RMSE for forecasts of 2m temperature at 12-hour lead time for coastal stations

The situation for inland stations is less clear-cut with the RUC not adding much value over the ACCESS-SY forecasts (Figure 6) although both the NWP models outperform the current GFE at these locations and 12 hour lead time. RUC outperformed ACCESS-SY on 20 days in the month 21 October–20 November in the mountains (Figure 7). One can conclude therefore that the RUC provides more reliable forecasts at the coast and in the mountains than the current operational ACCESS-SY model and has a similar performance in for the inland areas. It is not obvious that post processing the RUC output using the OCF or indeed the GFE will add much value at these lead times.

Page 19

Figure 6 Time series of RMSE for forecasts of 2m temperature at 12-hour lead time for inland stations

Figure 7 Time series of RMSE for forecasts of 2m temperature at 12-hour lead time for mountain stations

Page 20

13.5 Subjective and Qualitative Evaluation

Aurora Bell

A subjective evaluation of the model was conducted during and after the FDP. The aim was to test the capability of the Convective Scale Model with Rapid Cycle Update to assist the forecasters in performing their job. For this type of model, this means testing the capability of the model to support the forecaster in gathering the mesoscale “feature of the day”, the critical structures likely to impact weather in the following 24 hours. The objective was to assess the possibility of improving actual services and of developing new services.

Subjective evaluation, being about the qualitative analysis of both qualitative and quantitative data, captures aspects that cannot be quantified when using an objective method. We wanted to evaluate how experts "feel" about the CSM-RUC performance, and the qualitative data analysed were extracted from notes and transcriptions during midday discussion, blog, forms, and interviews, using Cognitive Task Analysis, Critical and Successful Event and Knowledge Audit methods (more details in the Subjective Evaluation Appendix).

There is a general agreement (e.g. Meteorological Service of Canada forecasters forums, Sills, 2009) that the weather analysis and forecast task has six elements (Bosart, 2003): Forecasters must consider what recently happened and why, what is happening and why, and what will happen and why. Increasingly good numerical weather prediction encourages forecasters to focus only on what will happen (Bosart 2003, LaDue 2011). There are many reasons why forecasters do not engage in all 6 elements (e.g. workload, lack of training) but finding them was not in the scope of this evaluation.

One of the most relevant conclusions of the subjective analysis was that including the verification of previous forecast and self-reflection on what went well/wrong in the work- schedule improved the forecaster engagement with the weather. This helped the forecaster to identify when the model generated a forecast that was likely to be bad, and still extract useful information from it (e.g. with convection, the model might miss the location and time but is still giving useful information about the severity of the convection mode).

We have also noticed that when there is a "social" component to this verification process, i.e. when forecasters can talk, brainstorm, discuss with each other, provide feedback (in person or via blogs and forums) then the quality of the engagement process increases.

Forecasters were asked to fill in evaluation forms (See the Appendices on Operations and Blog) and they were given the possibility to comment freely in an on-line blog. The comments were about their daily experience with the RUC while they were preparing convective outlooks as per the workflow schedule. The blog was a very successful tool for collecting feedback as the forecasters felt they had more freedom on what to comment about. We have analysed how the RUC would impact the way forecasters understand the evolution of the meteorological situation at Day Zero (+12 hours). The forecasters had to analyse the 10 minutes outputs from at least two consecutive runs of the RUC and produce a “Convective Outlook” based on the

Page 21

RUC. As a part of building situational awareness, they had to identify “features” that might have a role in convection in the outputs of the RUC. The subjective evaluation highlighted that the detailed analysis based on RUC enhanced the forecaster situational awareness for the last hours and allowed for greater intervention of the forecaster in the Convective Outlook product and in the nowcasting process.

The blog, the mid-day discussions, forms, and interviews with the forecasters were analysed using the Cognitive Task Analysis (CTA) method. Cognitive Task Analysis (CTA) is the study of what people know, how they think, how they organize and structure information, and how they learn when pursuing an outcome, they are trying to achieve. Forecasters were interviewed using the Critical Event and Successful Event method, using open questions about the strengths and imitations of the model in different situations.

The aim of this process was to identify the role of the CSM-RUC in providing the forecasters with guidance about the predictability of the specific meteorological situation, whether the trigger was at synoptic, mesoscale or storm scale. The results of the evaluation (presented in Table 5 below) are structured in 3 categories, as per the different stages of the Nowcasting process during a Severe Weather Workflow; these stages are: A. analysis and diagnostics, B. Very Short-Range Forecasts (0-12 hours), C. Warning for severe weather.

Examples of data used for subjective evaluation are provided in the Annex on Subjective Evaluation. The results of this analysis are presented in Table 5 below. The left column presents the forecasters experience during phases A, B, and C of the forecasting process (subjective evaluation). The right column represents the translation of the left column in scientific performance of the model. The subjective evaluation was split in two: positive (strengths) and negative (limitations). The limitations of the model were converted into "requirements". These requirements will be used by the NWP researchers to improve the model performance.

Table 3 Summary of the subjective evaluation of the Rapid Update Cycle NWP model during the forecasting process: A. analysis and diagnostics, B. Very Short-Range Forecasts (0-12 hours), C. Warning for severe weather.

A Forecasters experience with analysis and diagnostics. Strength of the RUC

Forecasters combined the RUC outputs with satellite, radar and observations to make a realistic representation of the actual meteorological state of the atmosphere. RUC has a detailed representation of terrain and coastal features that allows Forecasters used the detailed representation of topography in better interpretation at mesoscale of all the RUC to identify mesoscale and local-scale features in information available. pressure, wind, moisture, cloudiness and temperature fields.

Surface data is sparse, satellite is only

Page 22

Forecasters identified how valley winds and convergence areas qualitative, radar has less coverage, but change in different types of larger scale flows. They have all this information is enhanced by noticed how the sea breeze can advance deep inland or stay overlapping in the RUC. closer to the coast depending on the general flow.

RUC helped the forecasters to gain a detailed image of a larger scale flow in areas with fewer observations.

Forecasters obtained a good understanding of the environment in which some clouds develop and how they could be used to assess the state of the atmosphere They observed Horizontal Convective Rolls in the visible channel in satellite and noticed that they are oriented along the wind shear in the RUC.

Forecasters from other regions, unfamiliar with the area, could handle the mesoscale details because they were properly described in the RUC. Geographic features resolved by the model were terrain, coastlines, bodies of water, lakes, dams, rivers, inlets.

Forecasters could inform the RFC on the possibility of a The surface temperature is easy to use record of maxima. Good description of maximum temperature in the RUC. over different surfaces; temperatures over complex terrain under a stationary upper ridge.

Success with predicting a temperature record.

Forecasters could identify the possibility of a record in the maximum temperature of the day at a specific location and they could assess the time when this will happen and how local maxima vary across the large urban areas of Sydney.

The aviation forecasters had to evaluate the change of the The RUC was good with katabatic winds overnight and noticed that RUC appears to handle the drainage flow under no synoptic stress. development of north-westerly katabatic winds well in the Sydney Basin.

The forecasters could describe the overnight flow. The Better upper winds around orography overnight forecast model shows some downslope flow and (at low levels and upper levels in the channeled wind along the valleys. mountain lee.)

Forecasters could evaluate the evolution in time of wind parameters. The RUC seemed to outperform other models when predicting the start and cessation of stronger upper The wind is easy to use in the RUC, at westerly winds (above 925hPa) causing turbulent conditions. all levels.

Forecasters could validate their conceptual model on how a Advance of the cold changes and wind cold change would go faster up the river than on the land. changes through the complex

Page 23

topography.

Forecasters noticed mesoscale climatology of precipitation. In Orographic precipitation prediction is easterly wind the RUC shows a rain shadow immediately improved. downwind of the mountains.

Forecasters appreciated the orographic precipitation enhancement.

Forecasters used the RUC to differentiate between fog, mist RUC describes better than other models and low cloud. the moisture in the low levels of the atmosphere.

The RUC excelled in the timing of onset and clearance of Good depiction of storms life-cycle. thunderstorms from the Sydney metropolitan area, as well as the general trend in thunderstorm intensity.

RUC 10min precipitation fields verified well with radar for The best use of the RUC 10minute areas of convection and general intensity in the short term mode was in a “nowcasting” approach (

In one case they have identified a supercell-like structure in Outputs at 10 minutes look like precipitation field output at 10 minutes. reflectivity and were good to assess the mode of convection.

Forecasters had to estimate the total duration of rain and RUC manages the prediction of the analyzed both convective line and stratiform trail precipitation mode of convection and describes the in intensity and extension. The RUC description was very presence of the stratiform trail. similar to what radar was showing.

Initiation of convection was late by a couple hours on the dry Convection in the RUC is properly line but good on all other boundaries. described as "possible" event but is very sensitive to lateral conditions of The forecasters noticed that convective initiation and the model. sustenance in the RUC vary from day to day probably because of how the parent model describes the larger scale forcing.

Forecasters could predict afternoon rain in Sydney when no RUC had a good upscale growth of other model would mention it, because they believed RUC convection. scenario that isolated cells will develop in a squall line and move towards east.

Forecasters observed that RUC can build large convective Errors in precipitation upscale growth complex systems from lines of individual cells. In two cases in of RUC may be higher in weak shear the North and West of the domain the outflow associated to environment when entrainment plays a these systems was too strong and expanded too far. larger role.

Page 24

Aviation forecasters could predict correctly the early cessation of evening precipitation in an unstable air mass ahead of a sharply convergent wind shift with an upper-trough moving across. The situation had a high impact on the activity.

Forecasters gained information which coarser resolution models cannot provide on strongly forced or terrain-driven mesoscale structures.

A Forecasters experience with analysis and Limitations of the RUC diagnostics.

Some coastal observations did not fit well in the Representativeness of coastal stations needs to be model description of wind. Forecasters did trust considered. more the wind data from the model than from some of the observations. Radar is overshooting the PBL in the coastal area. More observations at low levels are needed, or to plan lower scans with better clutter rejection.

Forecasters did not trust the dew points and in The thermodynamic parameters are difficult to use general predicting moisture in the RUC. due to the representativeness of the RUC sounding data in areas where significant forcing is present.

The model has a problem with establishing surface inversions and related surface moisture conditions.

The PBL schemes need more research and calibration.

Soil moisture needs to be included.

Coupling with ocean needs to be included. or situations that involve physics-sensitive scenarios, especially interactions among different physics parameterizations such as low stratus deck erosion which depends on PBL mixing, cloudy radiative transfer, and microphysics

Forecasters could not appreciate the predictability Convection in the RUC is very sensitive to lateral of a specific situation and could not evaluate the conditions. parent model (was not accessible).

Errors in precipitation upscale growth of RUC may be higher in weak shear environment when

Page 25

entrainment plays a larger role.

Requirement from RUC:

(A) To provide an accurate analysis of the current weather condition with a resolution of a few kilometers.

B Forecasters experience with VSR forecasts Strength of the RUC

Forecasters used loops of 10 minutes steps of RUC outputs to Good diurnal cycle of temperature and make a suggestive description of the "story" from local-scale geographical distribution of thermic up to synoptic scale. They could understand the evolution of gradients in strong synoptic forcing. the weather through the forecast period and identify significant Good estimate of the values of the episodes. They could focus on a specific feature and decide on temperature. the policy for the day.

Forecasters prepared in the morning an Afternoon Outlook The RUC provided the best description product. This product was updated after one hour. The longer- of the mesoscale state for the afternoon. range RUC model output (T+6hrs) provided forecasters with a Forecasters agreed that when the parent timely understanding of the type of convection, local triggers, model provides the right conditions, and general daily pattern of thunderstorms expected. then the added information from the RUC is valuable.

The hourly updates of the forecasts had the realistic appearance of the precipitation field, even if major cells were missing.

Forecasters could issue a warning for strong easterly winds The model was good with resolving ahead of a southerly change based on the RUC. strong winds with a mesoscale surface gradient.

The model was good with strong winds aloft mixing down into the boundary layer

Modes of convection identified: individual small cells, Forecasters appreciated RUC skill in individual supercells, splitting cells, narrow lines, bowing predicting the mesoscale characteristics segments, wide lines with trailing stratiform regions, systems of convection such as whether with rapid forward propagation, back building systems, convection would organize into lines or training cells, individual cells, rear inflow jet and surface remain as isolated cells. Traditionally, outflow boundary in a convective complex, structure with larger scale guidance the containing leading convective line and trailing stratiform area, forecasters would infer these from

Page 26

width and length of the stratiform area and width of leading predictions of the storm environment. convective line, size and shape of large convective cells, a line RUC offers realistic explicit predictions of large, discrete cells, a solid squall line without pronounced of these characteristics. embedded cellular structure, rapidly-moving, long-lived isolated cells

RUC is very good with convective mode at Day zero.

Forecasters could use the first steps of the run to have a good The RUC has short spin-up time, the picture of the convective mode. precipitation looks real even at short lead times.

Even if the arrival of the cold front was not precise in the The model was very good with RUC, the forecasters noticed the rotational structure of the resolving the wind structure of a cold wind along the cold front. Different runs persisted with front and would have helped with wind resolving "supercell-type" winds. advisories despite the lack of accuracy in time.

During FDP the forecasters did not have 10 minutes outputs of the upper levels, but only hourly outputs.

The primary value of RUC is in indicating the type of event which is possible.

Forecasters would have high confidence in a major event that is described in the RUC, would carefully assess the forecast scenario and evaluate all possibilities and communicate with emergency managers and this heightens the situational awareness. Forecasters noticed that the RUC forecast is most useful when it is interpreted for its mesoscale characteristics such as convective mode and storm coverage.

B Forecasters experience with VSR forecasts Limitations of the RUC

The RUC did not improve the movement of strong The rapid updates had a very small effect on the low-level southerly changes well which reduced new forecasts. forecaster confidence in the value of rapid update cycles. Assimilation of radar data had very small effect on the new runs.

Longer range precipitation fields (>T+6hrs) did not The RUC forecast loses quality at longer lead verify well with radar or lightning data, however times. fields could be used by forecasters as a guide to the severity of rainfall.

Page 27

Forecasters did not observe a major improvement GAPS: The right methods for Rapid updates due to the "hot start", with one exception. They Cycles need to be implemented. Tools to evaluate could not observe how assimilating the radar their impact need to be developed. observations can improve the circulation at the convective scale. They have concluded that Assimilation needs to be supported with good appropriate assimilation methods must be used if quality data and appropriate tools to inform on RUC has to be used in Nowcasting. these standards need to be developed. Coastal observations need to be supplemented.

(B) Requirement:

To provide hourly accurate updates of the current weather and VSR forecasts, particularly with severe storms

Forecasters experience warning for severe weather Limitations of the RUC

Forecasters noticed how timing and strength of the The model was not very good with timing and sea breeze experienced at a given inland location location in situations under weak large-scale depends on the prevailing flow and how the parent forcing and are not geographically forced, and model handles this. situations that result from secondary interactions such as convection-outflow boundary interactions

The RUC did not handle the movement of strong RUC struggled with predicting accurate timings for low-level southerly changes well during the wind shifts in coastal areas. evaluation which reduced forecaster confidence in coastal wind predictions. Need to develop a tool that expresses the forecaster's confidence in the forecast.

Initiation in weak forcing, precise time and location of individual cells, Winds over complex terrain when a mid-level wave approaches in a split flow pattern, late afternoon precipitation in an unstable air mass ahead of a weakly convergent wind shift with a decaying wave aloft.

Nighttime precipitation with a short wave aloft in Forecasters would not issue a warning based on the RUC predictions due to the lack of accuracy in

Page 28

easterly flow (upper info from the sea is missing) time and location. They would need and ensemble of RUC forecasts if "warning on forecasts" has to be implemented.

(C) Requirement:

To provide frequent accurate timing and location of forecasted precipitation systems.

Results of the Subjective Evaluation of the RUC during FDP as they translate into users requirements for RUC and for the Nowcasting Service (requirements A, B and C in Table 5).

References

Pliske, R., D.W. Klinger, R. Hutton, B. Crandall, B. Knight, and G. Klein. 1997. Understanding

Skilled Weather Forecasting: Implications for Training and the Design of Forecasting Tools, 122. Klein Associates, Inc.

Bosart, L. F. (2003). Whither the weather analysis and forecasting process? Weather and Forecasting, 18, 520-529.

Sills, D.M.L. (2009). On the MSC forecasters forums and the future role of the human forecaster. Bulletin of the American meteorological Society, 90(5), 619-627.

LaDue, D. (2011), How meteorologists learn to forecast the weather: Understanding human learning in complex domains, 20th Symposium on Education, 91st American Meteorological Society Annual Meeting, Article 2.4, Boston, MA: AMS. (Downloaded from https://ams.confex.com/ams/91Annual/webprogram/Parer184220.html)

14. FDP MAJOR FINDINGS

14.1 Summary

• High resolution NWP will be able to contribute to improved forecast services because it can

o forecast the timing, location, and mode of convection,

o model the distribution of wind, dew point, temperature in the context of complex topography including valleys, mountains, and coasts,

o depict low cloud and fog, and

o improve the forecaster’s understanding of the meteorological situation.

Page 29

• Differences in the calibration and performance of the five radars in the Sydney area limits our ability to treat them as an integrated observation network

• Lack of integration between the current nowcasting applications is a major handicap

• Current BMTC training and the way that the Regional Forecast Centres use the training does not fully equip operational forecasters with the conceptual models that are needed for Day 0 nowcasting at the mesoscale

• There is considerable value in taking both scientists and experienced forecasters off- line so that together they can evaluate new NWP models and other operational systems in real-time and in a structured way

Table 4 Summary of the major finding for each FDP system

System Major finding

RUC Invaluable tool for mesoscale forecasting.

Rainfields There are limits to the accuracy of quality control for non-dual pol radars and this limit has been reached in Rainfields3. A longer period is needed for a formal quantitative evaluation.

Visual Weather VW was an invaluable tool for viewing model forecasts and data but needs more investment in people who are able to develop and maintain the system.

Calibrated Thunder Need to modify the set of ingredients to include storm attributes when using convection permitting NWP to provide the guidance.

Subjective Evaluation Invaluable tool for understanding model performance based on a structured use of experienced forecasters.

TITAN The latest version of TITAN is better in initiation phase.

TIFS Needs significant modification before it can support a cell-based nowcasting service outside the metropolitan area.

3D-Rapic Forecasters had a wide range in skill in using the most recent version of the application. Refresher training in the forecast centres is needed, and a redesign of the application should include a careful analysis of usability and graphical design.

Page 30

14.2 CSM RUC

CSM RUC resolution is important to describe the details in the modes of convection. It is also very useful when forecasting temperatures on summer days with low synoptic forcing. RUC would have contributed significantly to the warnings issued by the RFC during the FDP.

RUC will contribute to the development of mesoscale conceptual models that can improve the quality of forecasts and warnings.

14.3 Nowcast Systems

TIFS and 3D-Rapic were difficult to extend for FDP and we need to evaluate their design and use over the medium/long term. It is not easy for the interstate forecasters to understand the geographic underlays for the various systems (place names, names of AWS stations, elevation, etc.) and to move from one platform to the next while keeping the same location on the screen.

There is a need to integrate the nowcasting platforms and improve the visualisation of data and forecasts. Forecasters find it very difficult to navigate simultaneously through different applications and this reflects in the lack of time to properly analyse the details.

14.4 Radar Network

Radar quality control is very important for data assimilation and identifying clutter in the volumes helps the forecasters in severe weather situations.

There were several hardware and software failures in the radars during the FDP and this had a negative impact on the quality control system. The systems and processes to monitor and mitigate radar issues need to be strengthened.

The forecasters in the Regional Forecast Centre were not always aware that a particular radar was having a problem, what was being done to fix the issue, and when the radar was back within operating specification.

14.5 Subjective Evaluation

14.5.1 Strengths of the RUC

The identified strengths of the RUC were used to promote the implementation of the model into training and future operations. Based on these strengths, it was considered that the model can be already used to describe (in more detail that is already available in operations the meteorological) the features that are triggered by orography and diurnal cycle, without the rapid update cycle.

Page 31

14.5.2 Actions to make the RUC fit for purpose

The subjective evaluation identified the way that the RUC could be used in a very short range forecasts, and the actions that are needed to make the RUC fit for that purpose:

(A) Requirement: To provide an accurate analysis of the current weather condition with a resolution of a few kilometers. Actions: better description of land surface properties, coupling with ocean model for better sea surface temperatures.

(B) Requirement: To provide hourly accurate updates of the analysis and forecast. Actions: research into better mesoscale data assimilation and rapid update cycling. This should include improving the use of existing data, and the use of other data source, including third party observations.

(C) Requirement: To provide frequent accurate timing and location of forecasted convective systems. Actions: introduce ensembles to provide information on uncertainty, provide tool for predictability measure.

14.6 Knowledge gaps in the use of RUC in a Nowcasting Service

The value of the NWP model for operational forecasting can be increased by developing a “conceptual model of the NWP behaviour” and then integrating this understanding into the forecast process.

The following gaps were identified:

• A collation of meteorological conceptual models for the Australian climate, regions and topography, that present the scale and the predictability of a specific phenomenon (TC, southerly busters, breezes etc.).

• A conceptual model of NWP behaviour that explains how the model behaves in a specific meteorological situation (e.g. strong inversion, strong convection, retrograde motion etc) and what is the relation with the parent model in that circulation. This means that the forecasters understand how the model deals with physics, what are the limitations of the implemented parametrization schemes and what was the calibration performed on the model in a specific region.

• A descriptive knowledge of the NWP system that explains the technical differences between different runs, types of data that are assimilated and at what times, hours of assimilations and lateral boundaries, so the forecaster knows which run has “better” set of data, which initializes better for the feature of the day.

• An understanding of how the errors in the boundary conditions from the parent nesting model are propagated into the RUC.

• An understanding of the dependence of model forecast errors (particularly the initiation and location of convection) on the initiation time of the forecast.

Page 32

• Developing a process to monitor the situation to confirm the validity of the conceptual model of the day and to initiate a revision if required.

Page 33

15. APPENDICES

15.1 List of Acronyms

ACCESS-SY Operational ACCESS Sydney NWP system at 4km resolution ACCESS Australian Community Climate and Earth System Simulator AWS Aviation Weather Services BMTC Bureau of Meteorology Training Centre CSM Convective Scale Model FDP Forecast Demonstration Project RUC Rapid Update Cycle GFE Graphical Forecast Editor (software for viewing forecast data) GPATS Global Position And Tracking Systems (lightning) (G)OCF (Global) Operational Consensus Forecast HO Head Office NTFGS National Thunderstorm Forecast Guidance System NWP Numerical Weather Prediction NSWRO New South Wales Regional Office RFC Regional Forecasting Centre R12/ACCESS-R Regional limited area model with 12 km resolution RMSE Root Mean Square Error STD Standard deviation of forecast error SAMU Sydney Airport Meteorological Unit SARO South Australia Regional Office SREP Strategic Radar Enhancement Project TIFS Thunderstorm Interactive Forecast System TITAN Thunderstorm Identification, Tracking, Analysis and Nowcasting TS Thunderstorm TAF Terminal Aerodrome Forecast VW Visual Weather (software for viewing forecast data) ZR relation Relationship between radar reflectivity and rainfall rate WOSB Weather, Oceans and Services Branch 3DRapic 3D Radar Visualisation software

Page 34

15.2 Point Based Verification

Shaun Cooper

15.2.1 Method

The Verify system was used to complete the verification of forecasts from several sources during the Sydney Forecast Demonstration Project (FDP). The 00z forecasts from each source of guidance were compared to point-based observations and several scores were computed. The sources of guidance, the scores computed, and the different verification domains are described below.

The verification was completed once per day using the 12-hour lead time forecasts. The scores for the different verification regions were computed and kept, allowing an analysis of the performance over different regions, along with allowing the domain average performance to be calculated.

15.2.2 Guidance

Five forecast guidance streams have been verified during the FDP, ranging from Direct Model Output (DMO), post-processed NWP data and the official GFE forecasts. These sources of guidance are:

ACCESS-SY 4km - ACCESS City (Sydney) 4km NWP.

ACCESS-RUC - Rapid Update Cycle ACCESS 1.5 km NWP.

GOCF (RUC downscaling) - The Gridded OCF forecast with RUC downscaling. The RUC model is upscaled to the GOCF resolution and the difference between the RUC and GOCF is computed. This residual is added to the RUC model to generate the downscaled GOCF field.

GOCF - development version - Developmental version of the Gridded OCF (uses the ACCESS-C based MSAS analysis, see Operations Bulletin 103).

GFE - Official NSW GFE forecast.

15.2.3 Verification Regions

Several observations points within the FDP domain have been used to complete the point- based verification of the forecast guidance. These points have been split into 3 groups: Coastal, Mountain and Inland:

Coastal - stations within a few kilometres of the coast. The forecast error at these stations is likely to particularly reflect the skill of models in forecasting a sea-breeze during the warmer months of the year.

Page 35

Mountain - stations on the Great Dividing Range, or in other high terrain. In selecting these stations, both the high elevation stations (over about 400 m) and some lower elevation stations in areas of rapidly varying elevation have been chosen. Areas of rapidly varying elevation have been selected using the “roughness” parameter. This is the standard deviation of the elevation of the nine MSAS grid points including and surrounding the METAR station.

Inland - stations are that neither coastal nor mountainous.

Domain Average – The average of all stations.

15.2.4 Verification Scores

Three scores have been computed using these data: Residual, Root Mean Square Error (RMSE) and the standard deviation of forecast error (STD):

Residual - The forecast residual is the difference between the forecast and the observation (fcst -obs).

RMSE - The average Root Mean Square Error for the verification region.

Standard Deviation of Forecast Error (STD) - The RMSE is the square root of the Mean Squared Error (MSE) which is the average difference between the forecast and the observation squared. The MSE can be decomposed into a variance component and a bias squared component, i.e. MSE = Var + Bias2. The standard deviation of the forecast error is a way of mitigating the effect of forecast bias on the MSE and RMSE scores.

For example, if every point within a region contained a bias of 1 degree, the RMSE of this region would be 1 degree. However, the standard deviation of the error would be 0. Thus, if the bias were to be corrected the forecast would be perfect.

15.2.5 Direct Model Output: ACCESS City and ACCESS RUC

The Sydney Forecast Demonstration Project (FDP) ran from Monday September 29, 2014 through to Friday December 5, 2014. Two sources of Numerical Weather Prediction (NWP) guidance were used during the FDP, the ACCESS City (Sydney) model and the ACCESS Rapid Update Cycle (RUC) model.

The ACCESS City model has a resolution of approximately 4 km spatially and 1 hour temporally, out to 75 hours lead time. ACCESS City is run four times a day: 00Z, 06Z, 12Z and 18Z. The ACCESS RUC model has a spatial resolution of 1.5 km and a 10- minute temporal resolution. ACCESS RUC was run hourly throughout the FDP with the maximum forecast lead time varying depending on the model run. The shortest run had a maximum lead time of 11 hours and the longest, 38 hours. ACCESS RUC has a data assimilation cycle while ACCESS City does not.

Real time, point based verification of three forecasts variables from the NWP models was completed during the FDP. The variables verified were the 2 m temperature, wind speed and

Page 36

the screen dew point temperature. Three scores for each of these variables were computed; the root mean square error (RMSE), the residual, and the standard deviation of forecasts error (STD). The site observations were split into three domains; coastal, inland and mountain sites, with each score for each variable computed over each domain.

Temperature (2 m)

The domain average RMSE, residual and STD of the ACCESS City and RUC models are presented in Table 5, along with the worst performance in brackets. These data show that the ACCESS RUC model had a lower average RMSE and STD and a 0.07 °C higher residual. Furthermore, the worst performance values are lower for the ACCESS RUC model for the residual and STD fields whilst being 0.01 °C higher for the RMSE field.

The domain average RMSE of the ACCESS RUC model was lower than that of the ACCESS City model on 66.1% of occasions, while the domain average STD was less on 71.2% of occasions.

Table 5 The domain average scores (and worst performance) for the ACCESS City and ACCESS RUC models during the FDP for 2 m temperature.

ACCESS City (°C) ACCESS RUC (°C) RMSE 1.83 (2.60) 1.75 (2.61) Residual 0.38 (1.99) 0.45 (1.72) STD 1.63 (2.31) 1.52 (2.16)

Histograms of the two NWP models for the different verification scores are presented in Figure 8. The distributions of both models are quite similar for each score, with no outliers.

Page 37

Figure 8 Histograms of the domain average RMSE, residual and STD scores for the 2 m temperature forecasts from the ACCESS City and ACCESS RUC models during the FDP.

Wind Speed (10 m)

The domain average RMSE, residual and STD of the ACCESS City and RUC models for the wind speed forecasts are presented in Error! Reference source not found., along with the worst performance in brackets. These data show that the ACCESS City model had a slightly lower average RMSE, residual and STD than the ACCESS RUC model. However, the worst performance values are lower for the ACCESS RUC model each of the scores.

The domain average RMSE of the ACCESS RUC model was lower than that of the ACCESS City model on 37.3% of occasions, while the domain average STD was less on 44.1% of occasions.

Table 6: The domain average scores (and worst performance) for the ACCESS City and ACCESS RUC models during the FDP for 10 m wind speed forecasts.

ACCESS CITY (ms-1) ACCESS RUC (ms-1) RMSE 1.89 (4.17) 2.00 (3.90) Residual 0.08 (-2.11) 0.56 (1.54) STD 1.72 (3.96) 1.79 (3.78)

Histograms of the two NWP models for the different verification scores are presented in Figure 9. The distributions for the two models are skewed for each of the scores. When broken into their individual domains outliers in the distributions are obvious. In particular, the forecast for

Page 38

both models for October 14, 2014 were poor. This was a hard day to forecast in general and is an ideal candidate for a case study.

Figure 9 Histograms of the domain average RMSE, residual and STD scores for the 10 m wind speed forecasts from the ACCESS City and ACCESS RUC models during the FDP.

Screen Dew Point Temperature

The domain average RMSE, residual and STD of the ACCESS City and RUC models of screen dew point temperature forecasts are presented in Table 7, along with the worst performance in brackets. These data show that the ACCESS RUC model had a lower average RMSE and residual than the ACCESS City model, while the ACCESS City STD was 0.09 °C lower than ACCESS RUC. ACCESS City had smaller worst performances than ACCESS RUC for RMSE and STD while ACCESS RUC worst residual was lower than that of ACCESS City.

The domain average RMSE of the ACCESS RUC model was lower than that of the ACCESS City model on 54.2% of occasions, while the domain average STD was less on 44.1% of occasions.

Table 7 The domain average scores (and worst performance) for the ACCESS City and ACCESS RUC models during the FDP for screen dew point forecasts.

ACCESS City (°C) ACCESS RUC (°C) RMSE 2.37 (4.98) 2.28 (5.13) Residual 1.15 (3.34) -0.15 (2.77) STD 1.88 (3.62) 1.97 (3.95)

Page 39

Histograms of the two NWP models for the different verifications scores are presented in Figure 10. The distributions of both models are quite similar for RMSE and STD, however ACCESS RUC's residual distribution is significantly better than that of ACCESS City.

Figure 10 Histograms of the domain average RMSE, residual and STD scores for the screen dew point forecasts from the ACCESS City and ACCESS RUC models during the FDP.

15.2.6 Post Processed Guidance

It was quite difficult to draw conclusions by directly comparing the forecasts from GOCF Dev, GOCF RUC and the GFE at the point-based observation sites as has been done here. This is because of the nature of the different forecast products.

The GOCF is an ensemble mean of NWP forecasts at different spatial resolutions that have been bias corrected with respect to the Mesoscale Surface Analysis Scheme (MSAS) gridded analysis. It is forecasting the average conditions with a particular grid cell. It is shown to perform very well when compared to this gridded 'truth', MSAS. It is not designed to make point forecasts within a particular grid cell. This must be kept in mind when drawing conclusions from the data presented here. The performance of the GOCF at these point observation locations may be unfair to the GOCF as it is attempting to forecast the areal average conditions, not point forecasts. Verifying site-based forecasts at these locations during

Page 40

the FDP would have allowed an extra source of comparison and would allow more conclusions to be drawn.

Comparing the GOCF Dev directly with GOCF RUC is also fraught with danger. GOCF RUC is produced by upscaling the ACCESS RUC output to the GOCF Dev resolution. The residual between the GOCF Dev and upscaled RUC field is then computed. This residual is then added to the ACCESS RUC field to produce a downscaled GOCF RUC forecast. In the process, the GOCF Dev field is assumed to be bias free, thus the application of the residual field to the ACCESS RUC field would remove any bias. However, the GOCF Dev field is tuned to MSAS, not to point data. As the ACCESS RUC is not attempting to forecast the MSAS field, as GOCF Dev is, it is difficult to draw conclusions from a direct comparison between the two at observation points. Keeping this in mind, there are still some interesting results from the FDP.

Temperature (2 m)

The domain average RMSE, Residual and STD for the GOCF Dev, GOCF RUC and GFE are presented in Table 8, along with the worst performances in parentheses. These data show that the GOCF RUC has the lowest average RMSE followed by GFE and GOCF Dev. GOCF RUC also had the smallest worst performance followed by the GOCF Dev and GFE. The GFE's worst performance is almost 1.5 °C worse than the worst GOCF Dev forecast and almost 2.25 °C worse that the worse GOCF RUC forecast. The best average residual was from the GOCF Dev followed by the GFE and GOCF RUC. Again, the GFE had the largest (magnitude) worst performance followed by GOCF RUC and GOCF Dev.

In terms of STD it was GOCF RUC that had the lowest average followed by the GFE and GOCF Dev. Although the worst performance was again from the GFE it was only slightly larger than the worst performance of GOCF Dev while GOCF RUC worst was much less than both.

Table 8 The domain average scores (and worst performance) for the GOCF Dev, GOCF RUC and GFE forecasts during the FDP for 2 m temperature.

GOCF Dev (°C) GOCF RUC (°C) GFE (°C) RMSE 2.71 (4.02) 2.06 (3.25) 2.25 (5.49) Residual -0.06 (2.13) 0.62 (2.30) -0.36 (-3.57) STD 2.27 (3.57) 1.79 (2.81) 1.89 (3.75)

Histograms of three sources of forecasts for the different verification metrics are presented in Figure 11. This shows that, although the average performance for the GFE in terms of RMSE was worse that GOCF Dev and GOCF RUC, this due to a couple of large forecast errors. Most of the distribution is closer to zero than the GOCF Dev distribution and close to that of GOCF RUC. Outliers in the residual and STD GFE distributions are also obvious.

Page 41

Figure 11 Histograms of the domain average RMSE, residual and STD scores for the 2m temperature forecasts from the GOCF Dev, GOCF RUC and the GFE during the FDP.

Wind Speed (10 m)

Note: there were no wind forecasts from GOCF RUC.

The domain average RMSE, Residual and STD for the GOCF Dev and GFE are presented in Table 9, along with the worst performances in parentheses. These data show that the average RMSE are the same, however the GFE has a larger worst performance.

The best average residual was from the GOCF Dev. Again, the GFE had the largest (magnitude) worst performance. In terms of STD it was GFE that had the lowest average, and the worst performance was from GOCF Dev.

Table 9 The domain average (and worst performance) for the GOCF Dev and GFE forecasts of wind speed throughout the FDP.

GOCF Dev (ms-1) GFE (ms-1) RMSE 2.13 (4.02) 2.13 (5.14) Residual -0.18 (1.87) -1.12 (-3.75) STD 1.95 (3.76) 1.71 (3.45)

Histograms of two sources of forecasts for the different verification metrics are presented in Figure 12. This shows that, although the average performance for the GFE in terms of RMSE

Page 42

was the same as GOCF Dev, there are a number of forecasts with large RMSEs. Outliers in the residual and STD GFE distributions are also obvious.

Figure 12 Histograms of the domain average RMSE, residual and STD scores for wind speed forecasts from the GOCF Dev and the GFE during the FDP.

Screen Dew Point Temperature

Note: There were no GOCF Dev dew point forecasts.

The domain average RMSE, Residual and STD for the GOCF RUC and GFE are presented in

Table 10, along with the worst performances in parentheses. These data show that the average RMSE from the GOCF RUC was lower than that of the GFE, and the GFE had a larger worst performance.

The best average residual was from the GFE and the worst performance was from GOCF RUC. In terms of STD it was GFE that had the very slightly lower score but very slightly higher worst performance.

Page 43

Table 10 The domain average score (and worst performance) for the GOCF RUC and GFE forecasts of dew point throughout the FDP.

GOCF RUC (°C) GFE (°C) RMSE 2.06 (3.25) 2.13 (4.33) Residual 0.47 (2.99) -0.09 (2.62) STD 1.84 (3.43) 1.81 (3.44)

Histograms of two sources of forecasts for the different verification metrics are presented in Figure 13. These plots show that the distributions of the RMSE are very similar however the GFE does have some large forecasts errors. The GFE's residual histogram is centred very close to zero while the GOCF RUC distribution is shifted slightly to the right. The STD distributions are very similar to each other.

Figure 13 Histograms of the domain average RMSE, residual and STD scores for screen dew point forecasts from the GOCF RUC and the GFE during the FDP.

Page 44

15.3 Use of Rapid Update Cycle Predictions in the GFE

Michael Foley

15.3.1 Introduction

The focus of the GFE Subproject of the Sydney 2014 FDP was on application of RUC guidance to the fire weather process in the GFE. The RUC guidance was incorporated into a development version of the GFE to explore the usefulness of hourly NWP updates for fire weather forecasting and to trial the benefits of the extra high spatial resolution for preparation of spot fire forecasts. New GFE fire weather summary displays, and alerting were developed to enable the forecaster to keep abreast of the increased volume of data with updated forecast information arriving every hour.

The new RUC and ACCESS-R based NTFGS guidance for thunderstorm probability was also provided to the GFE, and forecasters experimented with producing gridded forecasts of thunderstorm probability.

15.3.2 Incorporating RUC Guidance into the GFE

The RUC guidance was remapped from 1.5 km to the 6km horizontal resolution used by the operational GFE grid for NSW. GFE forecast services require gridded information across the whole state. To experiment with how guidance such as RUC, which only partially covers this domain, could be incorporated into the GFE, it was embedded within a similar model, in this case the most recent run of the 12 km resolution ACCESS-R model, which covered the rest of the domain, with blending at the edges for a seamless result. The guidance was presented at the hourly steps supported by the operational GFE.

Forecasters construct gridded fire weather forecasts in the GFE by using objective guidance and manual editing tools to produce their forecasts of temperature, humidity and wind, and then running tools which use these inputs together with gridded information on fuel state to calculate forest and grassland fire danger indices. The tools also determine a grid of ‘fire weather hazards’ which indicate the grid cells which have met or exceeded the fire danger index thresholds for severe, extreme or catastrophic fire danger ratings at some time of the day.

So that forecasters would not need to go through all these steps to determine the implications for fire weather of each hourly RUC run, automatic pre-calculation of the forest and grassland fire danger indices and the resulting fire weather hazards were performed by the GFE. The fuel state information used (drought factor for forest and curing and fuel load for grassland) was taken from the operational GFE forecasts. For comparison with operationally-available guidance, the same fire weather outputs were pre-calculated for ACCESS-C.

The forecaster was supplied with a new hazard comparison tool which enabled a rapid graphical overview of hazard outcomes for the day for different NWP runs or between NWP

Page 45

guidance and the existing GFE forecast. The hazard comparison tool also ran automatically on arrival of the latest RUC guidance, generating an alert dialog if needed based on comparison between this run, the current forecast grids, and certain previous runs of RUC guidance.

A new thunderstorm probability parameter was added to the GFE. It could be populated with the new RUC or ACCESS-R based NTFGS thunderstorm probabilities or using an existing smart tool which derived probabilities from a lagged ensemble of old NTGFS forecasts. A simple smart tool was also written to map from thunderstorm coverages (chance/isolated/scattered/widespread) to nominal probabilities, allowing loose comparison with the forecasts in official GFE weather grids.

15.3.3 Outcomes from the Forecast Demonstration Project

During the FDP, the RUC guidance and new tools in the GFE were trialled on days of increased fire weather risk by using the new guidance as the basis for gridded fire weather products, as well as in production of 12 hour-long spot fire weather forecasts. The spot locations chosen were a mixture of real forecast requests (for comparison with operational forecasts) and observation locations (to enable evaluation against direct observations of conditions). Automated alerts were then used for monitoring of the forecasts.

As it turned out, the period of the Forecast Demonstration Project was quiet in terms of fire weather conditions, and much of the focus of the project was on rainfall and thunderstorm prediction. Nevertheless, fire weather forecast grids were prepared on 7 days and 11 spot fire forecasts were done.

Subjective feedback was obtained from operational forecasters involved in the trial. The alerting and visualization of changes in hazard areas were useful for monitoring forecasts, in particular through comparing the latest version of guidance to that which had been used to prepare the forecast. The detailed depiction of wind features such as sea breezes interacting with topography helped forecasters build better understanding of meteorological processes significant for spot fire forecasts.

As part of the thunderstorm outlook process in the FDP, forecasters prepared 3 hourly thunderstorm probability forecast grids for the rest of today in the GFE. The FDP is the first time that we have had a more than rudimentary ability to look at thunderstorm probabilities in the GFE and quantifying the thunderstorm probabilities was an unfamiliar activity for forecasters. It did prompt some useful discussion about definitions of the thunderstorm probability grid and the relationship between this and rainfall probabilities.

15.3.4 Limitations

The alerting based on new runs of RUC was not as impactful as had been expected. This was partly since the successive rapid updates did not exhibit much change in forecast with the assimilation of new observations – the main changes were when the run of the model supplying boundary conditions changed every 6 hours. In addition, an area needing

Page 46

improvement in the model-based alerting in the GFE was better accounting for the fact that the RUC runs were of various lengths and could have different amounts of coverage of the peak fire weather period of the day.

Thunderstorm probabilities in the GFE are desired by the Weather Forecasting Branch and are consonant with the move to more probabilistic descriptions of precipitating weather phenomena in GFE products. They could also form a useful service in their own right. However, the quality of thunderstorm probability guidance will need to improve before this is realistic to undertake in GFE operations.

15.3.5 Successes and Transition to Operations

For the FDP we solved the technical issues involved with handling guidance in the GFE for domains smaller than the GFE grid domain. This has already been transitioned to operations to allow use of ACCESS-C guidance in the GFE forecast process. The benefits of the mesoscale detail in the 1.5 km RUC guidance for spot fire weather forecasting in the GFE will become available in operations later this year with the APS2 upgrade of ACCESS-C to the same spatial resolution.

The pre=calculation of fire weather guidance proved useful and we would expect to transition this to operations in due course, perhaps as part of forecast process automation work in the GFE. Hazard comparison and alerting tools should also be transitioned to operations once further refinement has been undertaken.

Page 47

15.4 Performance of the Met Office Convective Diagnosis Procedure during the Sydney Forecast Demonstration Project

Rebecca Stretton and Piers Buchanan, Met Office, UK

15.4.1 Introduction

Throughout the Sydney Forecast Demonstration Project (FDP) the Met Office supplied and monitored probabilistic forecasts of severe weather diagnostics of lightning and hail over Australia. The forecasts were produced by the Convective Diagnosis Procedure (CDP) run on the global configuration of the Met Office Global and Regional Ensemble Prediction System (MOGREPS-G).

Following the FDP, objective verification and case studies have been investigated to assess the skill and to identify improvements to the forecasts.

15.4.2 CDP diagnostic in MOGREPS-G

MOGREPS-G runs at approximately 33km horizontal resolution at mid-latitudes with 70 vertical levels, for 12 ensemble members, out to 7 days. The CDP produces probabilistic risk forecasts for convective hazards such as lightning, hail and tornadoes and the risk of exceeding thresholds of surface-based CAPE (Convective Available Potential Energy), Lifted Index and precipitable water. Figure 14 shows an example of a CDP forecast for lightning and hail across Australia. Currently the CDP runs real-time in experimental mode with the 0Z and 6Z runs of MOGREPS-G for T+24, 36, 42, 48 and 60 hours.

Figure 14 Typical diagnostic plot for lightning and hail for Australia assessed during the FDP. Identical plots focused on the FDP region were also produced.

MOGREPS-G model data is used to generate a lightning index of 0, 1 or 10 for each ensemble member. A probability is then derived from the number of members exceeding a given index value. A lightning index of 10 indicates a deep convectively unstable environment where lightning is possible, an index of 1 indicates a risk of lightning and the default index of 0 shows lightning is unlikely. The index is determined by model values of CAPE, relative humidity, precipitation rate and convective cloud conditions. The hail diagnostic uses a method reported by Hand and Cappelluti (2011) to generate a forecast of hail size for each ensemble member. The probability of hail exceeding a size of 10, 25 or 50mm is calculated by summing the

Page 48

number of members that exceed these values at each grid point. Full details of the ingredients of each diagnostic can be found in Stretton et al (2014).

Verification of the lightning diagnostic has been done previously in a project for the UK Civil Aviation Authority. Objective site-specific verification of lightning forecasts was produced for the 2013 and 2014 summers at 1116 civil airports across Europe.

15.4.3 Objective lightning verification

The objective verification of the lightning diagnostic focuses on the 06Z model run T+24 forecast of the probability of the lightning index exceeding 1. These forecasts provided the most relevant and recent guidance on the potential of convection over the FDP region. Forecasts are verified against strikes detected by the Global Position and Tracking Systems Pty. Ltd. (GPATS) lightning location network. Similar to the verification of lightning across Europe a site-specific approach is taken using 33 airports located within the FDP region as shown in Figure 15. A lightning event is said to occur at a given site if strikes are detected within a 50km radius in the 6 hour validity period of the forecast (03-09Z).

Figure 15 Map of 33 airports located in the FDP region (red dashed area).

Verification results are assessed by producing a Receiver Operating Characteristic (ROC) curve and reliability diagram. The ROC curve assesses the forecast’s ability to discriminate between a lightning event and a non-lightning event. The reliability diagram indicates how well the predicted probabilities correspond to the observed frequencies.

Page 49

Figure 16 Receiver Operating Characteristic (ROC) Lightning index >1 for 06Z run at T+24.

The ROC curve in Figure 16 shows the lightning forecast has skill, especially given its resolution and global capability. Due to the small number of ensemble members the range of forecast probabilities is limited; therefore, the number of points along the curve is also limited. The reliability diagram (Figure 17) shows good forecast reliability, with slight under forecasting at low probabilities (0-30%). This result agrees with the day to day monitoring of the forecasts, a more reserved area of lightning risk was predicted compared to what was observed. For higher probabilities there was a mixture of over and under forecasting but these are a very small number of the forecasts.

Figure 17 Reliability diagram Lightning index > 1 for 06Z run at T+24.

15.4.4 Lightning case study

In the afternoon of November 1st, 2014 a significant thunderstorm passed over the Sydney area. There were 3,177 lightning strikes detected within 50km of Sydney airport between 03Z and 09Z. Examining the MOGREPS-G T+24 hours lightning forecast there was no signal over

Page 50

Sydney, just a small risk to the South-East of Canberra (Figure 18a). This lightning diagnostic was produced using the model forecast of instantaneous rainfall rate.

a)

b)

Figure 18 CDP lightning diagnostics using a) instantaneous rainfall rate and b) the maximum

rainfall rate for 6Z on 1/11/14 T+24.

Figure 18b shows the lightning diagnostic if the maximum rainfall rate from the previous 3 hours was used instead. This forecast displays a much higher probability of the lightning index exceeding 1 for the Sydney area and beyond. This appears to give a much more reasonable signal for severe convection in the New South Wales area, although more work will be necessary to evaluate whether this adapted CDP diagnostic improves the forecasting of lightning or leads to more false alarms.

This case study also highlights how minimal the difference is between the two lightning indices. In many cases the greater thresholds associated with lightning index 10 do not capture a more concentrated, severe region of lightning potential. As a result for the purpose of objective verification results only the probability of lightning index 1 is used.

15.4.5 Conclusions

Overall the objective verification carried out for the MOGREPS-G CDP lightning diagnostic indicates skill and reliability given model resolution, lead time and the rarity of lightning. There is also a great similarity between the results for Europe and New South Wales, Australia (FDP region) which is encouraging for its performance in different areas of the world.

The key findings from studying the CDP throughout the FDP are the lack of difference between lightning index 1 and 10 and the impact of the model parameter used for rainfall rate. Firstly, until more research can be done into the suitable thresholds for more severe convective environments the probability of the lightning index exceeding 1 will be the most useful

Page 51

forecast product. Secondly the lightning diagnostic has now been modified to be based on maximum rainfall rate instead of instantaneous rainfall rate. Due to the long validity periods this may provide a more accurate picture of the greatest risk of convection throughout whole forecast period. Further assessment will be done to monitor the impact of this change to the diagnostic. As discussed, the Initial case studies show a dramatic increase in forecast probabilities, which may cause a large number of false alarms in the verification results.

Further work includes the set up and assessment of the CDP on MOGREPS-UK for the Met Office Forecasting Experiment in June 2015. Similar to the FDP, forecasters and scientists will collaborate to assess the performance of a wide range of Met Office models and products.

15.4.6 References

Hand, W. H and G. Cappelluti, 2010: A global hail climatology using the UK Met Office convection diagnosis procedure (CDP) and model analyses. Met. Apps, doi:10,1002/met.236.

Stretton, Buchanan et al. (2014), ’Probabilistic Global Convective Hazard Forecasts & Verification’. Preprints, 54th American Meteorological Society Annual Meeting, Atlanta GA, Amer. Meteor. Soc. [Available online at https://ams.confex.com/ams/94Annual/webprogram/Paper234216.html]

Page 52

15.5 NTFGS/CALIBRATED THUNDERSTORM PROBABILITIES: EVALUATION

Harald Richter

15.5.1 Introduction

The Bureau of Meteorology has utilized a NWP-based post-processing algorithm, the National Thunderstorm Forecast Guidance System (NTFGS; Hanstrum 2004) since 2003 in aid of operational thunderstorm forecasting. The NTFGS gathers a range of ingredients conducive to thunderstorms, thresholds them, and defines areas of thunder as those where the NWP model output exceeds the thresholds of all the ingredients concurrently.

Based on work first published by Bright et al. (2005), a refinement of the NTFGS approach known as “calibrated thunder” employs two additional quality-promoting elements for the prediction of thunderstorms based on NWP output. Instead of using a single deterministic model only, it uses a 5-member lag ensemble constructed from ACCESS-R12, an operational model with reliable data delivery. Also, calibrated thunder calibrates its predictions of lightning against the Global Positioning and Tracking System (GPATS) lightning observations gathered over the previous 30 days. An example is shown in Figure 19.

The Sydney Forecast Demonstration Project sparked the coupling of calibrated thunder to a second NWP model, the ACCESS Rapid Update Cycle (RUC) with a grid spacing of only 1.5 km. This convection-allowing model is run hourly over a the eastern third of New South Wales during the period of the FDP, 29 September – 5 December 2014.

This report will briefly show selected verification results that compare the ACCESS-R12 and RUC versions of calibrated thunder as well as the performance of the traditional NTFGS thunderstorm output.

Page 53

Figure 19 ACCESS-R12 calibrated thunder 15-hour forecast for thunderstorm probabilities during 6-9 UTC 08 November 2014. The forecast probabilities are shaded in purple. The observed GPATS lightning flash density is overlaid in brighter yellow and purple shades.

15.5.2 Reliability of Calibrated Thunder

The reliability of calibrated thunder for the ACCESS-R12 and the RUC versions has been tested on a monthly basis for the 15-hour afternoon forecasts (Figure 20) and the 9-hour night-time forecast (not shown). Making this diurnal distinction in the evaluation is useful as the thunderstorm mode in summer generally changes from surface-based thunderstorms in the afternoon to elevated modes overnight. In Figure 20 the predicted probabilities are binned into 5% intervals centred on 5%, 10%, 15%, … 95%. The lowest probability bin is 0 to 2.5%.

The ACCESS-R12 system produced very reliable 15-hour forecasts for October 2014. Only the highest forecast probability (60%) was not reliable, possibly due to a small sample size. The RUC-based forecasts show more scatter and, for October, generally under-predicted the thunder probabilities. In November, the RUC-based calibrated thunder system over-predicted thunderstorms, which demonstrates that a system run over a small domain (eastern third of NSW) can lead to volatile calibration outcomes.

Page 54

Figure 20 Reliability of the thunder probability forecasts based on ACCESS-R12 (top) and RUC (bottom). The observed lightning frequency is taken from GPATS observations (mainly cloud-to-ground strokes) and are applied to a 40 km x 40 km box centred on the observed stroke latitude and longitude.

15.5.3 Relative operating characteristic of Calibrated Thunder

To compare the performance of calibrated thunder with the commensurate performance of the NTFGS, a verification method had to be chosen that allows for the comparison of deterministic and probabilistic approaches at the same time.

Figure 21 shows the Relative Operating Characteristic (ROC) curve for all 15-hour ACCESS-R12 (AR12) based calibrated thunder forecasts with base hour 18 UTC for November 2014. The forecast probabilities were divided into no-event/event categories against the thresholds 0.01%, 1%, 2%, 3%, 5%, 7%, 10% and 20%. A 2x2 contingency table was produced for each of these nine dichotomous categorical thunder forecasts, followed by the computation of a false alarm rate and a hit rate. then plots the hit rate against the false alarm rate for each 15-hour forecast (with base time 18 UTC), with the thick blue line marking the performance of the averaged forecast. The green squares marking the NTFGS performance from the first lag ensemble member lie mainly to the right of the blue curve, indicating that the majority of NTFGS forecasts performed more poorly than the average calibrated thunder forecast.

Page 55

Figure 21 Relative Operating Characteristic (ROC) of the November 2014 15-hour thunder probability forecasts based on ACCESS-R12. The green squares show the performance of the traditional NTFGS combined surface-based and elevated thunder diagnostic.

15.5.4 Summary

Both RUC-based and AR12-based calibrated thunder performs better than the traditional NTFGS. The AR12-based calibrated thunder system performs significantly better than the RUC- based version, possibly due to its calibration and evaluation over the whole Australian domain. Also, the AR12 calibrated thunder is reliable for daytime and most night time forecasts (i.e., the elevated storm mode).

For the AR12-based system, the short (30 day) calibration period leads to some volatility for any national calibration, especially at times when events are sparse (overnight; winter). The short calibration period leads to particularly large volatility for the local calibration; which suggests that any RUC-based calibrated thunder needs a bigger calibration domain or a much longer training period. As a convection-allowing model, the RUC allows the "storm attribute" approach (Sobash et al. 2011) to convective forecasting, which might be a more promising approach than running calibrated thunder over a small computational domain.

15.5.5 Refrences

Hanstrum, B. N., 2004: A National NWP based thunderstorm and severe thunderstorm forecasting guidance system. Preprints, International Conference on Storms, AMOS 11th National Conference 5-9 July 2004, Brisbane, Australia, 31-36.

Bright, D. R., M. S. Wandishin, R. E. Jewell, and S. J. Weiss, 2005: A physically based parameter for lightning prediction and its calibration in ensemble forecasts. Preprints, Conf. on Meteor. Applications of Lightning Data.

Page 56

Sobash, R. A., J. S. Kain D. R. Bright, A. R. Dean, M. C. Coniglio, and S. J. Weiss, 2011: Probabilistic Forecast Guidance for Severe Thunderstorms Based on the Identification of Extreme Phenomena in Convection-Allowing Model Forecasts. Wea. and Forecasting, 26, pp. 714-728

Page 57

15.6 The Air Quality Sub-Project of the Sydney Forecast Demonstration Project

Beth Ebert1, Martin Cope2, Alan Wain1, Sunhee Lee2, David Smith1

1Bureau of Meteorology 2CSIRO Ocean & Atmosphere Flagship

15.6.1 Introduction

Air pollution from industrial sources, vehicular traffic, and smoke contributes to poor respiratory health for many residents in urban and rural areas of Australia. Forecasting of air quality therefore can potentially contribute to better health outcomes through advance warning of hazardous air pollution conditions. The NSW Office of Environment and Heritage (OEH) and the NSW Rural Fire Service (RFS) are two agencies with a strong interest in the air quality of the Greater Sydney Region through monitoring, prediction, and fire suppression and fuel reduction burning.

Air quality prediction is typically underpinned by NWP and chemical transport modelling of air toxics and aerosols prescribed by anthropogenic and natural (i.e. dust and sea salt) source emission inventories. Smoke from bushfires and fuel reduction burns are an additional (usually transient) source of particulates and can be captured by parameterising smoke emissions with the use of real-time descriptive data such as satellite hot spots and fire agency line scan data.

CAWCR conducted a 3-year $1.6M project, co-funded by CAWCR and the Victorian Department of Environment, Land, Water & Planning, to develop an improved smoke emission and transport forecasting capability. The modelling component is based on the ACCESS atmospheric model, the CSIRO Chemical Transport Model (CTM), effectively rebuilding and improving upon the capability of the former Australian Air Quality Forecast System (AAQFS; Cope et al., 2004). Additionally, the HYSPLIT dispersion model can be run interactively to simulate the dispersion of smoke from planned fuel reduction burns. This project also trials the use of medium range ensembles from the ACCESS global ensemble (ACCESS-GE) to provide useful information on the likely suitability of weather conditions for fuel reduction burning.

The SREP Forecast Demonstration Project offered an opportunity to leverage the Victorian project to develop and demonstrate similar capability in New South Wales, with OEH and RFS as external collaboration and funding partners. Further improvements could be expected using the high resolution (1.5km) ACCESS rapid update cycle (RUC) model which assimilates Doppler radar, and the inclusion of real-time (daily) burnt area updates from RFS to provide more accurate estimates of ambient fire emissions. Additionally, RFS expressed an interest in using the RUC output to drive the Phoenix fire spread model (Tolhurst et al., 2008) to assess whether there might be improvements over the current GFE-based fire spread modelling.

The FDP air quality sub-project demonstrated components of a 3-tiered approach to smoke and air quality prediction (Cope et al., 2014). For one to two-week advance planning for fuel reduction burns, (Tier 1) 10-day ensemble-based forecasts of weather variables, fire danger indices, and their uncertainty were generated from the ACCESS-GE global ensemble. State-of- the-airshed forecasts (Tier 2) used a regional air quality forecasting system based on the CTM

Page 58

run from ACCESS-R (ACCESS-RUC) and provided a 3-day (1-day) quantitative prediction of smoke and air chemistry. This provides the background air quality for Tier 3 interactive smoke dispersion simulations using the HYSPLIT model, though this was not demonstrated in the FDP.

All experimental products were updated daily and graphical displays made available through a registered user website, http://reg.bom.gov.au/general/reg/smoke/newsmoke/ (login details available upon request). RUC direct model output was provided daily to RFS for input to the Phoenix fire spread model. Qualitative and quantitative forecast verification was done by CAWCR scientists and RFS/OEH partners through comparison with meteorological observations from the Bureau's automatic weather station (AWS) network and air quality observations from monitoring networks operated by OEH and NSW Environmental Protection Agency (EPA).

The forecasting system evolved during the course of the FDP and afterward, with new products and improved displays developed in consultation with RFS and OEH. Further improvements to the modelling system were made in the post-FDP period. Experimental forecasts continue to be produced each day for evaluation, and re-forecasts for the whole 3- month period are being regenerated using the latest version of the system.

This short report describes the modelling methodology, shows examples of the products developed and demonstrated in this sub-project, gives some preliminary verification results, and concludes with some recommendations for further work.

15.6.2 Methodology a. Ensemble forecasting of fire weather

The experimental ACCESS global ensemble, ACCESS-GE, has been under development in the Bureau's R&D Branch since 2009. It currently runs once per day at 12UTC, comprising 24 ensemble members and producing forecasts out to 10 days at approximately 60km spatial resolution.

The model output for 2m air temperature, relative humidity, 10m wind speed and direction, precipitation, and cloud cover were used to generate ensemble meteograms of 10-day forecasts at 43 sites in New South Wales (Table 11).

Table 11 Forecast locations for which ensemble forecasts were generated.

Forecast Location Elevation (m) Latitude Longitude Sydney Airport 6 -33.9465 151.1731 Terry Hills 199 -33.6908 151.2253 Richmond 206.3 -20.7017 143.1167 Mount Boyce 1080 -33.6185 150.2741 Lismore 9.21 -28.8305 153.2601 Grafton 25.09 -29.7583 153.0297 Coffs Harbour 5 -30.3189 153.1162 Taree 8 -31.8896 152.512 Williamtown 9 -32.7932 151.8359 Cessnock 61 -32.7886 151.3377

Page 59

Scone 221.4 -32.0335 150.8264 Moss Vale 678.4 -34.5253 150.4217 Nowra 109 -34.9469 150.5353 Moruya 4 -35.9004 150.1437 Bega 41 -36.6722 149.8191 Cooma 930 -36.2939 148.9725 Bombala 760.5 -37.0016 149.2336 Goulburn 640 -34.8085 149.7312 Braidwood 665.2 -35.4253 149.7835 Armidale 1079 -30.5273 151.6158 Orange 947.4 -33.3813 149.127 Mudgee 471 -32.5628 149.6149 Tenterfield 838 -29.05 152.02 Inverell 664 -29.7752 151.0819 Tamworth 394.9 -31.0742 150.8362 Moree 213 -29.4898 149.8471 Coonamble 181.3 -30.9776 148.3798 Dubbo 284 -32.2206 148.5753 Parkes 323.3 -33.1281 148.2428 Condobolin 192.6 -33.0682 147.2133 Young 379.6 -34.2493 148.2475 Tumbarumba 645 -35.78 148.01 Wagga Wagga 212 -35.1583 147.4575 Airport 163.5 -36.069 146.9509 Deniliquin 94 -35.5575 144.9458 Hay 92 -34.5412 144.8345 Griffith 134 -34.2487 146.0695 Wentworth/Balranald 61 -34.64 143.56 Ivanhoe 100 -32.8831 144.3088 Tibooburra 176.4 -29.4448 142.0567 Cobar 218 -31.5388 145.7964 Bowral 690 -34.49 150.4 Katoomba 1015 -33.71 150.31

Ensemble output was also used to forecast the Macarthur Forest Fire Danger Index (FFDI) and Grass Fire Danger Index (GFDI). The FFDI is represented as an exponential function of temperature, relative humidity and wind speed, multiplied by a power of drought factor with preceding coefficient. Five parameters are required in the formula attributed to Noble (1980):

= exp [ + ] 𝛽𝛽 • is the drought factor𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹, in the𝛼𝛼 𝐷𝐷range 0-10𝜖𝜖 𝑇𝑇 − 𝜃𝜃 𝑅𝑅𝑅𝑅 𝜑𝜑 𝑈𝑈 • is the air temperature in 𝐷𝐷 • is the relative humidity, as a percentage 𝑇𝑇 ℃ • is the 10-metre wind speed in km/h 𝑅𝑅𝑅𝑅 𝑈𝑈

Page 60

with parameters taking the values = 1.275256303, β=0.987, ϵ=0.0338, θ=-0.0345, =0.0234. The formula for GFDI takes a similar structure, the main difference being a square root dependence for both wind speed𝛼𝛼 and relative humidity inside the exponential. Additional parameters𝜑𝜑 involve ‘curing’ and ‘grass fuel load’, in the McArthur Mark 4 formula

= + (100 ) + + 𝛿𝛿1 • 𝑙𝑙𝑙𝑙Q 𝑔𝑔is10 the𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 fuel loading𝛼𝛼1 in𝛽𝛽1 tonnes𝑙𝑙𝑙𝑙𝑙𝑙10 𝑄𝑄per− hectare𝛾𝛾1 (range− 𝐶𝐶 0-20)𝜖𝜖1 𝑇𝑇 − 𝜃𝜃1 √𝑅𝑅𝑅𝑅 𝜑𝜑1 √𝑈𝑈 • C is the degree of curing, as a percentage with parameters = -0.6615, =1.027, =0.004096, =1.536, =0.01201, =0.09577, =0.2789. 𝛼𝛼1 𝛽𝛽1 𝛾𝛾1 𝛿𝛿1 𝜀𝜀1 𝜃𝜃1 𝜑𝜑Creating1 an ensemble forecast of fire danger indices requires blending the wind, temperature, and relative humidity as supplied by ACCESS-GE, with drought factor, fuel loading, and curing, available as forecast daily values in the Australian Digital Forecast Database (ADFD).

The ensemble forecasts were used by RFS to look ahead to identify days and locations with benign or potentially dangerous fire weather conditions. The fire danger indices were displayed both as meteograms and charts of probability of exceeding critical values.

Figure 22 to Figure 24 show examples of the ensemble products provided to RFS and OEH from the forecast made on 9 November 2014. This case was selected because an actual forest fire event occurred at Warrimoo (approximately 30km from Katoomba) midway through the 10- day forecast period. In Figure 23, the forecast daily temperature maxima steadily climb after day two, as the relative humidity minima decrease, culminating with a median temperature forecast above 35° on day 5 at T+114. At this time, a relative humidity median forecast of just over 15% provides another ingredient for high fire danger. The wind speed forecast (Figure 22) indicates a rapid increase on day 5, adding another key ingredient for high fire danger on that day. For forest fire danger, this is confirmed by a prominent local spike at T+114, with median value above 30, or “very high”. However, the corresponding grass fire danger undergoes a negligibly small spike which has been attributed to a low value of grass curing (37%).

6-hourly predicted wind direction forecasts, plotted as shown in the bottom row of Figure 22, are also available online.

Page 61

Figure 22 Ensemble meteograms for FFDI, GFDI, wind speed, and wind direction for Katoomba, NSW, starting on 9 November 2014. The box-whiskers diagrams show the 25-75th percentile range (box), the full range (whiskers), and the median value (red line) and the control forecast (green line).

Page 62

Figure 23 As in Figure 22 for precipitation, air temperature, relative humidity, and total cloud cover.

Page 63

Figure 24 Forecast probability of FFDI in the range 0-12 and exceeding thresholds of 12, 25, 50, 75, and 100 at hour 114 (4.75 days) of the forecast. b. Regional air quality forecasting

The CSIRO Chemical Transport Model (CTM) combines meteorological input from NWP with Australia-wide observations of burning fires, estimates of dust and sea salt particle emissions, and an inventory of anthropogenic emissions for New South Wales to forecast the concentrations of the criteria species PM2.5 (particles with aerodynamic diameter < 2.5 microns), ozone, nitrogen dioxide, and carbon monoxide concentrations for periods out to 72 hours. In order to model the criteria species, the CTM treats the transport, chemical transformation, and wet and dry deposition of about 100 gas and aerosol phase species. The aerosol species are divided into PM2.5 and PM2.5–10 size fractions. Over 30 aerosol chemical components are tracked, including elemental carbon, organic carbon (divided into 9 volatility bins), sulphate, nitrate, ammonia, sodium and chloride.

The nested domains used in the regional air quality forecasting system are shown in Figure 25. ACCESS-R meteorological forecasts and emissions derived from ECMWF's MACC system are used to produce continental scale air quality forecasts at 90 km resolution. An NSW domain uses an anthropogenic emissions inventory supplied by OEH (http://tinyurl.com/ljlphof) and an approach based on Meyer et al. (2008) to predict smoke emissions from known fires. The approach uses fire scar time series data (when available) together with a data base of fuel loading and assumptions regarding burning efficiency to determine the biomass burning rate. This is combined with gas and aerosol species emission factors to estimate the flux of selected combustion species for the duration of the fire.

Page 64

The two innermost domains use the city-scale ACCESS model (ACCESS-RUC during the FDP period, ACCESS-C otherwise) to produce air quality forecasts at approximately 3km for and 1 km resolution centred over Sydney. The CTM is initialised using data from an earlier forecast and, due to the sparseness of air pollution observations, does include a chemical analysis cycle.

Figure 25 Modelling domains for the Chemical Transport Model (CTM) nested inside ACCESS-R and ACCESS-RUC.

The regional air quality forecasting system was implemented on the research supercomputer for testing in CAWCR, and on the operational supercomputer for routine production runs. Daily CTM runs are made using the output from the 00UTC ACCESS runs. Forecasts for NO2, SO2, O3, and CO were available during the FDP period. The forecasts for particles (PM2.5 and PM10), smoke, and Air Quality Index were only implemented post-FDP and so were not available in real time for use or evaluation.

A suite of forecast charts and time series are generated with examples shown in Figure 26, Figure 27 and Figure 28. As noted earlier, these displays evolved during the project as forecasts for additional variables were produced and improved displays were implemented in consultation with OEH and RFS.

Page 65

Figure 26 Air quality forecast display showing daily maximum value of PM2.5 forecast for 00-23 UTC on 22 April 2015.

Figure 27 23h forecasts for PM2.5 and AQI for forecasts initialised at 00UTC 22 April 2015.

Page 66

Figure 28 Consecutive 1-day forecasts of carbon monoxide, ozone, nitrogen dioxide and sulphur dioxide at Randwick, NSW, from model runs initialised at 00UTC on 19, 20, 21, and 22 April 2015.

15.6.3 Performance assessment

Preliminary verification of the ensemble forecasts against AWS observations is quite encouraging, particularly given the coarse scale of the forecasts and the fact that no local bias correction or downscaling process was attempted. The ensemble spread and skill comparisons (not shown) indicate that the spread of the ensemble members underestimates the error of the ensemble mean for predictions of surface variables. This is quite common for large-scale ensembles, which struggle to resolve the detailed variability of surface weather.

To assess bias the forecast ensemble, mean values were compared to the observations using scatter plots and time series of the diurnal cycle. Figure 29 shows the forecast (ensemble mean) and observed values of FFDI averaged for the 43 sites in NSW during December 2014- February 2015. There is excellent correspondence between the forecast and observed values, although there is a small positive bias in daytime values that increases with lead time. Forecast GFDI values during this same period appear to have been underestimated by 25-30%.

Page 67

14 12 Observed 10 Forecast 8

FFDI 6 4 2 0 0 24 48 72 96 120 144 168 192 216 240 Forecast lead time (h)

Figure 29 Mean forecast and observed diurnal cycle of FFDI averaged over 43 sites in NSW during Dec2014-Feb2015.

The reliability of the ensemble-based probability forecasts is quite encouraging. Results are shown in Table 12 for probability forecasts of daytime FFDI exceeding 25. Perfect reliability would have observed frequencies equal to the forecast probability at all lead times. Although the higher values of the forecast probabilities are overestimated (overconfident) compared to the observed frequencies, the differences are not enormous, and the performance does not deteriorate very much with increasing lead time.

Table 12 Reliability of ensemble-based forecasts for probability of daytime FFDI exceeding a value of 25 for 43 sites in NSW during Dec2014-Feb2015.

Forecast Observed frequency probability Day 2 Day 4 Day 6 Day 8 Day 10 0.00 0.03 0.02 0.02 0.02 0.02 0.20 0.32 0.19 0.18 0.22 0.24 0.40 0.40 0.44 0.37 0.41 0.40 0.60 0.60 0.45 0.44 0.47 0.46 0.80 0.65 0.71 0.65 0.60 0.50 1.00 0.82 0.81 0.76 0.64 0.70

Subjective evaluation of the forecasts from users in RFS and OEH indicated that the ensemble predictions were very useful for characterising the fire weather conditions in coming days. Further quantitative verification of the ensemble forecasts, including examining how forecast performance varies by variable and by season, is in progress.

The air quality system was first tested on the State Mine bushfire case from October 2013, with good results. This provided confidence that the modelling system was working correctly. This is demonstrated in Figure 30 which shows the model performance for scenarios where ambient fires are included/excluded from the simulation. The smoke is predicted to make contributions to all of the species including ozone, which is a secondary pollutant formed from the reaction of oxides of nitrogen and volatile organic compounds emitted from both anthropogenic sources and burning vegetation. These results provide confidence that the modelling system was working correctly.

Page 68

100 Observed Nitrogen Dioxide 80 Anthropogenic 60 Urban+RFS

40

Concentration (ppb)Concentration 20

0 10 12 14 16 18 20 22 24 Ocbober 2013

200 PM2.5

) Observed 3 - 150 Anthropogenic g m µ Urban+RFS 100

50 Concentration ( Concentration 0 10 12 14 16 18 20 22 24 Ocbober 2013

120 O3(obs) Ozone 90 Anthropogenic Urban+RFS 60

30 Concentration (ppb)Concentration

0 10 12 14 16 18 20 22 24 Ocbober 2013

Figure 30 Observed and modelled 1-h concentration time series for nitrogen dioxide (top), PM2.5 (middle), ozone (bottom) for the period 10–25 October 2013 for Chullora air monitoring station in Sydney. Two model simulations are shown: ‘Anthropogenic’ excludes all ambient fire emissions from the model, ‘Urban+RFS’ include all fires in the simulation.

The primary evaluation of the air quality forecasts is through time series and statistical comparison of forecast and observed concentrations of species at 17 OEH/EPA monitoring sites in the Sydney, Hunter and Illawarra regions. A recent example of a time series comparison which is made available to air quality forecast users each week is shown in Figure 31 showing good correspondence between the forecast and observed ozone concentrations at Liverpool.

Page 69

Figure 31 Consecutive 1-day forecasts (red) and observations (blue) of ozone concentrations at Liverpool, NSW, during the week beginning 6 April 2015.

Further verification of the air quality forecasts will include evaluation of categorical forecasts for smoke events, statistical error quantification, comparisons of model and satellite smoke plume locations (where satellite data are available), and sensitivity to plume rise.

RFS ran three fire modelling cases using ACCESS RUC data to provide meteorological input to the Phoenix fire spread model. Matthews and Louis (2015) evaluated both the meteorological and fire spread predictions from these cases. They found that for one of the cases the RUC data provided better information than the official forecasts from the 6km ADFD; for another the two sources resulted in roughly similar meteorological and fire forecasts, and for a third case the ADFD gave better predictions. Clearly many more cases will need to be run and evaluated to further investigate whether the ACCESS RUC model provides additional value over the official forecasts.

15.6.4 Preliminary conclusions

The FDP air quality sub-project provided an opportunity to develop and demonstrate medium- range ensemble fire weather forecasts and short-range regional air quality forecasts based on the ACCESS model and the Chemical Transport Model. These form two tiers of a three-tiered smoke prediction system under development in CAWCR. Our collaboration partners in RFS and OEH have been quite positive about the usefulness of the system for their decision making, and they are particularly eager to make use of ensemble approaches which will increasingly be pursued as the preferred approach for informing risk-based decisions.

Although the performance evaluation has not yet been completed, preliminary evaluation suggests that the ensemble and air quality forecasts are performing well. Further improvements have been identified. For example, site-specific bias correction of the ensemble forecasts, perhaps using the most recent 30-day period, would certainly improve their accuracy. The air quality forecasts could be improved through the use of updated emissions inventories and refining the assumptions about the fire burn periods. Experiments show that the smoke dispersion is sensitive to the plume rise, so a good strategy may be to run ensemble predictions with different plumes.

Feedback from forecast users in OEH and RFS has been very helpful in refining the products and their display. However, by not having all the air quality products up and running in time for the FDP period, we missed the opportunity to receive real-time feedback from users

Page 70

(especially at OEH) on forecast performance during the period when RUC-based forecasts were available. We must therefore rely more heavily on air quality hindcasts and post-real time evaluation of their performance.

The ensemble fire weather predictions have been extended to sites in Victoria and Tasmania, though forecast evaluation for those locations has not yet started. The forecasts continue to be produced on a "best effort" basis in order to collect more samples for evaluation and further refine the products and displays in anticipation of future operational implementation.

Similarly, the regional air quality forecasts are also running routinely on a best effort basis using the operational ACCESS-C forecasts to provide input for the CTM. This air quality forecast system has potential for use in operational air quality prediction as well as further business development. Discussions with the Bureau's Environmental Information Branch and Commercial Weather Services have been initiated.

On the research front, we wish to strengthen links with other smoke and air quality researchers at UNSW and other universities including in the new Clean Air and Urban Landscapes hub of the National Environmental Science Program. The high temporal and spectral resolution of the Himawari-8 geostationary satellite offers exciting potential for smoke detection to use in verifying smoke forecasts and as input for inverse modelling techniques, which could better define the initial smoke plume characteristics and thus improve the forecasts.

15.6.5 Acknowledgements

We wish to thank the NSW Rural Fire Service and Office of Environment and Heritage for funding support, provision of fire data, emissions inventories, air quality and meteorological monitoring data, and many useful exchanges during this project. Mohammed Nabi and Simon Louis provided useful feedback on the prototype forecast products. Michael Foley assisted with testing the fire weather calculations.

15.6.6 References

Cope, M.E., Lee, S., Manins, P.C., Puri, K., Azzi, M., Carras, J. Lilley, W., Hess, G.D., Tory, K., Young, M., Ng, L., Wong, N., Walsh, S., Nelson, P., 2004: The Australian Air Quality Forecasting System. Part I: Project Description and Early Outcomes. J. Appl. Meteorol., 43, 649-662.

Cope, M., and co-authors, 2014: A three-tiered smoke forecasting system for managing air pollution from planned burns. AFAC14, Wellington, NZ, 2-5 September 2014.

Matthews, S. and Louis, S., 2015: Trial of Rapid Update Cycle numerical weather data with Phoenix fire spread predictions. Preliminary report, RFS.

Myer, C.P., Luhar, A.K., and Mitchell, R.M., 2008: Biomass burning emissions over northern Australia constrained by aerosol measurements: I—Modelling the distribution of hourly emissions. Atmospheric Environment, 42, 1629-1646. Noble, I.R., Barry, G.A.V. and Gill, A.M. 1980 McArthur's fire-danger meters expressed as equations. Australian Journal of Ecology, 5, 201-203.

Page 71

Tolhurst, K., Shields, B. and Chong, D., 2008: Phoenix: Development and application of a bushfire risk management tool. Australian Journal of Emergency Management, 23, 47-54.

Page 72

15.7 Radar wind assimilation and data quality

Susan Rennie, Peter Steinle, Alan Seed, Mark Curtis, Yi Xiao

During the Sydney Forecast Demonstration Project (FDP) Doppler radar velocity obser- vations were assimilated into the high-resolution ACCESS-Sydney FDP domain. This was the first time that radar wind observations were assimilated in a pseudo- operational system. Nine radars provided coverage to the full domain, with six situated in the inner 1.5 km resolution region (Figure 32).

In order to assimilate radar observations, an automated method of selecting valid observations from the radar scan was required. Radar scans contain echoes from many sources including precipitation, insects and other aerofauna, as well as ground Figure 32 FDP domain and location of radars. Dashed box is the 1.5 km and sea surfaces. Only echoes fixed resolution domain. Dashed circles show assimilated area of coverage.

Figure 33 Example of radar classification, showing precipitation near the Kurnell radar. Some parts of this report are © Copyright 2018 AMS. Rennie, S., Steinle, P., Seed, A., Curtis, M., and Xiao, Y. (2018) Assessment of Doppler Radar Radial Wind Observation Quality from Different Echo Sources for Assimilation during the Sydney 2014 Forecast Demonstration Project. Journal of Atmospheric and Oceanic Technology, 35, 1605-1620,Page 73 https://doi.org/10.1175/JTECH-D-17-0183.1

from which a good estimate of the wind can be obtained should be assimilated. In the lead-up to the FDP, a classification algorithm was developed which used several filters and a Bayesian classifier to identify different echo types. An example of the output is shown in Figure 33. The output of this particle identification algorithm was also provided to forecasters via the FDP version of 3D-Rapic, along with a ‘clean’ reflectivity product containing only precipitation echoes.

The particle identification was used to extract radar wind observations from precipitation echoes and insect echoes. Under conditions where bias from insect flight is not substantially greater than the observation error, the insect echoes may also provide wind information. The assimilation of this ‘clear air’ echo during the FDP was the first extended test with the latest quality control algorithm in place. During assimilation, the particle identification information allowed precipitation and clear air observations to be treated separately. Analysis and observation monitoring were also handled separately for these echo types.

Observation monitoring was achieved by comparison of the observations with model- equivalent values. A detailed analysis is published in Rennie et al. (2018). The speed bias was estimated for each radar scan assimilated. The average bias of all scans at each assimilated elevation is shown in Figure 34. The bias (model minus observation) was often positive when surface clutter was present, as this decreased the observed velocity. In Figure 34, the elevations with recurring ground clutter have the largest bias. This indicates where effort in

Clutter not filtered on-site and not classified correctly

Not enough data because only part of scan in domain

Figure 34 Mean speed bias (model – observation) for each radar. Data points left to right are for 0.9°, 2.4°, 4.2° and 5.6° elevation scans. Dashed lines are standard deviations. Error bars are the standard error of the means. Results are for precipitation (prp) and insects (clear air: ca).

Page 74

improving the quality control should be directed.

The radar observations undergo additional quality control before assimilation, which removes suspect pixels and spatially averages the observations. The average observation minus background (O−B) of these processed observations, also known as the innovations, indicates the quality of values actually assimilated. The mean O−B for all scans is shown in Figure 35. From this result it is apparent that there is not a substantial difference between the quality of precipitation and clear air observations using the current quality control algorithms.

Figure 35 The temporal and spatial mean innovations for each scan elevation (lines) and mean across all elevations (circles) as well as the standard deviations (dashed lines/open circles).

The impact of assimilating radial velocities has not been tested with a controlled experiment because the processing and assimilation of radial velocity is still undergoing development. The quality control requirements, particularly the removal of permanent clutter echo, were not adequately met until shortly before the FDP. During the FDP, various case studies demonstrated the limitations of Doppler wind assimilation. Radars only measure near-surface winds very close to the radar, so cannot comprehensively observe low-level wind changes, including sea breezes and southerly changes. However, there were cases where wind changes that were visible in the raw radar data were removed during one of the quality control stages. Retention of wind information could be improved with more advanced quality control using dual polarisation, which would also allow the lowest elevation scan to be used (currently surface clutter is not sufficiently removed in that scan). Additionally, the wind analysis increments showed that wind information is spread far across the domain, which means that small-scale wind information is not retained at high resolution. Improvements to the covariances that control the spread of information during assimilation will probably be made

Page 75

for observations in high-resolution assimilation before controlled observation impact experiments are run.

Reference

Rennie, S., Steinle, P., Seed, A., Curtis, M., and Xiao, Y. (2018) Assessment of Doppler Radar Radial Wind Observation Quality from Different Echo Sources for Assimilation during the Sydney 2014 Forecast Demonstration Project. Journal of Atmospheric and Oceanic Technology, 35, 1605-1620, https://doi.org/10.1175/JTECH-D-17-0183.1

Page 76

15.8 Aviation Evaluation

15.8.1 Aviation Objectives and Aims

The Aviation Sub-project contributed to the FDP through demonstration of the applicability and value of SREP science to meet the needs of the aviation industry. Specifically, the demonstration sought to show how FDP outputs could be applied to the optimisation of airport and airline operations at Sydney Airport.

The aims of the Aviation FDP Sub-project were to:

1) Demonstrate FDP tools and outputs to aviation forecasters and stakeholders. 2) Evaluate FDP tools for enhanced thunderstorm prediction in the terminal area (TMA). 3) Evaluate FDP tools and outputs for accurate forecast of wind affecting runway changes. 4) Evaluate FDP tools for accuracy in predicting parameters that relate to alternate and landing minima at Sydney Airport.

15.8.2 Operations and Demonstrations

Aviation forecasters from the Sydney Airport Meteorological Unit (SAMU) participated in the SREP trial over a 3-week period. They produced TAFs for Sydney, Bankstown, Camden and Badgerys Creek comparing their forecasts with the on-shift forecaster TAFs at SAMU after the event. This report contains the key outcomes of the trial from an aviation forecaster perspective. It was very beneficial to have an aviation forecaster involved in this evaluation process because the detail required of aviation forecasts helped identify some areas of development that need to be addressed prior to operational release.

A demonstration of the SREP-FDP was provided to aviation industry stakeholders on Monday 24th November 2014 at the Major Airports Verification Workshop held in Bureau of Meteorology Sydney Regional Office. A presentation on the SREP-FDP project and the results of aviation forecaster analyses were provided at the demonstration and at the Major Airports Consultative Meeting on 18th February. In addition, a practical demonstration of the system capabilities for weather forecasting were displayed in real-time on 24th November.

15.8.3 Evaluation of FDP Tools for Enhanced Thunderstorm Prediction in the TMA

The increase in computing power in the last five years has led to increased resolution in computer modelling. To date model grid points have been positioned at 12.5 km spacing. The ability to run an operational model at 1.5 km resolution is new to Australia and requires significant computing power. This computing power will be routinely available on an operational basis when the Bureau’s new supercomputer comes online in 2016.

Page 77

The ability of a model to resolve detail to a 1.5 km resolution provides meteorologists with improved modelling of meso-scale weather features. Current operational forecasting models in Australia focus on environmental proxies for thunderstorms. The high resolution associated with the RUC allows the model to compute thunderstorm parameters internal to the storm cell rather than proxy parameters for cells within the larger grid squares. Successive model runs at 1 hour spacing when compared together can provide probability parameters that can be displayed as probabilistic forecasts for the likelihood of thunderstorms.

FDP forecasters observed that:

• National Thunderstorm Forecasting Guidance System (NTFGS) using the RUC to produce a calibrated thunderstorm parameter was able to predict areas of convection that were not predicted well by other models.

• There were large areas of false alarms in the NTFGS RUC output.

• Thunderstorm probabilities were estimated to be 10–20% too high.

• The RUC model seemed to perform better with convection in the absence of upper level dynamical forcing; it predicted initiation along local boundaries, outflows, and convergence between north-easterlies near the coast and north-westerlies inland.

• The model appeared to excel in the timing of onset and clearance of thunderstorms from within 45 nm radius of Sydney airport.

• The model was approximately an hour early in clearing thunderstorms from within 5nm of Sydney airport, though this is a significant improvement on current operational models.

• The RUC 10-min output (as opposed to hourly output) is best used once an event has started to occur, particularly when used in conjunction with the radar and observational data. It assisted with understanding of the mode of convection, possible triggers, and the area and severity of thunderstorms expected.

• It is assessed that this product has good potential to contribute to TMA forecasts and convective forecasts for the airspace between major airports within the model domain. There is good potential for reducing the size of PROB30 thunderstorm brackets in aerodrome TAFs within the domain of the model (up to 4 hours clearance time improvement on 2 occasions during the trial). This needs to be examined further prior to operational implementation.

15.8.4 FDP tools and outputs for wind affecting runway changes

The ability of the RUC to resolve features to 1.5 km resolution has led forecasters to investigate the usefulness of the model for significant wind changes. In the trial, wind changes which could result in a runway change at Sydney airport were identified as:

Page 78

• Sea breeze • Southerly change • Katabatic wind and • Upper winds.

The ability of the model to provide 10-minute time steps at high resolution is a significant improvement on current models which could lead to better forecast skill.

FDP forecasters observed that:

• The RUC modelled the timing of the sea breeze wind shift well in coastal locations.

• The RUC was able to propagate a sea breeze (without synoptic reinforcement) inland to areas which are not modelled well by current models; however, the model timing of the boundary was often incorrect.

• The RUC was good at forecasting the timing, strength and propagation of synoptically reinforced sea breezes across the Sydney Basin.

• Generally, successive model runs appeared to better forecast the timing of a sea breeze change.

• The changes in boundary conditions within the RUC model every 6hrs (04 Z, 10 Z, 16 Z, 20 Z) made the most impact on correctly forecasting the timing of sea breeze wind changes.

• RUC is not accurate in predicting the timing of southerly changes. At this stage forecasters would have little confidence in this aspect of the model.

• Forecasters are currently better able to time the progression of a southerly change using climatology and linear now-casting techniques.

• There were mixed results with the inland propagation of southerly wind changes.

• RUC seems to accurately predict north-westerly katabatic winds in the Sydney basin.

• The high resolution of the RUC model was using topographical affects in its mesoscale calculations of wind around Mt Victoria.

• The RUC model upper winds appear to be slightly strong (~5 kts) on average.

• The RUC model appeared to be more accurate in predicting the timing of the start and cessation of strong and gale force winds in the upper levels (925/850 hPa).

There seems to be good potential for future model development of the RUC to better resolve wind changes. Whilst there were some advantages of the RUC model, such as sea breezes and katabatic winds, the fact that the model could not predict southerly changes well is of concern and needs to be rectified before any reliable wind change prediction benefits can be realised

Page 79

15.8.5 FDP Tools for accuracy in predicting alternate and landing minima

During the trial forecasters were able to examine the model and provide accurate forecasts for:

• Precipitation • Fog and Low Cloud and • Moisture.

Preliminary investigations suggest that:

• Long range precipitation fields (T+6 h) verified well with radar in terms of convective type and general intensity. The model was not able to forecast specific point initiation zones of un-organised convective activity.

• The RUC model provides an expected overview for the day with regards to rainfall type (convective vs non-convective), possible intensity and movement.

• The precipitation fields gave an indication of the severity and movement of convective cells as the model translated the cells using forecast winds. However, at T+3 h each RUC storm cell lasted longer on the model than observed on radar.

• The ability of the model to predict the specific location of heavy falls after storm initiation was very accurate in the first three hours but diminished beyond 6hrs (T+3 h).

• The RUC 10-min displays are best used for time frames less than T+3 h, particularly when used in conjunction with the radar and observational data. However, there are only a few scenarios investigated and further validation needs to occur.

• RUC appears to be able to predict the difference between fog and low cloud well, but more forecaster assessment is required in winter.

• RUC also appears to be able to predict mist from fog to some extent.

• The RUC model is not currently able to produce significant surface inversion in the lowest levels.

• Generally, the RUC predicts air temperature profiles well.

• Forecast temperatures and dewpoints from the RUC in the western Sydney Basin are consistently too high, particularly in the early hours of the morning

15.8.6 Conclusions and further study

The assessments of forecasters during the trial indicate that further research is needed in the following areas:

Page 80

• Thunderstorms: Further investigation of how the model differentiates heat-based convection versus synoptically forced convection. Thunderstorm diagnostics need further investigation.

• Sea breeze: Need to validate the propagation of sea breezes inland. Frictional factors in the model appear to be too low.

• Strong low-level coastal wind changes: Need to investigate further the parameters in the model which are causing timing issues on major wind changes in the low levels, such as southerly changes.

• Katabatic winds: The RUC appears to handle katabatic winds well but further investigation in different conditions would be recommended.

• Upper Winds: The RUC appears to handle strengthening upper westerly winds (>925hPa) well but more evidence of how the model behaves in westerly wind conditions is required.

• Precipitation: There were mixed results using the precipitation fields of the model and further investigation is needed for different scenarios.

• Fog forecasting: Forecasters need to verify the capability further in a winter fog season. Initial result shows promise.

• Low level temperatures, dewpoint temperatures and surface inversions: A correction in the surface temperatures and dewpoints parameters may lead to improvements in inversions; however, an excess of moisture in the atmospheric profile needs to be addressed.

The progress of further work funded by aviation would need to be properly scoped and presented to the industry prior to commitment of aviation resources to work associated with the project.

15.9 Subjective Evaluation

Aurora Bell, Weather and Ocean Services Branch

15.9.1 Purpose, Methods, and Data of Subjective Evaluation in FDP- Sydney2014

The purpose of subjective evaluation was to test the capability of the Convection Scale Model (CSM) with Rapid Update Cycle (RUC) to assist the forecasters in performing their job, with a focus on the capability of the RUC to support the forecaster in gathering the mesoscale “feature of the day”.

The final aim was to assess the possibility to improve actual services and to develop new ones.

Page 81

Subjective evaluation is a qualitative analysis of qualitative and quantitative data (Figure 36 ). In our case, it was about how experts feel about the CSM RUC performance.

Figure 36. Analysis and data types. Adapted from Bernard, Ryan and Wutich, Analysing Qualitative Data, Sage Publications, 2010.

The data analysed were texts and meteorological information written and captured by the forecasters in blogs and case studies, and transcriptions of interviews and mid-day discussions.

Forecasters were asked to comment in a blog and to fill in forms (See Appendix on Operations) about their experience with the RUC. Blogging was performed at the end of the day and it took in average one hour per day. The blog (see Appendix on Blog) was a very successful tool for collecting feedback as the forecasters felt they had more freedom on what to comment about. We have analysed how the RUC would impact the way forecasters understand the evolution of the meteorological situation at Day Zero (+12 hours). The forecasters had to analyse in detail the 10-minute outputs from at least two consecutive runs of the RUC. The analysis time was around 30 minutes per run in the beginning of the experiment and diminished towards 10 minutes as forecasters were more used to it. After the analysis they had to produce a “Convective Outlook” based on the RUC. They had to also identify “features” that might have a role in convection in that particular situation. The subjective evaluation highlighted that the detailed analysis enhanced the forecaster situational awareness for very short-period and local weather events and allowed for greater intervention of the forecaster in the “Convective Outlook product” and in the “nowcasting process”.

During shifts with no severe weather, forecasters wrote case studies of some challenging situations. In average a case study was written in less than a day.

The blog, the case studies, the mid-day discussions and interviews with the forecasters have been analysed using the Cognitive Task Analysis (CTA) method (Klein Associates, 2003). Cognitive Task Analysis (CTA) is the study of what people know, how they think, how they organize and structure information, and how they learn when pursuing an outcome, they are

Page 82

trying to achieve. Forecasters were interviewed using the Critical Event and Successful Event methods, with open questions. The responses have been grouped in the following topics: strengths and limitations of RUC, training needs in CSM-RUC, training needs in mesoscale knowledge, gaps in interfaces usability, procedures and operational arrangements, benefits and challenges of FDP, feature of the day and expert cues.

The aim of this process was to highlight the strength and limitations of the RUC in providing guidance in meteorological situation and to observe if the strengths and limitations depend on the scale of the trigger, whether the trigger was at synoptic, mesoscale or storm scale during the different stages of the Severe Weather Workflow e.g. during the Analysis and Diagnostics stage, during preparing the Very Short-Range Forecasts, and during preparing the Warning for severe weather.

Examples of data used for subjective evaluation are provided below, from the blog:

29 September 2014

An example of the extra details that the high resolution provides is the 29 September forecast for daily maximum temperature in Sydney where the RUC forecast that the sea breeze would not keep the temperatures low see Figure 37 and Figure 38Error! Reference source not found.. The ACCESS R12 forecast was 28 °C, RUC 32 °C, observed 32.9 °C, which is a near the record for September.

Figure 37 The surface temperatures from ACCESS R12 for the Sydney area for 29 September 2014.

Page 83

Figure 38 The surface temperatures from the RUC for the Sydney area for 29 September 2014.

The main interest was in the sea breeze in Sydney and the resulting maximum Temperature. The Sydney record for September is 34.6 °C. The increased resolution of the RUC guidance can now resolve large inland water bodies. The dewpoints match the Warragamba Dam—Sydney's main water source.

In the table below is an example of data representation for the day, analysing how forecasters look at data, identifying cognitive themes and trends, identifying phases (which cues the forecaster noticed; how those cues were assessed; how the forecaster generated projections of how the situation might unfold; information he/she sought out; strategies for dealing with conflicting information).

Page 84

Table 13 Example of data representation for the day of 29th of September 2014.

11 October 2014

The RUC did really well on this day, nailing the showers and storms on the ranges in NE NSW, as well as the early start times of the precipitation compared to other models. Also, correctly forecast was the inland movement of the precipitation during the day, as the coast became stable thanks to a strong sea breeze; ECMWF and Access-R didn’t really have this. TIFS was producing cells based on clutter. This clutter was also affecting the rain field products. TITAN "likes" the clutter breakthrough and populates it heavily. Another point of interest for overnight was the offshore convergence line that developed around 19–20Z, peaking around 22–23Z off the NE NSW coast; no other model has this. Looping the RUC 10-minute rainfall fields was helpful in developing and confirming a conceptual model for TS.

13 October 2014: A significant day for convection.

RUC had some wins and losses today. It nailed the showers and thunderstorms in N NSW, both the morning line of storms and the storms developing in the Mid north coast ahead of this morning line. The dryline exploding over the inland was well forecast, except for the fact that it was 2 hours too late on initiation of convection on the dryline. Looking at the RUC 10-minute

Page 85

precipitation fields, it nailed the mid north coast convection very well, but failed on timing and location of the storms developing on the dry line by about 2 hours over the inland regions.

14 October 2014

Synopsis:

Another busy day with the low developing off the south NSW coast, deepening and moving north during the day. Storms developed in NE NSW ahead of the trough which was connected to the low and quickly moved offshore around lunch time. Shear line /trough/ strong southerly change moving up the coast, very heavy rain with over 100 mm falling and wind gusts over 100 km h−1. Three modes possible of convection today: (1) surface-driven storms over the Mid North Coast and Northern Rivers associated with instability and surface W/NE wind convergence about the eastern slopes (2) cold pool convection during the afternoon over central eastern parts of the state (mainly ranges) and (3) maritime convection associated with the small east coast low (mostly about the sheer zone) may brush the coastal fringe, more likely in the evening. Observations for the 14th of October, notice the wide spread through the day: Temperature at Observatory Hill seemed to be well forecast today by the RUC. RFC went for 19 °C, Access went for 16 °C, Ruc went for 18 °C. It reached 17.5 °C. WIND Highest Gusts in Sydney Area Tuesday night/Wednesday morning: 161 km h−1 (87 kts) Wattamolla at 09:02 pm 115 km h−1 (62 kts) Kurnell at 8:52 pm 107 km h−1 (58 kts) Sydney Airport at 10:45pm and still 96 km around 4 am. 104 km h−1 (56 kts) Bellambi at 16:56 pm 100 km h−1 (54 kts) Sydney Harbour at 2:04 am 102 km h−1 Norah Head at 4:20 am Wednesday

Elsewhere on Tuesday: 111 km h−1 (60 kts) Montague Island 11:45 am 106 km h−1 (57 kts) Ulladulla at 10:28 am 104 km h−1 (56 kts) Kiama at 6:2 0pm 91 km h−1 (49 kts) Wollongong at 6:34 pm, Moss Vale at 7:0 4pm 82 km h−1 (44 kts) Point Perpendicular at 4:46 pm 80 km h−1 (43 kts) Cabramurra at 11:13 am

RAINFALL to 9 am on Wednesday: In the Sydney area, highest falls since 9 am Tuesday: 143 mm at Sans Souci (highest October daily rainfall on record) 142 mm Cronulla (highest October daily rainfall on record) 124 mm at Marrickville 121 mm at Canterbury 116 mm at Strathfield 101 mm at Sydney Airport Strathfield 94 mm in 3 hours Tuesday evening, a 1 in a 100 year event. Marrickville, Canterbury, Sydney Airport 2 and 3 hour rainfall generally around a 1 in 20 year event. Cooks River Catchment average of 109 mm over a 12-hour period (mostly in 4 hours).

Page 86

Highest rainfall totals elsewhere in 24 hours 9 am Tuesday to 9 am Wednesday were mostly in the Ulladulla area: Lake Conjola: 171 mm Ulladulla: 147 mm Jerrawangla: 126 mm Lake Tabourie: 120 mm Fisherman's Repeater: 119 mm Burrill Lake: 112 mm

WAVE HEIGHTS at 7am Wednesday: Sydney Sig Wave height 6 m with max near 9 m Crowdy Head Sig Wave heights 3 m with max of 4 m

SNOW Significant snowfalls were also reported in the Blue Mountains with 10 cm at Oberon and up to 20 cm claimed in some areas. Snow was reported as far north as the Barrington Tops.

Discussion: In regard to the low and shear zone, the Access-R nailed the thunderstorms in the NE and the location time and movement of the shear zone and rain up the South coast into the central coast. The RUC had the storms too far west and too slow to move offshore in the NE and again was too slow and too far south on the location of the heavy rain and shear zone moving up the coast.

ACCESS and EC had it really well, but the detail was missing, and this was where the RUC adds value. The RUC, which did have the timing and location out by speeding up the change, seemed to nail run after run, the mode and structure of the line and the vortices forming on the shear convergence line. This was forecast for around the Sydney region and that's exactly what occurred in the Sydney basin in the 09 to 11Z time frame. It's this extra detail that would allow us to better define our warnings and highlight our regions where we think the worst affects maybe. Both ACCESS and EC show a huge 100 km region that could be bad, while RUC shows about a 20-30 km wide region if that. This would be a great case study to do for a practise on advanced forecasters course and to show the advantages of high-resolution models. Big Win for the RUC. The shear zone, southerly change moving up the coast was captured well by all models. The interesting thing to note was that the morning runs of RUC 18–00z did well with the timing and location of the change and shear zone to be in the Sydney basin by 10Z which is what occurred (Figure 39).The following runs (01, 03, 05, 06Z etc.) pushed the change further and further up the coast with each run, which ended up being more and more wrong in a shorter lead time. The RUC should be capturing the situation better as it is updating hourly and getting closer to the event. It totally missed the stalling of the shear zone just south of Sydney and wasn't correct till about 15Z when the storm caught up to the model. It should have been the other way around. The major win for the RUC was while timing and location were wrong, which seems common with many forms of high resolution modelling, it captured perfectly the structure of the shear line and the vortices braking off and spinning into the coast run after run and that's exactly what happened and caused most of the issues for Sydney (flooding wind damage etc.).So, examining 1 hourly data (or 3 hourly in the case of EC), the 00Z RUC gave the clearest broad story (regarding account position and timing).

Page 87

Figure 39 Example of a 10-minute rainfall forecast in RUC (green colours) and the 6-minute rain rates from Rainfields3 (blue colours) for the 14 October 2014 case study.

27 October 2014

An example of RUC’s major improvement: 27 October 2014. Westerlies gustiness over land ahead of a southerly change over waters. RFC issued the wind gust warning at 13:10 pm (see below), after the gusts have been observed.

Page 88

All the models underestimated the wind speed. RUC depicted the gusts advancing over the CBD (Figure 40) in the 21:00 run which would have helped the forecasters to issue the warning earlier.

Figure 40 Wind speed and direction for 27 October 2014

IDN28500: Australian Government Bureau of Meteorology, New South Wales

Severe Weather Warning, for damaging winds for people in the Illawarra, Central Tablelands, Southern Tablelands, Snowy Mountains and Australian Capital Territory forecast districts.

Issued at 1:10 pm EDT on Monday 27 October 2014.

Gusty west to north-westerly winds ahead of southerly change expected to move through south-eastern New South Wales during Monday afternoon and evening.

Weather Situation

DAMAGING WINDS around 60 km/h with peak gusts of 90 km/h are forecast for parts of the Illawarra, Central Tablelands, Southern Tablelands, Snowy Mountains and Australian Capital Territory forecast districts…

Page 89

The Southerly change itself was depicted in the same 21:00 run at 08Z, image below (Figure 41):

Figure 41 Wind speed and direction at 08:30Z 27 October.

Other fields: 2m temperature in RUC for the Western suburbs was a bit high in the morning (34 °C) but hard to verify. Probably the cloudiness wasn’t well captured, and this was a reason for higher temps.

The dew points features in RUC were really spectacular, showing the westerly advection of lower dew points over the lower part of the Sydney area, between the two humid areas (at the South over the reservoirs and at the North over the Colo River) (Figure 42)

Page 90

Figure 42 05Z 27 October. Top: MSLP and dewpoint temperature. Bottom: MSLP and wind barbs.

Page 91

15.9.2 Cognitive Task Analysis – examples of outcomes

Cognitive task analysis (CTA) is a type of analysis aimed at understanding tasks that require a lot of cognitive activity from the user, such as decision-making, problem-solving, and judgement. We have used it to evaluate the impact of the RUC in the development and evolution of mental models in the decision-making-process. In other words, we wanted to reveal what forecasters and experts know and how they think in analysing the NWP outputs. We have used the Critical Decision Method to get experienced forecasters to describe some of the challenges they faced when using the RUC to identify the “feature of the day”. The process was meant to reveal factors and cues noticed by forecasters, the goals they’ve adopted, and actions used to assess the feature. While is easy to “observe” the actions that the forecasters are taken, the decision processes underlying these actions are not so obvious.

In the following table we have observed how the forecaster performed the task of developing Situational Awareness.

Table 14 Examples of developing Situational Awareness by the forecaster

Background Feature Cue Assessment (A)/ Conclusion Concern (C)/ Projection (P)/ Course of Action (COA)

Page 92

3 Nov 2014 Early morning off RUC and AccC Radar boundary, Features over shore feature, treat features NWP had the land are High differently. boundary generally pressure wind (NE-E) displaced (A) depicted in the ridge convergence over same way by Sydney, Dry line was pretty ACCESS C and ACCESS C and close (A) wind change RUC models RUC; represent the Dry air OK in the over sea ACCESS convergence west (A) C is slightly NE-E differently RUC cooled the better, but over Southern temp in the sea temps Sydney breeze (good (microphysics) point) but ACCESS are better in C hasn’t. (A) RUC.

Made a long loop with RUC in VW Assimilation of (COA) data didn’t solve Assimilation of the issue (C) data didn’t solve

the issue (C)

Page 93

5 Nov 2014 Initiation, 3 consecutive placement of runs in RUC,

convection, orographic initiation is

forecast with 1- RUC doesn’t 10 mins capture Initiation accuracy. around the main RUC did not feature. model the wide

Clearance spread conv around the main frontal

feature.

Enough mid- level instability After thermal

or other instability, other mechanisms to mechanisms continue to remain in RUC. initiate The clearance is convection Gale wind not quick enough. after the main

band of TS. RUC ACCESS C does it has a lagged better

clearance of TS

No convection Norah Head modeled in RUC Gale, due to an ACCESS C and with the southerly outflow that RUC model change; ACCESS C reinforced the differently the had convection Action: check with synoptic flow. upper satellite, upper RUC captures troposphere divergence moved pockets of gale (to teach quicker offshore. wind. forecasters about this !)

In reality there Timing of the southerly was no convection. change not convection mode, correct modelled (by

Multicellular RUC + ACCESS tending to lines C, ACCESS R). along features DA didn’t fix it! with embedded big problem ! organized cells (short lived Page 94 rotation, FF, funnel cloud, l

6 Nov 2014 Checking for Initiation good RUC is good conceptual model, with surface and ‘Geoff Feren” lifting – rule of thumb Initiation placement good, but early ~40 mins.

10 Nov 2014 Convective RUC A: RUC The nesting two troughs: initiation on NT precipitation overestimated model for RUC and wind initiation of was the source

overlaid with convection in NT, of the problem Trough from radar didn’t was only Cu in (overestimated the NWS to verify RUC’s satellite-VIS Td) the CT initiation for the Obs: Aviation NT trough; this thought Td were was a failure for A: RUC underestimated. Trough from all guidance. overestimated Td in the westerly NT to CT flow over the NT, Many runs by a couple of indicate degrees. All guidance initiation in NT; forecast C: Td problem in Assim didn’t fix precipitation RUC the problems on both (initiation and P: assimilation will troughs. timing). The convection solve it or the RUC might be a little bit in the NWS trough verifies early. The RFC Might be a PBL but is slightly trusts more parametrization early. the pp on the issue. NT-CT trough COA: investigated

the Td in all TBD: study the

Isolated cell guidance and behavior of the

spreading along found out that all PBL schemes in the trough, overestimated Td the RUC merging into in TN.

multicellular

Convective mode convection and producing an

outflow that

spreads through the NE during late afternoon, evening, night.

Page 95

Strength: Convection Multicellular evolution convection + large outflow

11 Low cloud In S/SEly flow, A : use VIS images RUC solves November low cloudiness and obs to assess better the

develops into low cloudiness. orography and the valleys solves low C: NWP do not along the east cloudiness in resolve these side of the S/SEly flows features in detail ranges along the east P: low cloudiness side of the

and drizzle might ranges. not be properly “RUC has done forecast; radar very good job” might not pick up the drizzle.

COA: check the total cloud in RUC,

compare with satellite and obs. Storms on afternoon/evening 11th November Concern: Satellite A trough slightly Operational pictures are misplaced in the mets had valid for 15- RUC (but no other trouble placing 20mins after models had it it based on the nominal time either). surface obs- Not enough could really only The clearing of the obs: we should pick it by storms. Comparing use the RUC as looking at RUC forecasts from analysis. 21Z on Nov 10th convection in with ACCESS-R satellite images) Clearing of the storms was forecasts from 18Z. Note that this area better in RUC is not really Assessment: than in AccessR. Also note that covered by APS1- SY satellite pictures are valid for 15- APS1-R has just a 20mins after very broad area of nominal time precip – no indication of

Page 96

shower structure and seems to hang

around for too long.

Comparing hourly precip as well (as field is in common for RUC and APS1- R)

Convective RUC has a bias C: this might be Initiation in initiation, is a due to bit too early. parametrization of entrainment of dry air in convective cloud and or parametrization of evaporation of precip from high based convective cloud

Td behavior in low The RUC doesn’t C: the nesting terrain during depict the rapid model for RUC is rapid drying processes of an experimental drying. one and can’t be seen. ACCESS R can be a proxy.

A: No other guidance captured this fast drying over the SW of the RUC domain.

15.9.3 Conclusions

The subjective evaluation highlighted that the detailed analysis enhanced the forecaster situational awareness for very short-period and local weather events and allowed for greater

Page 97

intervention of the forecaster in the “Convective Outlook product” and in the “nowcasting process”.

One of the most important conclusions of FDP: experts are crucial in developing declarative knowledge, trainers are important in developing procedural knowledge: both are needed

Forecasters expressed that they became more confident in the use of the model when they have received explanations from experts and that they had understood more about numerical prediction, STEPS and Rainfields in general then when they had received the “in-house” training from trainers. During the FDP there was systematic interaction between forecasters and experts that helped the forecasters to extract more information from the outputs, even if the experts provided explanations that were significantly more abstract and theoretically oriented than those of the non-experts. Somehow, the forecasters were able to solve “transfer problems” quickly and effectively.

Conceptual knowledge alone, however, is insufficient for generating effective performance. The non-expert trainers can provide more concrete, procedural explanations, which can facilitate higher performance by trainees when they attempted to perform the original target task (using the same example: in order to assess the potential for bow-echo development we first need to develop clear procedures on how to evaluate the wind shear in the layer cloud and its relation to the low-level boundary orientation).

The abstractions provided by the experts lacked key details and process information necessary for optimal performance. The most effective learning occurs when all necessary information is available to the learner in the form of instruction and/or prior knowledge.

We had experts that didn’t need detailed explanations.

The experts’ performance relied on processes that were very different from their declarative knowledge of their practice.

Event review was one of the primary strategies for learning.

As NWP become better, the role of the forecaster is moving from tweaking the model out output to interpret it to a multitude of decision makers. To fulfil this role, the forecaster needs high-quality learning to understand weather processes so she/he can assist decision makers or another professional to identify how weather impacts their domain.

15.9.4 References

Bernard, H.R, G.W. Ryan and A.Y. Wutich, 2010: Analysing Qualitative Data, Sage Publications.

Doswell C.A. III, 2003: What does it take to be a good forecaster? (http://www.flame.org/~cdoswell/ forecasting/Forecaster_Qualities.html)

Page 98

Doswell C.A. III, 1986: The human element in weather forecasting, (http://www.flame.org/~cdoswell/ publications/Human/Human_Element.html)

Erkkilä, Timo, 2008: About the nature of the forecaster profession and the human contribution to very short range forecasts, The European Forecasters – Newsletter, available at: http://www.euroforecaster.org/newsletter14/erkkila.pdf

Gaia, M. and L. Fontannaz, 2008: The human side of weather forecasting, The European Forecaster, 13, 17-20

Klein Associates Inc., 2003: Cognitive task analysis of the warning forecaster task, (http://wdtb. noaa.gov/modules/CTA/Final123102rev030108.pdf)

Kox et al, Perception and use of uncertainty in severe weather warnings by emergency services in Germany, Atmospheric Research, 158-159, 292-301. LaDue D., 2011: How meteorologists learn to forecast the weather: understanding human learning in complex domains- 20th Symposium on Education, 91st American Meteorological Society Annual Meeting

Proceedings of the WMO regional Association VI (Europe) Conference on Social and Economic Benefits of Weather, Climate and Water Services, Lucerne, Switzerland, 3-4 October 2011 available at: https://www.wmo.int/pages/prog/amp/pwsp/documents/PWS_23_ROE- 1_en.pdf

Stuart, N.A., D.M. Schultz, and G. Klein, 2007: Maintaining the Role of Humans in the Forecast Process: Analyzing the Psyche of Expert Forecasters. Bull. Amer. Meteor. Soc., 88, 1893–1898, https://doi.org/10.1175/BAMS-88-12-1893

15.10 On the Use of Convective Scale Models in Nowcasting - the forecaster's role and training needs

Aurora Bell

15.10.1 Scale and predictability

Energy cascades in the atmosphere from planetary scale (waves and jets, short waves and jet streaks, advecting air masses), down to local terrain effects and sea breezes, and further down to microclimates where vegetation roughness and surface evaporation play an important role. This implies that the weather that is experienced at a point on the surface depends on processes that operate on a wide range of scales.

1. The features at large synoptic scale (large synoptic systems) are quite predictable at 24 hours, and the use of a deterministic approach to evaluating the synoptic situation is sufficient for lead times that are less than 12 hours.

2. Moving into lower scales, the mesoscales, the features we need to describe are less predictable and therefore the models have less skill and ensembles start to be useful.

Page 99

3. If we go to even lower scales, like the individual shower scale, the skill of the NWP is unpredictable noise and very large ensembles are required. Before moving into ensembles at the small scale, we must first evaluate if the model is right, in a deterministic sense, if it had realism in the described features.

A Convective Scale Model is one of many forecast tools and should not be used in isolation but should be used together with observations and the meteorological conceptual model of the forecast situation. In common with any other tool or guidance used in forecasting, the forecaster needs to be equipped with the understanding of the model limitations, the different capabilities of the operational models used for guidance (Figure 43) and how these will impact on the forecast of the day. The model should not be used in the operational forecast process without first developing and then integrating this understanding into the forecast process. We will call this background knowledge the “conceptual model of the NWP behaviour”.

The Sydney2014FDP identified that the Convective Scale Model has considerable skill in depicting the initiation of convection along low level boundaries (mainly along the synoptic boundaries: drylines, cold fronts, prefrontal troughs, coastal troughs), but also the mode and the evolution of convection as it evolves through complex terrain and through changes in the thermodynamical processes of the lower troposphere. These skills depend strongly on how well the parent model positions this boundary and its subsequent movement. Therefore, the way in which the lateral conditions of the Convective Scale Model have been taken from the parent model, has a major impact on the quality of the forecast. We do not discuss here the role of the assimilation of data.

Figure 43 Depiction of model grid resolution for Australian models.

We present here a proposal for how to exploit this observed strength of the Convective Scale Model to deliver an improved forecasting service. Over time other strengths of the model will become apparent as the model is improved and the forecasters gain experience in using it, and therefore the way that the model is used will evolve over time.

Page 100

The task of assessing the behaviour of the Convective Scale Model in relation to convective mode evolution of storms initiating along a synoptic boundary builds on the following knowledge:

• A collection of meteorological conceptual models for the Australian climate, particular regions and topography, that present the scale and the predictability of a specific phenomenon (TC, southerly busters, breezes etc.)

• A conceptual model of Convective Scale Model behaviour that explains how the model behaves in a specific meteorological situation (e.g. strong inversion, strong convection, retrograde motion etc) and what is the relation with the parent model in that particular type of circulation. This means that the forecasters understand how the model deals with physics, what are the limitations of the implemented parametrization schemes and what was the calibration performed on the model in a specific region.

• A descriptive knowledge that explains what the technical differences between different runs are, types of data that are assimilated and at what times, hours of assimilations and lateral boundaries, so the forecaster knows which run has “better” set of data, which initializes better for the feature of the day (Table 15).

Table 15 Background knowledge for the effective use of a CSM RUC - Training needs

Domain Problem Forecaster's story Actions to be performed Data Assimilation In Australia the Mesoscale The forecaster needs to Build a descriptive knowledge on the circulations and flows are be aware of the impact of differences between runs, hours of background strongly different assimilating observations assimilations and lateral boundaries, so the forecaster knows which run has “better” knowledge between night and into the model. daytime. set of data, which initializes better for the feature of the day.

Some older runs can perform better than newer runs. CSM RUC The CSM RUC doesn't The forecaster needs to Build a conceptual model of NWP behavior solve all convective scale, be aware of the artefacts that explains how the model deals with background parametrization is still induced by different physics, limitations of the implemented used. parametrization schemes. parametrization schemes and the knowledge calibration in a specific region. Australian The complexity of The forecaster needs to Use of Australian Mesoscale Conceptual Mesoscale combination of different understand the models (ex: the Australian dryline, the East Meteorology scales aspects in the same predictability of the Coast Lows, the Southerly buster, the large topic of convection. "feature of the day". zonally oriented cold fronts, bay breeze, sea breeze, river breeze etc.) background knowledge

• How to evaluate the performance of the models at the initiation: A complex series of steps to assess the correctness of the parent model by overlapping recent observations of large synoptic scale features (satellite, radar, automatic station) on first forecast steps and the type of the synoptic and mesoscale forcing that act in the

Page 101

time window of interest. This is to assess how good the initialization of the High- Resolution model is.

• A decision on what model run to use and how to use it (ex: deterministic approach with subjective interpretation, subjective mean of different solution).

• Continuous monitoring of the situation to confirm the validity of the conceptual model of the day and to initiate a revision if required, e.g. Analysis and Diagnosis (Table 16).

Table 16 Analysis and Diagnosis in the Forecast Process based on CSM RUC.

Task Cognitive output Cognitive level Activity Output, frequency Assess the quality Building confidence Analysis, Assess the correctness of the parent Decision, policy. of initialization of in the model Diagnostic model by overlapping and comparing the CSM RUC. recent observations of large synoptic Interrogate, scale features (satellite, radar, Investigate automatic station) on first forecast Repeat for every steps new run

Assess the scale Understanding the Analysis, Assess the type and strength of the Policy (predictability) of risk of the day Diagnostic synoptic and mesoscale forcing that the feature of the act in the time window of interest. Repeat with every day. Interrogate, new run Investigate

Assess the Decide how to Diagnostic Decide if the CSM is good to use and Decide how to uncertainty of the communicate the how to use it (ex: deterministic use the model feature of the day risk of the day. approach with subjective interpretation, subjective mean of different solution, using the predictable parts of it, using the mode Repeat with every and evolution of convection, using the new run detailed description of features etc.). Production stage Projection-Forecast Objective/ Produce, Update & Disseminate Users oriented Forecast for a specific time or for an Automatic/Manual Subjective interval Monitoring Situational Continuous monitoring of the situation Repeat hourly awareness to confirm the validity of the conceptual model of the day and to Map Analysis, initiate a revision if required, by answering questions like: what has Mesoscale happened, why has it happened, etc. analysis

Page 102

15.10.2 Assessing the convective mode and evolution: a problem at different scales

One of the challenges in weather forecasting is the anticipation of convective initiation and the subsequent mode of evolution of thunderstorms. Because the convective mode has often a strong influence on the severe weather threat, it is crucial for a severe weather forecaster to accurately predict the modes and evolution of deep convection. This problem has two aspects:

1. To develop conceptual models for convective initiation and evolution at the mesoscale that are specific for a region and a synoptic situation and to include them in the forecast process.

2. To develop conceptual models of NWP behaviour: NWP have their constraints and might not represent properly these processes or can have “a specific behaviour” in a specific situation, and the forecasters need to learn about these when comes to introducing new models into operations.

During the FDP the forecasters had to forecast at very-short-term (0-3 h) the convective mode evolution, based on the RUC output; we will discuss here only the situations when storms initiated along a synoptic boundary (dryline, cold front, prefrontal trough, coastal trough), in conditions of surface-based instability and we will discuss only the “discrete” and “linear” mode (we will not discuss the mixed mode). In the model, in general the storms would remain isolated along or near the line boundary but on few occasions a very rapid upscale growth into contiguous lines of storms was observed.

The forecasters noticed the high skill of the RUC in depicting both the initiation of the storms, usually as discrete cells, and the evolution of the mode of convection from discrete cells into linear or combinations of discrete and linear, or even more complex systems. They have also appreciated the depiction of the advection and propagation components of the cell's movement along a boundary line that generated forward or backward building of the line and acclaimed the usefulness of this conceptual model of movement of the convective line in assessing the hazard potential. There were also situations when the initiation and evolution of convection was completely or partially missed by the RUC model. The forecasters tried to understand the context of these “missed” situations. Here, we present how the forecaster could use a conceptual model of such behaviour of the model and the relation to the parent model, in the forecast process based on CSM RUC.

15.10.3 What are the parameters relevant to the convection mode that the RUC takes from the parent model?

The forecast of the convective mode and evolution (discrete or linear) depends on evaluation of several parameters, from synoptic scale, to mesoscale and to the storm scale, like wind, temperature and moisture. While the wind parameters are easy to use in the RUC, the thermodynamic parameters are difficult to use due to the representativeness of the RUC sounding data in areas where significant forcing is present. The forecast of the convective

Page 103

mode and evolution (discrete or linear) depends on evaluation of several parameters of the wind profile:

• The vertical shear profile orientation with respect to the initiating boundary (shear oriented normal to a line of forcing is associated with upscale linear growth while shear oriented at an oblique angle to the boundary results in discrete cells);

• The orientation of the low-level boundary (the low-level forcing) with respect to the environmental flow has a role in in modulating the organization of the precipitation distribution associated with radar-observed linear mesoscale convective systems (the motion of the boundary…);

• Cold pool strength and longevity, which are determined by the orientation of the cloud-layer shear (precipitation efficiency). Also, the strength of the cold pool depends on temperature and moisture parameters, but, as we mentioned earlier, we will not discuss about the influence of the thermodynamical parameters here.

During FDP we were looking at three types of convective modes: linear, discrete and mixed. The linear mode can be “simple” or “severe” (squall line or bowing segments).

Processes that play a role in the convective mode are:

1. The trailing stratiform region (controlled by synoptic scale: 3D wind, mesoscale: pressure under the stratiform region, rear-inflow jets, storm scale: cold pool, downdrafts)

When forecasting linear modes, it is very useful to know whether the convection will develop a trailing stratiform region as this continuous precipitation adds to the accumulation and can transform a heavy rainfall into a not very obvious threat. This is also assessed by the 3D structure of the wind (and pressure) in the precipitating system and has synoptic and mesoscale components that impact on storms structure (strength of the cold pool and downdrafts). Rear-inflow jets are used to predict severe liner modes like squall lines propagation and the bowing segments.

2. The strength of the mesoscale linear forcing

The low-level convergence generates vertical motion along the boundaries. The strength of the low-level convergence (ex: mass convergence in the lowest 90 hPa) dictates the magnitude and duration of the deep mesoscale lift along the boundary and thus the magnitude of the mesoscale forcing. Deep, persistent mesoscale ascent can reduce the dry entrainment by moistening the thermodynamic profiles and weakens or eliminate convective inhibition associated with capping inversions. Very rapid upscale linear growth is frequently observed within the zone of strong convergence and mesoscale ascent (like from the merging of two lines).

3. Types of initiating synoptic boundaries

Page 104

It has been observed that prefrontal troughs and especially drylines tend to produce persistent discrete modes more often than cold fronts, which are more frequently associated with convective lines. The lift along a dryline is confined to a narrow zone with localized areas of enhanced low-level convergence while in the higher levels there is a strong (westerly-easterly) component of the mid-level flow. Therefore, the residence time in the updrafts is shorter. When the middle-level flow component is large, the residence time is even shorter. Also, the proximity to elevated mixed layers results in strong convective inhibition. If the deep-layer forcing is not sustained, the development of the convective cells will be isolated or widely scattered along the dryline.

The wedge of the advancing cold air associated with a cold front generates slab like ascent, that is stronger in fall and spring when the density discontinuity can be more pronounced, resulting in in a higher frequency of linear evolution. Cold fronts are associated with short-wave troughs or higher-amplitude upper troughs. The last ones are more likely to have a weaker normal component of mean wind and shear along much of its length, which can promote storms remaining within the zone of forcing along the boundary for longer periods of time.

4. Storm motion with respect to the initiating synoptic boundary

Storms evolve differently if they initiate in environments where external mesoscale forcing is persistent and strong rather than weak. Storms motion is due to both advection and propagation, therefore the orientation of the of the mean cloud-layer wind vectors with respect to the boundary, via advection, has a strong influence on the length of time updrafts remain near a boundary. Storms that remain within the zone of linear mesoscale convergence have a propensity to grow upscale at a faster rate. Also, the distance between the storm and the boundary is important: the larger the distance, the more likely to move away of the boundary.

5. Capping inversion or lid strength in warm sector

The strength of the inversion is larger in environments where the storms remain discrete, whether they stay on the boundary or if they move away. For the linear mode, the strength of the inversion was smaller.

Convective mode evolution involves many variables and processes that occur on the synoptic, meso and storm scales. Storms can evolve along outflow boundaries within the warm sector, along the complex terrain features and synoptic boundaries, and in many combinations of these.

15.11 Post-FDP Benefits

A nowcasting training event was organized after the FDP on 16-20 May 2016 to prepare the forecasting community for the use of the Convective Scale Model (APS2- high resolution model without the hourly rapid update cycle), on different meteorological situations that developed

Page 105

after the FDP over different regions of Australia. All material is available on BMTC Latitude internal site. The forecasters (18 forecasters from BNOC and the RO) used the APS2 to explore four mesoscale cases of high-impact weather events (a low cloud/fog case in Melbourne relevant for aviation, the Kurnell Tornado, a squall line in Darwin and the Pinery Fire in SA).

Several oral presentations and posters have been delivered both at international (American Meteorological Society, Australian Meteorological and Oceanographic Society, European radar conference) and for the internal meteorological community and can be accessed internally.

Conclusions from FDP have been used in strategy development during Re-imagining and development of the Extreme Weather Desk. The FDP was a useful template for running a "Hazardous Weather Testbed" out of the Extreme Weather Desk during November 2018, which allowed forecasters to test the new Rapid Update model with new convective weather diagnostics.

Page 106

Page 107