<<

University of PRISM: University of Calgary's Digital Repository

Graduate Studies The Vault: Electronic Theses and Dissertations

2017 Topographic and Geographic Influences on Near-surface Temperature under Different Seasonal Weather Types in Southwestern

Wood, Wendy Helen

Wood, W. H. (2017). Topographic and Geographic Influences on Near-surface Temperature under Different Seasonal Weather Types in Southwestern Alberta (Unpublished doctoral thesis). University of Calgary, Calgary, AB. doi:10.11575/PRISM/28465 http://hdl.handle.net/11023/3686 doctoral thesis

University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. Downloaded from PRISM: https://prism.ucalgary.ca UNIVERSITY OF CALGARY

Topographic and Geographic Influences on Near-surface Temperature under Different

Seasonal Weather Types in Southwestern Alberta

by

Wendy Helen Wood

A THESIS

SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE

DEGREE OF DOCTOR OF PHILOSOPHY

GRADUATE PROGRAM IN GEOGRAPHY

CALGARY, ALBERTA

APRIL, 2017

© Wendy Helen Wood 2017

ABSTRACT

Near-surface temperature variability is influenced by geographic and terrain characteristics. My research examines how these influences vary by weather type. This knowledge is used to determine the best methods for modelling temperature in the mountains and prairies in southwestern Alberta, using data collected as part of the Foothills Climate Array (FCA) study.

A weather classification system was developed for the area using multivariate statistical analysis, and six weather patterns were identified. Missing temperature data in the FCA are gap-filled using regression equations generated using the most closely correlated station for each site, where correlations are calculated by seasonal weather type. Seasonal weather type correlations improve estimates by ~7% over monthly correlations. The biggest improvements (10 to 20%) occur for chinook and cool-wet days. Cold Arctic air days and hot anticyclonic days in summer show the lowest improvement, indicating strong within- type variability for these weather types. These weather types also show the most variable temperature lapse rates, with frequent inversions.

Local weighted regression models outperform multivariate regression models by between 4 and 8% in the mountains. Daily temperature and elevation are not always strongly correlated, most notably during Arctic cold spells. This is true for both minimum and maximum temperatures in the mountains. Therefore, regression models using elevation as the only predictor perform poorly, particularly in winter months. Vertical and horizontal separation are the most important factors in choosing local neighbours, with vertical separation being most important for minimum temperatures and for winter months. Relative elevation and slope, as indictors of cold air pooling potential, influence the selection of local neighbours for minimum and mean temperature models.

Spatial proximity is the most important factor determining temperature relatedness in the prairies. Minimum temperatures are strongly influenced by urban and relative elevation effects. Sites located within the city of Calgary are warmer than those in the outlying areas, and temperatures are warmer away from low lying areas. Seasonal variability is stronger than weather type variability in the prairies. Therefore, kriging is suggested as an appropriate method for estimating temperature in the prairies, with models parameterised monthly. ii

ACKNOWLEDGEMENTS

This thesis would not have been possible without the support and encouragement from many people. In particular, I thank both my husband Nick and my supervisor Dr. Shawn Marshall. Nick you are a saint for putting up with me. Shawn it has been a pleasure and privilege to know you as a friend and supervisor. Your quest for sound science is an inspiration. Thank you for your patience and quietly challenging me to do better.

Thanks also to my committee members, Dr. Stefania Bertazzon and Dr. John Yackel. Stefania tried hard to instill statistical correctness in both my methods and writing, I tried hard, but can never measure up to her high standards. It was always a pleasure talking about the weather with John, someone who shares my interest in our highly changeable weather here in Alberta.

Rick Smith provided technical support and was a good listener for all my problems, as well as being a source of many good ideas. Terri Whitehead did an amazing job of looking after the field work and data management. My life was a lot easier knowing I could go to Terri with questions, and with her good organisational skills she always provided the answers. A host of summer students were responsible for field work and data collection.

Geography department staff, in particular Paulina Medori who always kept me informed of degree requirements and deadlines, allowing me to lose myself in data analysis, and Robin Poitras who helped me produce beautiful posters from my random collection of text and images.

I am grateful for financial support I received through Shawn Marshall's Canada research chair funding, and scholarships from the Alberta government and University of Calgary Faculty of Graduate studies.

iii

TABLE OF CONTENTS

Abstract ...... ii Acknowledgements ...... iii Table of Contents ...... iv List of Figures ...... vii List of Tables ...... xii Abbreviations and Symbols ...... xvi 1. Introduction ...... 1 1.1 Thesis Objectives ...... 3 1.2 Thesis outline ...... 5 2. Background ...... 6 2.1 Climate processes and temperature ...... 6 2.1.1 Temperature Lapse Rates ...... 7 2.1.2 Surface Energy Fluxes ...... 9 2.1.3 Terrain Influences ...... 12 2.2 Weather Typing ...... 13 2.3 Interpolation and Regression Models ...... 17 3. Study Area and Data Quality Analysis ...... 22 3.1 Study Area ...... 22 3.2 Instrument Calibration and Accuracy ...... 27 3.2.1 Site setup and maintenance ...... 27 3.2.2 Instrumentation ...... 28 3.2.3 Instrument calibration ...... 29 3.2.4 Vented calibration tests ...... 33 3.2.5 Station relocation ...... 36 3.3 Quality Control ...... 37 3.3.1 Field checks ...... 39 3.3.2 Time shifts ...... 39 3.3.3 Spikes ...... 40 3.3.4 Extreme values ...... 41 3.3.5 Snow burial ...... 42

iv

3.3.6 Neighbourhood consistency ...... 42 3.3.7 Field notes manual review ...... 45 3.3.8 Final review ...... 47 3.4 Summary ...... 49 4. Synoptic weather patterns in southwestern Alberta ...... 51 4.1 Cold dry weather ...... 52 4.2 Chinook ...... 54 4.3 Frontal conditions ...... 56 4.4 Cool wet weather ...... 58 4.5 Hot, high-pressure ridges ...... 59 4.6 Summary ...... 61 5. Statistical weather classification for southwestern Alberta ...... 62 5.1 Introduction ...... 62 5.1.1 Principal Component Analysis ...... 64 5.1.2 Cluster Analysis ...... 65 5.1.3 Discriminant Function Analysis ...... 66 5.1.4 Meteorologically-based (subjective) classification ...... 66 5.2 Methods ...... 68 5.2.1 Variable selection and data preparation ...... 68 5.2.2 Classification methods ...... 71 5.2.3 FCA data analysis by weather type ...... 72 5.3 Results ...... 72 5.3.1 Principal component and cluster analysis ...... 72 5.3.2 Discriminant function analysis ...... 73 5.3.3 FCA study period weather types ...... 76 5.4 Discussion ...... 87 5.5 Summary ...... 92 6. Terrain and weather-type influences on daily temperature variability ...... 93 6.1 Introduction ...... 93 6.2 Methods ...... 97 6.2.1 Method 1: Location and terrain attributes influencing temperature ...... 100

v

6.2.2 Method 2: Location and terrain attributes influencing station pair correlations ...... 100 6.3 Results ...... 101 6.3.1 Determination of optimum buffer size for calculating relative elevation ..101 6.3.2 Temperature – site attributes correlations ...... 103 6.3.3 Station pair correlation ...... 110 6.3.4 Location and terrain differences between correlated stations ...... 112 6.3.5 Standardized terrain differences ...... 116 6.3.6 Land surface type ...... 119 6.4 Discussion ...... 120 6.4.1 Prairies ...... 121 6.4.2 Mountains ...... 122 6.4.3 Relative elevation and cold air pooling ...... 123 6.5 Summary ...... 128 7. Gap filling the FCA data and landscape models for estimating daily temperature .130 7.1 Introduction ...... 130 7.2 Methods ...... 133 7.2.1 Correlated station regression...... 133 7.2.2 Landscape temperature interpolation models ...... 134 7.3 Results ...... 136 7.3.1 Correlated station temperature/temperature regression ...... 136 7.3.2 Mountain temperature interpolation models ...... 140 7.3.3 Prairie temperature interpolation models ...... 149 7.3.4 Comparison of interpolation methods ...... 152 7.4 Discussion ...... 159 7.5 Summary ...... 165 8. Conclusions ...... 167 References…………………………………………………………………………………………………………..…… 172

vi

LIST OF FIGURES

Figure 2.1. Solar radiation components: 1.direct, 2.diffuse, and 3.reflected direct (Kumar et al., 1997)...... 10 Figure 2.2. Variation in annual incoming shortwave radiation as a function of slope angle for north and south slopes in the northern hemisphere at latitudes ranging from 0 to 90° (Barry, 2001)...... 11 Figure 3.1. Foothills Climate Array study area. Crosses indicate mountain sites and dots are prairie sites. The City of Calgary municipal boundary is shown as a black outline. Sites within the boundary are considered urban sites...... 22 Figure 3.2. Distribution of topographic measures (aspect, slope, and elevation) for FCA sites (upper panel) and overall 250-m resolution DEM derived measures (lower panel)...... 24 Figure 3.3. Monthly climate normals (1981-2010) at the Environment Canada Calgary airport station for maximum, minimum, and average temperatures. Dashed lines indicate annual climate normals...... 26 Figure 3.4. Site location examples: prairie, forested, urban, and mountain...... 28 Figure 3.5. WRS (red) and 15 Veriteq loggers for a 5-day calibration test in May 2005. (a) actual temperatures and (b) temperature difference calculated as logger – WRS. ....30 Figure 3.6. Median temperature differences for each test where differences are calculated for all hours 00h to 24h, hours between 10h and 16h (day), and 00h and 06h (night). Box width is proportional to the square root of the number of observations, box height indicates the interquartile range, and the solid line within each box is the median value...... 31 Figure 3.7. WRS (red) and 17 loggers for a 6-day calibration test in June 2005, where 27 mm of rain fell on June 28. (a) actual temperatures and (b) temperature difference calculated as logger – WRS. Offsets greater than 2°C occur during heavy rain events...... 33 Figure 3.8. The average temperature difference between Veriteq loggers and the WRS at different speeds and incoming shortwave radiation values. Box width equals the square root of the sample size...... 35 Figure 3.9. Distribution plots of the difference between station daily mean temperature and the average of the 10 most highly correlated neighbor stations for stations which were moved during the study period. GT indicates data after the move and LT before the move...... 37 Figure 3.10. Site FA0234 shown in red has spikes lasting 2 and 3 hours on July 6 and 8, 2005, as well as persistent aberrant behaviour on July 7...... 40 Figure 3.11. Site FA0332 shown in red exceeded the Calgary monthly extreme on August 24 2007. All other days with unusual minima are still within the extreme value

vii

threshold, but this sensor is clearly erratic, with anomalies on other days flagged by the neighbourhood test...... 41 Figure 3.12. Station FA0715 shown in red has a reduced diurnal range indicating snow burial...... 42 Figure 3.13. Site FA0517, shown in red, is a high-elevation site with lower temperatures compared to both vertical and horizontal neighbours. On May 27, 2010, the mean is more than 5 standard deviations lower than the group mean, but visual review indicates this is acceptable...... 44 Figure 3.14. Site FA0123, shown in red, normally agrees well with neighbours, but has some unusual data exceeding 5 standard deviations on March 22, 24, and 26, 2010. Data were deleted for these days...... 45 Figure 3.15. Site FA0618 shown in red has unusually high daytime maximum temperatures prior to August 7, 2007 when it was found lying on the ground during the annual site visit...... 46 Figure 3.16. Site FA0234 field notes indicated data are suspect and that the sensor was replaced...... 47 Figure 3.17. Map indicating the percentage of available (excluding missing and bad) data for each site, as shown by symbol size and colour...... 49 Figure 4.1. 500 hPa geopotential heights (m) for cA conditions on January 4 and 13, 2005...... 53 Figure 4.2. 500 hPa geopotential heights (m) for chinook conditions on January 30, 2009...... 55 Figure 4.3. Surface hourly meteorological measurements at Calgary for a cold front passing through on the morning of January 27, 2008. East-west winds are plotted in red, with positive values indicating a west wind, and north-south winds are shown in blue, positive indicating north winds. Temperature is shown in black. The green line marks the approximate time of the front...... 57 Figure 4.4. 500 hPa geopotential heights (m) for a cyclonic rain event on June 8, 2005. .58 Figure 4.5. 500 hPa geopotential heights (m) for warm conditions on August 18, 2008. .59 Figure 5.1. Smoothed temperature values at 09:00. The black symbols represent a 40-year mean and the solid line is the 11-day moving average of the 40-year mean...... 69 Figure 5.2. Boxplots of standardized temperature (TavgStd) and specific humidity (QavgStd) values for the six weather classes. The box edges indicate the 25 and 75 percentiles and the thick black line is the median value for each weather type...... 73 Figure 5.3. Discriminant function score scatterplots showing weather type separation. ...75 Figure 5.4. Average hourly measurements for temperature (T), relative humidity (RH), east-west wind (uEW), specific humidity (q), north-south wind (vNS) and pressure (P) for the top 10 days for each weather type...... 79

viii

Figure 5.5. Lapse rates by weather type and month for Tmean. Months are grouped into seasonal bins indicated by colour. The dashed line indicates the median . 81 Figure 5.6. Lapse rates by weather type and month for Tmax. Months are grouped into seasonal bins indicated by colour. The dashed line indicates the median lapse rate. 82 Figure 5.7. Lapse rates by weather type and month for Tmin. Months are grouped into seasonal bins indicated by colour. The dashed line indicates the median lapse rate. 83 Figure 5.8. Mean temperature/elevation plots for the top 10 days for each weather type. The best fit line is shown and the average lapse rate. Mountain stations are in red, prairie stations are green, and city of Calgary stations are black...... 84 Figure 5.9. Interpolated mean temperature surfaces calculated using the average temperature of the top 10 days for each weather type. Points indicate FCA sites and the city of Calgary is shown as a black outline...... 85 Figure 5.10. Interpolated mean temperature surfaces calculated using the average temperature of the top 10 days for each weather type lapsed to an elevation of 1350 m. Points indicate FCA sites and the city of Calgary is shown as a black outline. ...86 Figure 5.11. A comparison between the number of days for each month for DFA (upper panel) and SSC (lower panel) weather types from 2005 to 2010...... 90 Figure 5.12. The number of days for each DFA weather type for each year from 2005 to 2010...... 91 Figure 6.1. Quantile-quantile plots for selected stations for 2005-2010. The straight line indicates a theoretical normal distribution...... 99 Figure 6.2. Elevation difference between site elevation and the lowest point within a 1000 m radius of the site (relative elevation, RE) plotted against average minimum temperature for warm-season Ht and Nl days at all sites between 1400 and 1800 m. A linear best fit line, regression equation, and adjusted R2 are shown for each plot. The slope of the regression indicates the rate of change in temperature per 100-m change in RE...... 103 Figure 6.3. Spatial patterns of the 20 most highly correlated neighbours for a selected mountain site (FA0415) and prairie site (FA0436) for different weather types. Darker shades indicate higher correlations. The black square indicates the selected site. ..112 Figure 6.4. Average difference between site and neighbour terrain and location attributes for the 15 most highly-correlated neighbours, grouped by temperature measure for mountain and prairie sites. Aspect and slope are measured in degrees, vsep and RE are in meters, and hsep and cddist are in kilometers...... 114 Figure 6.5. Differences in terrain attributes between sites and the 15 most correlated neighbours, by season. vsep and RE are measured in meters, hsep and cddist in kilometers. Cool season (blue) – November to February, warm season (red) – May to August, moderate season (green) – March, April, September, October...... 115

ix

Figure 6.6. Differences in terrain attributes between sites and the 15 most correlated neighbours, by weather type. vsep and RE are measured in meters, hsep and cddist in kilometers...... 116 Figure 6.7. Average standardized differences between site and neighbour terrain and location attributes for the 15 most correlated neighbours, grouped by temperature measure and season for mountain and prairie sites. Note that hsep, vsep, and cddist use a vertical scale from (0 to 1), and RE, slope, and aspect use a different vertical scale (0 to 0.2)...... 118 Figure 6.8. Summed standardized differences between site and neighbour terrain and location attributes for the 15 most highly-correlated neighbours, grouped by temperature measure and weather type (left panel) and month (right panel) for mountain sites. Colours indicate the different terrain and location attributes...... 119 Figure 6.9. Correlation coefficient between relative elevation and minimum temperature for 200-m elevation bins for (a) cool, (b) warm, and (c) moderate months. The coloured dashed lines indicate median correlations for each elevation bin. Elevation bin names indicate the upper limit of the altitude range, e.g., LT1600 includes sites between 1400 and 1600 m altitude. GT2200 includes sites at altitudes above 2200 m and includes two high-altitude sites above 2600 m. Box widths are proportional to the square-roots of the number of observations in the groups...... 126 Figure 6.10. Regression coefficient (RE lapse rate) for minimum temperature, with relative elevation as a predictor for 200-m elevation bins for (a) cool, (b) warm, and (c) moderate months. The coloured dashed lines indicate median coefficients for each elevation bin. Positive lapse rates indicate temperatures increase as relative elevation increases, i.e., the higher a site is above a low point/valley bottom. Elevation bin names indicate the upper limit of the altitude range, e.g., LT1600 includes sites between 1400 and 1600 m altitude. GT2200 includes sites at altitudes above 2200 m...... 127 Figure 7.1. Error distributions for the weather type (a) and month (b) estimates. Bars have a width of 1°C...... 139 Figure 7.2. Mean absolute error (MAE) for each site as a function of elevation...... 139 Figure 7.3. Standardized regression coefficients used as weighting factors for each month and temperature measure. Values are normalized and all weighting factors sum to 1...... 149 Figure 7.4. Empirical semi-variogram function to model July maximum temperature variance...... 152 Figure 7.5. MAE for each mountain site relative to site elevation by method: (a) multivariate regression, (b) elevation-only, (c) topographic weighting, local regression, and (d) correlation- weighted regression. Best fit lines are shown where a solid line indicates a significant (0.05 level) slope, i.e., MAE varies with elevation...... 155

x

Figure 7.6. MAE by month and weather type for the 6 interpolation methods for mountain sites...... 156 Figure 7.7. Binned residual counts for the six interpolation methods. Vertical scale is logarithmic...... 157 Figure 7.8. Lapse-rate distribution by temperature measure for the global and local regression methods. The dashed vertical line indicates the environmental lapse rate of 6.5 °C/km...... 159 Figure 7.9. Mean absolute error (°C) distribution maps for mean, maximum, and minimum temperatures, for all stations (a) and mountain sites (b)...... 161

xi

LIST OF TABLES Table 2.1. Interpolation methods, example studies and limitations/benefits...... 21 Table 3.1. Annual average of daily maximum, minimum, and average temperatures at the Environment Canada Calgary airport station during the FCA data collection period...... 26 Table 3.2. Number of sites by year and month from setup in July 2004 to takedown in December 2010...... 27 Table 3.3. Test result statistics by year showing the total number of tests performed, the average difference between the hourly logger and WRS temperatures for all hours 00h to 24h, hours between 10h and 16h, and 00h and 06h...... 30 Table 3.4. Average offset (logger-WRS temperature) by year for loggers undergoing four or five calibration tests...... 32 Table 3.5. The average hourly temperature difference between loggers and the WRS at different wind speeds and incoming shortwave radiation values. Darker shading indicates a bigger positive difference...... 36 Table 3.6. Number of site/days failing each quality control test. A null flag indicates good data...... 48 Table 3.7. Number of sites with different percentages of available data...... 48 Table 4.1. Surface daily meteorological measurements (T - temperature, RH - relative humidity, P - pressure, uWE and vNS are east-west (west positive) and north-south (north positive) wind vectors, q - specific humidity) at Calgary for cA conditions on January 4 and 13, 2005 and surrounding days...... 53 Table 4.2. Surface daily meteorological measurements at Calgary for chinook conditions on January 30, 2009 and surrounding days...... 56 Table 4.3. Surface 6-hourly meteorological measurements at Calgary for chinook conditions on January 30, 2009 and surrounding days...... 56 Table 4.4. Surface daily meteorological measurements at Calgary for a cyclonic rain event occurring between June 5-8, 2005. q is specific humidity...... 59 Table 4.5. Surface daily meteorological measurements at Calgary for unseasonably warm conditions from August 15-18, 2008...... 60 Table 4.6. Selected qualitative surface meteorological characteristics associated with different weather types in ...... 61 Table 5.1. Meteorological variables used for principal component (PCA) and discriminant function analysis (DFA) to create weather types...... 68

xii

Table 5.2. vNS15 – north-south wind vector at 15:00; uEW21 – east-west wind vector at 21:00; T – temperature; RH – relative humidity; q – specific humidity; DPT – dew point temperature; range – difference between daily minimum and maximum values; avg – daily average values calculated from hourly measures; trend – difference between value at 23:00 and 00:00 in a day...... 74 Table 5.3. Classification accuracy table, the percentage of correctly classified days for each weather type is shown in bold...... 76 Table 5.4. The number of days per month for each weather type...... 77 Table 5.5. The number of events for each weather type and the average and maximum duration...... 77 Table 5.6. Selected average standardized meteorological characteristics for each weather type with the top defining values shown in bold...... 78 Table 5.7. SSC weather types and the equivalent DFA types...... 89 Table 6.1. The number of days for each weather type grouped by cool (Nov-Feb), moderate (Mar,Apr,Sep,Oct), and warm months (May-Aug)...... 101 Table 6.2. Mean correlation coefficients calculated between minimum temperature and site relative elevation are shown for multiple weather types, altitude ranges, and buffer sizes. Elevation bins indicate the upper altitude of the bin, e.g., LT1600 includes sites between 1400 and 1600 m...... 102 Table 6.3. Correlation between aggregated temperature measures and topographic attributes for mountain and prairie sites. Correlations for cddist, slp250 and RE are partial correlations, controlling for elevation. Darker shading indicates stronger negative coefficient values...... 105 Table 6.4. Correlation between daily temperature measures and elevation calculated by (a) month for prairie sites and (b) month and (c) weather type for mountain sites...... 106 Table 6.5. Correlation between daily temperature measures and location variables (xproj and yproj) calculated by (a) month and (b) weather type for mountain sites, and (c) month and (d) weather type for prairie sites...... 107 Table 6.6. Correlation between daily temperature measures and distance from the continental divide calculated by (a) month and (b) weather type for prairie sites, and (c) month and (d) weather type for mountain sites...... 108 Table 6.7. Correlation between daily temperature measures and terrain measures (relative elevation, slope angle and aspect) calculated by (a) month and (b) weather type for prairie sites, and (c) month and (d) weather type for mountain sites...... 110 Table 6.8. Average correlations for each temperature measure and period for the five most highly correlated neighbours...... 111

xiii

Table 6.9. Percentage of sites selected for each surface type. The majority percentage for each site surface type is shown in bold...... 120 Table 7.1. Overall RMSE (°C) by station class (mountain and prairie) for daily temperature measures (minimum, maximum, mean) estimated using the most highly-correlated station by correlation period (year, month, weather type). The percentage improvement using weather type correlations compared to annual and monthly correlations is also shown...... 137 Table 7.2. Mean absolute errors (°C) of temperature estimates calculated using the most highly correlated station by month (mnth) and weather type (wt), shown for each (a) month and (b) weather type...... 138 Table 7.3. Correlation coefficients between potential predictor variables for mountain sites. Absolute values exceeding 0.5 are shown in bold...... 140 Table 7.4. Coefficients and standard errors for the cool_Ch models with and without a spatial autocorrelation (SA) term...... 141 Table 7.5. Significant multivariate regression coefficients for Tmax. elevKM is the change in temperature for every kilometer elevation increase, cd10 is the change in temperature for every 10 kilometers moving east of the continental divide, x10 is the change in temperature for every 10 kilometers moving east, and asp10 is the change in temperature for every 10 degrees east or west of north. Land cover designations lst g, r and w are the changes in temperatures for grassland, rock, and wetland sites, respectively, relative to forested sites...... 143 Table 7.6. Significant multivariate regression coefficients for Tmean models for mountain sites. Predictor variables are the same as for Table 7.5. slp is the change in temperature for every degree incline from the horizontal sites...... 144 Table 7.7. Significant multivariate regression coefficients for Tmin models for mountain sites. Table definitions are as in Table 7.5...... 145 Table 7.8. Average mean absolute error, elevation coefficient (lapse rate °C/km), and adjusted R2 for all daily mountain site models...... 146 Table 7.9. Mean absolute error (MAE) for multiple station selection and weighting methods used in local regression models for April 2008 minimum temperatures. Methods in bold are used on the full dataset...... 148 Table 7.10. Correlations between topographic attributes at prairie sites, absolute values exceeding 0.5 are shown in bold...... 150 Table 7.11. Adjusted R2 and significant regression coefficients for elevation (elevKM), easting (x10km), northing (y10km), relative elevation (RE100m) and land surface type (LSTu) for prairie site temperature models by period...... 151

xiv

Table 7.12. Overall error statistics by temperature measure for all interpolation methods. globalElev – global elevation linear regression; multiMnth – global multivariate regression models with predictors varied monthly; multiWT – global multivariate regression models with predictors varied by weather type; localElev – local elevation linear regression; topoWgt – local weighted elevation regression models with weights calculated by site topographic similarity measures; corrWgt – local weighted elevation regression models with weights calculated by site temperature correlations; TTmnth – temperature/temperature regressions using correlated stations calculated by month; TTwt – temperature/temperature regressions using correlated stations calculated by weather type; krig – temperatures estimated using kriging...... 153 Table 7.13. Median lapse rates (°C/km) for the global elevation, multivariate parameterised by month and weather type, local elevation and local topographically weighted regression models...... 158

xv

ABBREVIATIONS AND SYMBOLS

ANOVA analysis of variance CA cluster analysis DALR dry adiabatic lapse rate DEM digital elevation model DFA discriminant function analysis (DFA). DPT dew point temperature DST daylight savings time EC Environment Canada ELR environmental lapse rate ENSO El Niño Southern Oscillation FCA Foothills Climate Array GCM Global Climate Model GIDS Gradient Inverse Distance Squared GPS Global Positioning System IDW inverse distance weighting LAI leaf area index LST land surface type MAE mean absolute error ME mean error MST mountain standard time NCEP National Centers for Environmental Prediction NN nearest neighbour NRCAN Natural Resources Canada P pressure PCA principal component analysis PNA Pacific / North American teleconnection RE relative elevation RH RH RMSE root mean square error SALR saturated adiabatic lapse rate SOM self-organizing map SSC spatial synoptic classification

xvi

WD wind direction in degrees measured from north WMO World Meteorological Organization WRS University of Calgary weather research station WS wind speed in m/s or km/h

asp500 aspect calculated from a 500 m resolution DEM aspect difference in aspect (°) CD cold dry weather type cddist difference in the distance to the continental divide (km) Ch chinook weather cp specific heat capacity (J/ kg°C) at a constant pressure cv specific heat capacity (J/ kg°C) at a constant volume CW cool wet weather d change in a measure e.g. dP is the change in pressure

DPTrange difference between the daily maximum and minimum dew point temperature g gravitational acceleration (ms-2) hsep straight-line horizontal distance between stations (km) Ht hot high pressure ridge weather L outgoing longwave radiation (Wm-2) L incoming longwave radiation (Wm-2) lstg grass/shrub land surface lstr rock/rubble land surface lstu urban land surface lstw wetland land surface Nl normal weather P Pressure Q heat added to the air mass q specific humidity (g/kg) qavg daily average specific humidity R Gas constant of dry air  air density (kg/m3) R2 coefficient of determination RE difference in cold-air ponding potential (m)

xvii

RHavg daily average relative humidity (%)

RHrange difference between the daily maximum and minimum relative humidity (%)

RHtrend difference between relative humidity at 23:00 and 00:00 hours (%) -2 Rn net all-wave radiation (Wm ) S outgoing shortwave radiation (Wm-2) S incoming shortwave radiation (Wm-2) slope difference in slope angle between sites (°) slp250 slope calculated from a 250 m resolution DEM

Sn net shortwave radiation

Tavg daily average temperature at Calgary calculated as the average of hourly values (°C)

TavgStd standardized temperature Tmax daily maximum temperature (°C) Tmean daily mean temperature calculated as the average of hourly values (°C) Tmin daily minimum temperature (°C) Tr transition weather – cold front

Trange difference between the daily maximum and minimum temperature

Ts surface temperature (°C)

Ttrend difference between temperature at 23:00 and 00:00 hours u east-west wind vector (m/s or km/h) U internal energy of an air parcel uEW21 east-west wind speed at 21:00 hours (m/s or km/h) v north-south wind vector (m/s or km/h) vNS15 north-south wind speed at 15:00 hours (m/s or km/h) vsep difference in altitude between sites (m) W work done on or by the air mass xproj longitude in UTM projected coordinates (m) yproj latitude in UTM projected coordinates (m) z height

εs surface emissivity σ standard deviation σ Stefan-Boltzman constant 훿 difference between attribute values for site pairs µ mean

xviii

1. INTRODUCTION

Weather and climate can be studied at a range of scales, both spatially and temporally. For studying long term global changes, annual means of a particular weather variable may suffice. For applications such as hydrological or ecological studies, higher spatial resolutions and hourly, daily, or monthly measures are more appropriate. Numerous modelling applications use surface temperature as an input variable, e.g., forest fire monitoring, hydrology (snow melt), agriculture (growing degree days) and ecology (pest control; vegetation phenology). However, weather stations are seldom of high spatial density and they do not usually sample variations in topography, tending to be concentrated at lower elevations and near population centres. Therefore, these applications rely on interpolated temperatures for input and commonly use lapse rate adjustments to estimate temperature.

Lapse rate refers to the rate of change of meteorological variables with altitude. Temperature generally decreases with increasing altitude. The global environmental lapse rate is given as ˗6.5°C/km (Barry, 2001), but lapse rates have been shown to vary by temperature measure (minimum or maximum), by month (e.g., Blandford et al., 2008), and by weather type (Cullen and Marshall, 2011; Pepin, 2001). In addition, understanding how climate varies at different scales under the influence of different weather systems can help in improving global climate model (GCM) simulations, improve on techniques used to downscale GCM output to regional-scale agricultural, hydrological, and ecological studies, and provide a better understanding of how climate change may impact local areas having complex topography (Beniston, 2006; Daly et al., 2010).

Factors impacting climate depend on the scale of interest (Barry, 2001; Daly, 2006). On a global scale, temperature decreases moving poleward from the equator and is affected by continentality. For regional studies, elevation is considered the dominant control, but on a local scale, topographic effects such as cold air drainage become important. Methods and variables used for temperature interpolation depend on the scale of the study. For example, Nalder and Wein (1998) used latitude, longitude, and elevation in estimating monthly means in the boreal forest in . For a mountainous area in the French Alps, Carrega (1995) used a variety of terrain measures (slope, aspect, relative elevation, altitude,

1

and distance from the ocean) as predictor variables in estimating mean monthly minimum and maximum temperatures, using multivariate regression equations. The significant predictor variables varied by month and temperature measure. On a daily scale, Stahl et al. (2006a) used multiple regression and weighted averaging methods with different lapse rate corrections to account for elevation effects. All studies found errors varied by month and temperature measure, with the largest errors during winter months and for minimum temperatures.

Models such as PRISM (parameter-elevation regressions on independent slope models), a climate interpolation model developed by Daly et al. (2002) and MTCLIM (Mountain microclimate simulation model, Hungerford et al., 1989) explicitly account for terrain influences on temperature. PRISM uses a weighted regression where stations having similar topographic characteristics (slope, aspect, and relief) are weighted more highly. MTCLIM adjusts values based on variations in solar radiation due to slope and aspect, as well as vegetation effects, as these affect the surface energy balance.

Most temperature interpolation methods are based on statistical techniques, sometimes without regard to terrain or elevation parameters (e.g., inverse distance weighting (Nalder and Wein, 1998), sometimes including elevation adjustments (e.g., from regression- determined lapse rates (Cullen and Marshall, 2011) or using co-kriging with elevation (Ishida and Kawashima, 1993), or in some cases using multivariate regression models rather than spatial interpolations (e.g., Carrega 1995). In each example, there is little focus on the effect that different meteorological processes may have on temperature patterns. Interpolation models are often parameterised on a monthly basis (e.g., Daly et al., 2007; Nkemdirim, 1996). These studies show that estimation errors vary seasonally and monthly, indicating that factors affecting meteorological covariance vary in time.

Studies have investigated temperature interpolation methods and lapse rate variability under different synoptic conditions, with varying success. Huth and Nemesova (1995) classified days into weather classes to estimate missing data using lapse rates calculated per weather class, and errors were reduced when compared with using lapse rates calculated using all data. Pepin et al. (1999) reported significant differences in lapse rates for different synoptic types in the UK, when types were defined for a local area, but less

2

difference using UK-wide weather classes. Consistent with this, Courault and Monestiez (1999) found little improvement using lapse rates calculated for European-scale circulation patterns, when applied to a study area in southeastern France. Blandford et al. (2008) studied basin-scale (10 000 km2) lapse rates on a seasonal, monthly, and weather-type basis and found only a weak relationship between weather types and lapse rates. Weather types used by Blandford et al. (2008) were spatial synoptic classes as defined by Sheridan (2002). In contrast, Cullen and Marshall (2011) reported significantly different lapse rates for two commonly occurring winter weather types in the , namely chinook winds and Arctic cold spells.

The influence of synoptic type on lapse rate varies by time of day as well (i.e., for minimum vs. maximum temperatures). These studies indicate that factors influencing spatial temperature variability will vary by weather type and temperature measure, but results are more significant using weather classification systems appropriate for the study area (i.e., local to regional scales), rather than large-scale air-mass classifications.

The Foothills Climate Array (FCA) study covers an area of approximately 24 000 km2 in southwestern Alberta, extending from the in the west to 50 km east of Calgary. Terrain altitudes range from 800 m in the prairies to 3500 m in the mountains, and the FCA weather stations sample an elevation range from 890 to 2880 m. The landscape of the study area varies from alpine peaks and slopes, rolling foothills comprised of forests and meadows, to flat grassland and cropland in the prairies. Hourly temperature was recorded at ~220 stations between 2004 and 2010.

1.1 Thesis Objectives

The high spatial and temporal resolution of the FCA make it ideal for studying surface meteorological variations at a regional scale. The primary goal of my research is to determine terrain influences on temperature under different regional weather systems, and to better understand spatial and temporal variations in temperature in the Alberta foothills. I have two main applications of this information: (i) to develop methods for estimating missing data, and (ii) to generate landscape-scale temperature models.

3

The first objective is necessary for many meteorological applications, where it is common for some data to be missing. This is even more true for automatic and backcountry weather stations such as the FCA instrument network. Gaps and erroneous values exist in the data due to instrument malfunctions, field disturbance, and other causes. Damage to sites and instrument failure has resulted in approximately 9% of the data being lost, leaving spatial and time gaps in the data. This limits the use of the data (e.g., determination of monthly or annual means). Stooksbury et al. (1999) showed that three-day data gaps can result in errors of ±1°C in the calculation of monthly means, with larger errors during winter and in continental interior locations. Missing or erroneous data can also cause poor performance in interpolation or modelling of temperature surfaces.

The second objective, temperature modelling, is widely used for studying natural systems that are affected by temperature, such as hydrology, ecology, and avalanche studies. In mountain regions, landscape-scale temperatures are typically estimated from remote, low- elevation reference stations, or they are ‘downscaled’ from coarse-resolution climate models (e.g., Schoof and Pryor, 2001; Hay and Clark, 2003). Within the FCA study domain fewer than ten long-term temperature records are available as reference stations for this kind of modelling (e.g., Environment Canada stations at Calgary, Banff, Kananaskis, Nakiska), while a typical present day global climate model with a resolution of 1 would have about four grid cells covering the region. Landscape-scale temperature models based on such sparse data require a detailed understanding of how terrain effects and weather systems interact to govern spatial temperature patterns. The FCA dataset offers an opportunity to develop and test such models.

To address these research objectives, I develop a synoptic weather classification system using discriminant function analysis for the area. Spatial patterns and scales of temperature correlation are examined to determine topographic (e.g., slope and aspect) and physiographic (e.g., surface type) controls on temperature and how these vary as a function of weather system. This provides meteorologically-informed statistical methods for estimating weather variables where no data is available, as well as improved understanding of weather and climate patterns in complex terrain. The process reveals the underlying

4

structure of regional temperature variability in the study region and its relation to prevailing weather systems.

1.2 Thesis outline

I have written a ‘traditional’ thesis which builds through each chapter. My focus is on methods rather than applications. The contents are as follows:

Chapter 1: Introduction - rationale for the research, objectives, and thesis outline.

Chapter 2: Literature review: near surface temperature and its global, regional and local influences (solar, surface energy balance, winds, valleys, land surface type); statistical weather typing; temperature interpolation methods.

Chapter 3: Study area and data collection; equipment and calibration tests; data quality control.

Chapter 4: Meteorological characteristics of specific southwestern Alberta weather types.

Chapter 5: Statistical weather classification. I describe the development of a statistical weather classification system for southwestern Alberta using surface meteorological measurements. Spatial temperature patterns and lapse rates by weather type are described.

Chapter 6: Station pair temperature correlations are analysed in relation to station location, terrain and surface type characteristics. Results are used to determine regional and local controls on temperature and how these vary monthly and by weather type.

Chapter 7: I develop models for gap filling the FCA data and landscape temperature interpolation. Models are parameterized by month and weather type using different topographic, geographic and surface type measure as identified in Chapter 6. Analysis of error measures for different models by month and weather type indicates the most important topographic parameters. I also discuss the difference between monthly and weather type model performance and the successes and limitations of the weather typing.

Chapter 8: Conclusions and future work.

5

2. BACKGROUND

Primarily this research is about near-surface temperature variability under different weather conditions, and the creation of near-surface temperature models from point measurements. This chapter provides an overview of general climate processes and solar radiation, weather type classification systems, and interpolation methods. Greater detail about the specific methods used in this study are provided in the relevant chapters.

2.1 Climate processes and temperature

Processes occurring in the troposphere are the dominant cause of weather experienced on the Earth. The troposphere, the lowest layer in the Earth’s atmosphere, extends from the surface to between 9 (at the poles) and 12 (at the equator) kilometers and is characterized by vertical mixing and decreasing temperature with height. Atmospheric processes occur at a range of space and time scales. In general, space and time scales at which processes operate are related, i.e., spatially large processes occur over long time periods, e.g., Rossby waves in the upper troposphere extend for thousands of kilometers and can last for weeks. Similarly, spatially small processes occur on short time scales, e.g., thunderstorms extending over tens of kilometers last a few hours. In addition, atmospheric phenomena and the causative factors operate over a continuum of scales.

Differential heating of the Earth’s surface, whereby lower latitudes receive more solar radiation than higher latitudes, particularly during winter, occurs due to the tilt of the earth’s axis and the rotation of the earth around the sun. Surface temperature gradients due to differential heating generate pressure gradients and pressure gradients result in air movement, as air flows from high pressure areas to areas of low pressure.

Atmospheric thermodynamics are discussed in detail in several texts (e.g., Tsonis, 2002). Here I define some basic physical principles related to temperature, and use these to show that the rate of change in temperature with altitude is a consequence of thermodynamics.

Air temperature is defined as a measure of the average kinetic energy (or speed) of the molecules in an air parcel (Ahrens, 2008). Absolute zero (0 K) refers to the temperature at which the motion of molecules ceases (kinetic energy = 0). At Earth surface temperatures, molecules are moving rapidly and temperature is a macroscopic measure of the average

6

kinetic energy of billions of molecules of the gases that make up air. The internal energy of an air parcel with mass m is defined to be

U = mcvT, (1)

1 1 for temperature T and specific heat capacity cv (J kg C ) at a constant volume.

Air parcels follow the ideal gas law in equation 2,

P = ρ RT, (2) where P is pressure, R is the gas constant of dry air, and  is air density = m/V for an air parcel with mass m and volume V. Warming a parcel of air causes it to expand, making the air less dense. Similarly, cooling an air mass causes an increase in air density.

Pressure decreases with increasing height and the rate of change varies according to the hydrostatic equation (3).

dP/dz = ρg, (3) where dP is the change in pressure with height (dz) and g is gravitational acceleration.

2.1.1 Temperature Lapse Rates

Lapse rates are defined as the change in meteorological variables with height. In this thesis I am primarily concerned with temperature, so I use the term ‘lapse rate’ to refer to the change of temperature with altitude. Temperature generally decreases with height, as rising air expands due to decreasing pressure. Thermodynamically, the change in temperature with elevation is derived by considering conservation of energy and the rates of change of pressure, density, and internal energy with elevation in the atmosphere.

The first law of thermodynamics can be written as

dU = dQ  W, (4) where dU is the change in internal energy, dQ is heat added to the air mass (e.g., through radiation, sensible heat flux, or condensation of water vapour), and W expresses the work done on or by the air mass and is equal to PdV.

7

If no energy is added to the air parcel, e.g., through radiation or through latent energy release associated with changes in state, dQ = 0, therefore dU = – W. Substituting the first term with equation 1 yields

푚푐푣푑푇 = – PdV. (5)

Expressing the change in PV as d(PV) = PdV + VdP and replacing the second term in equation 5 yields,

푚푐푣푑푇 = VdP – d(PV). (6)

Writing the ideal gas law in equation 2 as PV = mRT, the change in PV can be expressed as, d(PV) = d(mRT) = mRdT, and replacing the third term in equation 6 yields,

푚푐푣푑푇 = VdP – mRdT. (7)

Rearranging terms in equation 7, VdP = m(cv+R)dT, where cv + R = cp, the specific heat capacity at a constant pressure, and using the relationship ρ = m/V, equation 7 can be written as,

dP/ρ = cpdT. (8)

Raising the air parcel by an amount dz and replacing the first term with the hydrostatic equation (3) yields, gdz = cpdT.

This gives what is called the dry adiabatic lapse rate (DALR),

dT/dz = – g/cp = –9.8℃/km. (9)

The DALR is a constant and is exact, but of course it is an approximation to assume that an air parcel will not exchange heat with its environment. The DALR nonetheless gives a good estimate of the change of temperature in unsaturated air that is rising or sinking. As air rises and cools under saturated conditions, water condenses and releases latent energy: dQ > 0. This reduces the rate of cooling in equation 9. The actual rate of cooling depends on the temperature of the air, as warmer air can hold more water vapour and releases more latent energy, reducing the rate of cooling. Cold saturated air cools at a saturated adiabatic

8

lapse rate (SALR) close to that of the DALR, whereas warm saturated air cools at a SALR close to 5°C/km.

It should also be noted that these processes and equations refer to the free atmosphere, with the lapse rates calculated as a function of the behaviour of an ideal gas. Surface temperatures are not the same as those in the free air, but are driven by the surface energy balance at a location: the net energy that is available in association with fluxes of radiation, sensible, and latent heat (e.g., Oke, 1987). Excesses of energy will warm the surface, while it cools in response to an energy deficit. The environmental lapse rate (ELR) is different from the adiabatic lapse rates experienced by air that is rising or sinking. It is the actual rate of cooling that occurs with altitude, and varies with location, time and altitude. On average, the ELR has a value of about 6.5°C/km in the troposphere. The near-surface air temperature that is measured by the FCA instruments, typically at a height of 1.5 m, responds to a combination of influences of the local surface temperature and the larger- scale, free-air atmospheric conditions. Hence lapse rates as measured by the near-surface conditions in the FCA should not be expected to adhere to free-air lapse rates.

2.1.2 Surface Energy Fluxes

Surface energy and diabatic heat exchange with air masses are largely associated with radiation fluxes. The net all-wave radiation Rn, can be expressed as a function of incoming (S) and outgoing (S) shortwave radiation and incoming (L) and outgoing (L) longwave radiation.

Rn = (S − S) + (L − L) (10)

Net shortwave radiation, Sn, is the difference between incoming shortwave radiation (S) and outgoing shortwave radiation (S). S consists of three components, direct, diffuse and reflected, as shown in Figure 2.1.

9

Figure 2.1. Solar radiation components: 1.direct, 2.diffuse, and 3.reflected direct (Kumar et al., 1997).

Potential S follows regular diurnal and annual cycles, with the value being a function of latitude, day of year, and time of day. Local slope configuration (angle and aspect) influences these values. Slope and aspect effects are most important for determining the amount of direct S under clear-sky conditions. Figure 2.2 shows the percentage of annual direct radiation received on north and south facing slopes of 10 and 30° in the northern hemisphere relative to a horizontal surface. Under cloudy skies, diffuse radiation is the most important component of S and the sky view factor influences the amount received. Sky view factor refers to the proportion of sky visible from a location relative to the full hemispheric view. Slope and aspect again affect this measure, but there is no preferential slope or aspect effect. S is a function of albedo, the ratio of outgoing to incoming radiation. For example, dark, wet soil has an albedo ~0.05, coniferous forest 0.05-0.2 and fresh snow 0.95 (Oke, 1987). Therefore S will be greater over snow-covered surfaces, reducing Sn.

10

Figure 2.2. Variation in annual incoming shortwave radiation as a function of slope angle for north and south slopes in the northern hemisphere at latitudes ranging from 0 to 90° (Barry, 2001).

Net longwave radiation, Ln, is the sum of outgoing longwave radiation (L) and incoming longwave radiation (L). Warmer surface temperatures increase L as per the Stefan- Boltzman law

4 L = εsσTs (11) where Ts is the surface temperature, εs the surface emissivity, and σ the Stefan-Boltzman constant. L is determined by atmospheric properties, as this determines the amount of absorbed longwave radiation which is then re-radiated. High water vapour content and cloudy skies increase L. Net radiation is dominated by incoming solar radiation during the day, except under cloudy conditions where a large amount is reflected before reaching the earth’s surface, and over surfaces having a high albedo. During winter and over snow- covered surfaces, outgoing longwave radiation may exceed net shortwave radiation, resulting in a negative daily radiation balance. Negative radiation balances are also typical overnight.

Under calm, sunny conditions during the day, strong surface heating by S occurs, resulting in positive net radiation. Rn is partitioned into sensible and latent heat fluxes. Latent heat is the energy consumed or released during phase changes, i.e., heat is released during

11

condensation and consumed during evaporation. Latent heat exchanges depend on the moisture content of the air and surface. Sensible heat results in temperatures changes. As the ground is heated it warms a thin layer of air directly in contact with the ground. Convective processes result in the dispersal of this warm air.

The ratio of sensible to latent heat is termed the Bowen ratio and is a function of surface type and soil moisture content. Dry un-vegetated surfaces have a high Bowen ratio, indicating more sensible heat available for heating the air in contact with the surface. Turbulent flow develops in the planetary boundary layer, and during the day under strong heating, the layer can extend 1-2 km (Barry and Chorley, 2010). This allows convective transfer of sensible heat, thereby increasing near-surface temperature. Vegetated surfaces or moist soils will have a lower Bowen ratio, indicating a stronger latent heat term due to evaporative processes. With less sensible heat available, near-surface temperatures are lower over moist surfaces. When relative humidity is high and under moist soil conditions, latent heat is released through condensation, resulting in warmer overnight temperatures. Lee slopes of mountain ranges tend to be drier, resulting in a reduced latent heat flux component.

2.1.3 Terrain Influences

Varying slope geometry in mountainous areas results in differential surface heating and cooling generating mountain wind systems (Whiteman et al., 1989). Mountain winds can have a distinct diurnal cycle, with air being drawn into the mountains during the day and flowing away from the mountains during the night. Cross-slope winds develop in valleys oriented north-south, with east-facing slopes warming preferentially in the morning, resulting in west winds perpendicular to the valley axis. The process is reversed in the afternoon allowing cross-slope east winds to develop. Mountain slopes oriented towards the sun are heated during the day, warming the air in contact with the slope and generating upslope winds. Upslope winds and cross-slope winds develop under the influence of a net positive radiation balance resulting from strong incoming shortwave radiation. A reverse downslope circulation develops at night due to a net negative radiation balance, resulting from strong outgoing longwave radiation.

12

Mountain and valley winds develop in response to slope winds. Away from the heating source, warmed slope air descends into the valley, heating valley air through sensible heat flow. The heated valley air is warmer than the plains air and results in an up-valley wind during the day. At night, slopes cool due to longwave radiation emission, creating a cool dense air layer which flows downslope, resulting in down-valley winds. Wind speed and direction will be a function of local slope geometries and net radiation. Mountain wind systems are strongest under cloudless, calm conditions, allowing strong shortwave radiative heating during the day and longwave cooling during the night, creating pressure gradients.

Local terrain depressions and valley bottoms may act as traps for sinking cold air. Cold air commonly drains downslope and ponds overnight, with minimum temperatures recorded late in the night (or close to sunrise), after many hours of cumulative cooling or cold air drainage and pooling. Particularly during winter when radiative heating is weak, daytime thermally induced flows may not be strong enough to move or heat these cold pools, resulting in persistent cold spots and temperature inversions in complex terrain.

Factors influencing temperature, as discussed, suggest topographic and terrain characteristics that need to be addressed when creating temperature interpolation models. These include land surface type, slope aspect and angle, and terrain relief.

2.2 Weather Typing

Synoptic climatology deals with the relationship between atmospheric circulation and the surface environment (Barry and Perry, 1973; Yarnal et al., 2001). Synoptic typing or classification, whereby classes identifiable with circulation patterns are identified using surface and upper air meteorological measurements, is commonly used for these kinds of investigations. This allows multiple complex variables to be treated as a single unit and assumes the surface response varies systematically/similarly for different synoptic classes. The goal is to minimize variability within groups and maximize variability between groups. Classifying weather types assumes that: (i) weather types exhibit sufficiently separate meteorological signatures, and (ii) types can be uniquely defined using the selected variables.

13

Synoptic classifications have been used extensively for a variety of purposes in meteorological studies. An early use was for weather forecasting, but more recently classifications have been used to investigate extreme weather events (Casola and Wallace, 2007; Cheng et al., 2010), air quality (Kalkstein and Corrigan, 1986), water quality (Sheridan et al., 2013), and for downscaling studies (Enke and Spekat, 1997). Classifications have been carried out at planetary (Konrad, 1998), continental (Courault and Monestiez, 1999), regional (Esteban et al., 2006) and local (Pepin et al., 1999) scales.

Huth et al. (2008) reviewed classification methods used to create atmospheric circulation patterns. Three major forms of classification were identified, namely subjective, mixed, and objective. Subjective and mixed methods require knowledge of weather types for a region or use physical thresholds, such as flow direction and strength, to define types. Objective or automated methods apply some form of statistical analysis, for example principal component analysis and cluster analysis, to create classes. A criticism leveled at subjective or manual methods is that they are time consuming and less repeatable (Yarnal et al., 2001; Sheridan, 2002). However, automated methods rely on several subjective decisions during the process. Therefore, they may be more repeatable than manual methods, but they are seldom truly objective (Huth et al., 2008). In addition, results from automated classifications may be more difficult to interpret than those from manual classifications (Sheridan, 2002).

Manual classification methods have been used to identify specific weather events using one or more specified criteria. For example, Nkemdirim (1996) analyzed frequency and strength of chinook wind events in southern Alberta, where chinook days were identified as having a higher than normal daily maximum temperature, a rapid drop in relative humidity, wind speed > 4.5 m/s and wind direction between SSW and WNW. Similarly, Cullen and Marshall (2011) identified chinook events based on an abrupt rise and then fall from unseasonably high temperatures, low relative humidity, gusty west winds, and a drop in pressure. In the same study, cold polar air events were identified as days having an average temperature below 15°C, accompanied by high pressure. These kinds of classifications require events with well-defined surface meteorological characteristics, and only classify a subset of days.

14

Hybrid classification methods involve subjective identification of ‘type’ days for each specified class and then use the characteristics of the type days to classify all days into classes using objective statistical methods, e.g., discriminant function analysis (DFA). McCutchan (1978) used 500-mb height measurements, sea level pressure, dew point temperature, and daily maximum temperature anomalies to identify and classify days into five different weather classes in southern California for May to October. A subset of the days was used to derive discriminant functions using further surface and upper-air measurements. The remaining days, those not used to train the model, were classified using the discriminant functions. Each day was classified according to the class having the highest probability of occurrence, taking into account the relative occurrence frequencies for each class. The discriminant functions were able to correctly predict more than 80% of the subjectively classified days.

Sheridan (2002) developed a hybrid weather type classification scheme (SSC2) for north America, based on the seasonal spatial synoptic classification (Kalkstein et al., 1996), whereby meteorological characteristics of seed days representative of weather types are extracted and used to classify remaining days into weather types based on air mass characteristics. The SSC2 identifies seven different synoptic-scale systems, based on their temperature and humidity characteristics, including transition days. Rather than the typical characterization of air masses based on their source region, SSC2 weather types are classified by local surface characteristics. Theoretical seed days are created using data from 14-day windows for each season. The 14-day period is selected to be the coldest (winter) and warmest (summer) two weeks in the year and two weeks in the middle for autumn and spring, and the seed day window period can be varied by location. Seed days for each class within each two-week period are selected using predefined characteristic conditions for the location. In this way extreme days are excluded. Bower et al., (2007) developed a similar classification for western Europe, illustrating a benefit of this kind of classification, that it is not location specific.

Methods for objective statistical classifications include different clustering techniques, principal component analysis (PCA), self-organizing maps (SOMs) and specification of thresholds. Classes may be based on spatial data, in which case a set of spatial patterns

15

associated with different weather types is created. Alternatively, classes may be based on point data, whereby weather types are associated with a set of surface and optionally upper air measurements.

Spatial classifications using gridded surface pressure use either map correlation (Kirchhofer technique) or principal component analysis and cluster analysis methods. The Kirchhofer method has been criticized for the subjective choice of significant correlation values. Frakes and Yarnal (1997) attempted to resolve this issue by using a hybrid method whereby grid patterns for known synoptic types were predefined and used to pattern match remaining grids. The Kirchhofer method has been applied to investigate the link between snowpack anomalies and synoptic type frequencies and variations in synoptic and planetary scale indices e.g., ENSO and PNA (Moore and McKendry, 1996). Stahl et al. (2006b) found temperature and precipitation anomalies in were preferentially associated with synoptic weather systems classified using PCA.

Casola and Wallace (2007) applied Ward’s clustering to spatial gridded geopotential height data to identify patterns associated with extreme cold weather in the Pacific-North American sector. However, applying the method to other areas and weather associations did not produce consistent patterns. Bardosy et al. (2002) used fuzzy rules based on thresholds to identify circulation patterns associated with different temperature and precipitation characteristics. Circulation patterns and the fuzzy rules were found to be different for the best identification of temperature and precipitation groups. Cassano et al. (2006) used SOMs to identify circulation patterns associated with extreme weather events in Alaska. Physically reasonable patterns were identifiable, where the large scale patterns were representative of spatial pressure patterns rather than surface weather conditions. Extreme wind and temperature events were found to be preferentially associated with particular patterns. The authors conclude that this indicates the method can be used to investigate links between large scale patterns and local weather conditions. However, an earlier study by Courault and Monestiez (1999) found daily interpolated temperature surfaces were not improved when interpolation models were parameterized by circulation patterns. A block clustering method was used to create classes from gridded geopotential height data covering Europe. The 1000 hPa level was used as it showed the highest

16

correlation with surface climate measurements. However, classes were not necessarily relevant to the local study area, and in fact were not able to be linked with known local weather types.

Beck and Philipp (2010) compared multiple circulation type classification methods developed for the North Atlantic European region. The methods included principal component analysis, pattern correlations, different clustering techniques, and manual classifications. The study found that no one method performed best for all applications. Performance varied by area, month and climate variable. Therefore, no single preferred method was recommended; rather the method should be selected based on the purpose of the study, and in some cases custom classifications are recommended. However, PCA or threshold-based methods performed best in terms of high between-class variability for temperature.

Studies using coarse scale weather variables for classifications, or regional weather types, found the link with local surface characteristics was not as strong as when more local classifications were used. For example, Jones et al., (2010) identified synoptic conditions conducive to in southern California using NCEP reanalysis data. However, the low resolution of the NCEP (2.5° ~250 km) dataset did not fully identify mesoscale and local topographic effects associated with the winds. Blandford et al., (2008) found basin scale lapse rates varied more by season than by SSC2 class. In addition, studies by Huth and Nemesova (1995) and Pepin et al. (1999) found local scale classifications more appropriate for investigations of local lapse rate variability. Therefore, I have chosen to create a weather type classification system for the FCA area using surface meteorological measurements at Calgary. Methods used for the classification include principal component, cluster and discriminant function analysis.

2.3 Interpolation and Regression Models

A number of techniques are available for interpolating data and the performance of the methods varies according to the spatial and temporal scale of the variable. Burrough and McDonnell (1998) conclude that most interpolation methods generate similar results when data points are closely spaced and plentiful, but with sparse data or rapidly varying terrain, the choice of method and parameters becomes more important. The techniques also need 17

to account for the spatial and temporal variability of surface meteorological measurements under different weather systems.

Baltazar-Cervantes and Claridge (2002) compared linear interpolation, cubic splines, and Fourier series for infilling short periods of hourly temperature data. Linear transforms were found to work best for short gaps between one and six hours. Spectral methods (Hocke and Kämpfer, 2009; Kondrashov and Ghil, 2006) can be applied to data series having a periodic signal. These methods interpolate time series data for individual data records.

Spatial interpolation is applied to datasets to create data at unsampled sites, thus creating a more spatially complete dataset. Common interpolation techniques include nearest neighbour (NN), inverse distance weighting (IDW), splines, kriging and regression.

The NN technique assigns the value of the closest sampled location to the unsampled location and is best suited to nominal data (Hartkamp, 1999). The influence of elevation on temperature must be accounted for in areas of varying topography, for example by using a constant lapse rate or one calculated from the data (Stahl et al., 2006a). Splines, kriging, and regression produce continuous surfaces and are appropriate for generating temperature data. The IDW technique considers a number of points within a neighbourhood of the unknown point location, with their contributions weighted inversely according to the distance between the point and the unknown location point. The rate of decay can be varied by applying a power function to the separation distance. Higher powers increase the weighting of nearer points. However, this relationship breaks down in areas of complex topography or surface heterogeneity, where horizontally close stations may have big vertical separations, introducing large temperature differences. Therefore, IDW is generally applied once temperatures have been converted to a common elevation (Dodson and Marks, 1997; Shen et al., 2001; Stahl et al., 2006a). This introduces the problem of first establishing an appropriate temperature/elevation relationship.

For large areas with small variations in topography, a combination of multiple regression using horizontal location and elevation and inverse distance squared weighted interpolation (Gradient Inverse Distance Squared or GIDS) has been used by (Nalder and Wein, 1998). The splining method fits a polynomial function between sampled points to generate a smooth surface. Thin-plate smoothing splines are best suited to modelling smoothly 18

varying surfaces e.g., monthly mean temperatures rather than daily values (Shen et al., 2001). As such they have been used to interpolate global monthly precipitation and temperature values (Hijmans et al., 2005). However, even when elevation is included as an independent variable along with latitude and longitude, the resulting surface models the global shape rather than matching prediction points (Shen et al., 2001).

Mosier et al. (2014) combined anomaly adjustments with cubic polynomial interpolation to create finer resolution climate grids from coarse climate normal grids and monthly data points. Anomalies between the monthly data and climate normals are generated. Fine scale delta surfaces are generated by applying polynomial interpolation functions. The final surface is created by reapplying the interpolated anomalies to the climate normal grid. However, the authors note a potential limitation of the method is that it assumes regional circulation patterns do not change with time. Without a dense grid of temperature measurements, it is also difficult to assess the validity and accuracy of the regionally- interpolated products for the present-day control period.

Similar to IDW, kriging also considers a number of points within a neighbourhood, but the relative contributions are based on a semivariogram, which models the distance/variance function of the climate variable calculated using all points within a predefined range. Kriging assumes stationary or near-stationary data, a requirement that is seldom met for daily climate data over large areas (Shen et al., 2001). Therefore, kriging is more applicable to interpolation of monthly/annual means or regional applications. Courault and Monestiez (1999) estimated daily minimum and maximum temperatures using kriging applied to raw and elevation-adjusted data. Better results were achieved using the elevation-adjusted data. This again requires an elevation/temperature relationship to be established. Co-kriging, as used by Ishida and Kawashima (1993), explicitly controls for elevation by including elevation as a co-variate. Neighbourhood weights are based on a cross-semivariogram modelled on the covariance between elevation and temperature.

The interpolation methods already discussed are based more on geostatistical relationships than physically-based (meteorological) controls. This is partially addressed with GIDS, co- kriging, and spline-based interpolations with elevation included as an independent variable, but elevation effects are not constant in space or time, and other systematic terrain and

19

meteorological influences are not readily captured by these methods. As an alternative, local or global regression methods generate formulae from selected predictor variables, which are used to calculate values at unsampled locations. In this way some of the geographic or terrain factors affecting temperature are explicitly accounted for, although meteorological processes or the characteristic spatial patterns of different weather systems are not necessarily captured.

When estimating missing values using regression equations based on multiple correlated stations, it is important to check for correlations between variables (stations) in order that the requirement of multicollinearity is not violated. Artificial neural networks (ANN) have been used by Coulibaly and Evora (2007) to estimate missing daily minimum and maximum temperatures. Inputs into the neural network are based on a station correlation matrix, but collinearity is not an issue with this method, allowing 3-4 neighbouring stations to be used as input. The study compared the performance of different neural network methods, but did not compare their performance to more traditional interpolation approaches. Another benefit of neural network methods is the ability to model non-linear relationships (Snell et al., 2000). This study found neural networks outperformed IDW, nearest neighbour and spatial average techniques. However, these techniques have been shown to be poor estimators in areas of complex topography. In addition, neural networks are very much a “black box” approach and give less indication of the physical controls on temperature under different weather systems.

The interpolation and regression methods discussed for temperature modelling, examples of studies in which they have been used, and limitations or benefits of the methods are shown in Table 2.1. The methods used in this study – kriging, global and local regression, and weighted local regression – are shown in bold.

20

Method Study Scale and variable Limitations/Benefits Linear Baltazar and hourly temperature single record interpolation, only interpolation Claridge,2006 works for short gaps (<4 hours)

Spectral Kondrashov and single record interpolation, analysis Ghil, 2006 requires data series having a periodic signal Nearest Stahl et al., daily minimum and need to correct for elevation in neighbour 2006a maximum areas of varying topography temperature Inverse distance Dodson and daily minimum and spatial interpolation, need to weighting Marks, 1997; maximum correct for elevation in areas of Shen, 2001 temperature varying topography Gradient inverse Nalder and monthly assumes a constant relationship distance squared Wein, 1998 temperature between elevation and temperature across the area Spline Hijmans et al., monthly mean, surface is too smooth 2005 minimum and maximum temperature Kriging Courault and daily minimum and need to correct for elevation in Monestiez, 1999 maximum areas of varying topography temperature Co-kriging Ishida and hourly temperature using elevation as a co-variable Kawashima, excludes other topographic 1993 influences on climate Anomaly+spline Mosier et al., monthly mean assumes constant spatial climate 2014 temperature and patterns, smooth surface precipitation Regression Carrega, 1995; daily/monthly requires stationary data and Esteban, 2009; minimum and assumes a constant relationship Huth and maximum between elevation and Nemesova, 1995 temperatures temperature across the area Local Nkemdirim, daily minimum and implicitly accounts for factors regression 1996; Eischeid, maximum other than elevation affecting 2000 temperature temperature Weighted local Daly et al, 2003, monthly, daily explicitly accounts for factors regression 2007, 2008 temperature and other than elevation affecting precipitation temperature Artificial neural Coulibaly and daily minimum and weaker performance in areas of networks Evora, 2007 maximum complex topography temperature Table 2.1. Interpolation methods, example studies and limitations/benefits.

21

3. STUDY AREA AND DATA QUALITY ANALYSIS

In this chapter I provide an overview of the terrain and general climate of the Foothills Climate Array area in southwestern Alberta. This is followed by details about site setup, instrument calibration, data downloads, and quality control of the data.

3.1 Study Area

Data collected during the University of Calgary Foothills Climate Array (FCA) study are used in this research. The study area, shown in Figure 3.1, covers approximately 24 000 km2, stretching from ~50 km east of Calgary to the Rocky Mountains in the west and spanning ~120 km north-south, centred on approximately 51°N and 114.5°W.

Figure 3.1. Foothills Climate Array study area. Crosses indicate mountain sites and dots are prairie sites. The City of Calgary municipal boundary is shown as a black outline. Sites within the boundary are considered urban sites. The survey consists of 12 east-west lines, spaced approximately 10 km apart. Along each line the sites are separated by approximately 5 km. Between 200 and 230 stations were in 22

operation during the main recording period from July 2005 to June 2010. Site locations sampled the varying topography and land surfaces of the area. I use the terms topography and terrain interchangeably. In both cases I am referring to the relative relief of the area and slope attributes: slope angle and aspect, altitude, and position, i.e., valley bottom, mountain top, or mid-slope.

The Rocky Mountains, with peak altitudes up to 3500 m, form the boundary between British Columbia and Alberta, and are aligned northwest to southeast. The Bow River is a major river basin in the study area. The southeast-flowing river cuts a path through the mountains before turning east and exiting the mountains southeast of Banff. In the mountains, the floor of the Bow Valley has an altitude between 1600 m and 1300 m, with peaks rising more than 1000 m on either side. The Kananaskis river, the second large river basin in the study area, flows north before joining the Bow River as it exits the mountains. There are also several smaller tributaries of the Bow River and numerous small mountain creeks and alpine lakes.

The upper slopes of the mountains are above treeline and consist of rock and rubble with little vegetation. Below treeline, coniferous forest and alpine meadows occur. The foothills are comprised of lower-elevation rolling hills, with coniferous and aspen forests interspersed with grassy meadows. The prairies are mostly shrub, grassland, and cultivated cropland, with farming the dominant land use. Ten sites are situated within the Calgary municipal boundary. At the eastern edge of the survey, east of Calgary, elevations drop to 900 m. Thus the area and site locations comprise a wide range of elevation, topography – flat prairies, rolling foothills and high mountains – and surface types: grassland, urban, forest, alpine vegetation, and bare rock.

The sites are representative of the overall topography of the area as indicated by two- sample Kolmogorov–Smirnov tests at the 95% significance level. The two-sample Kolmogorov–Smirnov test determines whether two samples have similar cumulative probability distribution functions. To compare the FCA sites with the full region, aspect (slope direction) is grouped into eight cardinal direction classes, slope into 5° bins, and elevation into 200 m bins. The elevation, slope angle, and aspect counts by bin for all sites are shown in Figure 3.2, along with the same terrain attributes for each 250-m grid cell

23

within the study area. The distribution of sites follows the distribution of terrain in the study area, although the steepest slopes and highest elevations are under-represented by the sample sites.

Overall, the diversity of elevation, topography, and surface types contributes to complex spatial and temporal variability of the recorded meteorological variables. The dense sampling grid allows for the investigation of local topographic effects on temperature.

Figure 3.2. Distribution of topographic measures (aspect, slope, and elevation) for FCA sites (upper panel) and overall 250-m resolution DEM derived measures (lower panel).

Lower altitudes and slope angles are more common in the prairies in the east. With more relative relief, it is anticipated that influences on temperature will be different in the high- altitude mountains relative to the flat prairies. Therefore, sites are classified as mountain, prairie, and urban. Sites within the Calgary municipal boundary are classified as urban. Both altitude and elevation variability are used to separate mountain and prairie sites. Site classification is a subjective process, the aim being to produce homogenous areas without site class outliers. Elevation variability (relative elevation) is calculated as the difference

24

between the site altitude and the minimum altitude within a 250 m buffer around each site. The specific parameters used to classify stations are:

 Prairie: altitude < 1250 m and relative elevation < 50 m  Mountain/foothills: altitude > 1250 m or relative elevation > 50 m

In this way 143 stations are classified as mountain/foothills, 83 classified as prairie, and 10 are urban sites.

The Rocky Mountains, forming the boundary between British Columbia and Alberta, influence the eastward flow of weather systems as they approach Alberta from the Pacific, and this limits the moderating influence of maritime polar air to areas west of the Rocky Mountains, giving Alberta its continental climate. Calgary broadly falls into the Dfc and Dfb Koppen classes (Ahrens, 2008), which are defined based on average annual and monthly temperature and precipitation values. A “D” class is defined as a “moist mid- latitude climate with severe winters”. The southeastern part of the FCA falls into the Dfb class, and is characterized as “humid continental with cool summers”. The remainder of the area falls into the Dfc class, characterized as “subpolar” having long, cold winters. Precipitation occurs year round, but with most occurring in summer. Winters are usually cold, windy, and snowy.

Nkemdirim (1998) gives more specific details about the climate of Calgary, describing it as “cold temperate”. Chinook winds occur in winter, and when frequent, can raise the average winter temperature above the normal ˗8°C. Summers are moderate, with a mean temperature of 15°C. Thunderstorms are a major source of summer precipitation, and snow occurs from fall through to spring, contributing 35% to annual precipitation.

Monthly climate normal (1981-2010) mean, minimum, and maximum temperatures from Environment Canada for Calgary are shown in Figure 3.3. The continental location, away from the moderating influence of large water bodies, allows for a high diurnal range. The high latitude results in a large annual range in solar radiation and a similar large annual temperature range. However, the mean temperature for any given year seldom matches the climatic average. Annual average of daily average, minimum, and maximum temperatures recorded by Environment Canada at Calgary airport for the years during which data were

25

recorded as part of the FCA (2004-2010) are shown in Table 3.1. 2009 was the coldest year, being 0.4°C below normal, and 2006 the warmest, 0.9°C above normal. Average temperatures during the FCA period were 0.3C above the 30-year normal, with the warmer conditions mostly associated with minimum temperatures. Average minimum, maximum, and mean temperatures for all years are within one standard deviation of “normal” values, therefore, overall, the FCA study period can be considered to be relatively normal in the context of the reference period, 1981-2010.

Figure 3.3. Monthly climate normals (1981-2010) at the Environment Canada Calgary airport station for maximum, minimum, and average temperatures. Dashed lines indicate annual climate normals.

Table 3.1. Annual average of daily maximum, minimum, and average temperatures at the Environment Canada Calgary airport station during the FCA data collection period.

26

3.2 Instrument Calibration and Accuracy

3.2.1 Site setup and maintenance

Setup of the FCA data loggers began in spring 2004 and was completed in summer 2005. Data recording continued until fall 2010 when the majority of the stations were taken down. Table 3.2 shows the number of sites collecting data each month from initial setup to takedown. Prairie sites were more accessible and were visited twice annually for site maintenance and data downloads, during the spring and fall field seasons. Mountain sites were less accessible, with many in remote locations accessible only in summer by long hikes. Therefore, these sites were only visited once per year for maintenance and data downloads.

Maintenance included cleaning the rain gauges, radiation shields, and sensors, as these become dirty over time through exposure to dust, pollen, insects, etc. Field notes and photographs were taken to document the physical location and condition of the sites during each visit. Sites do not conform to World Meteorological Organization (WMO) standards, which specify that climate recording sites should be level, away from vegetation and buildings, and not in areas of variable topography (WMO, 2008). As the purpose of the study is to examine topographic influences on weather, rather than looking at long-term trends, these variances are acceptable and are in fact part of the project design.

Table 3.2. Number of sites by year and month from setup in July 2004 to takedown in December 2010.

Example site locations are shown in Figure 3.4. In the prairies, instruments were mounted 1.5 m above ground level, but above 2000 m instruments were mounted 2-3 m above the

27

ground to minimize snow burial. Nonetheless several instruments were buried or knocked over. Both prairie and mountain sites also suffered damage due to interactions with wildlife, farm animals, people, and weather events.

Data were downloaded in the field and basic visual quality control tests performed. These included identification of spikes, unusual extremes, and comparison with neighbouring stations. Where data looked unusual, the logger was replaced and the questionable logger brought back for calibration testing. Site location coordinates and altitude were recorded by GPS on each visit. Minor discrepancies and errors were noted, which introduces some uncertainty in site locations. In addition, some sites suffering ongoing damage were moved to different locations. In general, these were less than 100 m away, but for moves greater than 100 m and elevation changes more than 50 m, data before and after the move were checked to see whether a single site location was adequate. This process is described further when sources of data uncertainty are addressed.

Figure 3.4. Site location examples: prairie, forested, urban, and mountain.

3.2.2 Instrumentation

Each site in the FCA recorded rainfall, temperature, and relative humidity. Rainfall was recorded using a HOBO tipping bucket rain gauge manufactured by Onset Computer Corporation. Temperature and relative humidity were recorded at 1-hour intervals, with instantaneous measurements taken at the top of the hour, using SP-2000 temperature- relative humidity data loggers from Veriteq Instruments Inc. The data loggers are mounted inside radiation shields manufactured by Onset Scientific Ltd. to protect the loggers from

28

direct sunlight and allow air circulation. Reported accuracy for the data loggers is ±0.25°C between ˗25°C and +70°C, with a resolution of 0.02°C at +25°C. Daily maximum (Tmax) and minimum (Tmin) temperatures were extracted from the hourly data, and the daily mean temperature (Tmean) was calculated as the average of hours 0 to 23.

3.2.3 Instrument calibration

Instruments were calibrated at the University of Calgary weather research station (WRS) before being set up in the field, and on an ongoing basis during the study. Calibration tests consisted of sensors set up at the WRS and recording instantaneous temperature at one, two, or five-minute intervals for one- to two-week periods. Data were aggregated to hourly intervals and compared with World Meteorological Organization (WMO) standard aggregated hourly temperature measurements at the WRS using a Campbell Scientific HMP35CF sensor mounted in a Gill Model 41004-5 12 plate radiation shield. Over the period of the study, 398 loggers were deployed at 234 unique locations. Each logger underwent between one and five calibration tests for a total of 671 tests between 2004 and 2011. Examples of 15 logger tests in May 2005 are shown in Figure 3.5. The average difference between logger and WRS temperatures is less than 0.2°C for all loggers, and hourly differences never exceed 1°C for the test period.

During the test period, if no absolute value of the hourly temperature difference between logger and WRS exceeded 3°C, a test was considered “good” and no further investigation required. 94 percent of tests were good. Average hourly differences were calculated for all good tests for all hours in a day (24-hr), for hours between 00 and 06h (night), and between 10 and 16h (day). Average differences by year are shown in Table 3.3. The average difference between the loggers and WRS is –0.1°C. Tests after 2007 show more negative offsets relative to the WRS values, possibly due to the WRS logger changing mid-2007 (using a Campbell Scientific HMP45C212 sensor), and more tests being conducted during winter, where there may be less daytime heating effect in the unventilated sensors as discussed in the vented calibration tests in section 3.2.4.

29

Figure 3.5. WRS (red) and 15 Veriteq loggers for a 5-day calibration test in May 2005. (a) actual temperatures and (b) temperature difference calculated as logger – WRS.

Average temperature difference logger-WRS Year No. Tests 00h00-24h00 10h00-16h00 00h00-06h00 2004 216 -0.02 0.13 -0.09 2005 67 -0.01 0.1 -0.08 2006 6 -0.13 -0.04 -0.2 2007 38 -0.05 0.16 -0.16 2008 48 -0.2 -0.25 -0.16 2009 73 -0.22 -0.26 -0.2 2010 140 -0.21 -0.23 -0.19 2011 83 -0.23 -0.36 -0.2 Table 3.3. Test result statistics by year showing the total number of tests performed, the average difference between the hourly logger and WRS temperatures for all hours 00h to 24h, hours between 10h and 16h, and 00h and 06h.

Boxplots showing the distribution of average hourly differences for each test for the three periods are shown in Figure 3.6. Differences are in general negative, indicating loggers have a small cold bias relative to the WRS. Absolute value differences seldom exceed 0.5°C with little difference between day and night tests. While the average difference varies

30

by year, there is no systematic change, indicating performance has not degraded with time. However, different loggers are tested each year so differences are not directly comparable.

Figure 3.6. Median temperature differences for each test where differences are calculated for all hours 00h to 24h, hours between 10h and 16h (day), and 00h and 06h (night). Box width is proportional to the square root of the number of observations, box height indicates the interquartile range, and the solid line within each box is the median value.

A further examination of loggers that underwent four or five tests between 2004 and 2011 (Table 3.4) also shows no systematic drift with time. Nor is the average error per logger for each test the same for different test periods. The instruments appear to be accurate within ~0.3C, similar to the manufacturer-reported accuracy, and well below what can be reasonably expected in landscape-scale temperature modelling (e.g., monthly errors from

31

0.5 to 1.3C in Cullen and Marshall, 2011). Therefore, it does not seem necessary to apply a correction to data recorded by each logger.

Table 3.4. Average offset (logger-WRS temperature) by year for loggers undergoing four or five calibration tests.

Six percent of tests had hourly differences exceeding 3°C. Bad tests often occurred during periods of heavy rain as shown in Figure 3.7. Some, but not all, loggers appear to read too high in rainy conditions. These loggers were still deployed in the field as they perform adequately when it is dry and quality control routines, which are described in detail later in the chapter, are designed to identify bad data, based on physically unreasonable readings and station correlation with neighbours.

32

Figure 3.7. WRS (red) and 17 loggers for a 6-day calibration test in June 2005, where 27 mm of rain fell on June 28. (a) actual temperatures and (b) temperature difference calculated as logger – WRS. Offsets greater than 2°C occur during heavy rain events.

3.2.4 Vented calibration tests

Studies have shown that temperature loggers in unventilated radiation shields used to protect the sensor from direct solar radiation, as used in the FCA study, may read too high under calm sunny conditions (Nakamura and Mahrt, 2005). Temperature sensors and shields have different associated errors depending on the design of the shield and the type of sensor. The design of the shield should ensure that the air within the shield is at the same temperature as the surrounding air (WMO, 2008). Shields may rely on natural ventilation from prevailing winds or may be artificially ventilated using a fan. However, naturally ventilated sensors may show a warm bias when wind speeds are less than 1 m s-1. Studies have investigated the magnitude of differences between temperatures recorded by naturally ventilated sensors with those recorded by artificially ventilated sensors under different wind and solar radiation conditions. Daytime temperature differences are the greatest due to solar heating. This heating effect may be from direct heating of the sensor (Georges and

33

Kaser, 2002) or indirect heating of the shield which then heats the air inside the shelter. For incoming solar radiation of 1084 W/m2 with multi-plate shields similar to those used in the FCA study, Whiteman et al. (2000) report root mean square errors of 1.5°C at wind speeds <1 m s-1, 0.7°C at 2 m s-1 and 0.4°C at 3 m s-1. Nakamura and Marht (2005) produced a correction model using measured temperature, wind speed, solar radiation and air properties – density and specific heat at a specified air pressure. Huwald et al. (2009) expanded this model to include surface albedo, as errors appeared greater over surfaces having a high albedo (e.g., sites with snow cover showed errors up to 10°C). Both studies recommend that the correction model should be created for each sensor/shield type.

This is not expected to be a problem for our sensors, given that they exhibit a small cold bias rather than a warm bias (Figure 3.6). To be sure, we performed an experiment at the University of Calgary WRS from October 2012 to September 2013 to quantify the effect of using naturally ventilated sensors, as used in the FCA study, compared to a mechanically ventilated sensor used at the WRS. The Veriteq loggers were set to record instantaneous temperature at five-minute intervals. These were aggregated to hourly values and compared with aggregated hourly temperature measurements for the WRS reference sensor, which records averaged data at one-minute intervals. The average difference between hourly average logger and WRS temperatures was 0.1°C for the test period.

This result does not control for the influence of wind speed and incoming shortwave radiation. These effects were examined by grouping hourly average wind speed and incoming shortwave radiation measured at the WRS between 08h00 and 18h00 into categories. Distributions of hourly average temperature differences (logger ˗ WRS) for each category are shown in Figure 3.8. Consistent with the theoretical expectation, larger differences are associated with higher radiation and lower wind speed values. However, outliers (large temperature differences) also occur for the low and moderate solar radiation categories (<300 and 300 to 600 W/m2).

34

Figure 3.8. The average temperature difference between Veriteq loggers and the WRS at different wind speeds and incoming shortwave radiation values. Box width equals the square root of the sample size.

Differences averaged for the whole year for the combined wind speed and incoming shortwave radiation categories are shown in Table 3.5. The maximum difference of 0.7°C occurs at higher solar radiation (greater than 600 W/m2) and winds speeds less than 1 m/s. For high wind speed and low solar radiation, the average difference is -0.2°C.

This is strongly systematic and points to a relatively simple correction if hourly shortwave radiation and wind data are available at a site. These are not available in the FCA, so this is a source of error that we must tolerate. However, the mean error in hourly temperature associated with the worst-case conditions is +0.7C, and the daily average errors associated with unventilated sensors will be much less (solar radiation is less than 600 W/m2 for most of the day, and throughout the winter). Hence this source of error is likely to be insignificant

35

for daily average and minimum (overnight) temperatures. It might affect maximum temperature measurements at sites located in sheltered locations; these may experience additional warming relative to sites in exposed locations due to reduced ventilation, particularly in the summer months, but differences will seldom exceed 1°C.

Table 3.5. The average hourly temperature difference between loggers and the WRS at different wind speeds and incoming shortwave radiation values. Darker shading indicates a bigger positive difference.

3.2.5 Station relocation

During each site visit, site coordinates and altitude were recorded by GPS. Field notes recorded site relocations and comments regarding coordinate accuracy. Any discrepancies between multiple coordinates were noted and verified against field notes, and the coordinates deemed most reliable are used in the analysis. Sites moved by less than 100 m horizontally are considered to be the same location. Five sites were moved by more than 100 m, resulting in elevation changes up to 40 m. These sites were investigated for any statistically significant differences in temperatures before and after the move.

 FA0315 was moved in June 2005, but no data exist prior to move; no further action required.  FA0920 was moved on July 26 2005; six months of data from 2004 exist prior to the move. Data analysis uses data beginning in July 2005, therefore this is not considered a move and no further action is required.  FA0120 was moved August 30 2007  FA0618 was moved August 7 2007.  FA0918 was moved 13 July 2008.

Boxplots showing the distribution of the station daily mean temperature minus the average of its top 10 correlated neighbour stations daily mean values before (LT) and after (GT)

36

the move date for the moved stations are shown in Figure 3.9. Distributions of the temperature difference for the before and after move groups are not normally distributed, therefore the non-parametric Wilcoxon-Mann-Whitney test, which determines whether the two samples come from the same population, was run. In all cases the test indicated sample means before and after the move were significantly different.

Figure 3.9. Distribution plots of the difference between station daily mean temperature and the average of the 10 most highly correlated neighbor stations for stations which were moved during the study period. GT indicates data after the move and LT before the move.

The median difference between a station and its 10 most highly correlated neighbours for two data periods corresponding to station move dates was calculated for two randomly selected stations which were not moved. The maximum difference between the two periods is 0.11C. For the moved stations, FA0918 has a difference of 0.14 before and after the move and this is within the accuracy range of the loggers so no further action is warranted. FA0618 has a difference of 1.25C between periods before and after the move, and FA0120 has a difference of 0.92C. These differences exceed the average difference between sensors. Rather than treating each site as two different locations, data prior to the move was excluded from the analysis.

3.3 Quality Control

Assessment of data quality prior to data analysis is an essential step to ensure only valid data are used. Hourly temperatures were recorded at all sites and downloaded annually or semi-annually. Veriteq data loggers store downloaded data in a proprietary file format, with data files named according to site and time of download. All raw data files were loaded

37

into the Spectrum software package (V3.7c) and exported as text files. In Microsoft Excel, header information was extracted and data files combined into a single file for each year and loaded into a SQL Server database. While sensors generally functioned well, periodically they malfunctioned or were damaged in the field, resulting in bad data.

Wade (1987) identified four general sources of measurement error, namely: instrument failure, drift, bias, and random error. Calibration tests identified that loggers do not show significant bias or drift, so these will not be corrected for. Instrument failure is readily identifiable as missing data or unrealistic values. Random errors are more difficult to identify, and showed up during calibration tests where loggers measure a few degrees too high for a few hours or days. Sensors would then return to normal, with good accuracy, so I do not wish to completely reject the sensors that are prone to this random error. A challenge in identifying bad data is to differentiate between extreme events and actual bad data.

Durre et al. (2010) detail comprehensive automated quality assurance procedures for daily meteorological measurements, as being applied operationally in the Global Historical Climatology Network (GHCN). Quality control procedures are designed to identify as many errors as possible with few false positives (that is, valid data flagged as bad data). Tests used in the procedure include: (i) physically reasonable bounds; (ii) internal consistency – is the daily value within statistical bounds for that day in the year; (iii) external consistency – is a value within reasonable limits of surrounding stations; (iv) multiple duplicate or repetitive values; (v) unusually large changes in daily minimum and maximum values. Similar tests are used in this study.

Quality control procedures for the FCA data include a sequence of automated (A) and manual (M) data checks applied to an entire file, hourly, daily, or monthly data, and run in the order shown below.

1. Field checks (M - entire file) 2. Time shifts (M - hourly) 3. Spikes (A - hourly) 4. Extreme values (A - hourly) 5. Snow burial (A/M - monthly) 38

6. Neighbourhood consistency (M/A - daily) 7. Review questionable loggers based on field notes and calibration tests (M – daily/hourly) 8. Final review (M – daily/hourly)

3.3.1 Field checks

The Spectrum software used for data downloading allows data from multiple loggers to be displayed on the same graph. During the download process each logger’s data was compared with one or more near neighbours. Downloads were characterized as “good”, “some bad” or “bad” using this comparison. Where a download was characterized as bad, the entire file was excluded from the compiled file. Files characterized as “some bad” were included in the compiled file and sql queries used to delete sections readily identified as bad based on a visual review.

3.3.2 Time shifts

The loggers do not contain a clock and take the start and end time from the computer doing the setup and download. The Spectrum v3.7c software uses the computer download time to assign a time to each measurement. Multiple machines were used for field downloads and at times the clocks were set to the wrong time zone, or alternated between daylight savings time (DST) and mountain standard time (MST). Times were commonly out by several minutes. In addition, on occasion some loggers malfunctioned and missed recording data for hours or days at a time, as seen in comparisons with neighbours, and this was noted in the field notes.

Data loggers are able to store ~18 months of data recorded at one hour intervals. If the time between site visits was too long, or a logger was inadvertently set to record at shorter time intervals, the memory filled up and no more data were recorded. By comparing multiple neighbours, and reviewing field notes which included the actual download time and whether time was DST or MST, files with time shifts and missing data were identified. Once the data were loaded into the database, sql queries applied the necessary time shifts. Where possible, periods of missing data were identified and measurement times adjusted to align data with neighbours.

39

3.3.3 Spikes

A directional step test as used by Hall et al. (2008) identifies consecutive measurements exceeding a user defined limit. Limits vary by location (climate region), measurement interval, and direction (rise or fall). Decreases of up to 9°C in five minutes have been observed in the Oklahoma mesonet, but the maximum increase observed is 6°C. However, Graybeal et al. (2004) found step tests were capturing real frontal events, whereas the real data problems were predominantly one hour spikes in the data. In southern Alberta, chinook winds and cold fronts both cause large rises and falls in hourly temperature measures, but these conditions usually persist after the step change.

A review of FCA data identified by step tests indicates spikes, either up or down, lasting one or more hours are real errors. A single hour spike may not influence daily mean values enough to be identified as unusual relative to neighbours (neighbourhood consistency check described later), but spikes lasting four or more hours will. Therefore, a spike test to identify spikes lasting from one to three hours was applied to FCA data. A spike was defined as a value exceeding subsequent or previous measures up to three hours apart by 5°C either up or down. As an example, Figure 3.10 shows spikes identified for site FA0234.

Figure 3.10. Site FA0234 shown in red has spikes lasting 2 and 3 hours on July 6 and 8, 2005, as well as persistent aberrant behaviour on July 7.

40

3.3.4 Extreme values

Extreme values were identified using Environment Canada monthly extreme minimum and maximum values for Calgary and adding or subtracting a set amount to account for variations due to differences in elevation. Hourly values were flagged as extremes when the value exceeded the Calgary monthly maximum extreme by more than 5°C or was lower than the extreme minimum by more than 10°C. The amount by which the extreme was exceeded is included in the flag. An example of an extreme outlier is shown in Figure 3.11 where FA0332 was 10C cooler than the Calgary average on August 24 2007 at 09:00. I found the extreme-value test to be less useful than the spike or neighbourhood consistency tests. Spike tests are more sensitive, as a sensor may spike, but still be within the limits of the extreme test. Where extremes are exceeded, it is generally for multiple hours in a day, which will cause the daily mean to fail the neighbourhood consistency test as well.

Figure 3.11. Site FA0332 shown in red exceeded the Calgary monthly extreme on August 24 2007. All other days with unusual minima are still within the extreme value threshold, but this sensor is clearly erratic, with anomalies on other days flagged by the neighbourhood test.

41

3.3.5 Snow burial

Some of the high mountain sites were buried by snow during late winter. Snow burial is seen through a small diurnal range, as shown in Figure 3.12 for site FA0715. However, high elevation sites tend to have a low diurnal range, especially during cold weather. Therefore, rather than identifying runs of days with low diurnal range, the snow burial test flags entire months where at least 25 days had a diurnal range of less than 3°C. Days on either side of the flagged months are examined for snow burial signal and manually flagged. In general, snow burial was identified during field checks and blocks of data deleted.

Figure 3.12. Station FA0715 shown in red has a reduced diurnal range indicating snow burial.

3.3.6 Neighbourhood consistency

Neighbourhood consistency checks are used to identify unusual values at a site relative to neighbouring stations. The method, threshold values, and number of stations used depend on station density, topography of the area, and the weather variable being checked. In all cases, estimated values calculated from neighbouring sites are compared to observed values and large deviations are flagged as potential errors.

Shafer et al. (2000) used a weighted average of neighbouring stations to calculate a value for the station being checked. Differences exceeding three standard deviations were

42

flagged as suspicious. Rather than assuming that nearest stations will be most highly correlated, Hubbard et al. (2007) ran regressions for multiple station pairs, where each pair included the station being checked. A weighted mean estimate was calculated using all regression estimates. Confidence limits to check whether the estimate was within an acceptable range were defined using a weighted average of all standard errors. The optimum threshold was determined to be three standard errors. When large differences are observed, the source of error may be the observation or one of the stations used in the estimation. Miller and Benjamin (1992) used a method, optimal interpolation, whereby each station was removed in turn from the regression. If the error remained large for all neighbour combinations, the observation was the source of the error. But if the error was reduced when a neighbour was removed, the neighbour was considered the source of the error and not used in future checks. However, all regression methods may not work as well in areas of high topographic variability or in rapidly changing weather, e.g., frontal conditions (Kunkel et al., 2005; Hubbard et al., 2007). To overcome these problems, Durre et al. (2010) used a 3-day window average value to identify large differences between estimate and observation, and Lanzante (1996) suggested using vertical neighbours as an alternative to horizontal proximity.

The FCA dataset has a greater station density and topographic range than the datasets used in other quality control studies mentioned. As a means of identifying most appropriate neighbours, two spatial neighbourhoods were defined. A horizontal proximity neighbourhood was defined using all sites within a 25 km radius of a site (50 km for boundary sites), and a vertical neighbourhood was defined using sites in 200 m elevation bands, e.g., 1400 to 1600 m, 1600 to 1800 m. All sites above 2200 m are grouped together; therefore, this group includes the two high elevation sites above 2800 m, with the reminder of the sites being less than 2500 m. In all cases, groups consist of at least 10 sites.

The spatial proximity test accounts for local variability and the elevation band accounts for elevation consistency. The test examines daily minimum, maximum, and average values for each site compared to the average and standard deviations calculated from all sites within elevation and horizontal proximity neighbourhoods. In calculating the neighbourhood average and standard deviation for each day, the site with the lowest and

43

highest values within the group are excluded, as are all site/days flagged as errors in previous quality control tests.

Site/days are flagged as suspect if any of their daily minimum, maximum, or average value differs from both horizontal proximity and elevation band neighbourhood means by more than five standard deviations. All suspect site/days are manually reviewed and flagged as natural variability or bad data. For sites identified as bad data using a five-standard deviation threshold, manual review generally indicated erratic sensor performance and the threshold for bad data was reduced to three standard deviations. All days for these sites are flagged as bad where any of the daily minimum, maximum, or average exceeds the group mean values by three or more standard deviations. The high altitude (2816 m) site FA0517 is consistently colder than both horizontal and proximity neighbours as shown in Figure 3.13. However, the diurnal pattern is in agreement with neighbours and this is considered natural variability and reliable data.

Figure 3.13. Site FA0517, shown in red, is a high-elevation site with lower temperatures compared to both vertical and horizontal neighbours. On May 27, 2010, the mean is more than 5 standard deviations lower than the group mean, but visual review indicates this is acceptable.

In contrast, site FA0123 shows just a few days with unusually high values as seen in Figure 3.14. These are considered suspect data and flagged for exclusion.

44

Figure 3.14. Site FA0123, shown in red, normally agrees well with neighbours, but has some unusual data exceeding 5 standard deviations on March 22, 24, and 26, 2010. Data were deleted for these days.

3.3.7 Field notes manual review

Field notes indicated any obvious issues with the site itself or unusual data.

Sensor on ground: Any stations where it was noted that the sensor had fallen over were examined in relation to elevation and proximity neighbours as in the neighbour consistency check. Sensors at ground level record too-high maxima and too-low minima during summer, e.g., FA0618 in Figure 3.15, and show as snow burial in winter. Bad data still present are manually flagged as bad.

45

Figure 3.15. Site FA0618 shown in red has unusually high daytime maximum temperatures prior to August 7, 2007 when it was found lying on the ground during the annual site visit.

Bad data: If field notes indicate some suspicious data for a particular sensor download, and the sensor had any bad calibration test results, the neighbourhood consistency test is rechecked using a three standard deviation threshold. If data look suspicious, site/days are manually flagged as bad. An example of site FA0234 is shown in Figure 3.16. Some days are near normal, but overall data look suspect and are flagged as bad for the full month.

46

Figure 3.16. Site FA0234 field notes indicated data are suspect and that the sensor was replaced.

3.3.8 Final review

Apart from entire files or obvious bad data blocks excluded from the compiled file, data are flagged as bad rather than being deleted. A different flag is assigned for each quality control test failure. The final review looks at groups of data with flags turned on or off to verify tests have correctly identified bad data and no bad data remains. Any further bad data identified in this final review, most commonly seen in days on either side of failed test days or very few days remaining in a month, are further flagged as bad. As the topography of the FCA and the climate of southwestern Alberta show high variability, quality control checks were applied subjectively and sometimes leniently in order to retain interesting data; therefore, bad data may be retained as well. The final number of site/days per year flagged as bad are shown in Table 3.6. Bad data peaked in 2007 and 2008, with the majority of problems being entire files flagged for deletion, commonly a result of sensor failure or the site being damaged.

47

Year badDataFlag 2005 2006 2007 2008 2009 2010 Total NULL 39744 79188 75981 76771 78215 37716 387615 _constVal 14 25 3 6 3 7 58 _partial 29 25 36 39 51 31 211 _snowburial 61 151 151 152 515 _spike 34 94 105 86 98 54 471 _spike_partial 1 1 2 badCmnt_manualDelete 242 202 61 185 690 badSD 378 479 974 446 519 346 3142 bulkDelete 2096 4233 7258 7328 5934 3626 30475 gndcmnt_manualDelete 115 344 279 239 40 1017 vizReview_manualDelete 58 98 14 77 247 Total 42771 84839 84863 85067 84861 42042 424443 Table 3.6. Number of site/days failing each quality control test. A null flag indicates good data.

On average, 91% of data were good and 9% of data were flagged as bad and excluded from the analysis. The percentage of data available from mountain, prairie, and urban sites is shown in Table 3.7. 65 percent of both mountain and prairie sites have at least 90% complete data. Figure 3.17 shows a map indicating sites and percentage of available data. Sites with more than 90% available data are evenly distributed across the study area.

Table 3.7. Number of sites with different percentages of available data.

48

Figure 3.17. Map indicating the percentage of available (excluding missing and bad) data for each site, as shown by symbol size and colour.

3.4 Summary

The Veriteq data loggers used in the FCA study underwent calibration tests before being deployed in the field, as well as during the study and at takedown. Calibration tests indicate a cold bias of sensors relative to the WMO-standard weather research station temperature, but the bias is generally less than 0.3°C for mean temperatures. Some loggers have mean errors exceeding 2°C during periods of rainy weather. However, not all loggers are affected, and quality control procedures are able to identify these errors.

Sensors showed no consistent drift with time, therefore no correction is applied. While the loggers show little difference between day and night performance, the vented calibration test shows a warm bias under calm, sunny conditions. However, the bias was generally less than 1°C, was not consistent under the same conditions, and a correction is difficult to apply without in situ wind and radiation data. Therefore no correction is applied, but this source of error may affect maximum temperatures at sheltered sites on calm, clear summer days.

Automatic and manual quality control procedures identified bad data. Types of errors include random sensor malfunctions where sensors read too high or low for a few hours.

49

These types of errors are identified by time series analysis and comparisons with neighbouring sensors, and affect short periods of data. Longer periods of bad data occur when sensors permanently malfunction or stations are buried by snow or are knocked or blown down. These bad data were identified during data downloads and by comparisons with neighbouring sensors.

Overall ~9% of data collected at the ~220 sites between 2005 and 2010 is missing or bad, and excluded from further analysis. Missing or bad data affects more than 70% of the sites and missing data is distributed randomly in the study area. While the dense station network provides some redundancy, and the percentage of missing data is not high, gap filling to create a complete dataset has benefits for applications requiring monthly means or for creating interpolated temperature surfaces. Gap filling methods are discussed in Chapter 7.

50

4. SYNOPTIC WEATHER PATTERNS IN SOUTHWESTERN ALBERTA

Canadian weather is dominated by the westerly wind belt of the mid to high latitudes and the polar easterlies in the high Arctic. Upper tropospheric winds determine the movement of the air masses affecting daily surface weather across the region. Air masses situated over western Canada in winter include continental polar (cP), continental arctic (cA), and maritime polar (mP). In summer, maritime polar air dominates with incursions of continental tropical (cT) and continental polar (cP) air. Godson (1950) determined the average temperatures of summer and winter air masses at the 850 hPa pressure level between 45 and 50°N in North America as follows: in winter, cA = 31°C, cP = 18°C, and mP is between 1 and +5°C; in summer, mP = +11°C and cT = +22°C.

The position of the polar front determines the air mass in place. During summer, with the polar front further north, mP air is most common in southwestern Canada. During winter, cP/cA air occurs when the polar front moves south during meridional (north-south) flow patterns and mP air enters the area when the polar front is further north, under a zonal (west-east) flow pattern, or when southerly winds accompany ridging events. Under zonal flow conditions, western Alberta experiences warmer, dry winters as moist Pacific air is modified by its passage over the Rocky Mountains, losing much of its moisture on the windward side and warming adiabatically as it descends to the prairies. A ridge of high pressure is often located above the mountain range and a trough downwind. Therefore, the Rocky Mountains are a source area for lee cyclogenesis and chinook winds (Chung et al., 1976; Stewart et al., 1995; Nkemdirim, 1996; Flesch and Reuter, 2012). In addition, the mountains block cold Arctic air from moving west and can result in extended periods of cold weather in Alberta.

In the next sections I describe several typical weather patterns that occur in southern Alberta. This is not exhaustive of the weather systems that are possible, and in reality a given day may include a mixture of the conditions described below. These are nonetheless the main large-scale weather patterns that frequent the study region.

51

4.1 Cold dry weather

Under meridional flow conditions, a trough of low pressure extends well south, bringing frigid dry air to the prairies. The air may be continental arctic (cA) or continental polar (cP) air. Arctic air originates over snow- and ice-covered surfaces in northern Canada, Alaska, Eurasia, and the Arctic Ocean in winter. While the air is dry, it can still produce light snow showers, and the frequency of cA events contributes to the frequency of days with precipitation on the prairies without contributing much to the amount of precipitation (Nkemdirim, 1988). Northerly winds bring in cold, dense air and the subsiding cold air causes surface pressure to rise, resulting in a shallow cold-core surface anticyclone with low pressure aloft. The atmospherically-stable Arctic air is associated with clear conditions. Strong surface cooling at night produces extremely low minimum temperatures. This can generate a persistent low-level temperature inversion as subsiding air warms but is stably stratified, creating a warmer layer above the cold surface layer. In addition, the dense dry air from the northeast is shallow and can push under the warmer westerly flow, resulting in upper-air inversions. Therefore, rapid temperature changes are possible over short horizontal and vertical distances.

Cold air events, during which inversions commonly occur, are a frequent weather type during winter in southern Alberta. These air masses move in from the north or the east, and the blocking effect of the Rocky Mountains can limit the westward penetration of the cold air, adding to the spatial variability of these events. Cullen and Marshall (2011) found negative surface temperature anomalies were highest in the eastern prairies and lower closer to the continental divide during this type of weather event.

Two examples of Arctic air incursions are shown in Figure 4.1. On January 4, 2005, under a weaker trough, the air mass was unable to completely dislodge the warmer air in the west, allowing Banff to record a daily maximum temperature of 11°C, 2°C warmer than in Calgary. However, a strong upper level trough was present during the event occurring around January 13, 2005, allowing the event to be stronger, last longer, and extend further south and west. A comparison of surface conditions for the two events is shown in Table 4.1. The January 13 event was accompanied by a stronger northerly wind, lower specific humidity, higher pressure, and colder temperatures. Calgary recorded a minimum

52

temperature of 33.9°C and that at Banff was 41.9°C (Environment Canada, 2016). Average daily minimum and maximum temperatures for Calgary in January are 15.1°C and 2.8°C respectively. Both 2005 Arctic air events had temperatures at least 10°C below normal, while the daily range remained in the normal range of ~12°C. A strong north wind ushered in the cold air on January 13, after which the winds subsided, allowing radiative heating of the air mass under clear skies. However, Arctic events are sometimes accompanied by persistent northerly winds, continuously refreshing the cold air, in which case the daily range is reduced. A positive surface pressure anomaly of 1.4 kPa was observed on January 14.

Figure 4.1. 500 hPa geopotential heights (m) for cA conditions on January 4 and 13, 2005. date maxT(C) minT(C) meanRH(%) meanP(kPa) uWE(km/h) vNS(km/h) q(g/kg) 20050103 -13.5 -26.3 69 89.93 1.3 3.6 0.7 20050104 -15.4 -26 71 90.07 1.1 -12.2 0.6 20050105 -3 -21.5 62 88.74 4.5 -8.3 1.13 20050112 -16.7 -31.8 68 88.69 4.6 24.9 0.42 20050113 -22.2 -33.9 57 89.74 2.8 0.5 0.24 20050114 -20.1 -31.4 56 90.2 5.5 -4.5 0.28 Table 4.1. Surface daily meteorological measurements (T - temperature, RH - relative humidity, P - pressure, uWE and vNS are east-west (west positive) and north-south (north positive) wind vectors, q - specific humidity) at Calgary for cA conditions on January 4 and 13, 2005 and surrounding days.

Continental Arctic air (cA) events occurring from 1990 to 2010 were examined to get an idea of the frequency and character of these weather systems. Arctic air events were defined as days having a minimum temperature less than 15°C and at least 10°C below the normal

53

minimum for the month. The analysis is not intended to be a rigorous evaluation of all cA events, but to highlight some features for this type of weather system. On average, 17 days per year recorded cA events, generally from November to March. There are occasional occurrences in April and October. A high of 52 cA days occurred in 1996 and no cA days were recorded in 1999. The highest average number of cA days per month (4) occurs in January; 14 cA days occurred in January 1996 and March 2002. The majority of events last 1 or 2 days, but these systems can persist for more than a week; one event over this 20-year period lasted 11 days.

These statistics illustrate the variability inherent in cA weather systems. None of daily minimum, maximum, or mean temperatures as reported by Environment Canada at Calgary airport is normally distributed during winter months. In all cases data is negatively skewed, with means 1 to 3°C below the median value, indicating the influence of strong cA events on Calgary’s climate normals.

4.2 Chinook winds

Large mountain ranges positioned 90° to the predominant wind direction can cause strong downslope winds known as chinooks in western Canada. Westerly chinook winds are warm and dry, and can result in large temperature swings of greater than 30°C in a day (Nkemdirim, 1988), especially if the warmed Pacific air is replacing cold Arctic air (Hare, 1974). Chinook events are variable in space and time, but the conditions under which they occur are associated with specific synoptic patterns. They originate as a moist westerly wind off the Pacific. As the air rises on the windward side of the mountains, it cools at the wet adiabatic lapse rate as it loses moisture. When it descends in the lee of the mountains it is much drier, so warms at the dry adiabatic rate.

Nkemdirim (1996) identified chinook days as having a higher than normal daily maximum temperature, a rapid drop in relative humidity, wind speed > 4.5 m s-1 and wind direction between SSW and WNW. Chinook winds were found to occur mostly in winter. Chinook winds have a wave like structure (wavelength varying from 60-70 km), with the highest temperature increases associated with the trough of the wave where it touches the ground. Nkemdirim (1996) found a greater variation in the strength of the anomaly than in the frequency of occurrence, indicating a chinook event passes through the entire area, but the 54

position of the wave varies, making for extreme spatial and temporal temperature variations during a chinook event. Cullen and Marshall (2011) also found that temperature anomalies during chinook events were spatially variable, with lower anomalies closer to the continental divide. The strongest winds are experienced where the barrier is narrow or winds are channeled through a gap in the barrier (Whiteman, 2000), as occurs in southern Alberta.

Chinook events are embedded in the background westerly flow, and while they may resemble the arrival of a warm front, the increased temperatures are more a result of the strong adiabatically warmed westerly flow, rather than the warm air mass following a warm front. In addition, relative humidity is lower during a chinook event. Figure 4.2 shows the synoptic pattern typical of a strong chinook event that occurred on January 30, 2009.

Figure 4.2. 500 hPa geopotential heights (m) for chinook conditions on January 30, 2009.

Surface meteorological measurements illustrating the chinook event are shown in Table 4.2. Days preceding and following the event are included in the table. Of note are the change to a strong southwest wind on January 30, a rapid rise in the mean daily temperature, and a drop in relative humidity.

55

date maxT(C) minT(C) meanT(C) meanRH(%) meanP(kPa) uWE(km/h) vNS(km/h) 20090128 2.2 -6.3 -1.5 43 88.75 23.9 -0.6 20090129 8 -6.3 2.1 47 88.91 21.5 -3 20090130 10 5.8 7.9 35 88.44 25.5 -14.2 20090131 5.5 -8.1 -2.6 50 88.37 18.7 6.7 Table 4.2. Surface daily meteorological measurements at Calgary for chinook conditions on January 30, 2009 and surrounding days.

Table 4.3 shows 6-hourly surface measurements for this chinook event. Prior to the chinook, relative humidity is above 50%, the wind is calm, and the temperature is about 5C. At the onset of the chinook around midday on January 29, the wind switches to the west and at the peak of the chinook 24 hours later, relative humidity reaches a low of 27%, wind is strong southwesterly, and the temperature reaches a maximum of 9.3°C. date hour T(C) RH(%) P(kPa) uEW(km/h) vNS(km/h) Weather 20090129 0 -5.5 54 89.27 0 0 Cloudy 20090129 6 -4.2 57 88.89 0 -4 Cloudy 20090129 12 6.6 36 88.8 22 0 Cloudy 20090129 18 6.7 41 88.86 43 0 Mainly Clear 20090130 0 7 40 89 41 0 Mostly Cloudy 20090130 6 7.7 36 89 26.3 -9.6 Cloudy 20090130 12 9.3 27 88.57 27.7 -16 Cloudy 20090130 18 6.5 40 87.93 12 -20.8 Mainly Clear 20090131 0 5.5 45 87.52 32.9 12 Cloudy 20090131 6 -4.8 79 88.2 5.4 -4.5 Clear 20090131 12 0.1 44 88.38 16.4 -9.5 Mostly Cloudy 20090131 18 -3 37 88.59 27.6 -4.9 Mostly Cloudy Table 4.3. Surface 6-hourly meteorological measurements at Calgary for chinook conditions on January 30, 2009 and surrounding days.

4.3 Frontal conditions

Any time one air mass gives way to another, a frontal zone of unstable conditions may be present. Pressure, temperature, and wind speed and direction change rapidly. The amount of change occurring during the frontal passage depends on the temperature/density contrasts between the air masses. As these are greater during winter, winter fronts are more easily identified. Prior to the arrival of a cold front, temperatures are warm in a westerly or southerly flow, with clear or cloudy conditions. As the front passes through, winds become

56

northerly, clouds and precipitation move in, and pressure and temperature drop. After the passage of the cold front, pressure rises, wind speeds and precipitation decrease, specific humidity is low, and temperatures remain cold.

Between January 26 and 28, 2008, moderate Pacific air gave way to cold Arctic air. The front passed through during the morning of January 27. Winds became northerly, clouds and snow moved in, and pressure and temperature dropped. Arctic air settled in on January 28 with a rise in pressure, wind speeds and snow decreasing, low specific humidity, and temperatures remaining very cold. Figure 4.3 shows hourly surface measurements of humidity, temperature, pressure, and u (east-west) and v (north-south) wind vectors.

Figure 4.3. Surface hourly meteorological measurements at Calgary for a cold front passing through on the morning of January 27, 2008. East-west winds are plotted in red, with positive values indicating a west wind, and north-south winds are shown in blue, positive indicating north winds. Temperature is shown in black. The green line marks the approximate time of the front.

Walker (1961) found that sites at different elevations show more microclimate effects and greater variations during frontal situations. Inversions may exist due to the low angle of the Arctic front, where higher elevation sites sit in warmer Pacific air and lower sites are in the colder Arctic air.

57

4.4 Cool wet weather

Cool wet days are associated with low pressure systems. Cyclones develop in the lee of the Rocky Mountains year round (Chung et al., 1976), or they migrate eastwards from the Pacific, passing through the Rockies in southern Alberta or northern . Their strength intensifies when they occur in conjunction with a deep mid troposphere trough, with associated upper level divergence. The uplift associated with the low pressure system generates widespread precipitation. Low level easterly upslope flow enhances precipitation, often in the form of spring snowstorms (Stewart, 1995). Cyclonic spring and early summer rains in Alberta can bring days of continuous precipitation (Nkemdirim, 1988). In addition to widespread stratiform precipitation, storms with thunder and hail often accompany these storm events due to forced convection (Flesch and Reuter, 2012).

Extensive rain and flooding occurred in June 2005 during four cyclonic storm events in southern Alberta. The maximum point-measured rainfall (248 mm) occurred from June 5- 8 (Flesch and Reuter, 2012), as a low-pressure system stalled over the region. The location and strength of the cyclone at the 500 hPa pressure level on June 8 is shown in Figure 4.4 and surface meteorological measurements at Calgary for the duration of the storm are shown in Table 4.4. The diurnal temperature range is reduced, with maximum temperatures up to 10°C below normal (20°C) and minimum temperatures close to normal (7°C) for June. Surface winds are light northeasterly, but upper air winds are southwesterly, bringing in humid mT air from the south. As expected, relative humidity is very high, close to 100%.

Figure 4.4. 500 hPa geopotential heights (m) for a cyclonic rain event on June 8, 2005.

58

date maxT(C) minT(C) meanRH(%)meanP(kPa) uWE(km/h) vNS(km/h) Precip(mm) q(g/kg) 20050605 14.2 10.3 92 88.58 -10.7 0.5 36.6 8.81 20050606 11.9 8.4 89 88.61 -2 9.4 18 7.6 20050607 8.5 5.8 98 88.55 -13.8 23.5 46.2 7.18 20050608 12.3 5.8 89 88.73 -1.9 7.3 10.2 7.07 Table 4.4. Surface daily meteorological measurements at Calgary for a cyclonic rain event occurring between June 5-8, 2005. q is specific humidity.

4.5 Hot, high-pressure ridges

Surface heating over the southern United States in summer creates a shallow thermal surface low pressure. Upper-level ridging allows warm continental tropical (cT) air to be advected from the south into Alberta. Anticyclonic flow prevails over southwestern Alberta, with high pressure and subsidence. Specific humidity is high for the prairies, however this is a function of warm air being able to hold more water vapour, rather than indicating high relative humidity. Daytime heating also drives evaporation, and can cause instability; therefore, there may be some precipitation associated with local convective uplift, but amounts are generally small. Surface winds are light and southerly. Clear skies allow for a higher than normal diurnal temperature range. Under the calm, clear conditions accompanying unseasonably warm weather, daily maximum temperatures have a similar distribution to shortwave radiation values (McCutchan, 1976). As an example, warm conditions associated with this pattern occurred from August 15-18, 2008. The synoptic pattern is shown in Figure 4.5.

Figure 4.5. 500 hPa geopotential heights (m) for warm conditions on August 18, 2008.

59

Daily meteorological measurements are shown in Table 4.5. Minimum temperatures are up to 5°C above long-term normals and the maximum temperature on the 18th was 10°C above normal. Specific humidity (q) is high for the prairies, with values of ~10 g/kg, above the average of 8 g/kg. Relative humidity is moderate, between 40-60% compared to an average 63%. Surface winds are light from the southeast. The diurnal range is higher than normal, 15-20°C compared to 13°C, indicating clear skies. date maxT(C) minT(C) meanT(C) meanRH(%) meanP(kPa) uWE(km/h) vNS(km/h) Precip(mm) q(g/kg) 20080815 28.3 11.3 20.6 57 89.83 -4.6 -10.3 0 8.96 20080816 28.6 12 22 53 89.28 -2.4 6.8 0 9.31 20080817 29.5 14.8 22 60 89.13 -4.7 -11.2 2.4 10.67 20080818 33.7 14.9 24.2 48 88.51 -4.1 -14.3 0 9.29 20080819 25.8 16.4 20.9 52 88.12 2.8 2.2 0.2 9.05 20080820 21.4 11.8 16.6 61 87.93 5.3 -3.3 0.2 8.01 Table 4.5. Surface daily meteorological measurements at Calgary for unseasonably warm conditions from August 15-18, 2008.

Calgary minimum, maximum, and mean temperatures for summer months from Environment Canada for the period 1990 to 2010 are near normally distributed with similar mean and median values. This indicates that warm events do not affect average temperatures as significantly as cold events affect winter temperatures, due to both frequency and intensity of the events. Barry and Chorley (2010) report an average temperature of 25°C for cT air at the 850 hPa pressure level, compared to 18°C for interior mP air. Therefore, there is not such a large difference in temperature between the contrasting summer air masses.

For the period 1990 to 2010, unseasonably warm summer days were defined as days having a maximum temperature greater than 25°C and an average temperature at least 5°C higher than the monthly average. On average, there are 13 unseasonably warm summer days per year. Summer months (JJA) average 2-3 warm days per year, September averages 4 unseasonably warm days per year, and 1.5 warm days occur in May. There is only a very small positive skew in the distribution of monthly mean and maximum temperatures for May and September, indicating unseasonably cold weather likely occurs just as often during late spring and early fall, due to transitional weather patterns.

60

4.6 Summary

Surface climate measurements vary in response to the prevailing weather type. Selected weather types occurring in southern Alberta and qualitative surface and 500-hPa pressure level conditions typical of each are summarized in Table 4.6. Characteristics of southern Alberta weather types inform a weather classification system, described in Chapter 5, to classify the prevailing daily weather system during the FCA study period. In reality, of course, weather system characteristics are not always ‘textbook’; the intensity of conditions varies within each weather type, adding to the complexity in identifying the surface signal associated with different weather systems. Moreover, most of these weather types occur year-round, but their seasonal signatures can vary (for example, summer chinooks occur but do not necessarily produce warmer conditions in the warm months).

Wind Surface Diurnal Weather 500 hPa Pressure direction Relative Specific temperature temperature type pattern anomaly and humidity humidity anomaly range strength strong strong Cold dry low/average positive N/NE low trough negative strong strong Chinook ridge low positive W/SW Transition variable high variable large cyclone Cool wet in negative low negative E high high southern Alberta Hot, high- strong pressure positive high positive S/calm high ridge ridge Table 4.6. Selected qualitative surface meteorological characteristics associated with different weather types in southern Alberta.

61

5. STATISTICAL WEATHER CLASSIFICATION FOR SOUTHWESTERN ALBERTA

In this chapter I discuss the development of a weather classification system for southwestern Alberta. First, classification methods and general weather types are reviewed. This is followed by a discussion on data selection and preprocessing prior to classification using statistical methods. Both automated (principal component analysis and cluster analysis) and hybrid (classifying a training subset for use with discriminant function analysis) classification methods are tested. Results from each method are analysed and final weather types defined and described. Lapse rates are calculated for each weather type using FCA data, and the spatial distribution of mean daily temperature for each type is discussed.

5.1 Introduction

Weather type classifications differ in terms of the methods and input variables used, and the objects being classified. Huth et al. (2008) termed methods using surface measurements only as weather typing, and methods using upper-air measurements as air-mass classifications. In the same review, three major forms of classification are identified, namely subjective, mixed, and objective. Subjective methods involve the manual specification of spatial patterns (Pielke et al., 1987) or meteorological criteria (Nkemdirim, 1996; Cullen and Marshall, 2011) associated with known weather types or air masses. Objective or automated methods (e.g., Huth and Nemesova, 1995; Cheng et al., 2010) apply statistical clustering methods to principal components extracted from multiple surface and upper air meteorological measurements. Mixed or hybrid methods (e.g., McCutchan, 1978; Sheridan, 2002) manually classify a subset of data and then generate discriminant functions based on surface and upper-air measures to automatically classify the remaining data. Both manual and hybrid classifications rely on prior knowledge of the number of classes or weather types, whereas objective methods use statistical tests to identify the optimal number of classes.

Climate variables used for weather type classification should indicate the characteristics of the air mass, e.g., temperature and humidity, and flow characteristics, e.g., wind direction and speed. Commonly used variables include temperature, dew point temperature (DPT), 62

wind direction and speed, sea level pressure, and cloud cover (Davis and Kalkstein, 1990; Cheng et al., 2010). DPT rather than relative humidity is used as it is less sensitive to changes in temperature (Davis and Kalkstein, 1990). Alternatively, specific humidity (q) can be used. Sea level pressure and wind indicate flow patterns, and cloud cover gives some indication of solar radiation and precipitation. In addition, Sheridan (2002) included daily DPT, sea level pressure range, and wind direction change to identify transition or frontal days. On occasion, upper-air variables have been used as well (e.g., Cheng et al., 2010), but these did not improve final clusters.

Surface meteorological variables such as wind, humidity, pressure, and temperature all include a seasonal signal. Therefore, to create a classification system applicable year round, it is first necessary to remove the seasonal signal from the variables. This allows a classification to be developed using anomalies rather than absolute values. Data values can be converted to z-scores (standardised) by subtracting the population mean and dividing by the population standard deviation. This assumes the population mean and standard deviation are known. Sheridan et al. (2013) calculated spatial anomalies for surface sea level pressure by subtracting the daily area mean from gridded daily values. This produced daily spatial anomaly maps used to identify circulation patterns. Davis and Kalkstein (1990) removed the latitudinal signal by subtracting 6-8 year means of three hourly measurements. The authors suggest that 5- to 10-year means are sufficient to indicate ‘normal’ values and the choice of time period is not critical when the purpose is simply to provide a baseline from which anomalies will be calculated.

Enke and Spekat (1997) calculated daily means using a 17-day moving average for data from 1951-1994 in order to calculate the annual cycle of meteorological variables. Using a moving average reduces the effects of short-term fluctuations when the data collection period is short. Cheng et al. (2010) standardised hourly surface measurements using mean and standard deviation of the dataset to maintain seasonality, but remove scale effects. However, their study was only looking at summer rain events, using data from April to November.

63

5.1.1 Principal Component Analysis

Daily weather variables are highly correlated, for example temperature is strongly inversely correlated with relative humidity. In addition, the use of hourly measurements introduces correlation between time measurements of the same variable, e.g., temperature at 06:00 is generally correlated with temperature at 03:00. Principal component analysis (PCA) reduces multiple correlated variables to orthogonal (uncorrelated) components while maintaining the variability present in the original variables. The components are extracted based on correlations between variables and should indicate underlying processes in the data, whereby related variables vary simultaneously under the same process. Therefore, the resulting components should have some physical meaning as indicated by which variables load highly on which components. In order to correctly identify underlying processes, all variables identifying the processes should be included in the analysis; otherwise some variables may have higher importance than reality or load on incorrect components.

As PCA is often used for exploratory analysis, it is possible to use data that do not strictly meet the requirements for analysis, e.g., normality, linearity, few outliers, and little multicollinearity, although a better solution results when the data do meet the requirements (Tabachnick and Fidell, 1996). However, the dataset must include sufficient observations (1000 is excellent, Tabachnick and Fidell, 1996) and there must be some significant correlations (> 0.3) between variables for PCA to be effective.

Difficulties with the PCA method are the number of choices to be made in doing the analysis and the lack of objective measures of how good the solution is. Decisions to be made include: method used to extract components, whether or not to rotate components, the rotation method, and the number of components to retain. Different measures and tests are available to aid in making objective decisions during the analysis, e.g., scree plots to select the number of components, one of the more influential decisions to be made. A common criterion is to retain components with eigenvalues greater than one. Retaining more components yields a solution explaining more of the variance, but results may be difficult to interpret. Loadings indicate the correlation between variables and components, and high loadings are used to explain the variables which identify components/processes.

64

5.1.2 Cluster Analysis

Component scores from the PCA are used as input for a cluster analysis in weather typing. Cluster analysis (CA) is used to group observations based on similarity amongst variables. There are two broad categories of clustering methods, hierarchical and non-hierarchical. Hierarchical methods may be agglomerative (each observation starts as its own cluster and groupings are gradually formed) or divisive, where each record is gradually separated from the full group. A measure of similarity or dissimilarity is defined, but the number of clusters need not be known a priori. Non-hierarchical methods require the specification of initial cluster centers and number of clusters. Non-hierarchical methods allow objects to move between clusters during the process. As for PCA, there are several decisions to be made when running a CA, e.g., cluster method, number of clusters, whether to use PCA components or convert back to surface measurements.

Schoof and Pryor (2001) tested five different clustering techniques applied to PC scores and found the average linkage method (the average squared Euclidean distance as computed between all members of each cluster) performed the best without grouping too many records into a single cluster. The optimum number of clusters was selected based on an inflection point on a plot showing number of clusters and separation between clusters. In addition, the clusters had to represent interpretable synoptic patterns.

Kalkstein et al. (1987) compared three hierarchical, agglomerative methods, namely: Ward’s clustering, centroid, and average linkage, and also found average linkage to produce the best compromise between the equal-sized groups produced by Ward’s clustering and the centroids method, which produced one large group and many small groups with very few members. Other studies (Davis and Kalkstein, 1990; Davis and Walker, 1992) have used a two-step clustering procedure, where average linkage is used to generate cluster means and number of clusters and this is used as input into k-means clustering to produce the final clusters, thereby minimizing within-group variability and maximizing between-group variability.

65

5.1.3 Discriminant Function Analysis

Discriminant function analysis (DFA) creates functions based on independent variables which are best able to differentiate between records in classes as defined by the dependent variable. These functions indicate which variables are important discriminators and can be used to classify new records. DFA can be used in combination with PCA and CA. For example, Cheng et al., (2010) showed DFA, applied after weather types have been classified using PCA and CA, generates clusters with smaller within-cluster variance and greater between-cluster variance. Alternatively, days can be manually classified and discriminant functions derived using multiple independent variables. Two methods of DFA are available; the “enter” method (Sheridan, 2002; Kalkstein et al., 1996) uses all independent variables as determined by the user, and the “step-wise” method uses statistical tests to determine the significant variables (McCutchan, 1978).

5.1.4 Meteorologically-based (subjective) classification

Regardless of the classification method used, it is still necessary to have some idea of expected weather types. Local weather types rather than regional air masses have been found to be more appropriate when studying local climate variability (Pepin et al., 1999), therefore I use this type of classification. Five typical weather patterns occurring in southern Alberta were described in detail in Chapter 4. Here I summarise them briefly again. Normal days (Nl) represent average conditions.

Cold dry weather (CD)

Cold air originates over snow- and ice-covered surfaces in the Arctic region in winter and enters Alberta from the north. The atmospherically-stable Arctic air is associated with clear conditions. Therefore, strong surface cooling at night can produce extremely low minimum temperatures. This can generate persistent surface temperature inversions as subsiding air warms, creating a warmer layer above the cold surface layer. In addition, the cold, dense cP air mass is typically shallow, and can push under the warmer westerly flow, resulting in upper air inversions.

66

Chinook winds (Ch)

Chinook winds are warm and dry westerly winds descending from the Rocky Mountains, and are characterized by a rapid rise in temperature and decrease in relative humidity. Descending air on the lee side of the mountains is much drier, so warms at the dry adiabatic rate. Chinook winds occur mostly in winter and have a wave like structure, with the highest temperature increases associated with the trough of the wave where it touches the ground. Strong spatial and temporal temperature variations occur during a chinook event.

Transition conditions (Tr)

Any time one air mass gives way to another, a frontal zone of unstable conditions may be present. Pressure, temperature, wind speed and direction change rapidly. The amount of change occurring during the frontal passage depends on the temperature/density contrasts between the air masses. As these are greater in general during winter, winter fronts are more easily identified. In addition, warm fronts are difficult to distinguish from chinook events – in a sense, chinooks are a type of warm front, but they are not a classical air mass frontal interaction (e.g., Ahrens, 2008). Therefore, for the purposes of the FCA weather type classification, transition days refer to cold fronts.

Cool wet weather (CW)

Cool wet days are associated with low pressure systems. A surface low pressure, in conjunction with an upper level trough, may get blocked over the region by high pressure to the north (Flesch and Reuter, 2012). This can produce an extended period of widespread precipitation. Uplift due to the mountains can result in upslope storms producing heavy rain or snow in the spring.

Hot, high-pressure ridges (Ht)

Upper-level ridging allows warm, continental tropical air from the southern United States to be advected into Alberta. Anticyclonic flow prevails over southwestern Alberta with high pressure and subsidence. These high-pressure ridges can be persistent, bringing clear skies and anomalous warmth. Specific humidity is high for the prairies; however this is a function of warm air being able to hold more water vapour, rather than indicating high

67

relative humidity. Surface winds are light and southerly. Clear skies allow for a higher than normal diurnal temperature range.

5.2 Methods

5.2.1 Variable selection and data preparation

Daily weather classification is based on daily and hourly surface weather measurements available from Environment Canada (EC) at Calgary airport and the National Center for Environmental Prediction /National Center for Atmospheric Research (NCEP/NCAR) Reanalysis data set (Kalnay et al., 1996) as shown in Table 5.1. Wind speed and direction are converted to east-west (u) and north-south (v) vectors using the equations:

u = WS * sin (WD)

v = WS * cos (WD) where WD is wind direction in degrees, measured from north, and WS is wind speed. By definition, west and north winds are positive while south and east winds are negative values. Daily trends are calculated as the difference between values at 23:00 and 00:00 (the change over 23 hours), and the daily range is calculated from the difference between daily minimum and maximum values.

Variable Period Source Method Surface temperature, selected hours 03,09,15,21 Environment PCA,DFA relative humidity, pressure, daily minimum, maximum, Canada dew point temperature, average, trend, range specific humidity largest hourly rise and fall PCA Surface east-west wind, selected hours 03,09,15,21 Environment PCA,DFA north-south wind daily minimum, maximum, Canada average east-west wind at the 850- selected hours 00,06,12,18 NCEP PCA hPa level, north-south wind at the 850-hPa level, cloud cover

Table 5.1. Meteorological variables used for principal component (PCA) and discriminant function analysis (DFA) to create weather types.

68

The seasonal signature in the selected variables is removed in order to use variables in a year round classification. Daily anomalies, referred to as standardized data, are created by subtracting 11-day moving average means calculated from daily and hourly data values for the period 1971-2010. Using an 11-day moving average of the 40-year mean produces a smooth annual cycle for the selected variables. An example for temperature is shown in Figure 5.1.

Figure 5.1. Smoothed temperature values at 09:00. The black symbols represent a 40-year mean and the solid line is the 11-day moving average of the 40-year mean.

Studies using long-term data series need to address issues such as missing data, instrument changes, site relocation, and changes in observation procedures as these can produce non- climatic trends in the data. Homogenized datasets, whereby non-climatic changes are identified and corrected for, have been created using raw Environment Canada (EC) climate data for use in climate trend analysis studies. These include monthly means of surface air temperature (Vincent et al, 2012), daily mean temperature using interpolation algorithms applied to monthly adjustments (Vincent et al., 2002), and mean monthly wind speed (Wan et al., 2010). A review of the raw monthly mean temperatures for the period 1960-2010 with the homogenized dataset shows differences never exceed 1°C for any month and the average absolute-value monthly difference is less than 0.3°C. As this study uses hourly data, relative anomalies are more important than absolute values, and my interest is weather classification rather than trend or climate change detection. Therefore these uncertainties are acceptable and raw EC data are used for the classification. The method of reporting wind direction changed in 1970, so I use data from 1970-2010 to create a baseline dataset from which daily anomalies are calculated.

69

Missing data are addressed as follows. Only 10 days from the 40-year period to be used in creating baseline data are missing data for some or all variables, with between 2 and 20 hours missing. These days are excluded in the calculation of the baseline mean and standard deviation data. However, relative humidity (RH) data are missing for several days in 2008 and 2009. These days are required to be classified as part of the FCA data collection period, therefore the missing RH data are recreated using a linear interpolation from hours on either side.

When using long time series of data, especially in a changing climate, the existence of trends is a possibility and these may need to be removed before calculating anomalies. Annual trends calculated for monthly means of daily mean, minimum, and maximum values indicate wind vectors and wind speed show significant decreasing trends and temperature and average specific humidity show significant increasing trends. However, looking at trends in daily values per month, only wind vectors show a consistent trend in all months. Temperature only has a significant increasing trend in January and specific humidity in June and July. In this study the focus is on anomalies rather than absolute values, over a relatively short period of time. Therefore, for the purpose of calculating anomalies using baseline data, the data are not detrended. The period for which anomalies will be calculated, 2005-2010 is brief relative to the baseline period, 1970-2010, and the effect of not removing a trend should have a similar effect on all anomalies.

Variables are selected based on the ability to indicate air mass characteristics (temperature and moisture variables), dynamic processes (wind variables), and stability (largest hourly rise or fall measurements, daily trend, and range). Surface meteorological variables used in other studies include: temperature, dew point temperature, mean cloud cover, mean sea level pressure at four selected hours during the day (Sheridan, 2002; Kalkstein and Corrigan, 1986); diurnal temperature and dew point temperature range, and largest four hourly change in wind direction and pressure (Sheridan, 2002); wind speed and direction (Kalkstein and Corrigan (1986). Cheng et al. (2004) in a study identifying freezing rain events, used similar static (excluding change and range) variables for all hours of the day. The study also used multiple measures at several different geopotential heights.

70

The purpose of principal component analysis is to reduce multiple variables to fewer components, where the components are linear combinations of the original variables and capture much of the overall variability in the original variables. While PCA requires some level of correlation between variables, it should not include highly correlated variables as this may result in unreliable factor loadings (Garson, 2013). Hourly measurements are highly correlated, therefore as a compromise, initial variables that I examined for their discriminating capability included: daily mean, range and trend, biggest one-hour rise and fall, hourly values at 0300, 0900, 1500 and 2100 for temperature, surface pressure, wind speed and direction, dew point temperature, relative humidity, specific humidity, cloud cover, and winds at the 850-hPa pressure level. The selected hourly values capture the diurnal cycle. All variables are used as input to the PCA. A subset of variables is used for the discriminant function analysis.

5.2.2 Classification methods

Two classification methods are used in this analysis. First, an unsupervised classification using cluster analysis is applied to principal components extracted from multiple correlated meteorological measurements. An unsupervised classification is used when the number of classes is not known a priori and final classes are based on creating groups where variability within groups is less than variability between groups. Different combinations of the meteorological variables are used as input to the PCA. Statistical tests are used to determine the optimum number of principal components and clusters. Different clustering methods are tested. Results are evaluated based on the stability of the solution, i.e., groups of days maintain similar classes having identifiable physical characteristics when different principal components or clusters are used, and a reasonable number of days match manually classified days.

The second method uses a supervised classification technique, discriminant function analysis (DFA), applied to raw meteorological measurements. In this case the number of classes is known prior to the classification process. Data from 10 Oct 2013 to 30 Sept 2014 were used to manually classify each day into one of six types – cold dry (CD), cool wet (CW), chinook (Ch), transition (Tr – cold front), hot (Ht) and normal (Nl). Classification used Calgary Environment Canada standardized hourly data for pressure, temperature,

71

relative humidity, specific humidity, and wind speed/direction. The same variables used in the PCA are available for DFA. However, DFA is sensitive to multicollinearity between the predictor variables, and the relative importance of predictor variables will be incorrect in the presence of correlated variables. Therefore, variables were selected by examining their discriminatory ability using analysis of variance (ANOVA) statistical tests and boxplot visual analysis.

5.2.3 FCA data analysis by weather type

Lapse rate variability within each weather type is described using daily lapse rates calculated for the FCA mountain sites. The top 10 most typical days, as identified by the highest post- probability scores, are used to visualize mean temperatures for the FCA area for each weather type. Daily mean temperatures are converted to an elevation of 1350 m (the mean elevation for all FCA sites) using the average lapse rate calculated from the 10 most typical days for each weather type. An inverse distance squared weighting interpolation algorithm using 25 neighbours is then used to create a temperature surface.

5.3 Results

5.3.1 Principal component and cluster analysis

Principal component analyses with different combinations of input variables and varying numbers of retained components were used as input for the cluster analysis. Multiple clustering methods and cluster sizes were tested and statistical tests were used to determine the optimum number of clusters. However, no methods or variable combinations achieved physically identifiable groupings that could be readily interpreted. Classes were not stable when different clustering methods or principal components were retained, and there were low levels of agreement with comparisons made with manually classified days. These challenges are likely a result of too much overlap between different weather types; days rarely match a ‘textbook’ description of a weather type, and weather variables cover a continuum rather than discrete values. Therefore, a supervised classification method, discriminant function analysis was applied.

72

5.3.2 Discriminant function analysis

Data from 10 Oct 2013 to 30 Sept 2014 were used to manually classify each day into one of six weather types (classes), which were used as the training dataset. This training dataset was used to create a set of discriminant functions in order to classify all days in the FCA study period July 2005 to June 2010. Data to be used for DFA should meet the following requirements: within each class the independent variables should be linearly related and normally distributed, no highly correlated variables, no outliers, and variance and correlation matrices of the groups should be similar (Garson, 2012). There are high correlations between multiple variables, therefore only a subset of uncorrelated variables are used in the discirminating functions. The largest hourly rise and fall values are strongly skewed and are excluded from the analysis. Weather types do show differing values for the meteorological measurements. Boxplots showing the variability in standardized values for average temperature and specific humidity are shown in Figure 5.2.

Figure 5.2. Boxplots of standardized temperature (TavgStd) and specific humidity (QavgStd) values for the six weather classes. The box edges indicate the 25 and 75 percentiles and the thick black line is the median value for each weather type.

All variables are significant discriminators between at least two weather classes, as shown by ANOVA tests. As a means to remove highly-correlated variables and select those with the strongest discriminatory power, separate t-test of means were run for each variable for each pair of classes. The top 10 discriminating variables for each class pair were identified and highly correlated variables removed, resulting in 10 variables in the final DFA solution, namely: vNS at 15:00, uEW at 21:00, daily average temperature, relative humidity, and

73

specific humidity, daily range for temperature, relative humidity, and dew point temperature, and daily temperature and relative humidity trend.

A Wilks’ lambda test is used to determine the overall significance of the model and of each discriminating function. For six classes, five discriminating functions are generated. The structure matrix shown in Table 5.2 shows the correlation between each independent variable and discriminant function. This can be used to assign meaning to the functions. Ideally, correlations should be greater than 0.5 for each independent variable included, but in some cases I retain additional variables because they increase the overall classification accuracy. As an example, the north-south wind vector vNS15 is included as it improves the accuracy of the transition type.

DF1 DF2 DF3 DF4 DF5 vNS15 0.020 -0.295 -0.319 -0.134 0.207 uEW21 0.052 -0.019 -1.009 -0.373 0.144 Trange -0.142 -0.742 -0.948 0.170 -0.413 Ttrend -0.036 0.776 0.179 -0.091 0.278 RHrange 0.006 0.023 0.021 0.178 1.277 RHtrend 0.175 -0.227 0.141 -0.056 0.604 DPTrange -0.002 -0.418 0.357 0.017 -0.591 qavg -0.624 0.181 -0.404 -1.128 -0.716 Tavg -1.386 -0.154 1.089 0.611 0.652 RHavg 0.547 -0.465 0.388 -0.167 0.996 Table 5.2. vNS15 – north-south wind vector at 15:00; uEW21 – east-west wind vector at 21:00; T – temperature; RH – relative humidity; q – specific humidity; DPT – dew point temperature; range – difference between daily minimum and maximum values; avg – daily average values calculated from hourly measures; trend – difference between value at 23:00 and 00:00 in a day.

Figure 5.3 shows scatterplots of the discriminant function scores for each record. DF1 separates types based mostly on average temperature. DF2 is weighted on temperature range and trend, thus separating Nl and Tr days (large range and falling temperature). DF3 is weighted with temperature range and east-west wind vector, and discriminates between Ch (stronger west wind and larger temperature range) and Ht days. DF4 is weighted on specific humidity and separates CD (low humidity) from CW (high humidity) days.

74

Figure 5.3. Discriminant function score scatterplots showing weather type separation.

The accuracy for the final model is 81% using jackknife cross-validation, where each day is left out in turn and prediction functions are calculated with the remaining data and used to predict the omitted day. Alternatively, training can be performed on a subset of the data and accuracy assessed by applying the prediction functions to the remaining data. Multiple runs, where a random sample of half the data is used for training, and functions applied to classify the remaining data, yield overall accuracies between 75 and 85%. The CD and Nl types are consistently well predicted, between 80 and 90%, while the remaining types vary from 50 to 90%, with only the Tr type dropping below 50%. Using the original jackknife 75

accuracy assessment, when the vNS15 variable is included the Tr type has 53% correctly classified compared with 47% when the variable is excluded. Table 5.3 shows the final class accuracies as calculated using a jackknife method assessment. Chinook days are most commonly misclassified as hot days (26%), cool wet days can be misclassified as cool dry days (11%), and 18% of hot days are misclassified as normal days. Misclassifications are most likely a result of the overlap between weather types and days being a mixture of types. For example, lighter winds during a multi-day chinook event may more closely resemble hot days.

Predicted Actual CD Ch CW Ht Nl Tr CD 93 0 0 0 4 3 Ch 0 66 0 26 9 0 CW 11 0 79 0 5 5 Ht 0 5 0 75 18 2 Nl 1 4 3 1 87 4 Tr 13 7 7 0 20 53 Table 5.3. Classification accuracy table, the percentage of correctly classified days for each weather type is shown in bold.

5.3.3 FCA study period weather types

The number of days per month for each weather type for the period July 1, 2005 to June 30, 2010 is shown in Table 5.4. Most weather types occur year-round, except cool-wet days which do not occur during the cool months (November to February). Cold-dry and chinook days are more common during the cool months and hot days occur most often from May to September. Normal and transition days occur equally in all months. Approximately 50% of days are classified as normal, with hot and cold-dry days being the next most frequent weather types.

76

weather type

CD Ch CW Ht Nl Tr January 16 27 17 85 10 February 27 16 6 79 13 March 21 26 5 17 77 9 April 24 11 13 15 79 8 May 20 9 15 24 78 9 June 15 4 19 21 86 5 July 8 3 8 42 87 7 August 18 3 23 23 79 9 September 13 3 11 38 78 7 October 24 14 12 13 87 5 November 22 26 20 79 3 December 38 18 14 75 10 Total days 246 160 106 250 969 95 Table 5.4. The number of days per month for each weather type.

I define a weather event as multiple consecutive days of the same weather type. Using this definition, Table 5.5 shows the number of events for each weather type, the average duration of events and the maximum duration of a weather event. The transition type, as expected, tends to last just one day. Normal events have the longest average and maximum durations, but both cold-dry and hot events can last up to 10 days. The average duration for all weather types is less than four days, indicating the frequent changes in weather experienced in southwestern Alberta. weather number average maximum type of events duration duration CD 133 3.6 12 Ch 50 2.3 6 CW 52 2.7 6 Ht 135 3 10 Nl 495 3.8 17 Tr 7 2 2 Table 5.5. The number of events for each weather type and the average and maximum duration.

Weather types show recognizable meteorological scenarios. The average values calculated for all days in each weather type for selected standardized measurements are shown in Table 5.6. Cold-dry days are characterized by below average temperature and specific

77

humidity, and a smaller daily temperature range. Chinook days have a positive temperature trend, higher than average temperature, and strong west winds. Cool wet days show below average temperatures, reduced daily temperature range, and weak east winds. Hot days experience above average temperatures and specific humidity. Transition days have a strong downward temperature trend and moderately strong north winds.

qavg uEW21 vNS15 Tavg Trange Ttrend CD -0.98 -0.4 0.14 -1.46 -0.73 -0.02 Ch 0.55 1.27 -0.36 0.96 0.77 0.84 CW 0.73 0.15 0.73 -1 -1.55 -0.37 Ht 1.01 -0.17 -0.4 1.32 0.5 -0.03 Nl 0.14 -0.14 -0.1 0.11 -0.01 0.06 Tr 0.02 0.15 1.16 -0.41 0.24 -1.52 Table 5.6. Selected average standardized meteorological characteristics for each weather type with the top defining values shown in bold.

Post-probabilities can be reported when using discriminant functions to assign weather types to days. Values indicate the likelihood of a day belonging to a particular weather type. High post- probabilities indicate days closely matching typical conditions for a weather type. An examination of the 10 days having the highest post-probabilities for each weather type is used to show typical meteorological conditions as measured at Calgary and spatial temperature patterns for the FCA area. Average hourly values for selected Environment Canada measurements at the Calgary station are shown in Figure 5.4 for the top 10 days for each weather type. Cold dry days have low temperatures (below 20°C) and low diurnal range, pressure is high, winds are light northerly and specific humidity very low. Transition days show strong north winds, pressure trending up and temperature trending down. Hot days show light winds, high temperature (above 20°C), high specific humidity, and a strong diurnal cycle for temperature and relative humidity. Relative humidity approaches 100% on cool wet days and the diurnal range is low. Chinook days have strong west winds and a rapid rise in temperature and drop in relative humidity.

78

Figure 5.4. Average hourly measurements for temperature (T), relative humidity (RH), east-west wind (uEW), specific humidity (q), north-south wind (vNS) and pressure (P) for the top 10 days for each weather type.

Lapse rates are calculated for daily minimum, maximum, and mean temperatures using only mountain stations to avoid over-representation of low altitudes when prairie sites are included. In this discussion, I refer to a rapid decrease in temperature with increasing altitude as a steep negative lapse rate, a shallow negative lapse rate has a slower decrease in temperature, and an increase in temperature with increasing altitude is a positive lapse rate or inversion. Lapse rates vary by weather type, but show seasonal variability as well.

The distribution of daily lapse rates by weather type and month for mean, maximum and minimum temperatures are shown in Figures 5.5 to 5.7. Months are grouped into seasonal bins based on similarity of lapse rates. The median lapse rates are 5.3, 8.2 and 2.3°C/km for mean, maximum, and minimum temperatures respectively. All weather types show some variability by month, but the cool dry days show the shallowest lapse rates with the highest variability during the cool months of November to February. This is due to the

79

frequency of inversions which occur on 75% of days for minimum temperatures and more than 25% of days for maximum temperatures. Hot days also show positive lapse rates for minimum temperatures on more than 50% of days during the warm months of June to September. Chinook days showing positive lapse rates in August and September are likely misclassified hot days. Normal days show some positive lapse rates in November, December, August and September. These are likely normal days with some characteristics of CD (Nov-Dec) and Ht (Aug-Sept) days. Cool wet days show the lowest variability for all temperature measures and all seasons.

While the FCA data do show significant differences in mean lapse rates between some seasonal weather types, there is still high variability within all types and seasons. As a means to reduce within-type variability, subsets of days were examined. First I looked at days only having a high post-probability score. That is, days having characteristics closest to the type measures for a weather type. Variability still remains high. Alternatively, I looked at days which were only part of a multi-day weather event. This examines whether lapse rates for days within multi-day events are more similar than those for single day weather events, where a day may be transitioning between types. Again this did not reduce the variability. It appears that lapse rate variability is an intrinsic part of weather types.

80

Figure 5.5. Lapse rates by weather type and month for Tmean. Months are grouped into seasonal bins indicated by colour. The dashed line indicates the median lapse rate.

81

Figure 5.6. Lapse rates by weather type and month for Tmax. Months are grouped into seasonal bins indicated by colour. The dashed line indicates the median lapse rate.

82

Figure 5.7. Lapse rates by weather type and month for Tmin. Months are grouped into seasonal bins indicated by colour. The dashed line indicates the median lapse rate.

Interpolated mean temperature surfaces for the FCA area are created using inverse distance weighting. Surfaces are created for raw temperatures and temperatures lapsed to the median elevation of all FCA sites, 1350 m. The average daily lapse rate for the 10 most typical days for each weather type is used. Lapse rates are calculated using mountain sites only. Average temperatures and lapse rates for the 10 days for each weather type are shown in Figure 5.8. Cold dry and transition days show a weak linear relationship between temperature and elevation and the average daily lapse rate for the CD days is positive. Chinook days show moderate scatter with a strong normal (negative) lapse rate, except for prairie sites. Cool wet days show the strongest temperature/elevation relationship, including for prairie sites.

83

Figure 5.8. Mean temperature/elevation plots for the top 10 days for each weather type. The best fit line is shown and the average lapse rate. Mountain stations are in red, prairie stations are green, and city of Calgary stations are black.

Interpolated surfaces for the average daily mean temperature for the 10 most typical days for each weather type for actual temperatures and lapsed temperatures are shown in Figures 5.9 and 5.10. Actual temperatures indicate the altitude of the region, with high temperatures (low altitudes) in the east and low temperatures (high altitudes) in the west for the cool- wet and hot weather types. But cold-dry days show a reverse pattern, with warmer temperatures in the west. The cooler temperatures in the east indicate the north-east source

84

of the cold dry air. Chinook days show the warmest temperatures in a north-south zone from Calgary west to the foothills, with cooler temperatures east of Calgary. The temperature range across the region is on average 10°C with a smaller range of 6°C for cold dry days and a larger range of 12°C for chinook days.

Figure 5.9. Interpolated mean temperature surfaces calculated using the average temperature of the top 10 days for each weather type. Points indicate FCA sites and the city of Calgary is shown as a black outline.

85

Figure 5.10. Interpolated mean temperature surfaces calculated using the average temperature of the top 10 days for each weather type lapsed to an elevation of 1350 m. Points indicate FCA sites and the city of Calgary is shown as a black outline.

If altitude was the only factor controlling temperature, and the rate of change in temperature with increasing altitude was constant through the altitude range in the area, lapsed surfaces shown in Figure 5.10 could be expected to show little spatial structure and limited temperature range. While the average temperature range is reduced to ~7°C, temperature is still spatially variable. Chinook days continue to show a warm central north-south zone, but transition days, which are characterized by strong winds, show the most spatial

86

consistency. The varying spatial patterns by weather type indicate that there are factors other than altitude influencing temperature, and these vary by weather type.

5.4 Discussion

Weather types in southwestern Alberta are recognizable and have distinct meteorological characteristics. However, classification using PCA and CA was not successful in producing stable results or easily physically identifiable classes. Therefore, the final weather classification used DFA. Classification accuracy was high, but there were still some misclassified days. Alternative variables and functions may have altered results, but Kalkstein et al. (1996) tested multiple combinations of input variables and found little difference between solutions using between 5 and 21 predictors. Days are classified according to the majority weather type within a 24-hour period from 00:00–23:00. However, weather operates as a continuum, rather than discrete 24-hour units, and this will always result in some misclassification.

Misclassification is most apparent for chinook and transition weather types, which have the lowest accuracies at 66 and 53 percent respectively. Chinooks are most often misclassified as hot days, and transition days as cold dry or normal days. Both weather types are characterized by trending variables (relative humidity, temperature, and pressure) and occur most often as single day rather than multi-day events. Trends may start late in the day and continue into the next day, but the classification functions will only identify the majority weather type occurring during a day. Chinook events may also be misclassified as hot days during winter, as a multi-day event will show as a persistent gusty westerly wind with high temperature and low humidity. While hot days are generally calm, the use of single hourly average wind vectors for the classification may not catch the characteristic wind gusts, as winds may periodically become lighter during a chinook. Therefore, a day within a multi-day chinook event may look more like a hot day. By my definition, transition days represent the onset of cold, dry weather, and any weather type can transition into a cold-dry type. Transition days, representing cold fronts, are generally rapidly moving events lasting a day at the most. Therefore, a transition occurring overnight may be classified as a normal day followed by a cold-dry day.

87

In a study examining the strength and frequency of chinooks in southern Alberta, Nkemdirim (1996) found that chinooks occurred year-round, but with the highest frequency in winter, with 48-50 days per winter for the Calgary region. My DFA classification has on average 30 chinook days per year, with most occurring between November and March. This is lower than that found by Nkemdirim (1996), but ~26% of chinook days were misclassified as hot days and this would account for some of the discrepancy.

In the same study, maximum signal strength was found to occur east of longitude 114°W at latitude 50. 5°N, the latitude of Calgary. The FCA data do not extend beyond 113.5°W, but show the maximum signal strength occurring between 115 and 114°W. It is possible that the limited easterly extent of the FCA study is not capturing the full chinook signal, and the 10 days between 2005 and 2010 that are used to show the spatial extent are not representative of the typical chinook spatial structure. Nkemdirim (1996) used data from 1951 to 1990. In addition, the altitude/temperature plots in Figure 5.8 show considerable scatter. In particular, the prairie sites show an inverse lapse rate, whereas mountain sites show a normal decrease in temperature with increasing elevation. Applying a lapse rate correction based on mountain sites has likely exaggerated the cooling seen east of Calgary in Figure 5.10.

The higher density of FCA stations does allow for a more detailed view of the chinook spatial structure. The FCA is resolving more structure than is possible through the Environment Canada network, and the spatial patterns observed here may be an accurate representation of chinook structure in this region. Further analysis using the full set of chinook days may provide new insights into the surface temperature patterns during chinooks.

A comparison with the spatial synoptic classification (SSC) daily weather types for Calgary as defined by Sheridan (2002) shows some similarities between DFA weather types and SSC types, but also differences. The biggest difference is in the relative proportions of the types. SSC types, like DFA types, are weather type classifications, as both classification systems are based on surface measurements. However, the SSC is based on actual measurements using “seed” days for each weather type selected during the four seasons,

88

where the seed days are chosen to represent average conditions for the season and weather type. In contrast DFA types are created using anomalies and represent more “extreme” conditions. SSC types more closely resemble air masses than regional named weather types. SSC types and their most likely DFA equivalents are shown in Table 5.7, and Figure 5.11 illustrates the relative proportions of each type for the two classifications. Dry moderate is the most frequent SSC type, accounting for 33% of days during the FCA study period. It corresponds with the DFA “normal” weather type, which accounts for just over 50% of days. Dry polar and moist moderate account for just over 20% of days and less than 10% of days, and the DFA equivalents, cold-dry and cool-wet days which account for 14 and 6 percent respectively in the DFA classification.

SSC DFA Dry Moderate Normal Dry Polar Cold dry Dry Tropical Hot/chinook Moist Moderate Cool wet Moist Polar Cool wet? Moist Tropical - Transition Transition Table 5.7. SSC weather types and the equivalent DFA types.

SSC dry tropical days are most similar to hot days, but while these are common in summer for the DFA classification, there are very few days of this type in the SSC classification. The DFA classification is specific to the FCA study area, in particular the chinook type which has no direct equivalent in the SSC classification, but is grouped in with the dry tropical type. This type is also underrepresented in the SSC. On an annual basis, the greater similarity in the number of days for the SSC types may result in even greater variability in lapse rates and surface temperature structure for SSC types than is present for the DFA types. However, the large number of days in the DFA normal type may make it too heterogeneous in terms of identifying topographic influences specific to the weather type.

89

Figure 5.11. A comparison between the number of days for each month for DFA (upper panel) and SSC (lower panel) weather types from 2005 to 2010.

Annually there are only small differences in the number of days of each weather type, as shown in Figure 5.12. 2009 experienced more cold-dry days than normal, and 2006 more hot days. The annual average temperatures are influenced by the number of days of each weather type, with 2009 recording the coolest annual average temperature of 4°C and 2006 the warmest at 5.3°C during the study period. The climate normal (1981-2010) average temperature for Calgary is 4.4°C.

Lapse rates show significant variability between weather types. In particular, chinooks show the steepest lapse rates and cold-dry days have the most frequent inversions. This is in agreement with the results reported by Cullen and Marshall (2011). However, minimum temperature lapse rates for chinook days also show inversions during summer months, similar to the lapse rate distribution for hot days in summer. This is unexpected in that minimum temperature inversions are most common under calm conditions, which are not generally associated with chinooks. While chinooks and hot days likely do occur year round, hot days in winter may be closer to chinooks and chinook days in summer may be closer to hot days. Possible misclassification of these two types may have inflated the lapse rate variability within these weather types.

90

Figure 5.12. The number of days for each DFA weather type for each year from 2005 to 2010.

As shown by other studies (e.g., Shea et al., 2004; Stahl et al., 2006a; Rolland, 2003), lapse rates vary by temperature measure and show a seasonal cycle. Maximum temperatures were found to have steeper lapse rates than minimum temperature lapse rates, with the steepest maximum temperature lapse rates occurring in summer. Minimum temperatures have the steepest lapse rates in spring. Overall these observations hold true in this study, but there are some notable exceptions by weather type. Cool-wet days show little seasonal variability. The usual seasonal bins (DJF, MAM, JJA, SON) are not generally applicable to all weather types in this region. An extended winter season, defined as NDJF, better represents the variability in cold-dry lapse rates, with shallow winter lapse rates. Summer, defined as JJAS, has shallow lapse rates for minimum temperatures. Blandford et al. (2008) found lapse rates calculated by season and weather type using the SSC classification did not improve temperature estimates. But as previously shown, when compared with DFA weather types, SSC types are more general, and the study used standard season definitions. Therefore, it is likely that seasonal SSC types may show more overlap in lapse rates than that seen in seasonal DFA weather types. Similarly, Pepin et al. (1999) found considerable overlap in local lapse rates grouped using Lamb’s weather types defined for the UK, but using a local weather type classification increased the separation between weather type lapse rates.

91

5.5 Summary

Discriminant function analysis using standardized surface meteorological measurements from the Environment Canada Calgary station is able to reliably identify physically meaningful weather types in southwestern Alberta. There are six weather types, with classification accuracies varying from 93% for the cold-dry type to 53% for transition days. Principal component and cluster analysis were less successful at identifying recognizable regional weather types, as shown by poor agreement with manually classified days, and different methods producing very different classifications.

DFA types have distinctive meteorological characteristics, and lapse rates differ between the types. However, variability remains high within types as well, in particular the cold- dry class. In addition, there is seasonal variability within the lapse rates for the different weather types, indicating weather type characteristics vary in time as well. Surface temperature patterns also show spatial variability not explained by altitude differences alone. These spatial patterns also vary by weather type.

Other landscape topographic attributes, such as valley/slope position, slope angle and aspect, and location (e.g., latitude and longitude) may explain some of the spatial structure seen in the temperature surfaces. Chinook days show an east-west trend and the city of Calgary is warmer than the surrounding area, with the cold-dry and hot days showing the strongest warming. The small elevation range (890 m to 1250 m) in the prairies and the scatter in elevation/temperature plots (indicating a weak linear relationship between temperature and altitude) indicate lapse rate models may not be appropriate in the prairies. Alternative interpolation methods, such as inverse distance weighting or kriging may work better. In the next chapter I will examine correlations between station pair temperatures by month and weather type, and how correlations are affected by topographic setting and weather type. This information will provide insight into factors controlling temperature in both the mountains and prairies.

92

6. TERRAIN AND WEATHER-TYPE INFLUENCES ON DAILY TEMPERATURE VARIABILITY

The spatial variability of daily temperature for different weather types was introduced in the previous chapter. In this chapter I look at terrain and land surface type controls on daily temperature and how these vary by month and under different weather types. Correlations between location and terrain attributes and daily temperature measures are calculated in order to determine direct topographic influences on temperature. Significantly correlated terrain attributes are used as suggested predictor variables in multiple regression temperature interpolation models described in Chapter 7.

In addition, I calculate the correlation between all station-pair temperatures for days grouped by month and by weather type. Using the most highly correlated neighbours for each site, I calculate the difference between site terrain attributes for each station and its neighbours. The relationship between station correlations and terrain attribute differences indicates the importance of different topographic influences on temperature, by month and by weather type. The assumption is that highly correlated sites will share similar topographic attributes. The results are used to select stations and assign station weights in local weighted temperature/elevation regression models, described in Chapter 7.

6.1 Introduction

The four main geographic factors influencing climate are latitude, continentality, altitude, and topography (Barry, 2001). For regional studies, altitude and topography are the dominant influences. Topography (e.g., relative relief, slope angle and aspect) affects climate elements at a range of scales. The importance of different climate forcing factors varies according to the scale of interest and the complexity of the topography of the study area (Daly, 2006). The overall size and orientation of mountain ranges relative to the prevailing winds impact large-scale processes, thus large topographic features may act as barriers to the movement of air masses. With respect to minimum and maximum temperatures, Daly (2006) found elevation to be highly important at spatial scales less than 50 km and moderately important at scales greater than 50 km. The effects of terrain, e.g., terrain barriers impeding flow, operate at similar scales to elevation. Relative relief and topographic shape, e.g., valleys, are important on a regional scale. In winter under calm 93

conditions, inversions are common in valleys at scales less than 50 km due to cold air drainage (Bolstad et al., 1998; Rolland, 2003). Daly (2006) considered cold air drainage to be most important at scales of 1 km and only moderately important at scales of 10 km and more. Slope and aspect affect near-surface temperatures at scales of less than 1 km and possibly at larger scales, due to solar radiation and wind influences (McCutchan, 1976; Barry, 2001; Daly et al., 2007).

Southwestern slopes tend to be warmer than north-facing slopes in the northern hemisphere (Beniston, 2006; Bolstad et al., 1998). The solar deficit on northern slopes increases with increasing slope and is greatest during summer when overall radiation is strongest. The influence of slope on incoming direct solar radiation is strong at high latitudes. Annual potential solar radiation is about 10% more on southerly slopes with a 10-degree slope angle, relative to a flat surface, and almost 40% more for 30 degree slopes at 50° latitude, the average latitude of the FCA study area (Barry, 2008). Differences between north and south slopes are greatest around the summer solstice. Temperature differences up to 3.5°C between warm south-west and cool north-east forested slopes have been observed (Bonan, 2008).

Land cover affects local climate due to variations in albedo, surface roughness, and moisture content (Bonan, 2008). Albedo affects the amount of radiation absorbed, resulting in variable surface heating. Snow covered surfaces, having a high albedo, have lower daily minimum and maximum temperatures relative to bare surfaces (Leathers et al., 1995), with deeper snow cover resulting in lower temperatures (Baker et al., 1992). Land cover also affects the surface energy balance, whereby sensible heat may be converted to latent heat in wet or vegetated areas, but heats bare soil and rock, resulting in higher temperatures for non-vegetated areas. Bare slopes show large differences in temperature for different slope aspects, but temperatures in closed-canopy forests show minimal slope effects. Temperatures in forests are cooler during the day than in surrounding open areas or openings within the forest due to reduced turbulent heating (Karlsson, 2000). Nighttime temperatures in forests are warmer due to reduced outgoing longwave radiation and reduced vertical mixing.

94

The Rocky Mountains can block or divert the eastward flow of weather systems as they approach Alberta from the Pacific, confining the moderating influence of maritime polar air to areas west of the Rocky Mountains. In addition, the mountains often block cold Arctic air from moving west, causing persistent inflections in the jet stream (i.e., an upper-level trough east of the mountains) and resulting in extended periods of cold weather in Alberta. On a mesoscale, mountains have local topographic climates related to slope and aspect. The valleys experience a large diurnal range in temperature, and wind speed and direction affects cold air drainage which can result in positive lapse rates (temperature increases with increasing elevation; Barry and Chorley, 2010). Daly et al. (2010) found that while surface temperatures on exposed hillsides and ridges were well-correlated with synoptic circulation conditions, conditions in sheltered valleys were subject to cold-air pooling and therefore less correlated with regional or synoptic conditions.

Some studies have looked at the varying response of climate variables under different synoptic conditions. The spatial patterns and diurnal variation in temperature, wind, dew point temperature, and relative humidity were found to vary significantly by weather type in southern California (McCutchan, 1978). Courault and Monestiez (1999) used different functions to convert station elevations to a constant (sea level) elevation: a fixed lapse rate, and a lapse rate dependent on the season and circulation pattern. Maximum temperature lapse rates showed a relationship with circulation pattern and thus improved the interpolation results relative to using a constant lapse rate. However, minimum temperature lapse rates showed less of a relationship with circulation pattern, which the authors attributed to using a large-scale classification (continental-scale circulation patterns) for a regional-scale application.

Temperature inversions occur frequently when Arctic air enters the area from the northeast. These are due to both surface cooling under clear conditions, and when the shallow dense Arctic air pushes under the warmer westerly flow. Cullen and Marshall (2011) found that negative temperature anomalies in the Alberta foothills were highest in the eastern prairies of Alberta and lower closer to the continental divide during Arctic air events. The same study reported spatially variable temperature anomalies during chinook events, with lower anomalies closer to the continental divide. Therefore, temperature variability may be a

95

function of a station’s easterly location (or distance from the continental divide) as well as its altitude.

Unusually warm temperatures are common when anticyclonic flow prevails over southwestern Alberta, with high pressure and subsidence. Under the calm, clear conditions accompanying unseasonably warm weather, daily maximum temperatures have a similar distribution to global radiation values (McCutchan, 1976). Therefore, slope and aspect may be important factors in determining local spatial variability in daily maximum temperature patterns during hot and dry weather.

Fridley (2009) noted that temperature differences between different landscape positions do not remain constant year round. Nkemdirim (1996) created monthly regression equations between closely correlated stations to recreate daily minimum and maximum temperatures in southern Alberta. This method implicitly accounts for seasonally-varying elevation and topographic influences on temperature, by generating regression equations from neighbouring stations based on correlations between stations. However, Courault and Monestiez (1999) noted that station correlations vary with wind direction and topographic location, indicating that the best-correlated station may vary during any month. I hypothesize that sites showing highly-correlated temperatures should show similar terrain characteristics for the sites. Examining terrain characteristics for highly-correlated sites will determine the importance of different topographic influences on daily temperature measures under different weather types, and will be used to inform the development of landscape interpolation models.

The number of stations and range of topographic settings of the FCA study provide a rich dataset for studying topographic influences on daily temperature measures under different weather conditions. Based on previous work and physical reasonableness in the study area, the following terrain variables and their influence on temperature variability are examined: slope angle and aspect, relative elevation, land surface type, altitude, horizontal location, and distance from the continental divide. I use two methods to determine topographic influences on temperature. Method one calculates the correlation between temperatures and terrain attributes, as a direct measure of terrain influences on temperature. Correlations are used to indicate important predictor variables to be included in multivariate regression

96

models to estimate daily temperature. Method two identifies terrain attributes that influence the correlation between station temperatures.

As will be shown in Chapter 7, local weighted regression models using stations selected and weighted based on correlation perform well in estimating temperature. Therefore, identifying the importance of different topographic influences on station correlations is a means to determine weightings for stations used in weighted local regression models developed in Chapter 7. Both methods are applied separately for prairie and mountain stations, and for days grouped by month and weather type, to determine how terrain influences vary by month and weather type.

6.2 Methods

Data collected between July 2005 and June 2010 are used in the analysis. In July 2005, with station setup almost complete, there were 120 (209) mountain (total) stations in place. Station takedown began in 2010 and in June 2010 there were 132 (208) mountain (total) stations still in place. For all months between July 2005 and June 2010 there were at least 120 (200) stations operational. While the same stations are not present every month, it is assumed that having 90% of stations available will give a representative and unbiased set of data for analysis when investigating the relationship between station correlations and terrain attributes. Correlations between temperature and terrain attributes are calculated using an aggregated, gap-filled dataset. The gap-filling method uses regression equations developed from the most highly correlated neighbour station, and is described in detail in Chapter 7.

Topographic influences are investigated by month and weather type. The same weather type can and does occur in all months of the year. While it is hypothesised that weather types have similar characteristics irrespective of month, there will be differences due to changing air mass, insolation, and landscape characteristics. For example, Godson (1950) reported the maritime polar (mP) air mass to have a temperature of about +2°C in winter and +11°C in summer. Land surface characteristics also change seasonally. In Chapter 5 it was shown that lapse rates vary by weather type and month, with similar lapse rates for the cool months (Nov-Feb) and warm months (June-Aug). Therefore, weather types are grouped seasonally as follows: cool (Nov-Feb), warm (May-Aug), and moderate (Mar, 97

Apr, Sep, Oct), creating 17 seasonal weather type groups (the cool-wet type does not occur during cold months).

Correlation coefficients can be calculated using Pearson’s method for normally distributed data and Spearman’s rank correlation for non-normally distributed data. Quantile-quantile (q-q) plots show whether a sample matches a particular statistical distribution. A straight line indicates the theoretical distribution, and data points should roughly follow the straight line if the data is normally distributed. Figure 6.1 shows q-q plots for selected stations for the entire study period for Tmax, Tmean, and Tmin. While data is not normally distributed for these examples, with some skew, correlations calculated using Pearson and Spearman methods return similar results. Similarly, topographic measures show some skew, but histograms indicate that distributions approximate the normal distribution. Therefore, Pearson’s correlation coefficients are used in this analysis.

A 10-m resolution digital elevation model (DEM) for the FCA study area was downloaded from AltaLIS (2012). Site topographic characteristics – slope, aspect, and relative elevation – are calculated from the digital elevation model and are scale-dependent. Sensitivity analysis is used to determine the optimum buffer size to calculate relative elevation. Relative elevation (RE) is a measure of cold-air pooling potential. Cold-air pooling can occur at a range of scales from localized depressions to large valleys.

To determine the dominant scale at which cold-air pooling occurs in the study area, I evaluated a range of buffer sizes from 100 m to 5000 m around each mountain site. For each buffer, the difference in elevation between the site and the lowest point within the buffer was calculated, and is referred to as relative elevation (RE). Sites were grouped into 200-m elevation bins, named LT1400, LT1600, etc. to GT2200. LT1400 refers to sites with altitudes between 1200 and 1400 m, LT1600 includes sites between 1400 and 1600 m, and GT2200 includes stations above 2200 m and includes the two highest sites above 2800 m. Correlations between minimum temperature and RE are calculated for each buffer size, elevation bin, and weather type to determine the optimum buffer size and the elevation range and weather types where cold-air pooling occurs. It is assumed that temperature variations within 200-m elevation bins will be more influenced by RE than elevation, whereas temperature variations across elevation bins depend more on absolute elevation.

98

Figure 6.1. Quantile-quantile plots for selected stations for 2005-2010. The straight line indicates a theoretical normal distribution.

As DEMs are aggregated to larger grid sizes, slopes become shallower and terrain becomes smoother (Grohmann, 2015). Dubayah (1994) modelled solar radiation at a range of scales and found values showed strong autocorrelation for grid sizes less than 300 m and no autocorrelation beyond 1000 m. I calculated slope and aspect using 100, 250, and 500-m resolution DEMs. The correlation of temperature with slope and aspect was calculated for all DEM resolutions and temperature measures. Aspect is calculated as

aspect = aspect, where aspect < 180

aspect = 360 – aspect, where aspect > 180

99

giving values between 0 and 180 degrees. This accounts for differences in solar radiation due to slope direction, where east and west slopes receive the same amount of solar radiation (neglecting the influences of topographic shading and cloud cover).

Land cover classes were extracted from a 25-m DEM produced by Natural Resources Canada (NRCAN). The 234 FCA sites yielded 12 different classes. These were verified with site comments and satellite imagery and coalesced into five classes: rock/rubble, wetland, forest, grass/shrub, and urban. Ten sites within the city of Calgary boundary are classified as urban. The agricultural fields and grasslands of the prairies are classified as grass/shrub.

6.2.1 Method 1: Location and terrain attributes influencing temperature

Correlations between daily temperature measures (Tmin, Tmax, and Tmean) and terrain attributes are calculated for days aggregated into monthly and weather-type groups. As missing data can adversely affect aggregated temperature values (Stooksbury et al., 1999), the dataset was first gap-filled. The following terrain attributes are considered: altitude, distance from the continental divide, relative elevation, slope, and aspect, at multiple DEM resolutions. Correlations are also calculated for location attributes, xproj (longitude) and yproj (latitude).

6.2.2 Method 2: Location and terrain attributes influencing station pair correlations

The local weighted regression models developed in Chapter 7 require specification of weights and selection of stations to include in the model. Station pair correlations are a successful method for weighting and selection of neighbours for these models. To examine whether highly-correlated sites are located in similar terrain, differences are calculated between continuous terrain measures for each station and its neighbours. In this way, attribute differences can be used to select and weight neighbours.

Horizontal separation (hsep) measures the straight-line horizontal distance between stations, vertical separation (vsep) measures the difference in altitude between sites, RE measures the difference in cold-air ponding potential, cddist measures the difference in the distance to the continental divide, slope is the difference in slope angle between sites, measured in degrees, and aspect (slope direction) measures the difference in aspect in

100

degrees. Land surface type (LST) is a categorical variable and its influence on neighbour correlations is determined by counting the number of top-correlated neighbours having the same LST as the site.

Not all stations have data on all days; therefore, correlations are calculated using a different number of days for each site for any period. Rather than focusing on only the most closely correlated station, the 10 to 20 most correlated stations are used to analyse the relationship between terrain characteristics of stations and their most correlated neighbours. Gap filled data is not used here, as station correlations are used to select stations for gap filling, and this would bias the results.

6.3 Results

Table 6.1 shows the number of days for cool, moderate, and warm seasons for each weather type for the period July 1 2005 to June 30 2010. Correlations are calculated for each month and seasonal weather type, yielding 29 ‘periods’ (12 months and 17 seasonal weather types).

Weather Type Season CD Ch CW Ht Nl Tr cool 106 82 65 313 35 mod 82 51 43 87 317 30 warm 61 26 64 108 325 31 Table 6.1. The number of days for each weather type grouped by cool (Nov-Feb), moderate (Mar,Apr,Sep,Oct), and warm months (May-Aug).

6.3.1 Determination of optimum buffer size for calculating relative elevation

Relative elevation (RE) is used to indicate the potential for cold-air ponding. RE for each site was determined by calculating the difference between the site altitude and the lowest altitude within multiple buffers (100 m – 5000 m) of the site. Only mountain sites were used in the analysis. A small buffer, e.g., 100 m, will capture localized depressions and larger buffers will capture more regional-scale valleys. Results show a weak positive correlation (0.3) between minimum temperature and RE for all data. This indicates that for sites having similar altitudes, sites higher above a depression will have higher minimum temperatures than those near valley bottoms. However, the overall correlation is influenced

101

by low correlations for sites at altitudes greater than 2000 m and certain weather types, e.g., warm CD and CW days, as shown in Table 6.2. The overall coefficient increases to 0.6 when only including weather types prone to cold-air pooling and sites below 2000 m.

All data selected elevation bins and weather types weather type mean elevation bin mean buffer (m) mean weather type mean elevation binmean buffer (m) mean cool_CD 0.36 GT2200 -0.43 100 0.31 cool_CD 0.53 LT1400 0.51 100 0.51 cool_Ch 0.25 LT1400 0.47 250 0.36 cool_Nl 0.57 LT1600 0.62 250 0.63 cool_Ht 0.27 LT1600 0.54 500 0.36 mod_Ht 0.60 LT1800 0.69 500 0.65 cool_Nl 0.36 LT1800 0.56 1000 0.34 mod_Nl 0.59 LT2000 0.58 1000 0.65 cool_Tr 0.21 LT2000 0.44 2000 0.29 warm_Ch 0.63 2000 0.64 mod_CD 0.27 LT2200 0.23 3000 0.25 warm_Ht 0.65 3000 0.60 mod_Ch 0.31 5000 0.20 warm_Nl 0.62 5000 0.52 mod_CW 0.18 mod_Ht 0.44 overall 0.3 overall 0.6 mod_Nl 0.37 mod_Tr 0.15 warm_CD 0.24 warm_Ch 0.45 warm_CW 0.10 warm_Ht 0.47 warm_Nl 0.40 warm_Tr 0.31 Table 6.2. Mean correlation coefficients calculated between minimum temperature and site relative elevation are shown for multiple weather types, altitude ranges, and buffer sizes. Elevation bins indicate the upper altitude of the bin, e.g., LT1600 includes sites between 1400 and 1600 m.

For a 1000-m buffer, RE values range from 0 to 1000 m, with the correlation coefficient reaching a maximum when sites with RE greater than 500 m are excluded. This may indicate an upper limit of 500 m for the depth of most cold pools. Buffer sizes from 250 m to 2000 m have similar correlation coefficients. Therefore 1000 m is selected as the optimum buffer size for which cold air ponding will be calculated. Figure 6.2 shows the relationship between average minimum temperature and RE for all sites with altitudes between 1400 and 1800 m, for the normal and hot weather types during the warm season. Minimum temperatures increase by 1-2C for every 100 m of relative elevation during warm Nl days and by 2-3C per 100 m for warm Ht days. The rate of increase is stronger for the lower elevation bin (LT1600, 1400 -1600 m).

102

Figure 6.2. Elevation difference between site elevation and the lowest point within a 1000 m radius of the site (relative elevation, RE) plotted against average minimum temperature for warm-season Ht and Nl days at all sites between 1400 and 1800 m. A linear best fit line, regression equation, and adjusted R2 are shown for each plot. The slope of the regression indicates the rate of change in temperature per 100-m change in RE.

6.3.2 Temperature – site attributes correlations

Correlations between temperature and locational and terrain site attributes are calculated for the aggregated data set i.e., the average daily maximum, minimum, and mean temperatures for July 2005 to June 2010, and for days grouped by month and by seasonal weather type. Site attributes used include elevation, location attributes (xproj, yproj,

103

cddist), and terrain attributes (slp250, asp500, RE). Tables shown in this section only display significant correlation coefficients.

All aggregated data

Elevation

Correlations between aggregated temperature data and terrain attributes are shown in Table 6.3. Elevation shows the strongest negative correlation with maximum and mean temperatures in the mountains. As shown in Table 6.3, elevation is only weakly correlated with temperature in the prairies, with a positive correlation for Tmin and negative correlation for Tmax.

Location

Xproj is moderately positively correlated with Tmax, Tmin, and Tmean in the mountains, but only with Tmax in the prairies. This indicates temperature increases moving east. Yproj is inversely correlated with Tmax and Tmean in the prairies, indicating a temperature decrease moving north. Distance from the continental divide, slope, and relative elevation are all correlated with elevation, therefore partial correlations controlling for elevation are calculated for these measures. Cddist has a similar pattern to yproj, being inversely correlated with Tmax and Tmean in the prairies, indicating a temperature decrease moving further from the continental divide. Tmax shows the strongest relationship.

Terrain

RE is moderately correlated with Tmean and Tmin in both the prairies and the mountains, with temperatures warming higher above terrain low points. This indicates cold air pooling effects occur in the low terrain of the prairies as well as the mountains. Slope and RE are correlated in the mountains, and therefore slope correlations in the mountains are very similar to those for RE. There are no significant correlations between aspect and any daily temperature measure.

104

Table 6.3. Correlation between aggregated temperature measures and topographic attributes for mountain and prairie sites. Correlations for cddist, slp250 and RE are partial correlations, controlling for elevation. Darker shading indicates stronger negative coefficient values.

Data grouped by month and seasonal weather type

Variable seasonal or weather-type topographic influences on temperature are not captured when looking at aggregated temperature correlations. Therefore, correlations are calculated by month and weather type as well. All slope measures (100, 250, and 500-m DEM resolutions) produce similar results; therefore, coefficients are only shown for slope calculated from a 250-m resolution DEM. Correlations are strongest for aspect calculated from a 500-m resolution DEM.

Elevation

Correlations between temperature and elevation for mountain and prairie sites are shown in Table 6.4. There are significant correlations between daily temperature measures and elevation for each month and weather type for mountain sites. Maximum temperature and elevation have a strong negative correlation (< 0.8, i.e., temperature decreasing with increasing elevation) for all months and weather types, except for the cool_CD type, which has a weaker coefficient (0.38). Similarly, there is a slight weakening of the correlation coefficient during the cool months. Correlations are much weaker for minimum temperatures (> 0.3 from July to September), and are not significant from December to February and for cool_Tr, warm_Ch and warm_Ht weather types. The correlation is positive for both mean and minimum temperatures for the cool_CD type, indicating a prevalence of positive lapse rates (inversions). Interestingly, the CD type in the moderate

105

and warm seasons has relatively strong negative correlations between minimum temperature and elevation, as does the transitional weather type. Only the CW type shows a strong negative correlation for minimum temperatures across all seasons.

The relationship between elevation and temperature is not as strong in the prairies. In agreement with the weaker aggregated correlation coefficients in the prairies, there are fewer months with significant correlations than for mountain sites, and the sign of the coefficient varies by month. Correlations are positive from November to March for all temperature measures and negative from April to October for Tmax and Tmean. Weather- type correlations (not shown) have a similar pattern with positive coefficients in the cool season weather types and negative coefficients during the warm season.

Table 6.4. Correlation between daily temperature measures and elevation calculated by (a) month for prairie sites and (b) month and (c) weather type for mountain sites.

Location

Correlations with location measures (xproj, yproj, cddist – Tables 6.5 and 6.6) show a consistent direction year-round in the mountains, but vary seasonally in the prairies. For mountain sites, from November to January, Tmean and Tmin decrease moving north as shown by weak negative correlation coefficients (absolute value generally < 0.2) for yproj. There are no significant correlations for other months or Tmax. However, Tmax for the CD type in the cool-season shows a cooling trend moving north. In the prairies, Tmax

106

(absolute value from 0.3 to 0.9) and Tmean (absolute value from 0.24 to 0.78) decrease moving north, except in May and June. During the summer, warm air moves into the area from the south, and cold air moves in from the north in winter, both of which may result in year-round cooling moving north. The result will also be related to the decrease in insolation moving north.

Table 6.5. Correlation between daily temperature measures and location variables (xproj and yproj) calculated by (a) month and (b) weather type for mountain sites, and (c) month and (d) weather type for prairie sites.

In the mountains, all temperatures are positively correlated with xproj for all months, indicating warming moving further east. Coefficients are strongest for Tmax, in particular during the winter months, with values greater than 0.7. Weather types show a similar pattern, except the cool-season CD type, where Tmin shows a warming trend moving west. In contrast, temperatures in the prairies show a moderate cooling trend moving east from December to February, and Tmean and Tmax increase from April to September. Arctic air masses move into the area from the northeast, therefore this is likely the reason for the

107

cooling trend moving north and east in winter in the prairies, but with less impact in the mountains further west.

Table 6.6. Correlation between daily temperature measures and distance from the continental divide calculated by (a) month and (b) weather type for prairie sites, and (c) month and (d) weather type for mountain sites.

Correlations between temperature and distance from the continental divide in the mountains are moderate to weak (generally absolute values < 0.5) with both weather type and seasonal differences. Temperature decreases as distance from the continental divide increases for Tr, CD and CW weather types, but increases for Ht, Ch and Nl types. An exception is the warm-season Nl type where Tmax decreases and Tmin increases with increasing cddist. Tmin also has no significant correlations for the CD weather type. There

108

are fewer significant correlations between temperature and cddist when calculated by month. Where significant, correlations are positive, except for Tmax from June to August.

Tmax is strongly correlated with cddist in the prairies, mirroring the pattern shown with yproj, but with stronger coefficients. Fewer months have significant correlations between cddist and Tmean. Correlations are significant for less than half the months and weather types for Tmin and are both positive and negative. Cddist and yproj are correlated due to the NW-SE alignment of the Rocky Mountains, therefore it is difficult to separate the influence of these two attributes.

Terrain

Apart from moderate and cool-season Tr days which show a weak correlation (< 0.3) between Tmin and slope, there are no significant correlations between any temperature measure and slope or aspect in the prairies (Table 6.7). In the mountains there is a weak positive correlation (0.17) between Tmax and aspect in June and warm-season Nl days. Both Tmin and Tmean show positive correlations with RE in the prairies and the mountains with coefficients varying from 0.25 to 0.5. In addition, Tmax is weakly correlated (<0.2) with RE from November to January. However, there are some exceptions by weather type, with warm and moderate-season CD and CW days showing weak negative correlations between RE and Tmax and Tmean. As these are partial correlations, controlling for elevation effects, this behaviour cannot be explained as a function of the strong correlation between RE and elevation, so remains unexplained. Slope is also positively correlated with Tmean and Tmin in the mountains. Tmean only increases with increasing slope from February to September, whereas Tmin is positively correlated with slope year round with slightly stronger coefficients (0.3 to 0.5). Tmax shows a weak negative correlation (absolute value <0.2), whereby temperature decreases with increasing slope, in October and for moderate-season Ch and Nl days.

109

Table 6.7. Correlation between daily temperature measures and terrain measures (relative elevation, slope angle and aspect) calculated by (a) month and (b) weather type for prairie sites, and (c) month and (d) weather type for mountain sites.

6.3.3 Station pair correlation

Table 6.8 shows the average correlation between station pairs for the five most closely correlated neighbours, calculated for different time periods (month, weather type, and year) for mountain and prairie sites. Overall correlations are strong, above 0.98, with little difference between mountain and prairie correlations. Minimum temperature station/neighbour correlations drop below 0.96 during summer months, warm-season hot days, and most cool-season weather types.

110

Table 6.8. Average correlations for each temperature measure and period for the five most highly correlated neighbours.

The spatial pattern of highly-correlated neighbours and differences between station-pair terrain attributes are used to indicate how the terrain may be influencing temperature. Figure 6.3 shows the spatial pattern of neighbours for two representative sites. FA0415 is a westerly mountain site and FA0436 is an easterly prairie site. The 20 most correlated neighbour locations for three weather types are plotted, with highly-correlated sites shown in black and weaker correlations shown in grey. Prairie neighbours have a more circular grouping, while mountain neighbours are elongate and parallel the continental divide. The top-correlated neighbours vary by temperature measure and correlation period.

111

Figure 6.3. Spatial patterns of the 20 most highly correlated neighbours for a selected mountain site (FA0415) and prairie site (FA0436) for different weather types. Darker shades indicate higher correlations. The black square indicates the selected site.

6.3.4 Location and terrain differences between correlated stations

In this section I examine site attribute differences between closely-correlated neighbours as a means to identify attributes contributing to high correlations. Terrain attributes, as used here, refer to the difference between site and neighbour terrain attributes, rather than actual site attributes. Large values indicate that a site and its neighbours have very different attributes, and small values indicate that a site is similar to its neighbours.

Figure 6.4 shows terrain-attribute differences between each site and its 15 most highly correlated neighbour stations averaged by station class (mountain or prairie) and temperature measure. Both horizontal separation and distance from the continental divide are highest for Tmin and lowest for Tmax for both mountain and prairie sites. The average hsep is higher in the mountains than the prairies. The higher hsep for Tmin indicates factors other than spatial proximity are important for Tmin. Note that for easier comparison,

112

standardized difference measures are presented in the next subsection; dimensional variables are shown here to give a sense of the actual values.

Due to less terrain variability in the prairies, terrain measures (vsep, RE, slope and aspect) have a smaller range than in the mountains. Differences between factors are very small for the prairies, and while the most correlated neighbours for Tmin are always the most different, it is possible that differences for variables other than hsep are simply due to the larger hsep. In the prairies, near sites are more likely to have similar elevations and distance from the continental divide. Therefore, a smaller hsep translates to a smaller vsep as well, and does not mean that vsep, being smaller for Tmax than Tmin, is more important in determining correlations for Tmax than Tmin.

There appears to be greater terrain influence in the mountains, with significant differences between the neighbourhood attributes for Tmin and Tmax. Slope, RE, and vsep are high for Tmax and lower for Tmin and Tmean in the mountains. This indicates that slope, RE, and vsep are less important in influencing correlated-neighbour site selection for Tmax in the mountains, although differences in slope are relatively small. In contrast, RE and vsep appear to be important to neighbourhood selection for Tmean and Tmin in the mountains. Differences in aspect are small, and look to be unimportant in both the prairies and mountains.

Terrain-attribute differences also vary by season (Figure 6.5), with hsep, vsep, and RE all showing strong seasonal signals in the mountains. There are higher values of hsep and lower values of vsep and RE in the cool season compared to the warm season, for all temperature variables. This indicates that RE and vsep are more important in determining site correlations in the cool season, whereas horizontal proximity is important in the warm season. Differences are less systematic for Tmin, particularly with RE; consistent low values indicate the importance of RE on minimum temperatures year-round. Sensitivity to cddist seems to be mixed, but differences between seasons and temperature variables are small (ca. 2 km).

113

Figure 6.4. Average difference between site and neighbour terrain and location attributes for the 15 most highly-correlated neighbours, grouped by temperature measure for mountain and prairie sites. Aspect and slope are measured in degrees, vsep and RE are in meters, and hsep and cddist are in kilometers.

In the prairies, all terrain attributes show a similar seasonal pattern for all temperature measures, i.e., low values for the cool season and higher values for the warm season. Site similarity is therefore more important in the cool season; the most correlated neighbours are spatially proximal with similar absolute and relative elevations. These are less important in summer.

114

Figure 6.5. Differences in terrain attributes between sites and the 15 most correlated neighbours, by season. vsep and RE are measured in meters, hsep and cddist in kilometers. Cool season (blue) – November to February, warm season (red) – May to August, moderate season (green) – March, April, September, October.

Highly-correlated neighbours also vary by weather type (Figure 6.6), with separation for most attributes being highest for chinook days in the mountains. Therefore, correlated site selection is less dependent on site attributes than, for example, the CD type. During CD weather events, vsep is consistently low for Tmin, Tmax, and Tmean. This indicates that the most correlated sites come from similar elevations during CD weather systems, whereas elevation is less important as a predictor for temperature during chinook events. RE is low for Tmean and Tmin and higher for Tmax, consistent with the assumption that RE is a measure of cold-air ponding potential, which influences Tmin. Hot, normal, and CD days have the lowest values of RE for Tmin and Tmean; the selection of most correlated neighbours is strongly influenced by RE for these weather types.

115

Figure 6.6. Differences in terrain attributes between sites and the 15 most correlated neighbours, by weather type. vsep and RE are measured in meters, hsep and cddist in kilometers.

In the prairies, chinook events also show the largest separations for Tmax for all terrain attributes. This is consistent with strong spatial variability in temperature seen during chinooks. Interestingly, while hsep and cddist are high for chinooks for Tmin and Tmean, the highest separations occur for cool-wet weather, in particular for vsep and RE. It is possible that this is due to low spatial variability of temperature during this weather type; therefore, temperature at most sites is highly correlated, and horizontal proximity and other site characteristics are less important. Overall, there is considerable structure in Figure 6.6, which is difficult to generalize but is indicative of variable influences of location, elevation, and relative elevation on spatial temperature patterns as a function of weather type.

6.3.5 Standardized terrain differences

Terrain attributes are measured on different scales and the range of values differs for mountain and prairie sites. As a means to more easily compare average values between attributes and station classes, attribute differences were “standardized” using the 50 most highly correlated neighbours for each site. Attribute differences are not normally

116

distributed, all showing positive skew, but I am more interested in relative values, so this is considered acceptable. The standardized difference is calculated as

훿 − 휇 훿 = 푠푡푑 휎

Where 휇 and 휎 are the mean and standard deviation of the 50 most highly correlated neighbours, 훿 is the attribute value (the difference between two sites) and 훿푠푡푑 the standardized value.

If an attribute is influencing correlation, then its standardized value is less than 0 (i.e., terrain differences between site and a neighbour are less than the mean difference, which for standardized values is 0). The more negative the standardized value is, the greater the influence of the factor on correlation. If a factor has little influence on correlation, the standardized value is close to 0 (i.e., terrain differences are what one would expect from random site selection). A large positive value would indicate that the preferred neighbours have large differences in that attribute, but this situation is not expected.

Figure 6.7 shows average standardized values for the 15 most correlated neighbours for each site, grouped by terrain measure, station class, season, and temperature measure. hsep has the strongest influence on correlation in the prairies for all temperature measures and seasons, followed by cddist and vsep. Slope and RE both have similar values, but are only about 25% of the strength of hsep. Aspect is not significant.

In the mountains, hsep has the strongest influence for all temperature measures during warm weather, but vsep has equal or stronger influence on Tmin and Tmean in the moderate and cool seasons. Slope and RE influence Tmean and Tmin, but are only 20-30% the strength of hsep. Aspect has a measurable effect on Tmax during the cool and moderate seasons, but is only 15% of the strength of vsep.

117

Figure 6.7. Average standardized differences between site and neighbour terrain and location attributes for the 15 most correlated neighbours, grouped by temperature measure and season for mountain and prairie sites. Note that hsep, vsep, and cddist use a vertical scale from (0 to 1), and RE, slope, and aspect use a different vertical scale (0 to 0.2).

As prairie sites show a consistent year round pattern, and horizontal separation and distance from the continental divide are the most important factors influencing station correlation, I now focus on mountain sites. In Figure 6.8 I have summed the standardized differences for all terrain attributes for each temperature measure grouped by weather type and by month for the mountain sites. Each colour in a bar represents the average standardized difference for a particular terrain or location attribute. Larger summed values may indicate correlations are more influenced by terrain attributes. Summed values for monthly groupings tend to be higher than those for weather type groupings, in particular for minimum temperatures. The relative sizes of each of the averaged terrain attributes show a strong annual pattern, the same as was as seen in Figure 6.7 for seasonal differences. While there are some high summed values for certain weather types, the higher values and

118

strong annual pattern indicate there is greater consistency in terrain attributes influencing correlations by month than by weather type.

Figure 6.8. Summed standardized differences between site and neighbour terrain and location attributes for the 15 most highly-correlated neighbours, grouped by temperature measure and weather type (left panel) and month (right panel) for mountain sites. Colours indicate the different terrain and location attributes.

6.3.6 Land surface type

The land surface affects the surface energy balance, which in turn affects near-surface temperature. Table 6.9 shows the percentage of sites selected for each surface type for the 15 most correlated neighbours. These percentages are based on monthly correlation groups and all temperature measures. For all surface types except wetland, sites preferentially select neighbours having the same surface type. However, site/neighbour temperature correlations are not substantially lower for neighbours having different land surface types.

119

ngbLST Class stnLST forest grassShrub rock-rubble urban wetland urban urban 44 1 55 mountain forest 60 19 14 7 grassShrub 18 61 15 1 5 rock-rubble 20 16 64 wetland 67 27 2 3 prairie grassShrub 97 3 Table 6.9. Percentage of sites selected for each surface type. The majority percentage for each site surface type is shown in bold.

6.4 Discussion

My primary purpose in examining the relationship between topographic attributes and temperature is to inform landscape temperature models developed in Chapter 7. Correlations between temperature and topographic measures provide guidance as to which measures should be included in multivariate regression models. They provide an indication of physical variables influencing near-surface temperature and the strength of the influence. In addition, this information is useful for interpolation and statistical downscaling of, for example, coarse-scale climate models (e.g., Wilby et al., 2004; Fowler et al., 2007).

The correlation between temperature and a geographic or topographic attribute indicates how temperature responds as the attribute value changes. For example, mean temperature is positively correlated with easting in the mountains; therefore, temperatures increase moving east. Topographic difference measures and how they relate to temperature correlations between stations inform the selection and weighting of stations to be used in weighted regression models of temperature. The correlation between two time-series of station temperatures indicates whether the stations respond in a synchronous way under different weather conditions. The relationship between topographic attribute differences and station temperature correlations indicates the attribute range over which temperatures behave synchronously. For example, when highly-correlated neighbours are selected over a narrow range of relative elevation, it indicates that temperature behaviour diverges at different sites as a function of relative elevation.

120

Daily temperatures in the FCA are highly correlated across sites, with similar values for both prairies and mountains. However, the relationship between temperature and terrain and geographic attributes varies by temperature measure, season, and weather type, and differs for mountain and prairie sites. The prairies are characterized by low relief with site altitudes varying from 900 to 1250 m. Therefore, terrain effects are minimal and horizontal separation is the dominant influence on temperature correlations. The average horizontal separation between the five most highly correlated sites in the prairies is 16 km for Tmax and 19 km for Tmin. The same values for Tmax and Tmin are 26 km and 32 km for mountain sites. The higher separation values in the mountains indicate factors other than spatial proximity are influencing site correlations and therefore temperature.

6.4.1 Prairies

In general, weather-type correlations mirror monthly correlations with all cool season types having the same sign, but with differing strengths, and similarly for warm season types. Therefore, it appears as if seasonal influences are stronger than weather-type influences in the prairies, and monthly or seasonal relationships are adequate to describe temperature, independent of weather type.

Maximum temperature is strongly negatively correlated with distance from the continental divide year-round, whereas correlations between Tmax and elevation (xproj) are positive (negative) in the cool months (Nov-Feb) and negative (positive) the rest of the year. Correlations between Tmax and yproj are negative year round, but not significant in May and June. There are no significant correlations between Tmax and relative elevation. Minimum temperature shows few significant correlations with elevation, xproj, yproj, or cddist, but Tmin is moderately correlated with RE year-round; i.e., temperatures are lower in depressions and valleys. Correlations between Tmean and terrain and geographic variables are a combination of those shown by Tmax and Tmin. Correlations with elevation, xproj, and yproj are similar to those shown by Tmax. However, there are fewer significant correlations with cddist (mostly the warm months) and weak to moderate correlations with RE for most months.

These correlation patterns may be explained by both topography – elevation decreases moving east and further from the continental divide – and the direction from which air 121

masses enter the area. In winter, cold Arctic air often moves in from the northeast, and is seen as a cooling effect in the easterly stations. This also shows as a cooling effect for the northernmost stations. In summer, the cooling effect moving north may be due to the southerly source of warm air masses, as well as the decline in insolation with latitude. However, local topographic depressions appear to have the strongest influence on minimum temperatures, as seen by the systematic correlation with RE. This effect is evident year-round for all but the cool-wet weather type.

Nalder and Wein (1998) found that distance-based interpolation methods (kriging and IDW) performed well in areas of relatively flat terrain and similar surface types. Therefore, as geographic variables show strong correlations with temperature, and hsep is the strongest indicator of correlated stations, it is likely that these methods will suffice for temperature interpolation in the prairies. However, RE could be considered as a co-variable for both Tmean and Tmin.

6.4.2 Mountains

With higher topographic variability in the mountains, correlations between temperature and terrain attributes (RE, slope, and aspect) are more frequently statistically significant and stronger in the mountains than the prairies. Horizontal separation is the dominant control on station temperature correlations in the prairies year-round; however, vertical separation is the stronger control during the cool months in the mountains. In contrast with the prairies, where seasonal variations in temperature/location correlations dominate weather-type variations, there are more noticeable differences by weather type in the mountains. For example, temperatures increase moving away from the continental divide during hot, chinook, and normal weather, but decrease during transition, cold-dry, and cool-wet weather.

Elevation is moderately to strongly correlated with Tmean and Tmax year-round, therefore elevation-only regression models may perform adequately for these temperature measures. However, the relationship varies by weather type. In particular, the cool_CD type is anomalous, with Tmean and Tmin increasing with elevation; for all other weather types, temperature decreases with elevation. In addition, Tmax is only weakly correlated with elevation for cool_CD days, likely due to deep inversions where a warmer air mass overlies 122

shallow cold air. Highly correlated stations for the cool_CD weather type are also characterised by small vertical separation. Cullen and Marshall (2011) identified a persistent inversion up to 2000 m altitude when continental polar air masses are present in the FCA area.

Other weather types and geographic attributes do not show such large differences. For example, Tmax is only weakly correlated with xproj for the cool-wet type, whereas other moderate and warm season weather types show stronger correlations. Thus, variations in the magnitude of the coefficients by weather type for the same season indicate topographic and geographic influences do vary by weather type as well as seasonally.

It is not always possible to assign a physical reason to seasonal and weather-type patterns in temperature and topographic attribute correlations. This is likely due to the many different combinations of attribute values for all sites, making it difficult to identify influences of individual attributes. It will sometimes be a simple example of correlation rather than causality. In addition, attributes may act together; for example, south-facing slopes in the northern hemisphere report warmer temperatures than north-facing slopes (Whiteman, 2000). However, neither slope nor aspect showed positive correlations with maximum temperature. Therefore, it may be necessary to address combinations of attributes when determining topographic influences on temperature. For example, the effect of aspect may only be apparent for a specific range of slope angles.

6.4.3 Relative elevation and cold air pooling

The weak correlation between Tmin and elevation indicates regression models using elevation as the sole predictor may not perform well for minimum temperatures. One of the reasons for the weak correlation is cold air pooling, where overnight surface cooling allows cold air to flow downslope and collect in a valley bottom, resulting in higher temperatures at increased heights above the valley floor. This has been well documented (e.g., Trewin, 2005; Whiteman et al., 2004; Lundquist et al., 2008; Reeves and Stensrud, 2009).

Relative elevation is a measure of cold air drainage potential which operates at a range of scales. For a small scale (2.5-km) study, a 50-m buffer was found to be the optimum

123

indicator of cold pool (CP) potential (Chung et al., 2006). Carrega (1995) found a radius of 100 m was a significant predictor of cold air ponding potential for monthly minimum temperatures in a study area of similar size and topographic range to that of the FCA. Lundquist et al. (2008) found the selection of radius, in which to define sites prone to cold air ponding based on DEM derived slope and curvature surfaces, was very important. The study looked at several different valley configurations and found the best results used a radius equal to half the mean peak to peak distance. For the FCA area, 1000 m was selected as the buffer size based on sensitivity analysis using buffers ranging in size from 100 m to 5000 m. However, valleys of different sizes exist in the area, thus a single buffer will not necessarily capture all cold air pooling sites.

In addition, the depth of the cold layer will be a result of the duration of the cooling (Tabony, 1985) and the temperature of the surface, with increased development of cold pools expected during winter with long nights and snow covered surfaces. Kirchner et al. (2013) found cold pools developed up to a depth of 500 to 600 m during winter months in an alpine valley, with shallower depths in spring. Similarly, my analysis found temperatures increased up to 500 m above topographic low points and decreased thereafter. Land cover may also influence cold air pooling, as Gustavsson et al. (1998) found temperatures were colder in forested areas compared with exposed areas, which they attributed to reduced turbulence in the forest.

It seems unusual that although RE and Tmin are moderately positively correlated, RE is not a more highly weighted attribute in determining station temperature correlations. In order to better understand the effect of relative elevation on cold air pooling in the FCA and how it might be accounted for in temperature interpolation models, I examined the relationship between relative elevation and minimum temperature for different weather types at different elevations for mountain sites. Sites are grouped into 200 m elevation bins and the correlation between minimum temperature and relative elevation calculated for each elevation bin (Figure 6.9).

Correlations are strongest during the warm months (May-Aug) for sites below 2000 m, with some difference between elevation bands, and strong variability between weather types. Correlations are weak and highly variable on transition and cool-wet days. Hot,

124

chinook, and normal days during warm months show the most consistent and strong positive correlations, with median values greater than 0.5. As discussed in Chapter 5, chinook days during the warm months may be a subtle variation of the hot days which develop under anticyclonic conditions. This is in agreement with Pepin and Kidd (2006), who found inversions to occur most frequently under anticyclonic conditions in the Pyrenees.

While correlations indicate the strength of the relationship, regressions, whereby relative elevation is used as a predictor of minimum temperature, indicate the actual rate of change in temperature for unit change in relative elevation. The lapse rate for minimum temperature as a linear function of relative elevation for each weather type and elevation bin is shown in Figure 6.10. There is strong variability in RE lapse rates between weather types and elevation bands. For chinook, hot, and normal days, the average lapse rate is ~5°C/100m for sites below 1400 m. For sites at higher elevations the rate drops to below 2°C/100m.

Positive lapse rates indicate that for sites having similar altitudes, minimum temperatures increase with height above the lowest point within a 1000 m buffer of each site. It is apparent that the influence of relative elevation on minimum temperature is also a function of altitude. Sites at lower altitudes experience greater cooling at lower relative elevations compared with those at higher altitudes. Above 2000 m the effect is negligible. This may be due to mountain summits in the study area being between 2500 and 3000 m, with less surface area available for the generation of cold air. Also, there may not be effective cold air ponding at higher elevations, similar to classical slope environments; cold air is likely to have an outlet and will drain to lower elevations. Therefore, sites at different altitudes, but similar relative elevations, may not be highly correlated.

125

Figure 6.9. Correlation coefficient between relative elevation and minimum temperature for 200-m elevation bins for (a) cool, (b) warm, and (c) moderate months. The coloured dashed lines indicate median correlations for each elevation bin. Elevation bin names indicate the upper limit of the altitude range, e.g., LT1600 includes sites between 1400 and 1600 m altitude. GT2200 includes sites at altitudes above 2200 m and includes two high- altitude sites above 2600 m. Box widths are proportional to the square-roots of the number of observations in the groups.

126

Figure 6.10. Regression coefficient (RE lapse rate) for minimum temperature, with relative elevation as a predictor for 200-m elevation bins for (a) cool, (b) warm, and (c) moderate months. The coloured dashed lines indicate median coefficients for each elevation bin. Positive lapse rates indicate temperatures increase as relative elevation increases, i.e., the higher a site is above a low point/valley bottom. Elevation bin names indicate the upper limit of the altitude range, e.g., LT1600 includes sites between 1400 and 1600 m altitude. GT2200 includes sites at altitudes above 2200 m.

127

6.5 Summary

Daily temperatures in the FCA are highly correlated across sites. The most highly- correlated neighbours for each site vary by temperature measure and time period (month and weather type). Correlations are weaker for minimum temperatures, indicating more complex spatial and topographic controls relative to mean and maximum temperatures. Examination of the terrain characteristics for highly-correlated stations indicates that geographically- and topographically-similar sites have similar temperature patterns. In the prairies, horizontal separation has the strongest correlations year-round. In the mountains, horizontal separation has the strongest correlations for all but the cool months, when vertical separation dominates. Aspect, relative elevation, and slope all show weak correlations.

Elevation, relative elevation, slope, location, and land surface type all influence station temperature, particularly in the mountains. The strongest correlations are seen for maximum temperatures during the warm months. In the prairies there is less systematic influence of terrain on temperature, probably due to the low topographic variability. However, minimum temperatures do show a year-round increase with increasing relative elevation.

Elevation is the dominant influence on maximum and mean temperatures in the mountains, with temperatures generally decreasing with increasing elevation. However, minimum and mean temperatures increase with increasing elevation for the cool_CD type. Other geographic attributes, e.g., easting, have a similar or stronger relationship with minimum temperatures. Geographic relationships for the cool_CD weather type and cool months tend to differ from other weather systems and periods. For example, minimum temperatures are generally positively correlated with easting, but the correlation is negative during cool_CD days, indicating warmer temperatures moving west. Similarly, the only significant correlations between mean and minimum temperatures and northing occur during the cool months, with temperature decreasing moving north.

Slope and relative elevation are strongly correlated, and the two variables show similar relationships with minimum temperatures, i.e., minimum temperatures increase with increasing slope and relative elevation. The influence of relative elevation, a proxy for cold 128

air pooling, is best seen when calculated within a 1000-m buffer of each site. Effects are strongest within the lowest 100 m above a valley bottom, diminishing with increased height above a topographic low point until effects are no longer apparent 500 m above the valley bottom. This is true for altitudes up to 2000 m. However, the rate of increase in temperature varies with altitude. Temperature increases of 5°C/100m occur on hot days during the warm months at sites below 1400 m, but the rate of increase drops to 2°C/100m at higher altitudes. This altitudinal-variable relationship may reduce the strength of the partial correlation between minimum temperature and relative elevation, which varies between 0.2 and 0.45.

The lack of significant correlations with aspect is perhaps surprising; maximum temperatures on south-facing sites can be expected to be warmer during the day. The influence of aspect may be masked by other attributes and their interactions. Highly correlated sites commonly have the same land surface type. Exceptions are the wetland sites, where the dominant neighbour land surface is forest sites. As there are only eight wetland sites in the study area, this in part accounts for the lower percentage of wetland neighbours.

In the mountains, the relationships between topographic setting and temperature vary for different time periods and weather types. Temperature measures at stations are affected by multiple combinations of factors, and it is sometimes difficult to determine the relative importance of the different topographic influences. In addition, the terrain measures themselves are correlated and interacting in ways that are not necessarily linear or separable. There are nevertheless some systematic geographic and topographic influences as a function of month and weather type. This information will be used to inform the parameterization of landscape temperature interpolation models, explored in the next chapter.

129

7. GAP FILLING THE FCA DATA AND LANDSCAPE MODELS FOR ESTIMATING DAILY TEMPERATURE

There are two primary goals in Chapter 7, both of which build on the station correlations and the topographic and geographic influences on temperature presented in Chapter 6:

(1) Gap-filling the FCA dataset.

(2) Developing and comparing temperature interpolation models using global multivariate and local weighted temperature/elevation regression methods.

The following research questions are addressed:

 For gap-filling, are monthly or weather type estimates better?

 How do local and global regression models compare for interpolation?

 How do topographic predictors vary by month and weather type?

 How well does the use of topographic similarities in determining station selection and weights for a weighted regression compare to selecting stations using nearest neighbours or station temperature correlations?

 How do regression models compare with kriging in the prairies?

7.1 Introduction

Nkemdirim (1996) created monthly regression equations using closely correlated stations to recreate daily minimum and maximum temperatures in southern Alberta. Eischeid and Pasteris (2000) used between one and four most closely correlated neighbouring stations to estimate daily minimum and maximum temperatures for the western United States, using a version of the general linear least squares regression estimation – least absolute deviations criteria. This method is less sensitive to outliers than least squares regression and better handles skewed data (Eischeid and Pasteris, 2000). Both studies calculate correlations on a monthly basis. However, Courault and Monestiez (1999) noted that station correlations vary with wind direction and topographic location, indicating that the most correlated

130

station may vary during any month. I test gap-filling methods using the most closely correlated stations calculated by month and by seasonal weather type.

Topographic influences on temperature can be modelled using multivariate regression with different topographic and terrain predictor variables. For example, Esteban et al. (2009) included elevation, latitude, continentality, and solar radiation, Tveito and Forland (1999) considered elevation and distance to coast, and Carrega (1995) modelled temperature with elevation, slope, aspect, and altitude range within 100 m of each site (equivalent to my relative elevation). Rolland (2003) accounted for cold-air ponding effects by modelling slope and valley bottom sites separately.

Regression functions may be calculated globally, where model parameters are derived from all sample locations within a dataset, or locally, where parameters are derived from smaller neighbourhoods or sample locations having characteristics similar to the unsampled location (Bolstad et al., 1998). Global regressions assume that the relationship between temperature and independent variables are constant in space and time. In the simplest and most common case, where temperature is modelled as a univariate function of elevation, this is equivalent to the assumption that lapse rates are constant. However, lapse rates have been shown to vary by month and season (Blandford et al., 2008; Bolstad et al., 1998), topographic location (Bolstad et al., 1998; Rolland, 2003; Minder et. al., 2010) and weather type (Cullen and Marshall, 2011; Esteban et al., 2009; Huth and Nemesova, 1995; Lundquist and Cayan, 2007).

Daly et al. (2002) developed a local linear weighted regression technique, PRISM (parameter-elevation regressions on independent slope models), to model spatial patterns of climate elements based on the influence of terrain and location factors. Rather than using multiple predictor variables, the model uses elevation as the dominant factor in controlling climate and varies the weight of a station entering the regression based on its topographic characteristics. In addition, the model allows the relationship between temperature and elevation to vary with altitude by using a two-layer model to account for a temperature inversion layer. The number of stations used in the regression can be varied, and a station’s weight in the regression model varies according to the distance from the unsampled location and the similarity in geographic and terrain characteristics with the unsampled

131

location. The spatial variability of lapse rates is modelled by using a local neighbourhood and applying different weights to stations. This allows temperature to be modelled as a linear function of elevation, but the slope of the function (lapse rate) varies locally.

Stahl et al. (2006a) in a study using several regression and weighted average methods to estimate daily minimum and maximum temperatures over British Columbia, reported varying errors by elevation band for all methods. While the study attributed the largest errors to the absence of sufficient high-elevation stations, Cullen and Marshall (2011) also encountered varying lapse rates by elevation, and suggest calculating lapse rates within multiple vertical layers. PRISM models are parameterized to allow for the monthly variation of lapse rates. This method has been used to model mean monthly minimum and maximum temperatures (Daly et al., 2003, 2008) and daily minimum and maximum temperatures (Daly et al., 2007).

MTCLIM is a program designed for extrapolation of meteorological measurements in mountainous terrain using elevation, slope, aspect, and leaf area index (Hungerford et al., 1989). Leaf area index (LAI) is used as an indicator of land cover influence, as absorption of radiation increases with increased leaf area (Bonan, 2008). Stahl et al. (2006a) suggested future studies should focus on topographic influences on interpolation techniques under varying synoptic conditions at a range of spatial scales. Techniques should differentiate between local effects, e.g., cold air ponding, and larger-scale effects, e.g., the air mass in place.

Different methods of cross-validation can be used when evaluating the performance of predictive models. Where data are plentiful, the dataset can be divided into training and testing datasets. The training data are used to develop the model, and performance is evaluated using the testing dataset. The model can be evaluated using multiple training and test datasets. However, it is important that the training and testing datasets are independent of one another (von Storch and Zwiers, 1999). Alternatively, a method termed “leave one out cross-validation” by Esteban et al. (2009) or jackknife cross-validation (Daly et al., 2007) can be used. Using this method, the model is run as many times as there are data points, excluding a different data point for each run, and using the remainder of the data to estimate the removed point. Various error measures can be calculated by comparing

132

predicted and observed values, e.g., root mean square error (RMSE) or mean absolute error (MAE).

In this chapter, station/neighbour temperature regression models are developed to fill gaps in the FCA data. In addition, I generate multivariate regression models and weighted linear regression models, with models parameterized by month or weather type, to estimate daily temperatures in the mountains. In the prairies I use global multivariate models and kriging to generate temperature surfaces. Models are compared using error statistics calculated from jackknife cross-validation.

7.2 Methods

7.2.1 Correlated station regression

Station/neighbour temperature regression models are developed for the most highly- correlated station/neighbour pairs to fill in gaps in the FCA data. Station temperatures are generally highly correlated, but the coefficient and most highly-correlated station vary by month and weather type. Therefore, correlations between all station temperatures were calculated for days grouped by month, seasonal weather type (three seasons), and year. As the normal weather type (Nl) comprises more than 50% of days, and correlations and topographic influences show an annual cycle, correlations were also calculated for Nl days grouped by month. Thus, temperature estimates using correlated stations were calculated using three methods of calculating correlations – year (one group), monthly (12 groups), and seasonal weather type (26 groups).

For each station/day, the most highly-correlated neighbour station with data available is used as the predictor variable. Because of missing data, different neighbours can be used to estimate the same station data for the same period (where period is a year, month, or seasonal weather type). A regression/prediction equation is generated for each station/neighbour/period. Error statistics are calculated using the difference between estimated and actual temperatures for each station/day. These include mean absolute error (MAE), mean error (ME), root mean square error (RMSE), extreme maximum and minimum errors, and the number of sites and days with absolute errors exceeding thresholds.

133

7.2.2 Landscape temperature interpolation models

(i) Mountains

Two different methods are used to account for topographic influences on temperature. Global multivariate regression models use multiple topographic attributes as predictor variables to model temperature. Local weighted regression models use elevation as the single predictor variable, with weights assigned to stations based on topographic similarity measures, spatial proximity, or correlations relative to the station being estimated.

(a) Multivariate regression

Correlations between different temperature measures and topographic attributes vary in strength and sign by month and weather type. Aspect shows no significant correlations, but it is still included as a potential predictor variable as it has been shown to be significant in other studies (e.g., Carrega, 1995; Cullen and Marshall, 2011). Variables used as potential predictors include: elevation, relative elevation, slope, aspect, distance from the continental divide, and land-surface type. All measures are continuous variables except land-surface type. Four land-surface classes are used: forest, rock/rubble, shrub/grass, and wetland. As discussed in Chapter 6, terrain influences vary by temperature measure and period. Therefore, models are created for each temperature measure and period.

Models including all predictor variables are run for gap-filled temperatures aggregated by period. These models are used to suggest the best predictor variables for each period. Gap- filled data are used to avoid errors introduced by aggregation when data are missing (Stooksbury, 1999). Regression diagnostic tests – testing residuals for normality, homoscedasticity, and spatial autocorrelation – are run on the model residuals. Significant predictor variables identified for each aggregated model are used in the daily regression models. Error statistics are then calculated from the daily model residuals.

(b) Weighted regression

Local weighted regression models, using elevation as the predictor variable, allow for a spatially- variable lapse rate. For each station, a neighbourhood of stations is selected and weighted according to some measure of similarity to the station being estimated. Tests were

134

run using between 15 and 25 neighbours to determine the effect of varying the number of neighbours. Three methods for selection and weighting of stations were examined: horizontal proximity, correlation, and topographic similarity.

Horizontal proximity is similar to inverse distance weighting, and assumes that proximity is the major influence on temperature relatedness. Correlation coefficients as weights implicitly account for multiple factors – topographic and locational – influencing temperature relatedness. Weights based on topographic similarity measures are developed using the correlation analysis (Chapter 6) between station-pair correlations and topographic attribute differences. Monthly correlations are used in this analysis, as weather-type correlations did not perform as well. Topographic differences – horizontal separation, vertical separation, relative elevation difference, slope difference, aspect difference, and distance from the continental divide – are calculated for all station pairs. Regressions are run for each station, with correlation as the dependent variable and difference measures as the independent variables, to determine the importance of each difference measure on station-pair correlations. Standardized coefficients indicate the relative importance of each difference measure on correlation, and are used as parameter weighting factors. For each station, each difference measure is normalized to have values between 0 (biggest difference) and 1 (smallest difference). Final station weights are calculated by summing the product of normalized difference measures and parameter weighting factors. The weight for a neighbour will vary depending on the month and the station for which the local regression is being calculated.

Using the three methods of neighbour selection and weighting, temperatures are estimated at each station for each day. Error statistics, calculated as estimate minus actual temperature, are calculated for each method, giving an indication of the best method for station selection and weighting. The overall performance is also compared with error statistics from the global regression models.

(ii) Prairies

The prairies are characterized by low relief, and elevation is not significantly correlated with temperature for all periods. Therefore, local weighted regression models with elevation as the only predictor are not used in the prairies. Rather, kriging is used to 135

estimate temperatures, as horizontal separation shows the strongest relationship with temperature correlation.

Temperatures show some significant correlations with other topographic attributes, but the influence of the terrain is weaker than for the mountain stations and does not systematically vary by season or weather type. In addition, several topographic attributes are correlated with one another, indicating possible multicollinearity, depending on which variables are included in the models. Multiple combinations of predictor variables are tested and the final predictor variables are selected based on maximising R2 and minimizing multicollinearity.

7.3 Results

7.3.1 Correlated station temperature/temperature regression

Temperatures are estimated using the most highly-correlated stations, where station-pair correlations are calculated for different periods – year, month, and weather type. Residuals are calculated as the difference between estimated and actual temperatures. Root mean square errors by station class, correlation period and temperature measure are shown in Table 7.1. Estimates using correlations calculated by weather type produce the lowest errors, with improvements between 11-20% over annual correlations, and 7% over monthly correlations. Overall errors are lower in the prairies than the mountains for maximum and mean temperatures, but there is little difference for minimum temperatures. Errors are lowest for mean temperatures, and highest for minimum temperatures.

136

Correlation period year month Weather pctImprove pctImprove type (yr-wt) (mnth-wt) mountain Tmax 1.23 1.06 0.99 19.8 6.6 Tmean 0.85 0.74 0.69 18.4 6.8 Tmin 1.30 1.19 1.11 14.9 6.7 prairie Tmax 0.94 0.84 0.78 16.8 7.1 Tmean 0.67 0.59 0.55 17.4 6.8 Tmin 1.25 1.19 1.11 10.9 6.7 Table 7.1. Overall RMSE (°C) by station class (mountain and prairie) for daily temperature measures (minimum, maximum, mean) estimated using the most highly-correlated station by correlation period (year, month, weather type). The percentage improvement using weather type correlations compared to annual and monthly correlations is also shown.

Mean absolute errors by month and weather type are shown in Table 7.2, and follow the same pattern as for overall errors in Table 7.1. Weather-type estimates are better than those based on monthly correlations, with Tmean having the lowest errors followed by Tmax and Tmin. Errors in the mountains are slightly higher than in the prairies, and errors for all temperature measures are highest for the cold months, November to February. Errors by weather type (Table 7.2b) show improvements that vary by weather type. Improvements range from 2-7% for the CD and Ht types, 6-12% for the Ch, 12-20% for the Tr and CW types, and 5-7% for the Nl type.

137

Table 7.2. Mean absolute errors (°C) of temperature estimates calculated using the most highly correlated station by month (mnth) and weather type (wt), shown for each (a) month and (b) weather type.

Histograms of temperature estimation errors are shown in Figure 7.1, for all days and sites. Error distributions for both methods are similar, with slightly higher (greater than 10°C) extreme errors for the monthly estimates. No sites have mean absolute errors (averaged for all days) exceeding 2°C, and 90%, 95%, and 98% of site/day errors for all methods and Tmin, Tmax and Tmean respectively, are less than 2°C. Errors for Tmin show no relationship with site altitude and are similar for mountain and prairie sites (Figure 7.2). Errors are lower for prairie sites for Tmax and Tmean. Errors in Tmax and Tmean estimates increase with elevation.

138

Figure 7.1. Error distributions for the weather type (a) and month (b) estimates. Bars have a width of 1°C.

Figure 7.2. Mean absolute error (MAE) for each site as a function of elevation.

139

7.3.2 Mountain temperature interpolation models

Multivariate regression

Potential predictor variables include: elevation, easting (xproj), northing (yproj), distance from the continental divide (cddist), aspect calculated from a 500-m DEM (asp500), slope calculated from a 250-m DEM (slp250), relative elevation (RE), and land surface type. Some predictor variables are highly correlated, as shown in Table 7.3. Regressions were run for each temperature measure for data aggregated by period (month and weather type). Different combinations of predictor variables were tested and final variables selected based on the average adjusted R2 value and lower multicollinearity as indicated by variance inflation factors. Combinations of elevation, xproj, cddist, asp500, slp250, and land surface type are used in the final models. In total, 87 models were run.

elevation RE yproj xproj cddist asp500 slp250 elevation 1 RE 0.7 1 yproj -0.12 -0.03 1 xproj -0.47 -0.18 -0.52 1 cddist -0.61 -0.22 0.49 0.49 1 asp500 0.12 0.16 0.08 -0.18 -0.09 1 slp250 0.62 0.61 -0.06 -0.41 -0.47 0.15 1 Table 7.3. Correlation coefficients between potential predictor variables for mountain sites. Absolute values exceeding 0.5 are shown in bold.

Residuals were tested for normality and homoscedasticity, and results show that assumptions are in most cases valid. Moran’s I tests show spatial autocorrelation in about half the models. All models were rerun incorporating a spatial autocorrelation term modelled with either a Gaussian or exponential variogram function. Models incorporating a spatial term are compared with the non-spatial models using the Akaike Information Criterion (AIC). Differences seldom exceed 2, indicating little difference between the models (Burnham and Anderson, 2002). The largest difference is seen in the cool_Ch model. All coefficients remain significant, but coefficients and standard errors show small differences (Table 7.4). For other weather types, differences are minor; for example, the elevation coefficient for the warm_CW model changes from 5.3°C/km to 5.2°C/km. Therefore, model coefficients shown are for non-spatial models.

140

with SA term no SA term coefficient stdErr coefficient stdErr elevation -6.6 3.0 -6.2 3.7 asp500 0.03 0.01 0.04 0.02 xproj 0.03 0.01 0.04 0.00 cddist 0.03 0.01 0.04 0.01 Table 7.4. Coefficients and standard errors for the cool_Ch models with and without a spatial autocorrelation (SA) term.

Significant coefficients at the 0.05 level for the final Tmax models are shown in Table 7.5. Elevation, aspect, and easting are the most common significant predictors for Tmax for all periods. Adjusted R2 values are close to 0.9 for all models except for cool_CD, where R2 drops to 0.4. The elevation coefficient is always negative, varying from 1.5 °C/km for cool_CD up to 9 °C/km for some warm months and weather types.

Aspect is measured in 10s of degrees east or west of north. The coefficient is always positive, indicating temperature increases by between 0.04 and 0.09°C for each 10 degrees away from north, or 0.7 to 1.6C for a south-facing vs. north-facing site. Carrega (1995) reported similar values, with the highest coefficients occurring during the summer months. However, the FCA study coefficients are highest in February, March, September and October. Slope is never a significant predictor for Tmax. In contrast, Carrega (1995) reported positive slope coefficients from March to October and negative coefficients from November to February.

The easting coefficient is generally positive, indicating temperatures increase moving eastwards by between 0.1 and 0.4°C/10km. The warm and moderate CW and CD weather types have negative easting coefficients. The cddist coefficient, measured in °C/10km, varies between 0.32 and 0.37. The highest positive coefficients (warming moving further from the continental divide) occur in the cool Ch and Ht weather types. The highest negative coefficients (cooling moving further from the continental divide) are seen for the warm and moderate CW and CD weather types, consistent with the results for easting.

141

The land surface type coefficient indicates change relative to a forested site for different surface types, namely: lstg – grass/shrub, lstr – rock/rubble, lstw – wetland. The main effect is one of cooling in grass/shrub sites (lstg) by between half and one degree during some warm months and warm weather types. February and March and cool_CD also show relative cooling at grass/shrub sites. This seems unusual, in that Oke (1987) found forested sites to show cooler daytime temperatures relative to exposed areas, due to reduced solar radiation. Bonan (2008) ascribed temperature variations in forests and clearings, where temperatures tend to be cooler during the day and warmer at night, to reduced incoming shortwave radiation during the day and reduced outgoing longwave radiation at night due to a reduced sky view. However, the amount of heating or cooling depended on multiple factors, including the size of the clearing, height of trees, and leaf area. For large clearings (diameter > 50 m), the heating effect is reduced because winds can mix the air. FCA sites classified as forest refer to a variety of locations, e.g., dense conifer, mixed forest, and clearings. This may explain why no daytime cooling is evident.

142

Period R2 elevKM cd10 x10 asp10 slp lstg lstr lstw Jan 0.87 -4.5 0.16 0.34 0.05 Feb 0.86 -5.7 0.15 0.08 -0.64 Mar 0.94 -7.9 0.10 0.07 -0.43 Apr 0.95 -8.2 0.05 May 0.96 -9.0 0.04 Jun 0.94 -8.6 -0.07 0.04 -0.56 Jul 0.92 -7.8 -0.19 0.06 0.04 -0.54 Aug 0.91 -7.6 -0.24 0.04 0.05 Sep 0.92 -7.2 0.13 0.08 Oct 0.94 -7.3 0.21 0.08 Nov 0.91 -5.8 0.13 0.33 0.05 Dec 0.84 -3.4 0.15 0.36 0.04 cool_CD 0.40 -1.3 -0.19 0.12 0.07 -0.37 cool_Ch 0.91 -6.2 0.37 0.43 0.04 cool_Ht 0.89 -5.4 0.37 0.46 0.04 cool_Nl 0.90 -5.3 0.11 0.29 0.06 cool_Tr 0.91 -5.6 0.22 0.03 mod_CD 0.90 -6.7 -0.17 -0.13 0.06 mod_Ch 0.95 -8.7 0.23 0.24 0.07 mod_CW 0.92 -7.0 -0.28 -0.12 0.05 mod_Ht 0.93 -7.7 0.11 0.24 0.08 mod_Nl 0.94 -8.0 0.16 0.07 mod_Tr 0.93 -6.3 -0.29 0.14 0.05 warm_CD 0.94 -7.9 -0.27 -0.07 0.04 warm_Ch 0.94 -9.1 0.10 0.18 0.05 -0.62 warm_CW 0.88 -6.5 -0.32 -0.08 0.03 warm_Ht 0.91 -8.3 0.08 0.04 -0.98 -0.74 warm_Nl 0.93 -8.5 -0.09 0.05 0.04 -0.52 warm_Tr 0.94 -8.6 -0.16 0.16 0.04 -0.36 Table 7.5. Significant multivariate regression coefficients for Tmax. elevKM is the change in temperature for every kilometer elevation increase, cd10 is the change in temperature for every 10 kilometers moving east of the continental divide, x10 is the change in temperature for every 10 kilometers moving east, and asp10 is the change in temperature for every 10 degrees east or west of north. Land cover designations lst g, r and w are the changes in temperatures for grassland, rock, and wetland sites, respectively, relative to forested sites.

Elevation, slope, and easting are the most common significant predictors for Tmean for most periods (Table 7.6). The average adjusted R2 value is 0.80, but it drops to 0.41 for the cool_CD model, where elevation is not a significant predictor. Elevation coefficients are always negative, with a shallower average lapse rate of 4.8°C/km, compared with

143

6.9°C/km for Tmax. Slope and easting coefficients are always positive, indicating warming as slopes steepen and in the eastern part of the FCA. Aspect is a significant predictor in less than half of the models, and values are less than 0.03°C per 10 degrees. Land surface type is seldom a significant predictor, but wetland sites (lstw) show cooling by 0.7°C relative to forest sites during February and warm_Ht weather.

Period R2 elevKM cd10 x10 asp10 slp lstg lstr lstw Jan 0.66 -2.7 0.11 0.33 0.07 Feb 0.62 -3.0 0.19 0.07 -0.41 -0.72 Mar 0.91 -5.8 0.17 0.03 0.05 -0.32 Apr 0.95 -6.3 0.11 0.02 0.04 May 0.95 -6.3 0.06 0.12 0.02 0.04 Jun 0.94 -6.1 0.10 0.04 Jul 0.89 -5.0 0.13 0.06 Aug 0.83 -4.4 0.15 0.06 Sep 0.79 -4.1 0.21 0.03 0.07 Oct 0.85 -4.8 0.20 0.03 0.05 Nov 0.80 -4.4 0.29 0.04 Dec 0.63 -2.0 0.13 0.35 0.08 cool_CD 0.41 -0.14 0.07 0.08 cool_Ch 0.79 -4.4 0.20 0.44 0.05 cool_Ht 0.81 -4.0 0.28 0.46 0.05 cool_Nl 0.71 -3.4 0.09 0.30 0.07 cool_Tr 0.74 -4.0 -0.16 0.16 0.05 -0.41 mod_CD 0.94 -5.3 -0.10 -0.03 0.02 0.04 mod_Ch 0.88 -5.6 0.18 0.30 0.05 mod_CW 0.92 -5.5 -0.17 0.02 0.02 mod_Ht 0.80 -4.0 0.20 0.34 0.08 mod_Nl 0.88 -5.4 0.19 0.03 0.05 mod_Tr 0.87 -5.3 -0.27 0.07 0.03 warm_CD 0.96 -6.1 -0.16 0.01 0.02 0.16 warm_Ch 0.85 -4.9 0.22 0.29 0.08 warm_CW 0.91 -5.3 -0.16 0.31 warm_Ht 0.83 -4.8 0.15 0.21 0.08 -0.78 warm_Nl 0.91 -5.8 0.14 0.06 Table 7.6. Significant multivariate regression coefficients for Tmean models for mountain sites. Predictor variables are the same as for Table 7.5. slp is the change in temperature for every degree incline from the horizontal sites.

Results for Tmin are shown in Table 7.7. Adjusted R2 values are generally below 0.6 for Tmin models, but reach 0.8 to 0.9 for the moderate and warm CW weather types. Elevation

144

is not a significant predictor during the cool months or for the moderate and warm Ht weather types. For the remainder of the periods, elevation is significant, with an average coefficient of 3°C/km. Slope is a significant positive predictor for all models, with values varying from 0.04 to 0.19, indicating an increase of up to 1.3°C for a 10-degree increase in slope during the cool months.

Period R2 elevKM cd10 x10 asp10 slp lstg lstr lstw Jan 0.31 0.34 0.11 -1.09 -1.44 Feb 0.28 0.24 0.13 -0.84 -1.60 Mar 0.55 -3.8 0.23 0.12 -1.33 Apr 0.66 -4.4 0.17 0.10 -1.01 May 0.56 -3.3 0.11 0.14 0.10 0.01 Jun 0.49 -2.9 0.15 0.16 0.12 Jul 0.35 -2.3 0.23 0.13 Aug 0.32 -1.4 0.15 0.21 0.14 Sep 0.25 -1.6 0.21 0.13 Oct 0.40 -3.0 0.14 0.10 Nov 0.45 -2.8 0.25 0.10 -0.61 -1.17 Dec 0.33 0.37 0.11 -0.94 -1.32 cool_CD 0.54 1.3 0.14 -1.36 cool_Ch 0.41 -2.8 0.39 0.11 cool_Ht 0.56 -2.6 0.18 0.43 0.10 cool_Nl 0.35 -1.8 0.32 0.14 -1.49 cool_Tr 0.32 -2.3 -0.17 0.1 0.12 -0.68 -1.39 mod_CD 0.64 -4.3 0.09 -0.84 mod_Ch 0.43 -3.2 0.32 0.12 mod_CW 0.83 -4.9 -0.07 0.05 0.06 mod_Ht 0.34 0.29 0.35 0.14 mod_Nl 0.44 -3.1 0.19 0.11 -1.04 mod_Tr 0.66 -4.4 -0.18 0.06 -0.65 warm_CD 0.76 -4.3 0.06 warm_Ch 0.32 0.34 0.32 0.19 warm_CW 0.89 -4.9 0.04 warm_Ht 0.36 0.35 0.30 0.18 -1.52 warm_Nl 0.39 -2.8 0.22 0.13 warm_Tr 0.64 -4.2 0.11 0.10 Table 7.7. Significant multivariate regression coefficients for Tmin models for mountain sites. Table definitions are as in Table 7.5.

As slope and relative elevation are correlated, slope is probably an indicator of cold air pooling with respect to its relation with Tmin. Easting is significant for most models and 145

is always positive, varying from 0.05 to 0.4°C/10km. Where land surface type is a significant predictor, the main effect is one of cooling at wetland sites during cool months and cool weather types. Many wetland sites are close to topographic low points (relative elevation less than 30 m); cold air might pond here for the same reason that water pools. Therefore, this cooling effect may be due to cold air pooling. Rock/rubble sites (lstr) also show a cooling effect relative to forest sites from December to February, possibly related to higher albedo over snow-covered, open sites.

Regressions were run on a daily basis using the significant predictors for each period. Average model statistics – mean absolute error, adjusted R2, and the elevation coefficient (lapse rate) – are shown in Table 7.8. Multivariate models parameterized by either month or weather type perform better than models using elevation as the only predictor, but model performance by weather type is only slightly better than the monthly results. Mean temperature models are strongest, with an adjusted R2 value of 0.80 for the weather-type models. The adjusted R2 is less than 0.5 for minimum temperature multivariate models, and drops below 0.25 using elevation as the sole predictor variable, indicating the weakness of elevation as a predictor for minimum temperature.

MAE elevation R2 coefficient weather type Tmax 1.09 -6.8 0.77 multivariate Tmean 0.95 -4.5 0.80 Tmin 1.60 -2.3 0.43 monthly Tmax 1.14 -6.8 0.75 multivariate Tmean 0.96 -4.5 0.74 Tmin 1.62 -2.1 0.40 elevation only Tmax 1.38 -7.5 0.66 Tmean 1.21 -4.8 0.57 Tmin 1.86 -1.8 0.24 Table 7.8. Average mean absolute error, elevation coefficient (lapse rate °C/km), and adjusted R2 for all daily mountain site models.

Weighted regression

Local weighted regression models are only run for mountain sites, as this method uses elevation as the single predictor variable, with other terrain influences included in the station weighting that determines the elevation lapse rate. Elevation was seldom a

146

significant predictor in multivariate regression models in the prairies, so lapse-rate based models cannot be expected to perform well here. Local regression model testing was initially done using one month of data to determine the optimal neighbourhood size. Varying the number of stations used in the regression determines the level of “localness”; fewer stations result in more localized parameters. Models were tested using between 15 and 25 stations. Both the mean absolute error and the adjusted R2 increase by ~5% using 25 neighbours. I have chosen to go with a more local model using 15 stations.

Stations to include in the model may be selected based on horizontal proximity, on temperature correlations with the site being estimated, or on topographic similarity. In addition, stations may be unweighted (contributing equally), or they can be weighted based on horizontal separation, correlation, or topographic similarly. For each method, a local regression model using elevation as the predictor is calculated for each station using 15 neighbours, and this is used to predict the station value.

Spatial proximity models select the nearest 15 neighbours. All neighbours are weighted equally in the unweighted (unwgt) method. Alternatively, neighbours are weighted by (i) the square of the inverse of the distance between stations (idw2), (ii) the correlation between station and neighbour (stdcorr), where the 15 correlation values are normalized to have values between 0 (least correlated) and 1 (most correlated), or (iii) the topographic similarity between station and neighbour (stdTopoWgt), where the 15 topographic similarity values are normalized to have values between 0 (least similar) and 1 (most similar).

The mean absolute errors for different methods of station selection and weighting are summarized in Table 7.9 for a sample application in April 2008. Models were run for several selected months and different temperature measures. On average, the relative performance of the methods was similar to those shown for April 2008 minimum temperature in Table 7.9.

Three models were selected to be run on the full data set. The unweighted spatial proximity method (unwgt) is effectively a local lapse rate model, so it is good to compare to as a ‘reference case’ that represents common practice. The stdcorr method has the lowest errors, and shows the improvement that can be achieved using correlated neighbours both 147

for station selection and weighting. The stdTopoWgt method attempts to emulate the correlated-neighbour method by selecting topographically-similar neighbours. However, while performance is slightly better than for nearest neighbours, it is not as good as the correlated-neighbour method. I selected it because it represents the realistic situation where one needs to estimate temperatures at locations where no temperature measurements exist. station spatial proximity correlation topographic selection similarity station unwgt idw2 stdcorr stdTopoWgt unwgt stdcorr unwgt stdTopoWgt weight MAE 1.38 1.41 1.15 1.33 0.91 0.85 1.32 1.27 Table 7.9. Mean absolute error (MAE) for multiple station selection and weighting methods used in local regression models for April 2008 minimum temperatures. Methods in bold are used on the full dataset.

Selection of topographically similar neighbours

Topographic similarity measures are calculated as a function of both topographic differences between stations and the relative importance of the difference measure on correlation (see Chapter 6). The relative importance of differences was determined by running regressions on temperature/temperature correlation coefficients using topographic differences as predictors. Five difference measures are used as predictors: horizontal separation (hsep), vertical separation (vsep), distance from the continental divide (cddist), relative elevation (RE), and slope (slp).

Standardized regression coefficients indicate the relative importance of the predictor variables. Relative importance varies by month, as shown in Figure 7.3. hsep is the strongest predictor for Tmax for most months, with vsep having equal or higher importance during the cool months. Slope and relative elevation have limited influence on the correlations, and cddist has a moderate weight. Seasonal patterns are similar for hsep for all three temperature measures, with horizontal proximity contributing to more of the variance in the summer months. Overall the ratio of weightings for hsep/vsep is highest for Tmax, indicating that location is a better predictor of maximum temperatures, particularly in the summer, whereas elevation is more important for mean and minimum temperatures. vsep is larger than hsep from October to February.

148

Figure 7.3. Standardized regression coefficients used as weighting factors for each month and temperature measure. Values are normalized and all weighting factors sum to 1.

All stations pairs are assigned weights for each month and temperature measure, calculated as follows:

wgtijsv = (hsepij * wfhsepsv) + (vsepij * wfvsepsv) + (cddistij * wfcdsv) + (slpij * wfslpsv)

+ (REij * wfREsv)

where wgtijsv is the final weight for each temperature measure (v) and station pair

(ij) for each month (s), and wfsv is the monthly weighting factor for each temperature and separation measure.

The top 15 weighted neighbours are selected as predictor stations for each day.

7.3.3 Prairie temperature interpolation models

Multivariate regression

Correlation coefficients between potential predictor variables are shown in Table 7.10. Several topographic variables are strongly correlated with one another. For example,

149

elevation is correlated with easting, relative elevation, slope, and distance from the continental divide, with coefficients greater that 0.5; northing and distance from the continental divide are also correlated with one another, as are relative elevation and slope.

elevation RE yproj xproj cddist asp500 slp250 elevation 1 RE 0.5 1 yproj -0.14 0.1 1 xproj -0.84 -0.42 -0.2 1 cddist -0.64 -0.17 0.78 0.43 1 asp500 0.06 -0.03 0.01 -0.01 0.02 1 slp250 0.49 0.58 -0.06 -0.45 -0.35 0.14 1 Table 7.10. Correlations between topographic attributes at prairie sites, absolute values exceeding 0.5 are shown in bold.

Multiple combinations of variables were tested. With low relief in the prairies and little seasonal pattern to correlations between temperature and topographic attributes, I used a reduced set of temperature models and parameters. Specifically: xproj, yproj, RE, and land surface type for Tmean; xproj, RE, and land surface type for Tmin; xproj, yproj, and land surface type for Tmax from December to February; and yproj, land surface type, and elevation for Tmax from March to November.

Significant predictor coefficients and adjusted R2 values are shown in Table 7.11. The average adjusted R2 is 0.66 for Tmax and 0.43 for Tmin. These values are lower than the mountain model values, indicating that the subtler temperature variations in the prairies are difficult to model as a function of simple position and terrain attributes. Maximum and mean temperatures cool moving north by between 0.05 and 0.2°C for every 10-km change, and cool by between 0.2 and 0.3°C for every 10 km moving east from December to February. There is a warming trend of 0.1 to 0.3°C for every 10 km moving east for mean and minimum temperatures from April to September. Relative elevation has a warming effect for sites above depressions of up to 3.3°C/100m for mean temperatures in the winter, and between 3.2 and 6.5°C/100m for minimum temperatures. Put another way, a depression or low point in the terrain with 10 m of relief, such as Confederation Park in Calgary, gives minimum temperatures from 0.3-0.6C lower than surrounding terrain. The

150

effect on mean temperatures is small in spring and summer, about 0.1C, but mean winter temperatures in low-relief valley bottoms will be 0.3C cooler.

The land surface type coefficient, which is the warming effect associated with urban sites, is generally less than 1°C for maximum temperatures, between 1 and 2°C for mean temperatures, and between 1.5 to 2.5°C for minimum temperatures.

Tmax Tmin Tmean Period R2 elevKM LSTu x10km y10km R2 LSTu RE100m x10km R2 LSTu RE100m x10km y10km Jan 0.81 0.68 -0.34 -0.19 0.43 1.52 6.5 0.55 0.96 3.3 -0.23 -0.12 Feb 0.81 1.38 -0.31 -0.15 0.51 2.37 5.8 0.67 1.70 2.6 -0.18 -0.09 Mar 0.62 0.97 -0.17 0.35 2.07 4.1 0.58 1.41 1.8 -0.15 Apr 0.34 -6.6 -0.07 0.34 2.34 3.6 0.18 0.57 1.43 1.3 0.21 -0.03 May 0.42 -7.8 -0.06 0.41 2.01 3.7 0.27 0.38 1.01 0.22 Jun 0.43 -7.1 -0.06 0.47 1.84 3.2 0.30 0.30 0.85 0.24 Jul 0.75 -6.8 0.93 -0.11 0.52 2.36 3.4 0.21 0.64 1.61 1.1 0.20 -0.05 Aug 0.88 -7.6 0.64 -0.15 0.46 2.30 4.3 0.17 0.65 1.52 1.7 0.21 -0.05 Sep 0.75 -7.5 -0.11 0.42 2.17 5.2 0.13 0.52 1.23 2.3 0.22 Oct 0.82 -4.6 -0.10 0.40 2.13 4.8 0.35 1.15 2.0 0.12 -0.03 Nov 0.64 -1.6 0.44 -0.14 0.38 1.66 5.3 0.46 0.96 2.2 -0.11 Dec 0.66 0.81 -0.29 -0.11 0.50 1.79 6.5 -0.19 0.50 1.06 3.3 -0.22 -0.07 Table 7.11. Adjusted R2 and significant regression coefficients for elevation (elevKM), easting (x10km), northing (y10km), relative elevation (RE100m) and land surface type (LSTu) for prairie site temperature models by period.

Kriging

Two thirds of the prairie multivariate regression models have spatially auto-correlated residuals. In addition, spatial proximity is the strongest predictor of temperature correlation; therefore, kriging was used as a further interpolation method. Ordinary kriging with a single semi-variogram function was used for all models. The semi-variogram function is shown in Figure 7.4 for July maximum temperatures, for an exponential function with nugget 0.4°C, sill 0.8°C, and range 30 km. The range represents the limit of correlation between sites, and the sill represents the maximum variance between sites. The nugget represents the minimum variance between close sites, sometimes considered to be the instrument or sampling error. Mean absolute errors are lower than those for multivariate models, but higher than temperature/temperature regression models.

151

Figure 7.4. Empirical semi-variogram function to model July maximum temperature variance.

7.3.4 Comparison of interpolation methods

Average errors

Error statistics are shown in Table 7.12 for the different models used to estimate mountain and prairie daily temperatures. Mean absolute error (MAE) indicates the average absolute value of the residuals (estimate – actual). Mean error (ME) is the average value of the residuals and is an indication of bias in the model. Root mean square error (RMSE) is the square root of the average value of the squared residuals, and is more sensitive to outliers.

The temperature/temperature regression models perform the best, with the lowest MAE for both mountain and prairie sites and all temperature measures. Elevation-only regression models have the highest errors for all temperature measures for both mountain and prairie sites. Local weighted regression models perform better than global multivariate models in the mountains. On average kriging models have lower errors than global multivariate models in the prairies. Errors are lowest for Tmean and highest for Tmin. There is little difference in performance between the global monthly and weather-type multivariate models, but errors are slightly lower for models parameterised by weather type. Errors for the local topographic neighbour models are slightly lower than nearest neighbour models, with the biggest improvement (~9%) for the Tmin models. For most models the ME is zero, indicating no bias. Exceptions are the local weighted regressions, with weights

152

calculated using temperature correlations, which show some positive bias for Tmean and Tmin.

Table 7.12. Overall error statistics by temperature measure for all interpolation methods. globalElev – global elevation linear regression; multiMnth – global multivariate regression models with predictors varied monthly; multiWT – global multivariate regression models with predictors varied by weather type; localElev – local elevation linear regression; topoWgt – local weighted elevation regression models with weights calculated by site topographic similarity measures; corrWgt – local weighted elevation regression models with weights calculated by site temperature correlations; TTmnth – temperature/temperature regressions using correlated stations calculated by month; TTwt – temperature/temperature regressions using correlated stations calculated by weather type; krig – temperatures estimated using kriging.

Residuals

Residuals vary across the full elevation range in the mountains (Figure 7.5). Other studies (e.g., Stahl et al., 2006a) report the highest errors associated with the highest elevations. In the FCA, this is true for Tmax, but Tmin errors are highest for the lower elevations. In general, the residual plots have a high degree of scatter with only weak relationships with altitude. MAE by station has no errors exceeding 2°C for any temperature measure using the temperature/temperature regression method used for gap-filling. In contrast, all

153

interpolation methods using elevation or multiple variables as predictors have at least one station with a MAE greater than 2°C, with the elevation-only method having the greatest number of stations with high MAEs.

154

Figure 7.5. MAE for each mountain site relative to site elevation by method: (a) multivariate regression, (b) elevation-only, (c) topographic weighting, local regression, and (d) correlation- weighted regression. Best fit lines are shown where a solid line indicates a significant (0.05 level) slope, i.e., MAE varies with elevation.

155

Residuals vary by month and weather type (Figure 7.6). Tmin and Tmean residuals show an annual cycle with highest residuals (absolute values) during the cold months and the lowest values in the shoulder seasons. Tmax residuals have a weaker monthly cycle, except the global elevation models. The CW weather type has the lowest residuals for all temperature measures. Tmin residuals are highest for chinook and hot weather types.

Figure 7.6. MAE by month and weather type for the 6 interpolation methods for mountain sites.

The majority of residual values, 90%, 95% and 98% for Tmin, Tmax, and Tmean, respectively, fall within ±2°C, but there are some high values, ±10°C or more (Figure 7.7). The majority of the extreme errors are positive (i.e., the model overestimates actual temperature), and occur from November to February, with twice as many extreme errors for Tmin relative to Tmax. Stahl et al. (2006a) found that models were unable to accurately estimate temperatures across the full temperature range, with low temperatures being

156

underestimated and high temperatures being overestimated. However, extreme errors in the FCA occur across the full temperature range and are more an indication of the extreme daily temperature variability that can occur during winter in the study area.

Figure 7.7. Binned residual counts for the six interpolation methods. Vertical scale is logarithmic.

Lapse rates

Most mountain models use elevation as a predictor variable. Exceptions are the monthly multivariate models for Tmin for December, January and February, the weather type models for Tmean (cool_CD) and Tmin (warm_HT, warm_Ch and mod_Ht). The median lapse rates for each temperature measure and three global and two local interpolation methods are shown in Table 7.13. Maximum temperature lapse rates are steeper than the environmental lapse rate of 6.5°C/km and Tmean and Tmin lapse rates are shallower

157

(slower decrease in temperature with increasing elevation). Median lapse rates do differ between methods, but not significantly.

globalElev multiMnth multiWT localElev topoWgt Tmax -8.2 -7.5 -7.6 -7.5 -7.3 Tmean -5.3 -5.2 -5.4 -4.5 -4.8 Tmin -2.3 -3.3 -3.2 -1.3 -2.7 Table 7.13. Median lapse rates (°C/km) for the global elevation, multivariate parameterised by month and weather type, local elevation and local topographically weighted regression models.

The global models use a single daily lapse rate for the study area, with daily values ranging from 12 to +12 °C/km. The local elevation regression models produce spatially variable lapse rates, with a unique lapse rate for each site. Lapse rates were not confined to be within a defined range, therefore there are some physically unreasonable values. However, more than 90% of values lie within 10 and +10 °C/km. Lapse-rate density distribution plots for two global regression methods and the local regression methods are shown in Figure 7.8. Distributions for the global elevation-only regression lapse rates are broader than those for the global multivariate methods. The local weighted method has the broadest distribution curves, indicating strong spatial and topographic variability in lapse rates. However, lapse rates calculated with elevation as the sole predictor are not exactly comparable with those calculated as part of a multivariate or local regression.

158

Figure 7.8. Lapse-rate distribution by temperature measure for the global and local regression methods. The dashed vertical line indicates the environmental lapse rate of 6.5 °C/km.

7.4 Discussion

Estimation errors vary between methods and station class (prairie and mountain). However, relative performance is consistent across all temperature measures for the different methods. For all methods, errors are lowest for mean temperature estimates and highest for minimum temperature estimates. Errors show an annual cycle, with Tmax and Tmean errors being higher in winter than summer. Tmin errors are high in both summer and winter, 159

and lower in the shoulder seasons. Blandford et al. (2008) reported a similar pattern in minimum temperature lapse rates, with shallower lapse rates during summer into early fall, which the authors attributed to the frequent occurrence of inversions. The presence of inversions contributes to complex spatial temperature patterns, and therefore higher errors, and these occur preferentially under certain weather types in both winter and summer.

Regression equations developed using the most closely-correlated station have the lowest errors, where correlations are calculated for seasonal weather type groupings. However, this method is appropriate for gap-filling rather than interpolation. A study estimating temperatures in the western United States using correlated stations (Eischeid et al., 2000) reported RMS errors between 1.4 and 1.8°C for maximum temperatures and 1.5 to 2°C for minimum temperatures. Equivalent errors for the FCA data are 1.1°C for Tmin and 0.9°C for Tmax. Estimating daily temperatures in British Columbia, Stahl et al. (2006a) reported RMS errors of 1.8 and 2.2°C for Tmax and Tmin respectively for the best performing methods in the year 2000. However, errors were higher for years having fewer stations and in particular where there were few high altitude stations. The high FCA station density and altitude range allows for greater prediction accuracy.

Errors in Tmax are higher in the mountains (RMSE 1.0°C) than the prairies (0.8°C), but there is no difference for Tmin (1.1°C). The highest mean absolute errors in the mountains for Tmax and Tmean occur in the far west of the study area, where altitudes and terrain variability are greatest (Figure 7.9). However, there is no systematic bias, with both positive and negative errors occurring at high altitudes; nor are the highest errors associated with the highest altitudes, as found by Stahl et al. (2006a).

Correlations for minimum temperatures are slightly lower in the prairies than the mountains. Relative elevation and land surface type (an urban warming effect) are the most common predictor variables in the multivariate regression models for minimum temperatures. Even in the low relief of the prairies, local cold pockets do occur, causing spatial variability in minimum temperatures. In agreement with this, Mahrt et al. (2001) observed cold air flows developing in shallow gullies.

160

Figure 7.9. Mean absolute error (°C) distribution maps for mean, maximum, and minimum temperatures, for all stations (a) and mountain sites (b).

It is interesting that correlations between mean temperatures are highest, and the errors in mean temperature estimates are the lowest. By calculating mean temperatures as the average of hourly measurements, it is possible that the inconsistent behaviour of factors influencing minimum (e.g., relative elevation) and maximum temperatures (e.g., aspect) are limited when looking at mean temperature. Local, short-duration temperature fluctuations will average out over a day, e.g., the effects of variable winds and cloud cover, and it is perhaps natural that daily average temperatures will be more coherent over the region than Tmin or Tmax. Blandford et al. (2008) found errors were lowest for Tmax, with Tmean between Tmin and Tmax. In this case, Tmean was calculated as the average of daily Tmin and Tmax, and this may not be sufficient to remove both Tmin and Tmax

161

influences. My values of Tmean represent the actual daily average, and are not calculated from Tmin and Tmax.

When the purpose is interpolation rather than gap-filling, other techniques such as regression or kriging are used. In the mountains, multivariate regression models perform better than elevation- only models, and in the prairies, kriging models have the lowest errors. Cullen and Marshall (2011) concluded that multivariate regression for modelling monthly mean temperatures added complexity with little improvement in accuracy, compared to elevation-only models. However, the models in that study combined the mountain and prairie sites, and as shown in this study, predictor variables and seasonal or weather type influences are different in the prairie and mountain sites. Therefore, methods of interpolation and choice of predictor variables perform better when applied in topographically consistent areas. Cullen and Marshall (2011) also did not consider the systematic effects of different weather types.

I created multivariate regression models using mountain sites only, with predictor variables varied by both month and weather type. Carrega (1995) modelled mean monthly minimum and maximum temperatures in southeastern France using multivariate regression. While this study and the studies by Carrega (1995) and Cullen and Marshall (2011) have some predictor variables in common, namely elevation, slope, and aspect, each study included some additional variables. Cullen and Marshall (2011) also used land surface type, Carrega (1995) used relative elevation and distance from coast, and I used distance from the continental divide, latitude, and longitude. Therefore, it is difficult to make a direct comparison of coefficient values, but some generalizations regarding significant predictors and their values can be made.

The most common predictor variables in the mountains for Tmax were elevation, aspect and xproj. I found the elevation coefficient to be highest in summer and lowest in winter for Tmax, as did Carrega (1995). However, while aspect and slope were significant predictors year round in southeastern France, in the FCA only aspect was significant. Possibly because the French study only used 24 sites compared to the 143 FCA sites, there were fewer slope/aspect combinations, which allowed for a more consistent influence.

162

Cullen and Marshall (2011) reported significant aspect coefficients for Tmean in September and October, but slope was never a significant predictor. In contrast I found aspect was never a significant predictor for Tmean, and slope was significant for all months. This could be a result of different combinations of variables used in the models, and that I was only looking at mountain sites. Slope and aspect were never significant predictors in my prairies multivariate regression models.

Carrega (1995) found relative elevation to be a strong significant predictor of minimum temperature in winter, with temperatures increasing on average by 6°C for every 100 m above a valley bottom. As slope and relative elevation are highly-correlated variables in the FCA, and models using slope rather than relative elevation had greater explanatory power, my models used slope, with the most common predictor variables being slope, xproj, and elevation. However, in Chapter 5 I showed daily minimum temperatures increasing by between 2 and 6°C/100m, with the rate of increase being a function of weather type and altitude.

Kriging models created in the prairies used a single isotropic semi-variogram function. Nonetheless, they still outperformed multivariate regressions. However, significant multivariate predictor variables indicate the kriging models can be improved by considering additional covariables, for example relative elevation and land surface type. An urban warming effect was significant for all months for Tmean and Tmin. Other studies, e.g., Gedzelman et al. (2003), showed an overnight warming effect of urban environments, with warming being strongest in late summer and fall and weaker in winter and spring. Warming effects are weakest in early winter and early summer in the FCA. The urban heat island develops most strongly during calm, clear weather; therefore, frequent chinooks during winter may reduce the effect, as may the prevalence of cool, wet days in early summer. However, Cullen and Marshall (2011) did report a significant warming coefficient for monthly mean temperatures in June.

While elevation-only (lapse-rate) models had higher errors than multivariate models, it is still interesting to compare values with other studies, as these kinds of models are commonly used for interpolation. Lapse rates calculated using FCA mountain sites vary by both weather type and month. Monthly Tmax lapse rates vary from 5.6 to 8.9°C/km,

163

whereas Tmax weather type lapse rates vary from 1°C/km (cool_CD) to 10°C/km (chinook), with other weather types having values between 4 and 8.5°C/km. Tmean monthly lapse rates vary from 3.1 to 6.5°C/km, and 4 to 7°C/km for seasonal weather types, except the cool_CD with a value of +1.5 °C/km. Tmin lapse rates vary between 0°C/km and 3.8°C/km for months and 0°C/km to 4.3°C/km for weather types, with +3.4 °C/km for cool_CD.

Blandford et al. (2008) concluded that lapse rates were not significantly different between weather types. However, as discussed in Chapter 5, this is possibly a result of using more general SSC weather types. In the same study, suggested monthly lapse rates are given for each temperature measure, with a proviso that these are specific to the study area. Monthly FCA lapse rates tended to be steeper by up to 2°C/km for all months and temperature measures than those calculated by Blandford et al. (2008) for an area of 10 000 km2 in south-central Idaho with similar altitude range to the FCA study. However, the monthly lapse rate values I calculated are in agreement with those calculated for monthly mean temperatures (Cullen and Marshall, 2011), indicating that lapse rates can and do vary by region.

Overall, my results indicate that applications using daily temperatures, and using lapse rates for interpolation, should consider using weather-type specific lapse rates. This is particularly important for some weather types, such as cold-dry air masses and chinook conditions. Both of these are common winter weather types in southwestern Alberta, and while their effects may average out where both occur with the same frequency, application of monthly lapse rates can lead to high daily errors. For example, for mean temperature, the mean December lapse rate is 3.1°C/km, that for cool_Ch is 6.5°C/km, and that for cool_CD is +1.5°C/km. During the FCA study period, the coolest year (2009) had a December lapse rate of +1.4°C/km and the December lapse rate in the warmest year (2006) was 3.2°C/km. These averages reflect the frequency of cool_CD and chinook conditions in each of these months.

Local weighted regression models have lower errors (MAE 0.5-1.0°C) than global multivariate models (1.0-1.6°C), with the greatest improvement achieved when correlations are used to select stations in each local neighbourhood. However, 164

improvements are more modest (7-10%) when topographic characteristics are used to choose the neighbourhood. Therefore, further work is needed to identify related stations and uncover what attributes explain the highly-correlated neighbourhood. What is it about these sites that make them good predictors? The correlation-based local models require a dense station network, which is generally unavailable, and a better understanding of the landscape and site properties that govern temperature variations is needed for temperature modelling in the absence of a dense network.

High estimation errors (> 5°C), while uncommon, do occur across the area for all models, mostly affecting minimum temperatures. The greatest number of high errors are positive, indicating that actual temperatures are lower than estimates. High errors are not restricted to a specific altitude, temperature range, or area, but the majority occur from November to February for all weather types. Therefore, it is an indication of the strong spatial temperature variability that occurs during winter in the mountains and foothills of southwestern Alberta. This is likely due to the contrast between two common winter weather types, Arctic cold days and warm chinooks, and the highly variable terrain. Other areas that experience highly-contrasting weather types and complex terrain will likely show similar high errors, particularly in the case where both weather types may be present over part of the day or part of the region.

7.5 Summary

Weather types improve model performance vs. mean annual or monthly models, but there is still a lot of inherent variability within weather types. In addition, weather types have distinct seasonal characteristics. For gap-filling, models using the most closely correlated station produce the lowest estimation errors on average, where correlations are determined for each seasonal weather type. Mean absolute errors range from 0.40°C (Tmean in the prairies) to 0.84°C (Tmin in the prairies), which is close to the estimated accuracy of the sensors of 0.5°C. Lapse rates also vary significantly by weather type, particularly in winter where cold-dry days have a lapse rate of +1.5°C/km for Tmean, whereas chinook days have a lapse rate of 6.5°C/km.

All global and local neighbourhood interpolation models have some extreme errors, and these occur most commonly for minimum temperatures during winter. Elevation-only models produce the greatest number of high errors. Local weighted regression models,

165

where stations are selected based on correlation, produce the least number of extreme errors.

With a dense network of stations with a multiyear data record, local weighted regression models using correlated stations produce more accurate interpolated temperature surfaces than do multivariate global regression models. However, where stations are sparse and global models are all that can be used, multivariate models perform better than lapse rate (elevation regression) models. In areas with low terrain variability, kriging performs better than multivariate models.

166

8. CONCLUSIONS

The research presented in this thesis has focussed on methods rather than applications. In this concluding chapter I summarise the methods used and where they performed best. Where results were not so strong, I give suggestions for further work. I follow with ideas for applications where findings of this research may provide improved methods and parameters.

Due to sparse temperature networks or the need to downscale weather forecasts or climate model scenarios to more local areas of interest, temperature interpolation models are required. Performance of these models varies depending on the topography of the area, the type of measurement (daily, monthly etc.), and data availability. Few studies have included the importance of weather systems on the spatial variability of temperature in combination with geographic and topographic effects. Therefore, with this research I addressed this gap.

The following research objectives were addressed:

 Develop a weather classification framework for southwestern Alberta  Determine the best method for gap-filling the FCA data  Identify topographic influences on temperature  Develop landscape-scale temperature interpolation models incorporating weather types and topography

My hypothesis was that temperature controls vary systematically depending on the weather type, and gap-filling methods and interpolation models will perform best when parameterized by weather type. The first step in my research was to develop a statistical weather classification method. Five distinct weather types were identified, namely cold- dry, chinook, cool-wet, hot and transition days. Together these days comprised 50% of days during the FCA study period. “Normal” days made up the remaining 50%. Discriminant function analysis produces more physically recognizable days than cluster analysis applied to principal components, based on a suite of hourly and daily meteorological variables.

Gap-filling, using the most closely correlated station to generate regression equations to estimate missing data, produces the lowest errors compared with multivariate topographic

167

and local weighted regression models. Station selections based on seasonal weather-type groupings give lower errors than those calculated from monthly groupings. However, correlations should be calculated monthly for the normal weather type, which comprises 50% of days. This indicates there are strong monthly as well as weather-type controls on temperature, resulting in correlated stations varying both by month and by weather type. For daily temperature estimates over all sites, root mean square errors vary from 0.69°C (mean temperature) to 1.11°C (minimum temperature) in the mountains, and 0.55°C (mean) to 1.11°C (minimum) in the prairies. All sites have mean absolute errors less than 2°C, and 90%, 95% and 98% of daily errors for all sites are less than 2°C for Tmin, Tmax, and Tmean, respectively.

Improvements in estimation errors compared to monthly correlations vary by weather type, with cool-wet days showing the biggest improvement, up to 20%. Cold-dry and hot days show improvements of 5-7%. This may be an indication of variability inherent within these weather types, as seen in lapse-rate distributions associated with the different weather types. While inversions are most common on cold-dry days and this type has the shallowest lapse rate, it also shows the greatest variability for lapse rates, resulting in weaker estimations. Similarly, while inversions are common for minimum temperatures on hot days, there is strong variability in lapse rates for this weather type.

Inversions most commonly develop under calm, clear-sky conditions. Therefore, incorporating additional meteorological measurements, such as wind and cloud cover, as used by Sheridan et al. (2010), to further subdivide these weather types into days conducive to the formation of inversions would potentially improve temperature models. However, these measurements are not readily available and tend to be very localized. Additional data are being collected in the Bow Valley near Banff and Canmore and this may indicate whether these conditions can be identified, and how applicable they are to the wider geographic area.

Lapse rates do show systematic seasonal and weather-type variability. Based on results from this research, I would recommend that applications requiring daily temperature estimates use lapse rates for the applicable daily weather type, if available, when

168

interpolating from nearby stations or downscaling reanalysis data. This may be particularly important for studies using thresholds, for example mountain pine beetle die-off.

Correlations between temperature and topographic measures indicate that climate-surface type interactions occur on multiple scales, e.g., daily and seasonal. Elevation shows a relatively consistent and expected relationship with temperature, although this relationship varies by weather type and temperature measure. Other topographic measures have less systematic relationships with temperature. For example, aspect is not significantly correlated with temperature, but it is a significant predictor for summer Tmax models.

The minimal influence of terrain factors may be because these operate at different spatial scales, which I do not capture in my methodology. For instance, slope and aspect calculated at a single DEM resolution may miss the appropriate scale of terrain-temperature interactions, and the relevant scale may change with time and location, depending on the specific processes and air mass interactions at play. Similarly, using a single land surface type for each site for the whole study period may not be appropriate. Over the study period there will have been cycles of seasonal leaf-on and leaf-off in deciduous forests, bare soil to crops of various heights in agricultural lands, inter-annual crop change, long-term changes in forest growth, and variable snow cover over the study area.

Correlated stations may also be keying off factors other than the topographic and geographic attributes that were used to identify related sites. Other potential temperature influences include soil moisture, canopy closure, albedo, and temporal vegetation changes. Additional measurements of these variables could be helpful to further reduce errors in gap-filled and modelled temperatures.

Correlations between stations are weakest for the cold-dry weather type. Temperatures during cold-dry weather show strong spatial variability in both horizontal and vertical dimensions. It is the only weather type where maximum temperatures show a weak correlation with elevation, indicating the presence of persistent and deep inversions. A two- layer atmosphere may improve the model, however further work is necessary in order to determine the level separating the layers, and whether this can be determined or whether it is too variable itself.

169

Cold-air pooling is a common issue when modelling temperature. While the physical cause of cold-air pooling is well understood, determining when it occurs and how strongly it develops is difficult. Different temporal and spatial scales are involved, and the location and magnitude of the pooling is not constant. Variability within weather types results in variability in cold-air pooling sites. More detailed analysis is required to determine the vertical depth and rate of cooling within the cold layer under different weather conditions. In addition, there are expected physical relationships between cold-pool development and valley size and drainage catchment (i.e., contributing area). In a regional study such as the FCA, this may be related to relative and absolute elevation, but there may be better proxies for this process.

Sub-setting the data into mountain and prairie sites indicates that there are different influences on temperature in the two areas, with topographic influences stronger in the high relief mountains, and spatial proximity being dominant in the low-relief prairies. Selection of the optimum interpolation method therefore varies, depending on the relief in the area of interest. Regression models provide the best estimates in the mountains, whereas kriging gives the lowest errors in the prairies.

In mountainous terrain, elevation-only models perform poorly, especially for minimum temperatures. Where station data is plentiful, local weighted regression models using a neighbourhood of correlated stations provide the lowest error estimates. However, in the absence of a dense station network, including other topographic and geographic predictor variables in a multivariate regression model reduces estimation errors relative to elevation- only models.

Correlations between different terrain and geographic parameters vary by month and by weather type. However, using topographic similarity to determine highly-correlated stations gives only weak prediction of the most-correlated neighbours. This indicates that either required variables are not used (not measured) or variability operates at different spatial and temporal scales. Therefore, for gap-filling missing data, using most closely- correlated stations calculated by seasonal weather type is worth the additional work in creating a weather classification. However, the improvement gained for landscape interpolation models is less, and without further work, may not be justified.

170

I finish with some suggestions for continuing this research, and applications where the gap- filled data can be used.

1. Correlations performed well in identifying stations with similar behaviour, but the method to emulate correlations based on topographic similarity was weak. Suggestions for improvements include: use multiple spatial scales to calculate topographic measures; combine measures e.g., slope and aspect; two-layer atmosphere as a means of determining the upper limit of an inversion layer which results from surface influences; treat relative elevation as a function of absolute elevation and as a nonlinear or categorical variable effect.

2. The FCA is a spatially dense network spanning a large altitude range, incorporating multiple terrain and surface types, and recording data for 5 years. Other regional weather monitoring networks exist, but do not always sample the same diversity of landscape as the FCA. For example, WegenerNet in Austria (Kirchengast et al., 2014) has a higher spatial density and has been recording data since 2007, but has an altitude range of less than 300 m. However, studies like the FCA are expensive and labour intensive, therefore it would be interesting to decimate the FCA to find out what spatial density is necessary, and what is the nature of relief and altitude range that should be sampled to adequately capture temperature variability. Do such recommendations vary by weather type?

3. Using the gap-filled FCA data, a detailed study of a specific weather type, e.g., chinooks, would be interesting, to expand on research by Nkemdirim (1996). A spatially dense network may be able to identify fine scale spatial patterns not previously captured.

4. Applications to other research areas should be explored, e.g., in hydrology, glaciology, avalanche studies, or ecology (fire conditions, pest management, etc.). These fields frequently rely on downscaling, using measurements from climate reanalyses or distant, low-altitude temperature data. The FCA dataset could be used to evaluate the performance of different downscaling methods – elevation only with fixed lapse rate (6.5°C/km), monthly lapse rate as calculated in this study, or a daily variable lapse rate using weather-type lapse rates determined in this study.

171

REFERENCES

Ahrens, C. D. 2008. Essentials of meteorology: an invitation to the atmosphere. (5th ed.) Brooks/Cole. Baker, D. G., Ruschy, D. L., Skaggs, R. H. and Wall, D. B. 1992. Air temperature and radiation depressions associated with a snow cover. Journal of Applied Meteorology. 31(3): 247-254. Baltazar, J. C., and Claridge, D. E. 2002. Restoration of short periods of missing energy use and weather data using cubic spline and fourier series approaches: Qualitative analysis. Energy Systems Laboratory (http://esl.tamu.edu); Texas A&M University. Available electronically from http : / /hdl .handle .net /1969 .1 /4575. Bárdossy, A., Stehlík, J., and Caspary, H. J. 2002. Automated objective classification of daily circulation patterns for precipitation and temperature downscaling based on optimized fuzzy rules. Climate Research. 23(1): 11-22. Barry, R. G. 2001. Mountain weather & climate. Psychology Press. Barry, R. G. and Chorley, R. J. 2010. Atmosphere, weather, and climate. (9th ed.) Routledge. Barry, R. G. and Perry, A. H. 1973. Synoptic climatology: methods and applications. Routledge Kegan & Paul. Beck, C., and Philipp, A. 2010. Evaluation and comparison of circulation type classifications for the European domain. Physics and Chemistry of the Earth, Parts A/B/C, 35(9): 374-387. Beniston, M. 2006. Mountain weather and climate: A general overview and a focus on climatic change in the Alps. Hydrobiologia. 562: 3-16. Blandford, T. R., Humes, K. S., Harshburger, B. J., Moore, B. C., Walden, V. P. and Ye, H. 2008. Seasonal and synoptic variations in near-surface air temperature lapse rates in a mountainous basin. Journal of Applied Meteorology and Climatology. 47(1): 249-261. Bolstad, P., Swift, L., Collins, F. and Regniere, J. 1998. Measured and predicted air temperatures at basin to regional scales in the southern Appalachian mountains. Agricultural and Forest Meteorology. 91(3-4): 161-176. Bonan, G. 2008. Ecological climatology: concepts and applications. Cambridge University Press. 2nd edition. Bower, D., McGregor, G. R., Hannah, D. M., and Sheridan, S. C. 2007. Development of a spatial synoptic classification scheme for western Europe. International Journal of Climatology. 27(15): 2017-2040. Burnham, K. P., and D. R. Anderson. 2002. Model Selection and Multimodel Inference: a practical information-theoretic approach. Springer-Verlag. 2nd ed.

172

Burrough, P. A. and McDonnell, R. A. 1998. Principles of Geographical Information Systems. Oxford University Press, Oxford. Carrega, P. 1995. A Method for the Reconstruction of Mountain Air Temperatures with Automatic Cartographic Applications. Theoretical and Applied Climatology. 52(1- 2): 69-84. Casola, J. H., and Wallace, J. M. 2007. Identifying weather regimes in the wintertime 500-hPa geopotential height field for the Pacific-North American sector using a limited-contour clustering technique. Journal of Applied Meteorology and Climatology, 46(10): 1619-1630. Cassano, E. N., Lynch, A. H., Cassano, J. J., and Koslow, M. R. 2006. Classification of synoptic patterns in the western Arctic associated with extreme events at Barrow, Alaska, USA. Climate Research. 30(2): 83-97. Cheng, C. S., Li, G., Li, Q. and Auld, H. 2010. A Synoptic Weather Typing Approach to Simulate Daily Rainfall and Extremes in Ontario, Canada: Potential for Climate Change Projections. Journal of Applied Meteorology and Climatology. 49(5): 845- 866. Cheng, C., Auld, H., Li, G., Klaassen, J., Tugwood, B. and Li, Q. 2004. An automated synoptic typing procedure to predict freezing rain: An application to Ottawa, Ontario, Canada. Weather and Forecasting. 19(4): 751-768. Chung, U., Seo, H. H., Hwang, K. H., Hwang, B. S., Choi, J., Lee, J. T. and Yun, J. I. 2006. Minimum temperature mapping over complex terrain by estimating cold air accumulation potential. Agricultural and Forest Meteorology. 137(1–2): 15-24. Chung, Y., Hage, K. and Reinelt, E. 1976. Lee Cyclogenesis and Air-Flow in Canadian Rocky Mountains and East Asian Mountains. Monthly Weather Review. 104(7): 879- 891. Coulibaly, P. and Evora, N. D. 2007. Comparison of neural network methods for infilling missing daily weather records. Journal of Hydrology. 341(1-2): 27-41. Courault, D. and Monestiez, P. 1999. Spatial interpolation of air temperature according to atmospheric circulation patterns in southeast France. International Journal of Climatology. 19(4): 365-378. Cullen, R. M. and Marshall, S. J. 2011. Mesoscale Temperature Patterns in the Rocky Mountains and Foothills Region of Southern Alberta. Atmosphere-Ocean. 49(3): 189-205. Daly, C. 2006. Guidelines for assessing the suitability of spatial climate data sets. International Journal of Climatology. 26(6): 707-721. Daly, C., Gibson, W., Taylor, G., Johnson, G. and Pasteris, P. 2002. A knowledge-based approach to the statistical mapping of climate. Climate Research. 22(2): 99-113. Daly, C., Helmer, E. and Quinones, M. 2003. Mapping the climate of Puerto Rico, Vieques and Culebra. International Journal of Climatology. 23(11): 1359-1381.

173

Daly, C., Conklin, D. R. and Unsworth, M. H. 2010. Local atmospheric decoupling in complex topography alters climate change impacts. International Journal of Climatology. 30(12): 1857-1864. Daly, C., Halbleib, M., Smith, J. I., Gibson, W. P., Doggett, M. K., Taylor, G. H., Curtis, J. and Pasteris, P. P. 2008. Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. International Journal of Climatology. 28(15): 2031-2064. Daly, C., Smith, J. W., Smith, J. I. and McKane, R. B. 2007. High-resolution spatial modeling of daily weather elements for a catchment in the Cascade Mountains, United States. Journal of Applied Meteorology and Climatology. 46(10): 1565-1586. Davis, R. and Kalkstein, L. 1990. Development of an Automated Spatial Synoptic Climatological Classification. International Journal of Climatology. 10(8): 769-794. Davis, R. and Walker, D. 1992. An Upper-Air Synoptic Climatology of the Western United-States. Journal of Climate. 5(12): 1449-1467. Dodson, R. and Marks, D. 1997. Daily air temperature interpolated at high spatial resolution over a large mountainous region. Climate Research. 8(1): 1-20. Dubayah, R. C. 1994. Modeling a solar radiation topoclimatology for the Rio Grande River Basin. Journal of Vegetation Science. 5(5): 627-640. Durre, I., Menne, M. J., Gleason, B. E., Houston, T. G. and Vose, R. S. 2010. Comprehensive Automated Quality Assurance of Daily Surface Observations. Journal of Applied Meteorology and Climatology. 49(8): 1615-1633. Eischeid, J. K. and Pasteris, P. A. 2000. Creating a Serially Complete, National Daily Time Series of Temperature and Precipitation for the western United States. Journal of Applied Meteorology. 39(9): 1580. Enke, W. and Spekat, A. 1997. Downscaling climate model outputs into local and regional weather elements by classification and regression. Climate Research. 8(3): 195-207. Environment Canada, 2016. Canadian Climate Normals for 1981 to 2000. Environment Canada. (on line) http://www.climate.weatheroffice.ec.gc.ca/climate_normals/index_e.html (accessed April 2016). Esteban, P., Martin‐Vide, J., & Mases, M. 2006. Daily atmospheric circulation catalogue for Western Europe using multivariate techniques. International journal of climatology. 26(11), 1501-1515. Esteban, P., Ninyerola, M. and Prohom, M. 2009. Spatial modelling of air temperature and precipitation for Andorra (Pyrenees) from daily circulation patterns. Theoretical and Applied Climatology. 96(1-2): 43-56.

174

Flesch, T. K. and Reuter, G. W. 2012. WRF Model Simulation of Two Alberta Flooding Events and the Impact of Topography. Journal of Hydrometeorology. 13(2): 695- 708. Fowler, H., Blenkinsop, S. and Tebaldi, C. 2007. Linking climate change modelling to impacts studies: recent advances in downscaling techniques for hydrological modelling. International Journal of Climatology. 27(12): 1547-1578. Frakes, B., and Yarnal, B. 1997. A procedure for blending manual and correlation‐based synoptic classifications. International Journal of climatology 17(13): 1381-1396. Fridley, J. D. 2009. Downscaling Climate over Complex Terrain: High Finescale (< 1000 m) Spatial Variation of Near-Ground Temperatures in a Montane Forested Landscape (Great Smoky Mountains). Journal of Applied Meteorology and Climatology. 48(5): 1033-1049. Garson, G. D. 2012. Discriminant Function Analysis. Asheboro, NC: Statistical Associates Publishers. Garson, G. D. 2013. Factor Analysis. Asheboro, NC: Statistical Associates Publishers. Gedzelman, S. D., Austin, S., Cermak, R., Stefano, N., Partridge, S., Quesenberry, S., and Robinson, D. A. 2003. Mesoscale aspects of the urban heat island around New York City. Theoretical and Applied Climatology, 75(1-2), 29-42. Georges, C. and Kaser, G. 2002. Ventilated and unventilated air temperature measurements for glacier-climate studies on a tropical high mountain site. Journal of Geophysical Research-Atmospheres. 107(D24): 4775. Godson, W. L. 1950. The structure of North American weather systems. Proceedings of the Royal Meteorological Society. 3. 89-106. Graybeal, D. Y., DeGaetano, A. T. and Eggleston, K. L. 2004. Improved Quality Assurance for Historical Hourly Temperature and Humidity: Development and Application to Environmental Analysis. Journal of Applied Meteorology. 43(11): 1722-1735. Grohmann, C. H. 2015. Effects of spatial resolution on slope and aspect derivation for regional-scale analysis. Computers & Geosciences. 77: 111-117. Gustavsson, T., Karlsson, M., Bogren, J. and Lindqvist, S. 1998. Development of temperature patterns during clear nights. Journal of Applied Meteorology. 37(6): 559-571. Hall Jr, P. K., Morgan, C. R., Gartside, A. D., Bain, N. E., Jabrzemski, R., and Fiebrich, C. A. 2008. Use of climate data to further enhance quality assurance of Oklahoma Mesonet observations. 20th Conf. on Climate Variability and Change. Hare, F. K. and Thomas, M. K. 1974. Climate Canada. J. Wiley & Sons Canada. Hartkamp, A.D., De Beurs, K., Stein, A. and White, J. W. 1999. Interpolation Techniques for Climate Variables. NRG-GIS Series 99-01:34pp. Mexico, D.F.: CIMMYT.

175

Hay, L. E. and Clark, M. 2003. Use of statistically and dynamically downscaled atmospheric model output for hydrologic simulations in three mountainous basins in the western United States. Journal of Hydrology. 282(1): 56-75. Hijmans, R., Cameron, S., Parra, J., Jones, P. and Jarvis, A. 2005. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology. 25(15): 1965-1978. Hocke, K., and Kämpfer, N. 2009. Gap filling and noise reduction of unevenly sampled data by means of the Lomb-Scargle periodogram. Atmospheric Chemistry and Physics, 9(12), 4197-4206. Hubbard, K. G., Guttman, N. B., You, J. and Chen, Z. 2007. An improved QC process for temperature in the daily cooperative weather observations. Journal of Atmospheric and Oceanic Technology. 24(2): 206-213. Hungerford, R. D., Nemani, R. R., Running, S. W. and Coughlan, J. C. 1989. MTCLIM: a mountain microclimate simulation model. USDA Forest Service Research Paper INT. 414: 52. Huth, R. and Nemesova, I. 1995. Estimation of Missing Daily Temperatures - can a Weather Categorization Improve its Accuracy. Journal of Climate. 8(7): 1901-1916. Huth, R., Beck, C., Philipp, A., Demuzere, M., Ustrnul, Z., Cahynova, M., Kysely, J. and Tveito, O. E. 2008. Classifications of Atmospheric Circulation Patterns Recent Advances and Applications. Trends and Directions in Climate Research. 1146: 105- 152. Huwald, H., Higgins, C. W., Boldi, M., Bou-Zeid, E., Lehning, M. and Parlange, M. B. 2009. Albedo effect on radiative errors in air temperature measurements. Water Resources Research. 45: W08431. Ishida, T. and Kawashima, S. 1993. Use of Cokriging to Estimate Surface Air- Temperature from Elevation. Theoretical and Applied Climatology. 47(3): 147-157. Jones, C., Fujioka, F., and Carvalho, L. M. 2010. Forecast skill of synoptic conditions associated with Santa Ana winds in Southern California. Monthly weather review, 138(12): 4528-4541. Kalkstein, L. and Corrigan, P. 1986. A Synoptic Climatological Approach for Geographical Analysis - Assessment of Sulfur-Dioxide Concentrations. Annals of the Association of American Geographers. 76(3): 381-395. Kalkstein, L., Nichols, M., Barthel, C. and Greene, J. 1996. A new spatial synoptic classification: Application to air-mass analysis. International Journal of Climatology. 16(9): 983-1004. Kalkstein, L., Tan, G. and Skindlov, J. 1987. An Evaluation of 3 Clustering Procedures for use in Synoptic Climatological Classification. Journal of Climate and Applied Meteorology. 26(6): 717-730.

176

Kalnay, E., Kanamitsu, M., Kistler, R., Collins, W., Deaven, D., Gandin, L., Iredell, M., Saha, S., White, G., Woollen, J., Zhu, Y., Leetma, A., Reynolds, R., Chelliah, M., Ebisuzaki, W., Higgins, W., Janowiak, J., Mo, K.C., Ropelewski, C., Wang, J., Jenne, R. and Joseph, D. 1996. The NCEP/NCAR 40-year reanalysis project. Bulletin of the American Meteorological Society. 77: 437-470. Karlsson, I. M. 2000. Nocturnal Air Temperature Variations between Forest and Open Areas. Journal of Applied Meteorology. 39(6): 851-862. Kirchengast, G., Kabas, T., Leuprecht, A., Bichler, C., and Truhetz, H. 2014. Wegenernet: A pioneering high-resolution network for monitoring weather and climate. Bulletin of the American Meteorological Society. 95(2): 227-242. Kirchner, M., Faus‐Kessler, T., Jakobi, G., Leuchner, M., Ries, L., Scheel, H. E., and Suppan, P. 2013. Altitudinal temperature lapse rates in an Alpine valley: trends and the influence of season and weather patterns. International Journal of Climatology. 33(3): 539-555. Kondrashov, D., and Ghil, M. 2006. Spatio-temporal filling of missing points in geophysical data sets. Nonlinear Processes in Geophysics. 13(2): 151-159. Konrad, C.E. 1998. Persistent Planetary Scale Circulation Patterns and their Relationship with Cold Air Outbreak Activity Over the Eastern United States. International Journal of Climatology. 18: 1209-1221. Kumar, L., Skidmore, A. K. and Knowles, E. 1997. Modelling topographic variation in solar radiation in a GIS environment. Int. J. Geographical Information Science. 11(5): 475-497. Kunkel, K., Easterling, D., Hubbard, K., Redmond, K., Andsager, K., Kruk, M. and Spinar, M. 2005. Quality control of pre-1948 cooperative observer network data. Journal of Atmospheric and Oceanic Technology. 22(11): 1691-1705. Lanzante, J. 1996. Resistant, robust and non-parametric techniques for the analysis of climate data: Theory and examples, including applications to historical radiosonde station data. International Journal of Climatology. 16(11): 1197-1226. Leathers, D. J., Ellis, A. W. and Robinson, D. A. 1995. Characteristics of temperature depressions associated with snow cover across the northeast United States. Journal of Applied Meteorology. 34(2): 381-390. Lundquist, J. D., and Cayan, D. R. 2007. Surface temperature patterns in complex terrain: Daily variations and long‐term change in the central Sierra Nevada, California. Journal of Geophysical Research: Atmospheres. 112(D11). Lundquist, J. D., Pepin, N. and Rochford, C. 2008. Automated algorithm for mapping regions of cold‐air pooling in complex terrain. Journal of Geophysical Research: Atmospheres. 113(D22). Mahrt, L. 2006. Variation of surface air temperature in complex terrain. Journal of applied meteorology and climatology, 45(11): 1481-1493.

177

McCutchan, M. 1976. Diagnosing and Predicting Surface-Temperature in Mountainous Terrain. Monthly Weather Review. 104(8): 1044-1051. McCutchan, M. 1978. Model for Predicting Synoptic Weather Types Based on Model Output Statistics. Journal of Applied Meteorology. 17(10): 1466-1475. Miller, P. and Benjamin, S. 1992. A System for the Hourly Assimilation of Surface Observations in Mountainous and Flat Terrain. Monthly Weather Review. 120(10): 2342-2359. Minder, J. R., Mote, P. W. and Lundquist, J. D. 2010. Surface temperature lapse rates over complex terrain: Lessons from the Cascade Mountains. Journal of Geophysical Research: Atmospheres. 115(D14). Moore, R. D., & McKendry, I. G. 1996. Spring snowpack anomaly patterns and winter climatic variability, British Columbia, Canada. Water Resources Research. 32(3): 623-632. Mosier, T. M., Hill, D. F., and Sharp, K. V. 2014. 30‐Arcsecond monthly climate surfaces with global land coverage. International Journal of Climatology, 34(7): 2175-2188. Nakamura, R. and Mahrt, L. 2005. Air temperature measurement errors in naturally ventilated radiation shields. Journal of Atmospheric and Oceanic Technology. 22(7): 1046-1058. Nalder, I. and Wein, R. 1998. Spatial interpolation of climatic Normals: test of a new method in the Canadian boreal forest. Agricultural and Forest Meteorology. 92(4): 211-225. Nkemdirim, L. 1988. On the Frequency of Precipitation-Days in Calgary, Canada. Professional Geographer. 40(1): 65-76. Nkemdirim, L. 1996. Canada's chinook belt. International Journal of Climatology. 16(4): 441-462. Oke, T. R. 1987. Boundary layer climates (2nd ed.) Methuen. Pepin, N. and Kidd, D. 2006. Spatial temperature variation in the Eastern Pyrenees. Weather. 61(11): 300-310. Pepin, N. 2001. Lapse rate changes in northern England. Theoretical and Applied Climatology. 68(1-2): 1-16. Pepin, N., Benham, D. and Taylor, K. 1999. Modeling lapse rates in the maritime uplands of northern England: Implications for climate change. Arctic, Antarctic, and Alpine Research. 151-164. Pielke, R., Garstang, M., Lindsey, C. and Gusdorf, J. 1987. Use of a Synoptic Classification Scheme to Define Seasons. Theoretical and Applied Climatology. 38(2): 57-68.

178

Reeves, H. D. and Stensrud, D. J. 2009. Synoptic-scale flow and valley cold pool evolution in the western United States. Weather and Forecasting. 24(6): 1625-1643. Rolland, C. 2003. Spatial and seasonal variations of air temperature lapse rates in Alpine regions. Journal of Climate. 16(7): 1032-1046. Schoof, J. and Pryor, S. 2001. Downscaling temperature and precipitation: A comparison of regression-based methods and artificial neural networks. International Journal of Climatology. 21(7): 773-790. Shafer, M., Fiebrich, C., Arndt, D., Fredrickson, S. and Hughes, T. 2000. Quality assurance procedures in the Oklahoma Mesonetwork. Journal of Atmospheric and Oceanic Technology. 17(4): 474-494. Shea, J., Marshall, S. and Livingston, J. 2004. Glacier distributions and climate in the Canadian Rockies. Arctic Antarctic and Alpine Research. 36(2): 272-279. Shen, S., Dzikowski, P., Li, G. and Griffith, D. 2001. Interpolation of 1961-97 daily temperature and precipitation data onto Alberta polygons of ecodistrict and soil landscapes of Canada. Journal of Applied Meteorology. 40(12): 2162-2177. Sheridan, P., Smith, S., Brown, A. and Vosper, S. 2010. A simple height‐based correction for temperature downscaling in complex terrain. Meteorological Applications. 17(3): 329-339. Sheridan, S. C. 2002. The redevelopment of a weather-type classification scheme for North America. International Journal of Climatology. 22(1): 51-68. Sheridan, S. C., Pirhalla, D. E., Lee, C. C. and Ransibrahmanakul, V. 2013. Evaluating Linkages of Weather Patterns and Water Quality Responses in South Florida Using a Synoptic Climatological Approach. Journal of Applied Meteorology and Climatology. 52(2): 425-438. Snell, S., Gopal, S. and Kaufmann, R. 2000. Spatial interpolation of surface air temperatures using artificial neural networks: Evaluating their use for downscaling GCMs. Journal of Climate. 13(5): 886-895. Stahl, K., Moore, R. D., Floyer, J. A., Asplin, M. G. and McKendry, I. G. 2006a. Comparison of approaches for spatial interpolation of daily air temperature in a large region with complex topography and highly variable station density. Agricultural and Forest Meteorology. 139(3-4): 224-236. Stahl, K., Moore, R. and McKendry, I. 2006b. The role of synoptic-scale circulation in the linkage between large-scale ocean-atmosphere indices and winter surface climate in British Columbia, Canada. International Journal of Climatology. 26(4): 541-560. Stewart, R. E., Bachand, D., Dunkley, R. R., Giles, A. C., Lawson, B., Legal, L., Miller, S. T., Murphy, B. P., Parker, M. N., Paruk, B. J. and Yau, M. K. 1995. Winter Storms over Canada. Atmosphere -- Ocean (Canadian Meteorological & Oceanographic Society). 33(2): 223-247.

179

Stooksbury, D., Idso, C. and Hubbard, K. 1999. The effects of data gaps on the calculated monthly mean maximum and minimum temperatures in the continental United States: A spatial and temporal study. Journal of Climate. 12(5): 1524-1533. Tabachnick, B. G., and Fidell, L. S. 1996. Using Multivariate Statistics. Allyn & Bacon. 4th ed. Tabony, R. 1985. The variation of surface temperature with altitude. Meteorological Magazine. 114(1351): 37-48. Trewin, B. C. 2005. A notable frost hollow at Coonabarabran, New South Wales. Australian Meteorological Magazine. 54(1): 15. Tsonis, A. A. 2002. An introduction to atmospheric thermodynamics. Cambridge University Press. Tveito, O. E., and Førland, E. J. 1999. Mapping temperatures in Norway applying terrain information, geostatistics and GIS. Norsk Geografisk Tidsskrift-Norwegian Journal of Geography. 53(4): 202-212. Vincent, L. A., Wang, X. L., Milewska, E. J., Wan, H., Yang, F. and Swail, V. 2012. A second generation of homogenized Canadian monthly surface air temperature for climate trend analysis. Journal of Geophysical Research: Atmospheres. 117(D18). Vincent, L. A., Zhang, X., Bonsal, B. and Hogg, W. 2002. Homogenization of daily temperatures over Canada. Journal of Climate. 15(11): 1322-1334. Von Storch, H. and Zwiers, F. W. 2001. Statistical analysis in climate research. Cambridge University Press. Wade, C. G. 1987. A quality control program for surface mesometeorological data. Journal of Atmospheric and Oceanic Technology. 4(3): 435-453. Walker, E. R. 1961. A Synoptic Climatology for Parts of the Western Cordillera. (No. PIM 35). MCGILL UNIV MONTREAL (QUEBEC). Wan, H., Wang, X. L. and Swail, V. R. 2010. Homogenization and trend analysis of Canadian near-surface wind speeds. Journal of Climate. 23(5): 1209-1225. Whiteman, C. D., Allwine, K. J., Fritschen, L. J., Orgill, M. M., and Simpson, J. R. 1989. Deep valley radiation and surface energy budget microclimates. Part II: Energy budget. Journal of Applied Meteorology, 28(6): 427-437. Whiteman, C. D., Pospichal, B., Eisenbach, S., Weihs, P., Clements, C. B., Steinacker, R., Mursch-Radlgruber, E. and Dorninger, M. 2004. Inversion breakup in small Rocky Mountain and Alpine basins. Journal of Applied Meteorology. 43(8): 1069- 1082. Whiteman, C. D. 2000. Mountain Meteorology: Fundamentals and Applications. Oxford University Press.

180

Wilby, R., Charles, S., Zorita, E., Timbal, B., Whetton, P. and Mearns, L. 2004. Guidelines for use of climate scenarios developed from statistical downscaling methods. World Meteorological Organization. 2008. Guide to meteorological instruments and methods of observations. 7th edition. World Meteorological organization, Geneva, Switzerland. Yarnal, B., Comrie, A., Frakes, B. and Brown, D. 2001. Developments and prospects in synoptic climatology. International Journal of Climatology. 21(15): 1923-1950.

181