INTERNATIONAL JOURNAL OF CLIMATOLOGY Int. J. Climatol. 28: 629–641 (2008) Published online 12 June 2007 in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/joc.1561

Mesoscale climate analysis: identification of anemological regions and wind regimes

M. Burlando,a,b* M. Antonellia,b and C. F. Rattoa,b a Department of Physics, University of Genoa, Via Dodecaneso, Genoa, Italy b National Consortium of Universities for Physics of Atmosphere and Hydrosphere (CINFAI), Italy

ABSTRACT: Following the idea that the climatological study of a physical variable should aim at the comprehension of its mean state as well as the characterization of its dynamics, cluster analysis has been applied to study the wind climate of Corsica (France) in order to identify the anemological regions (mean state) and the wind regimes (weather variability) which characterize its coastal areas. The analysis is based on a 3-year long time-series of measurements of the wind velocity from 11 anemometric stations located along the perimeter of the island. Since the present study was an analysis preliminary to the subsequent assessment of the wind potential of Corsica, we have worked only with wind intensities. Nevertheless, at the end of our analysis, we have also considered wind directions for the final interpretation of the results. The anemological regions are defined through the comparison of 15 different clustering techniques resulting from the combination of three distance measures and five agglomerative methods. As confirmed by geographical considerations, the results identify three distinct anemological regions: the eastern region (ER), the -western region (NWR), the -western region (SWR). The wind regimes are identified by means of a two-stage classification scheme based on a hierarchical cluster analysis followed by a partitional clustering. The final classification identifies eight regimes: the four wind regimes corresponding to the main weather patterns of Western Europe, as proposed by Plaut and Simonnet, and another four clusters corresponding to breeze regimes. Copyright  2007 Royal Meteorological Society

KEY WORDS anemological regions; hierarchical and partitional clustering; mesoscale wind climate; wind regimes Received 15 May 2006; Revised 25 February 2007; Accepted 15 April 2007

1. Introduction such as the orthogonal eigen-modes decomposition, and cluster analysis. Linear approaches rely on the hypoth- The interest in regional climatological studies aimed at esis that, in the multidimensional phase space of the defining recurrent weather patterns or delineating zones wind patterns (defined like the vector field of the wind, of similar climate has recently grown considerably in i.e. the map which assigns each point of the space the the international scientific community. In particular, this corresponding wind vector), the mean wind field is at interest is promoted by the need for defining objective the centre of the phase space, while wind regimes (the techniques for climate classification and monitoring cli- most recurrent wind patterns) are identified by specific mate changes. In this framework, the present study is directions defined by eigenvectors. In this case, it should focused on two main questions concerning the classifica- be possible to reveal strong correlations among spa- tion of meso-β scale (after Orlanski, 1975) wind fields: tially stationary meteorological wind conditions. Ludwig the regionalization of a territory from an anemological et al. (2004), for example, have been able to identify point of view and the identification of its main wind the most important wind patterns in valleys south of regimes. the Great Salt Lake (Utah) from empirical orthogonal First attempts at the regionalization of wind climate function (EOF) analysis. In particular, they clearly dis- and classification of wind regimes based on the anal- tinguished the circulation of up- and down-valley flows ysis of an ensemble of wind or meteorological fields for negative and positive coefficients of EOF1, respec- date back to the late 1980s, when automated methods tively. However, they could not identify satisfactorily relying on statistical procedures to reduce large, multi- the secondary wind patterns because these are neither variate datasets into distinct weather types began to take necessarily orthogonal to EOF1 nor coupled in spatial place thanks to increasing computational resources. In structures with inverse polarity, as constrained by the particular, two main mathematical approaches have been linear decomposition. Non-linear approaches, on the con- used extensively to study wind data: linear techniques, trary, are based on the identification of attractors which correspond to weather regimes (Lorenz, 1963), defined as peaks of the probability density function in the clima- * Correspondence to: M. Burlando, Department of Physics, University of Genoa, Via Dodecaneso 33, 16 146 Genoa, Italy. tological phase space. Cluster analysis, in particular, is a E-mail: burlando@fisica.unige.it multivariate statistical technique based on the assumption

Copyright  2007 Royal Meteorological Society 630 M. BURLANDO ET AL. that a collection of events can be grouped into a small conditions, but the atmosphere does not merely evolve number of representative states according to a given crite- around a mean state; on the contrary it spends most of rion of similarity (Everitt, 1977). This is why cluster anal- the time among a few peculiar large-scale states. The ysis has been widely adopted in climatology for grouping study of weather regimes is therefore necessary in order stations or grid points to define climate regions, as well to understand the interactions between the forcing mech- as for grouping meteorological patterns into classes or anisms at synoptic and local scales which mainly con- climate regimes. tribute defining different climate regions. The idea that In defining climate regions through clustering tech- climate must be studied not only through the comprehen- niques, measurements from a set of stations or gridded sion of its mean state but also through the understanding data from numerical simulations or data assimilation are of its dynamics is the leading point of the present research generally used. For example, Davis and Kalkstein, in into the wind climate regions and regimes of Corsica. 1990, developed a spatial synoptic surface climatology In the next section an overview of the main synoptic for the continental U.S. to group locations with homo- weather regimes of Western Europe and a short descrip- geneous weather conditions. Fovell and Fovell (1993) tion of the territory under study are reported, in order accomplished a regionalization of the U.S. in climate to give an idea of the expected large- and local-scale zones through the hierarchical cluster analysis of tem- forcing to surface flow fields. Section 3 describes the perature and precipitation data over 50 years. On the available anemological data, their standardization in order basis of the same databases, Bunkers et al. (1996) rede- to calculate the distances for clustering, as well as a short fined the climate regions in U. S. northern Plains through presentation of clustering methods. It is worth noting that a hierarchical clustering method followed by a pseudo- the present study was preliminary to the assessment of the hierarchical iterative procedure to optimize the final clas- wind energy potential of Corsica, so that we have based sification through element reassignment. Finally, Mim- our analysis only on wind intensity patterns, i.e. scalar mack et al. (2001) applied a hierarchical clustering to fields of the wind intensity mapped into the physical cumulated monthly rainfalls to define rainfall regions in space. The vector fields of wind intensity and direction South Africa. have been used just for the final interpretation of the Alternatively, a great number of examples concern the results. In Section 4 a great number of clustering algo- use of cluster analysis to identify climate regimes by rithms to define wind climate regions are compared, and grouping recurrent time-series of contemporary meteoro- the corresponding results are shown. Then, the method- logical observations or gridded data. In 1987, Kalkstein ology for the identification of different wind climate et al. used a combination of principal component analysis regimes and the resulting classification are reported in and clustering to study the time-series of seven meteoro- Section 5. Conclusions are drawn in Section 6. logical variables collected in a single surface location, in order to develop a synoptic index to classify the tempo- ral succession of synoptic situations. Davis and Walker 2. Synoptic and local forcing to wind climate (1992) applied a similar technique to upper-air synop- tic climatology using thermal, moisture and flow data As already stated, the present study was the first part of a from a rawinsonde to classify seasonal and inter-annual more general research concerning the assessment of the synoptic scale variations in hydrodynamic and thermody- wind potential of Corsica. In this framework, we had the namic conditions. Eder et al. (1994) proposed a two-stage necessity of defining the main large-scale synoptic flows clustering classification scheme designed to elucidate the as well as the local forcing on wind patterns. dependence of ozone on meteorology. Mengelkamp et al. In Western Europe there are approximately two main (1997) studied the large-scale wind climatology making a states for the atmosphere: the westerly or zonal flows cluster analysis of 850 hPa wind fields and vertical tem- modulated by the advection of Atlantic lows, and perature gradients from time-series of radiosonde data the long-lived blocking anticyclonic configurations over to develop a statistical dynamical downscaling between North Sea or Scandinavia. Plaut and Simonnet (2001) upper-air and surface wind fields by means of a regional applied the dynamical cluster algorithm (Michelangeli atmospheric model. The same procedure was applied et al., 1995) to large-scale patterns of the sea level by Mengelkamp (1999) for studying the wind poten- pressure (SLP) and geopotential heights at 700 and tial of a complex terrain area. More recently, cluster 500 hPa tropospheric levels, in order to identify the analysis has been used for the identification of sets of main weather regimes over Western Europe. They dis- forecasts in ensemble prediction systems (for example: tinguished five different regimes at sea-level which cor- Stephenson and Doblas–Reyes, 2000) and also to study responds to the main pressure configurations of the climate dynamics (Smith et al., 1999; Cassou et al., 2004; aforementioned westerly or easterly flows aloft. Follow- Straus and Molteni, 2004). ing Vautard (1990), Plaut and Simonnet named Atlantic It is worth noting that all the aforementioned studies ridge (AR) the SLP regime when the high-pressure rely either on the identification of climate regions or on over mid-latitude North Atlantic induces north-westerlies the classification of climate regimes, but they usually do over Western and Central Europe; Greenland anti-cyclone not deal with both these topics together. Climate regions (GA) when a dipole of high-pressure over Greenland are defined as zones of similar average meteorological and low-pressure over mid-latitude North Atlantic causes

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc WIND CLIMATE ANALYSIS BY MESOSCALE WIND FIELDS CLUSTERING 631 south-westerly ; zonal flow (ZO) when a low- AR, GA, and WBL configurations give rise, over pressure over Iceland and a high-pressure over the Corsica, to the well-known wind patterns called maestro, Azores enhance the purely zonal flow over the North , , and . In particular, in Figure 1 Atlantic; blocking (BL) when the high-pressure over Cen- the synoptic near-surface baric conditions over Western tral Europe forces south-easterlies; and blocking Europe, i.e. the 925 hPa geopential height, and the (WBL) when an anti-cyclonic cell centred over Scotland corresponding surface wind flow fields at 10 m above favours north-easterly flows. ground level (a.g.l.) over Corsica are shown for the

Figure 1. Correspondences between weather regimes over Western Europe and local surface wind flow fields over Corsica. Panels show 925 hPa geopotential heights (left) and wind fields at 10 m a.g.1. (right).

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc 632 M. BURLANDO ET AL. aforementioned wind patterns. The 925 hPa geopotential 3. Cluster analysis of anemological data heights are from the NCEP/DOE AMIP-II reanalysis We analysed time-series of the wind intensity of a 3- (Kistler et al., 2001), while surface wind fields are year period from 11 anemometric stations located along simulations from the Limited Area Model BOLAM the shoreline of Corsica. The spatial distribution of (Buzzi et al., 1994). The uppermost panels show the cor- the stations is shown in Figure 2, which indicates a respondence between AR-like configuration and maestro reasonably even distribution along the whole perimeter wind over Corsica (the event occurred on 21st January of the island. 2005); GA-like configuration is then compared to libeccio All stations belong to Met´ eo´ France network and (21st October 2004); finally, WBL-like configuration record the horizontal wind intensity and direction at 10 m gives rise to both sirocco (21st February 2004) and (a.g.l.). Wind measurements are averages over the last gregale (23rd November 2005) depending on the position 10 min of the hour, and datasets have a sampling rate of the low-pressure over Mediterranean. every hour (24 times per day from 00 UTC to 23 UTC) or In the centre of Western Europe, Corsica consists of every 3 h (8 times per day from 00 UTC to 21 UTC) from a territory about 175 km wide in latitude and 80 km in 1st October 1996 to 30th September 1999. Cluster anal- longitude (Figure 2). It is characterized by a very com- ysis applies to fixed-dimension vectors in order to have plex topography, and the main mountain chain crosses well-defined distances between elements in the measure the whole island from north to the southern edge with space, so that it would be desirable to have synchronous more than 2000 m high peaks. Such high topography is databases. Therefore, while it is not important to have expected to strongly influence the sea-level wind pat- the total number of available measurements per station, terns of the island, determining at least different cli- i.e. the entire length of the original databases, we have matological regions between the western and eastern retained only the measurements simultaneously collected side, and inducing recurrent wind patterns forced by in all stations as working datasets. Consequently, each = the meridional mountain chain of the island. Keeping in dataset would count a maximum of nmax 8760 contem- mind the morphology of the territory and the aforemen- porary observations (3 years × 365 days × 8 measure- tioned schematization of synoptic surface wind flows, we ments per day), but the number of available measure- expect that a successful clustering is able on one hand to ments per station is somewhat lower than nmax because of distinguish at least two main wind climate regions on the recording interruptions for maintenance or damage. Actu- west and coasts as a consequence of the topograph- ally, in spite of the aforementioned random interruptions, = ical blocking of ZO (Section 4), and on the other hand the m 11 time-series of contemporary measurements = ∼ to reproduce the main large-scale patterns (Section 5) of consists of n 7277 values ( 83% of nmax). It is worth noting that every measurement consists of the Plaut and Simonnet’s classification. two values in the form of wind intensity and direction (v, θ) or, equally, horizontal wind components (v1,v2). However, for the reasons already explained, we decided to work only with wind intensities. Besides, we had expected that the identification of both climate regions 43.0 4 - Cap Corse Height and climate regimes could be obtained through the use (meters) of wind intensities alone. Indeed, as already mentioned 42.8 in Section 1, wind climate regions are defined as zones 2500 5 - Oletta of similar average conditions so that it sounds reasonable 6 - Ile Rousse 42.6 3 - Bastia 2250 that wind intensities hold enough information to iden- 7 - Calvi 2000 tify different climatological zones. Moreover, intensities 42.4 1750 2 - Alistro should also be able to distinguish wind climate regimes, 1500 as topography and local forcing are expected to shelter 42.2 1250 and force wind flows as a function of the wind direc-

Latitude (deg) 1000 tion so that surface wind patterns can be recognized like 42.0 1 - Solenzara 8 - Ajaccio 750 different patchworks of high and low wind intensities in 41.8 9 - Pila-Canale 500 correspondence of the stations. Nevertheless, after having 250 based our analyses just on wind intensity data, we will 10 - Sartene 41.6 0 recover the wind directions for the final interpretation of 11 - Figari wind climate regimes. 41.4 3.1. Wind intensity standardization 8.6 8.8 9.0 9.2 9.4 9.6 Longitude (deg) Let us represent all information, i.e. wind intensities, in the form of a n × m matrix V ≡{vil} (with 1 ≤ l ≤ m Figure 2. Topography of Corsica. The position of the considered eleven ≤ ≤ = anemometric stations along the shoreline of the island is also shown. and 1 i n where, as stated above, m 11 is the Stations are numbered counterclockwise from 1 (Solenzara) to 11 number of stations and n = 7277 is the number of (Figari). retained data per station). Following this notation, two

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc WIND CLIMATE ANALYSIS BY MESOSCALE WIND FIELDS CLUSTERING 633 different time-series of wind intensity corresponding to • the first step consists in defining a distance measure stations r and s, can be expressed as the column vectors d in order to establish similarities between pairs of n vr ≡{vir : r fixed} and vs ≡{vis : s fixed}∈R .Since elements. In particular, d hastobeametricmea- cluster analysis evaluates the similarity between time- sure and satisfy the properties of symmetry (dij = series through a metric measure of the distance between dji∀ 1 ≤ i, j ≤ n), non-negativity (dij ≥ 0), identifi- simultaneous observations, stations with comparable cation (dii = 0), definiteness (dij = 0ifi = j),and mean and variance risk being similar simply because dif- triangle inequality (dij ≤ dil + dlj ). ferences in wind intensities are small in absolute terms. • in the second step, an information loss function is In this case, two vectors vr and vs are likely to be clus- defined which represents the clustering procedure by tered together even though the corresponding time-series amapφ : {1,...,m}→{1,...,k}. are poorly correlated, unless wind intensities are not stan- dardized. Actually, all the stations that we are considering There exist many different clustering algorithms lie in coastal areas, but they are heterogeneous as for the depending on the choice of d and φ, but every algorithm distance from the coast, the position in crest or valley, can be generally classified into hierarchical or partitional and the height above ground level, so that it is likely methods. In agglomerative hierarchical methods, all ele- that two stations might have similar mean wind inten- ments are initially considered independent clusters which sities simply because the characteristics of their location are being aggregated iteratively. On the contrary, parti- are similar. Some kind of standardization is then required tional methods assign elements to groups, once the num- in order to de-localize the time-series and analyse com- ber of final clusters has been specified, through successive parable datasets. As Mimmack et al. (2001) pointed out, re-arranging of the elements from an initial partition of the usual standardization based on subtraction of sample items into groups or successive associations of items to an mean and division by standard deviation initial set of seed points which will form the nuclei of the clusters. Generally, non-hierarchical methods apply faster −  = vir vr to large datasets because large matrices of distances do vir (1) σr not have to be determined and stored during computation. where vr represents the sample mean, i.e. time-average, of the wind intensities collected in station r,andσr is 4. Identification of anemological regions the corresponding standard deviation, might be inaccurate ˜ for zero-bound data like wind intensities (Wilks, 1995). Let us analyse the set of elements V (Section 3.1) to Moreover, datasets having unitary standard deviations define the wind climate regions of Corsica. Actually, could be clustered together just because the variability there is no general agreement on the most appropriate of the corresponding time-series has been reduced. This distance measure d to use, as well as on the most suitable is true at least when distance measures are Euclidean. mapping function φ. For instance, Mimmack et al. (2001) A more appropriate delocalization is obtained by compared Euclidean and Mahalanobis distances to define defining the new variables (Kaufmann and Whitemann, rainfall regions, and they concluded that Mahalanobis 1999) distances are less appropriate than Euclidean distances when the latter are calculated on major principal com- vir ponents. Nevertheless, the biggest problem of Euclidean v˜ir = (2) vr distance measures is that they do not take the correla- tion between variables explicitly into account, while it Applying Equation (2), all normalized time-series have is expected for stations lying in the same anemological mean v˜r = 1, while different standard deviations σr are region to be highly correlated. retained. In particular, we will use the set of vectors n V˜ ={v˜1 ···v˜r ···v˜m : v˜r ∈ R } to define the wind cli- 4.1. Methodology mate of Corsica (Section 4). In order to identify the wind As far as the singling out of different wind regions is con- regimes, a further normalization will be required (Section cerned, we compared three different distance measures, 5). d :Rn × Rn → R, in order to identify the most appropri- ate correlation measure between pairs of vectors. The first 3.2. Distance measure and clustering distance is the Euclidean distance measure, which does Cluster analysis is a widely used numerical technique not include any information about correlation between which seeks to allocate objects into groups following stations r and s some kind of criteria defined apriori. In our case, we ˜   1 have a set of elements, e.g. V, and we wish to partition n 2 2 its elements, v˜,intok groups, or clusters, such that a d2(v˜r , v˜s) = (v˜ir −˜vis) (3) couple of elements in the same cluster is more similar i=1 than a couple of elements belonging to distinct clusters. This clustering operation is obtained in two subsequent Obviously, d2 increases as the difference between the steps: two standard deviations, σr and σs ,ofvectorsv˜r and

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc 634 M. BURLANDO ET AL. v˜s increases, out of the uncertainty determined by the Cramer´ index) and five information-loss functions (sin- high-dimensionality of the vectors. gle linkage, complete linkage, group average, Ward’s, and Then, we have considered the squared Pearson dis- Mc Quitty’s) for a total number of 15 different hierarchi- tance, based on the linear correlation between time-series cal clustering algorithms. The most effective representa- tion of hierarchical aggregation is the tree-like structure, = − 2 d2ρ(v˜r , v˜s) 1 ρrs (4) i.e. the dendrogram, which illustrates successive merges among clusters. Figure 3 shows an example of dendro- where ρrs is the linear correlation coefficient. It is worth gram obtained for the squared Pearson distance coupled noting that ρrs does not change if non-normalized vari- with Ward’s method. Following the order of aggrega- ables are used instead of variables normalized accord- tion shown in the tree, at the top level the algorithm ing to Equation (1) or (2). Furthermore, when d2ρ → 0, distinguishes two main regions: the cluster of the east- simultaneous wind measurements behave according to the ern stations (Solenzara, Alistro, Bastia) and the cluster   linear model v ir = ρrsv is. of all the other stations (Figure 2). Moving to a further Finally, we used the distance defined by means of the subdivision, the biggest cluster is separated into a clus- Cramer´ (1946) index, ,as ter of the north-western stations (Cap Corse, Ile Rousse, Calvi) and a cluster of the south-western stations (Ajac- = − 2 d (v˜r , v˜s ) 1 rs (5) cio, Pila–Canale, Sartene, Figari). At this stage, Oletta,  located on the north-eastern side of the island, represents 2 2 2 The index  is defined as χ /χmax,whereχ is the an unexpected exception because it is aggregated to the chi-squared value of the contingency table of v˜r and v˜s , south-western cluster which lies at the opposite side of 2 = − and χmax n(p 1) is the maximum chi-squared value, the island. being n the number of data and p the minimum between The visual analysis of all dendrograms shows similar- the number of rows and columns of the contingency table. ities between the clustering algorithms in particular for Note that χ 2 = 0 when the variables are statistically the subdivision between the eastern and western sides 2 = 2 → independent, while χ χmax and d 0incaseof of the island. The element of vagueness and subjectivity perfect correlation. of the analysis consists in the choice of the number of As far as the map φ is concerned, we tested the follow- ing agglomerative hierarchical algorithms corresponding to five different mapping methods: single linkage, com- Table I. Parameters of Equation (6) corresponding to different plete linkage, group average, Ward’s minimum variance agglomerative hierarchical classification algorithms. (Ward, 1963), and McQuitty’s (Mc Quitty, 1966). All these methods build up the aggregations according to the Algorithm αA αB βγ following scheme: Single linkage 1/2 1/2 0 −1/2 • at the beginning, the algorithm finds two clusters A Complete linkage 1/2 1/2 0 1/2 and B such that d(A,B) is minimal; Mc Quitty 1/2 1/2 0 0 N N • clusters A and B are merged in a single cluster Group average A B 00 N(A∪B) N(A∪B) C = A ∪ B and, for every remaining cluster N ∪ N ∪ Ward (A D) (B D) − ND 0 D,d(C,D) is computed; N(A∪B∪D) N(A∪B∪D) N(A∪B∪D) • the algorithm ends in n − 1 steps when all elements are grouped inside a single cluster.

All the aforementioned methods calculate the distance Cluster dendrogram measure between groups satisfying the recurrence for- 1.4 mula by Lance and Williams (1967) that specifies how 1.2 similarities are computed when agglomerating clusters 1.0 0.8 d(C,D) = αAd(A,D) + αB d(B,D) 0.6 + βd(A,B) + γ |d(A,D) − d(B,D)| (6) 0.4 Distance

Table I shows the values of the parameters in Equa- tion (6) for each clustering algorithm. N ,N and N

A B D Calvi Figari Oletta Alistro Bastia Ajaccio are the number of elements in cluster A, B and D, respec- Sartene Solenzara Ile Rousse Cap Corse

tively. Pila Canale Squared Pearson distance 4.2. The wind climate regions of Corsica Ward’s method

As stated in the previous section, we tested the coupling Figure 3. Cluster dendrogram of successive aggregations between of three distance measures (Euclidean, squared Pearson, stations for the squared Pearson distance and the Ward’s method.

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc WIND CLIMATE ANALYSIS BY MESOSCALE WIND FIELDS CLUSTERING 635 clusters to be retained. We made some hypotheses for The classification of these three stations into a North- decreasing levels of aggregation and we concluded that Western Region (NWR) seems reasonable as supported the most realistic level is the level which retains five by recurrent aggregations. clusters, because some algorithms still aggregate Oletta • The cluster of the south-western stations Pila–Canale, to the south-western stations when just four clusters are Sartene, and Figari {9, 10, 11} also occurs 6 times, retained (e.g. Figure 3). and the cluster {8, 9, 10, 11} add another four counts The aggregations obtained for the 15 different algo- to the hypothesis of the existence of a South-Western rithms after retaining five clusters per algorithm are Region (SWR). reported in Table II. The first column reports the sets • All the five clustering algorithms involving d2ρ agree of stations, numbered 1–11 counterclockwise along the in classifying Ajaccio, {8}, as a stand alone station, perimeter of the island from Solenzara to Figari, which while it is grouped 4 times with south-western stations are clustered together with a decreasing order as far as in the set {8, 9, 10, 11} and 3 times with Calvi in the set their frequency of occurrence is concerned. Columns 2–4 {7, 8, 10, 11}. We suppose that Ajaccio is a transitional show how many times the clusters appear, i.e. the cor- station between NWR and SWR. responding frequency of occurrence, as a function of the distance measure. Note that in each of these columns It is worth noting that not all the distance measures the number of clusters amounts to 25, coming from behave stably with the five information-loss functions. five different information-loss functions and five dendro- The squared Pearson distance measure shows the most grams per distance. The last column, which is the sum of regular behaviour, and its classification is fully consistent columns 2–4, reports how many times the corresponding with all the aggregation methods: the composition of the cluster occurs in absolute terms. five clusters does not change with φ. On one hand, the From Table II some conclusions can be drawn. Euclidean distance is probably too weak a condition to be independent of the aggregation method, because it is • All the clustering algorithms agree upon isolating a purely ‘geometrical’ distance and does not take into Oletta, {5} from any other station. The inability to account any degree of correlation between vectors. On aggregate this station to the neighbouring ones is due to the other hand, the Cramer´ index distance is probably the low spatial representativeness of the corresponding too sensitive to non-linear correlations of large-scale dataset. The reasons of this behaviour will be discussed patterns. The squared Pearson distance, involving only at the end of Section 4.3. linear correlations, is probably the most representative • Almost all the clustering methods, i.e. 14 counts over distance because it is tuned, through linear correlations, 15, seem to recognize the set of the stations Solenzara, towards local scale phenomena. As a consequence, the Alistro, Bastia {1, 2, 3} asastablecluster,sothatit association of different φ to distances d2 and d produces seems realistic to assume this set as representative of more variable solutions than d2ρ does (Table II). an Eastern anemological Region (ER). • The cluster of the stations Cap Corse, Ile Rousse, Calvi 4.3. Climatological interpretation of the wind climate {4, 6, 7} occurs 6 times, and the associations {4, 6} regions of Corsica and {6, 7} also present high frequency of occurrence. Corsica has a typical Mediterranean climate. During summer, the climate is dominated by the presence of subtropical high-pressure cells, with dry subsiding air Table II. Frequency of occurrence of the clusters obtained from capping the colder surface marine layer and creating the 15 different clustering methods. stable atmospheric conditions at least all along the coasts. Therefore, from late spring to early autumn the sea and Cluster d2 d2ρ dψ Total land breezes are the prevailing winds. During winter, the subtropical high shifts to the south and its influence {5} 55515is replaced by the sequence of extratropical cyclones {1, 2, 3} 45514 { } travelling within the storm track associated with the 4, 6, 7 –51 6polar jet stream. In this period, the intermittency between {9, 10, 11} 15– 6 {8} –5– 5westerly and easterly flows over Corsica is associated {4} 1–4 5to the cyclones movement from west to east, as well {8, 9, 10, 11} ––4 4as to occasional secondary Alpine cyclogenetic events {4, 6} 4–– 4in the Gulf of Lion or in the Ligurian Sea, or blocking {6, 7} ––4 4situations. {9} 4–– 4On the regional scale, the summer climate depends {7, 8, 10, 11} 3–– 3mainly on the incoming solar radiation, which does {1, 2, 3, 7, 8, 10, 11} 1–– 1not vary very much within the meridional extension {8, 9, 10} ––1 1of Corsica, i.e. 1.5° of latitude. As a consequence, the { } 7, 8 1–– 1summer wind climate does not change considerably all { } 6 1–– 1along the coast as it consists mainly of local winds {11} ––1 1 forced by the thermal gradient between land and sea. On

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc 636 M. BURLANDO ET AL. the contrary, the winter climate is extremely influenced contemporaneous wind observations from m = 49 differ- by the topography, which acts sheltering the western ent points over a 55 × 55 km2 spatial domain as vector m and part of the northern coast from easterly flows, and elements, xi ∈ R with 1 ≤ i ≤ n, defining the distance the eastern coast from westerlies. For instance, SWR is matrix d(xi, xj ) between pairs of elements, i.e. wind sheltered from Gregale and partially from Sirocco, NWR fields, through a non-Euclidean distance. The authors is partially sheltered from Sirocco and Libeccio, and ER tested different information-loss functions φ in the frame- is sheltered from Maestro and Libeccio (Figure 1). In work of hierarchical methods and concluded that the this respect, the topography acts like a directional filter complete linkage method was the most suitable because of winter flow fields and highly contributes in defining it was able to build clusters of comparable dimensions. different anemological regions over Corsica. In 1996, Kaufmann and Weber extended their automated Having stated the major role of winter flows to define classification to longer datasets, n = 8784, introducing a different regional wind climates over Corsica, it is worth further refinement of the clustering procedure through a noting that all the anemometric stations sheltered with two-stage classification: the hierarchical cluster analysis, respect to the synoptic forcing can not be considered according to the complete linkage method for detection representative of the regional wind climate because they and exclusion of outliers, provides a first-guess classifica- turn out to be influenced only by local forcing. It is well- tion into groups; the centroids of the first-guess clusters known that the spatial representativeness of an anemo- become initial seeds for the k-means non-hierarchical metric station, intended as the degree to which a mete- method. Indeed, in purely hierarchical clustering, if an orological instrument recordings is able to resolve the erroneous classification groups two clusters in an early large-scale wind variability, strongly depends on the ter- stage of the process, there can not be reclassification at ritory where it is sited (World Meteorological Organisa- any later stage. Partitional methods, which do not involve tion, 1996). Some station records may disproportionately the tree-like construction process, allow that elements are resolve characteristics of their immediate surroundings reassigned inside clusters until some metric relative to the when the sensor is sheltered by topographic obstacles or centroids of the clusters is minimized, like the sum of the roughness elements with respect to the larger scale flow. variance over all clusters or the total distance between This is the case of Oletta, whose measurements have less elements and their centroids. The metric to minimize utility in representing climate or weather of the surround- and the choice of the distance measure determine the ing area with respect to other stations since it is placed shape of the final optimum clusters. This automated wind inside a valley sheltered from the main synoptic winds. field classification was applied again by Kastendeuch and As a result, its dataset is almost not correlated at all with Kaufmann in 1997, who improved the procedure to esti- large-scale flows and local winds are by far the most pre- mate the matrix of transition probabilities between wind vailing wind regimes. This is also the reason why all the fields, and by Kaufmann and Whitemann in 1999 for clustering algorithms agree upon isolating Oletta from classification of wintertime wind regimes in the Grand any other station but Pila–Canale or Sartene, which have Canyon Region. similar locations inside valleys. However, the aggregation of Oletta with Pila–Canale and/or Sartene occurs when 5.1. Methodology retaining only four clusters per algorithm (for example In the present study, we retained the general guidelines Figure 3). In particular, Oletta has fairly high values of corresponding to the two-stage automated classification the correlation coefficient especially with Sartene during procedure by Kaufmann and Weber (1996), with some summer because sea and mountain breezes, which appear differences for the distance measure and the mapping almost synchronously all over Corsica, are well recogniz- methods of the cluster analysis. able in the datasets of both these stations. On the contrary, Let us define two different wind patterns of the wind Oletta is not correlated to closer stations, like Cap Corse, intensity at times i and j as the row vectors v˜i = m Ile Rousse or Bastia, since they are much more influenced {vil : i fixed}, v˜j ={vjl : j fixed}∈R . Following Kas- by the synoptic wind regimes. On the other hand, the tendeuch and Kaufmann (1997), in order to distinguish aggregation of Oletta with stations located on the other equivalent wind patterns which differ by a scaling factor, side of the mountain chain would be misleading as for the we adopted the further normalization subdivision of Corsica in macro-regions of similar wind ˜ climate: this is one of the reasons which have convinced vil vil = (7) us to stop at the aggregation level presented in Table II. ˜v i

where ˜v i stands for the space-average of simultaneous wind intensities at i-time. The set of vectors Vˆ = 5. Classification of anemological regimes m {vˆ1 ···vˆi ···vˆn : vˆi ∈ R } will be used to identify the Likewise the identification of anemological regions, there wind climate regimes of Corsica. exist no unique criteria for regime identification and clas- As for the distance between elements, we chose the sification. Weber and Kaufmann in 1995 proposed an Euclidean distance measure d :Rm × Rm → Ranalo- automated classification scheme for wind fields based gously to the distance between wind fields used at on a hierarchical cluster analysis. They used n = 744 National Meteorological Center to characterize forecast

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc WIND CLIMATE ANALYSIS BY MESOSCALE WIND FIELDS CLUSTERING 637 errors (Toth, 1991). In particular, we considered the fol- the metric associated to the clustering algorithm, which is lowing distance between wind patterns i and j an indicator of the loss of information when two clusters gather together, can be assumed as an objective meter for   1 m 2 this purpose. The values of the total variance, i.e. the sum 2 d2(vˆi , vˆj ) = (vˆil −ˆvjl) (8) of cluster variances, when clusters merge along the tree- l=1 like structure of the dendrogram is plotted in Figure 4 as a function of the number of clusters. The diagram shows × and we calculated from Equation (8) the initial n n a strong reduction of total variance at steps corresponding symmetric positive-definite distance matrix d,sothat to N = 4,N = 6andN = 8 clusters, while it decreases groups are depicted to lie in Euclidean spaces, and wind slowly after the latter value. We chose N = 8 because it fields are replaced by orthogonal coordinates. represents the last really abrupt step, as after this point the Finally, Weber and Kaufmann (1995) pointed out the decreasing trend become more gradual, while for N = 4 importance for the clustering of wind regimes to have the or N = 6 too much information is lost. With this choice, tendency of building similar-size clusters, i.e. sizes of the the cluster size ranges from 385 to 1570 elements. same order of magnitude. Indeed, we have focused on the Before the second stage of clustering is performed, classification of the most frequent wind patterns, e.g. the Kaufmann and Weber suggest to exclude outliers, i.e. wind regimes of Section 2, which are expected to show wind patterns which may be attributed to transition comparable frequency of occurrence. As a consequence, between diurnal and nocturnal cycle, thunderstorms or methods based on least-squares criterion such as the front passages or weak wind situations. For each cluster, hierarchical Ward’s minimum variance and the partitional we analysed the frequency distribution, f(x),ofthe k-means method are particularly appropriate to identify values of variance associated to the wind regimes with wind regimes as they aim at finding compact, spherical 2 respect to the corresponding centroid, x = vˆ − vˆc ≥ 0. clusters. The Ward’s method, in particular, has this The empirical functions have highly populated central property and it is good especially at recovering the cluster bodies and low-frequency tails which represents the structure when clusters are not elongated. Therefore, we outliers: thresholds between central body and outliers used this method in the first stage of the Weber and lie in the range 6.5–12.0. We did not filter out outliers Kaufmann’s classification procedure, followed by the because they correspond to ∼1% of the total number of k iterative -means algorithm of Hartigan and Wong (1979) elements. This is likely due to the prevailing of breezes to refine the hierarchical cluster analysis and complete the instead of weak wind conditions in the coastal areas of second stage. In particular, we used a global optimization Corsica, so that outliers are expected not to influence method to reassign some elements to different clusters: at considerably the following clustering process. each step new centroids and the metric to minimize are In the second stage, when the hierarchical aggregation determined, then the procedure continues until no more has defined a first guess of the clusters, the corresponding rearrangements take place and the optimum assignment of centroids are assumed as the seeds for the k-means clusters is found. It is worth noting that the Hartigan and clusterization. After reclassification, 38% of the elements Wong’s algorithm chooses clusters in order to minimize turn out to have been reassigned, and the smallest cluster the total variance, in accordance to the criterion of Ward’s hierarchical method used in the first stage.

5.2. The wind climate regimes of Corsica Sum of cluster variances In the following we present the results of the classification method (see previous section) applied to the coastal 1200 surface wind patterns of Corsica. In particular, the two- stage clustering technique is outlined in the following 1000 three points: 800 • the distance measure is defined through the Euclidean distance of Equation (8); 600 • Ward’s minimum variance aggregation method is cho- N=4 sen as hierarchical clustering in the first stage; 400 • k-means partitional method based on the minimisation N=6 N=8

of the total variance is applied in the second stage. Total variance (Ward’s method) 200 After the first stage has been completed, and all the n = 7277 wind patterns are aggregated into a single cluster, the criterion to define the final number of clusters 510152025 to be retained as representative of the main climatological Number of clusters wind regimes of the island remains an open question. As Figure 4. Values of total variance corresponding to successive cluster suggested by Kaufmann and Weber (1996), the value of merging for the last 25 aggregations.

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc 638 M. BURLANDO ET AL.

(a)Histograms of clusters (b) Histograms of clusters 8 8 200 400 150 300 100 200 50 100 0 0 7 7 200 400 150 300 100 200 50 100 0 0 6 6 200 400 150 300 100 200 50 100 0 0 5 5 200 400 150 300 100 200 50 100 0 0 4 4 Count Count 200 400 150 300 100 200 50 100 0 0 3 3 200 400 150 300 100 200 50 100 0 0 2 2 200 400 150 300 100 200 50 100 0 0 1 1 200 400 150 300 100 200 50 100 0 0 2 46 8 10 12 5101520 Months of the year Hour of the day

Figure 5. Seasonal (left) and daily (right) distributions of the eight wind regimes of Corsica. counts 579 elements, while the maximum-size cluster Clusters 1, 4, 6, and 8 are characterized by flows groups 1519 wind fields. Figure 5 shows the seasonal which are converging towards or diverging from the and daily cycles of the resulting clusters. Some clusters inland areas, corresponding to the regimes of sea breeze show a marked seasonal cyclicity, like clusters 1 and (cluster 6) and land breeze (clusters number 1, 4, and 6insummer,8inwinter,2inspringandautumn, 8), respectively. Cluster 6 is typically a summer diurnal whereas other clusters, like 3, 4, 5, and 7 do not present regime, flows are directed from sea to land, and it has a well-defined seasonal nature. However, low populated high frequency of occurrence (21%). On the contrary, clusters associated to less common events, like number clusters 1, 4, and 8 are nocturnal and flows are mainly 3, 4 and 5, could be misunderstood, as only three years directed seaward. In particular, clusters 1 and 8 can of measurements were available. Many clusters present be interpreted as two different versions of land breezes daily cyclicity as well. Clusters 3, 6, and 7 are distinctly predominant during summer and winter, respectively ∼ diurnal with maximum frequency in the morning (at 10 (Figure 5(a)), while cluster 4 is a further pattern which UTC), around noon (∼12 UTC) and early in the afternoon ∼ does not present any preferential season. Altogether, land ( 15 UTC), respectively. On the contrary, clusters 1, 4, breezes have frequency of occurrence of 27%. 5, and 8 seem to be associated to nocturnal phenomena. Clusters 2, 3, 5, and 7, instead, represent the wind Cluster 2 does not show any preferential occurrence. regimes driven by the synoptic meteorological situation. 5.3. Patterns of the wind climate regimes They can be divided into easterly (clusters 2 and 3) To have a deeper understanding of the eight clusters, and and westerly flows (clusters 5 and 7). Cluster 2, which verify the reliability of these results, we recovered for corresponds to the well-known gregale wind (WBL con- figuration in Section 2), does not show any dependence each cluster the corresponding components (v1,v2) of the horizontal wind (Section 3) and we calculated the on the time of the day but shows a preferential appear- average wind vectors, i.e. the vector of the mean wind ance in spring and autumn. Cluster 3 presents the typ- components, at measurements sites. Figure 6 shows the ical pattern of the sirocco wind (WBL configuration), average wind vectors corresponding to the wind climate with strong flow channelling in the southern part of the regimes of Corsica, as identified after the two-stage island and recirculation on the lee side of the moun- clustering. The length of wind vectors is proportional tain chain. It is worth noting that both these easterly to the square root of the corresponding average wind patterns have a minimum of frequency of occurrence dur- vectors. From these pictures, clusters can be separated ing summer, when meteorological blocking over eastern into two distinct classes: thermally forced wind regimes Europe is rare. Finally, maestro and libeccio winds (AR and synoptically driven wind regimes. and GA configurations) merge together in clusters 5 and

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc WIND CLIMATE ANALYSIS BY MESOSCALE WIND FIELDS CLUSTERING 639

Cluster 1 (~8%) Cluster 2 (~15%) Cluster 3 (~10%) Cluster 4 (~11%)

43.0 43.0 43.0 43.0

42.5 42.5 42.5 42.5

42.0 42.0 42.0 42.0 Latitude (deg) 41.5 41.5 41.5 41.5

8.5 9.0 9.5 8.5 9.0 9.5 8.5 9.0 9.5 8.5 9.0 9.5 Cluster 5 (~11%) Cluster 6 (~21%) Cluster 7 (~16%) Cluster 8 (~8%)

43.0 43.0 43.0 43.0

42.5 42.5 42.5 42.5

42.0 42.0 42.0 42.0 Latitude (deg) 41.5 41.5 41.5 41.5

8.5 9.0 9.5 8.5 9.0 9.5 8.5 9.0 9.5 8.5 9.0 9.5 Longitude (deg) Longitude (deg) Longitude (deg) Longitude (deg)

Figure 6. Average wind vectors at measurements sites for all the 8 clusters. Percent values represent the frequency of occurrence associated to each cluster. Clusters 1, 4, 6 and 8 identify the breeze regimes. Clusters 2 and 3 correspond to easterly flows, while clusters 5 and 7 represent westerly flows.

7, because they have very similar patterns over Corsica. by some distributions showing an evident peak which cor- Maestro winds, in particular, are typically north-westerly responds to the direction of the wind vector as shown in winds over southern France, but they blow counterclock- Figure 6. This behaviour suggests that only a few stations wise around the Alps so that the wind turn out to be per cluster contribute identifying the corresponding wind mainly westerly over Corsica, like libeccio. Actually, the regime, and they define a subspace of Vˆ which constitute main difference between clusters 5 and 7 is that the for- the typical mark of the regime. mer is typically nocturnal, while the latter is prevalently diurnal, so that they show the superimposition of land and sea breezes, respectively. 6. Conclusions Finally, we have analysed the frequency distribution of In the present paper, we focus on the setting up of wind direction for all the eight wind regimes in order to as objective a methodology as possible to describe the evaluate how much the direction regimes, shown up so climatology of meso-β scale (Orlanski, 1975) wind fields. clearly in the aforementioned interpretations, can be con- In scientific literature, a great number of papers are sidered physically consistent, and not just artificially pro- available for the definition of climate regions (Davis duced by the clustering procedure. In particular, we have and Kalkstein, 1990; Fovell and Fovell, 1993; Bunkers compared the frequency distribution of wind direction of et al., 1996; Mimmack et al., 2001) as well as for the the complete database with the corresponding distribution classification of climate regimes (Kalkstein et al., 1987; for each cluster. As an example, Figure 7 shows the result Davis and Walker, 1992; Eder et al., 1994; Mengelkamp of this analysis for cluster 7. In each panel, which refers et al., 1997; Stephenson and Doblas–Reyes, 2000; Smith to an anemometric station, the x-axis corresponds to the et al., 1999; Cassou et al., 2004; Straus and Molteni, ° ° ° ° wind rose subdivided into 12 sectors (0 –30 ,30–60 , 2004) through cluster analysis. Nevertheless, as far as ..., 330° –360°), while the y-axis reports the frequency of the authors know, studies of wind climate specifically wind direction per sector. The stations placed in northern dedicated to wind regions are not available, while a great and western side of the island (Cap Corse, Ile Rousse, number of researches mainly deal with wind regimes Calvi, Ajaccio, Pila–Canale, Sartene, Figari) show, for (Weber and Kaufmann, 1995; Kaufmann and Weber, cluster 7, a peak of frequencies of the wind coming from 1996; Kastendeuch and Kaufmann, 1997; Kaufmann the third (180° –270°) and/or fourth (270° –360°) quad- and Weber, 1998; Kaufmann and Whitemann, 1999). rants higher than the corresponding complete datasets, Actually, as explained in Section 1, we do believe that and a consequent reduction from the first (0° –90°)and the climate of a zone should be defined through the second (90° –180°) quadrants. The remaining stations characterization of its average state together with a few (Solenzara, Alistro, Bastia, Oletta) have frequency dis- typical states which represent the weather variability. tributions of the wind direction comparable between the Therefore, we suggest that the full characterization of different datasets. In general, each cluster is characterized wind climate should be studied through two distinct

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc 640 M. BURLANDO ET AL.

cluster7 complete database

Solenzara 0.4 0.0 Sartene

Pila Canale 0.4 0.0 Oletta

Ile Rousse 0.4 0.0 Figari

frequency Cap Corse 0.4 0.0 Calvi

Bastia 0.4 0.0 Alistro

Ajaccio 0.4 0.0 000–030 030–060 060–090 090–120 120–150 150–180 180–210 210–240 240–270 270–300 300–330 330–360 sector

Figure 7. Frequency distributions of wind direction of the entire dataset and cluster 7 for each anemometric station. and complementary analyses based on the identification be particularly stable and reliable as the corresponding of both wind climate regions (average state) and wind classification did not present dependence on the specific climate regimes (weather variability). Finally, it is worth clustering algorithm. As for the interpretation of the clus- noting that in the beginning of the paper we made tering, it is worth noting that in Western Europe there the assumption that the modulus of the wind velocity exist on average two main states for the atmosphere: retains enough information to distinguish wind climate the common middle-latitude westerly flows and easterly regions as well as wind climate regimes, so that wind flows corresponding to blocking configurations over Cen- directions can be ignored in the clustering procedure. All tral and Eastern Europe. Taking into account the topog- the analyses here presented have been made by means of raphy of Corsica which mainly consists of a meridional time-series of the wind intensities, and final results seem mountain chain, alternating zonal flows are expected to to confirm this hypothesis. create different climatic zones at least on the west and Before performing the cluster analysis, we looked east sides of the island. The final results of the clustering in literature for studies related to the wind climate of are in accordance with these hypotheses in that they dis- Western Europe. The reason for this research is twofold: tinguish three wind climate regions: an Eastern Region, on one hand we had to identify the meteorological forcing a North-Western Region and a South-Western Region. at synoptic scale which plays a role in determining As for the classification of wind regimes (Section 5), typical surface wind flows, in order to understand what we followed the two-stage clustering procedure suggested the results of cluster analysis should look like; on the by Kaufmann and Weber (1996), even though some dif- other hand we have used previously reported results to ferences and refinements have been applied. In particular, compare them qualitatively with our new outcomes and we based the hierarchical aggregation method on Ward’s thus test their reasonableness. In particular, we referred minimum variance technique applied to an Euclidean to the classification proposed by Plaut and Simonnet in distance, while the partitional method consisted of the 2001 (Section 2), which distinguishes some basic surface k-means method with minimization of the total variance, pressure configurations corresponding, over Corsica, to in order to assume the same metric to define clusters the well-known wind patterns called maestro, libeccio, during the first and second stages. As we hoped, the clas- sirocco, and gregale. sification of wind climate regimes of Corsica recovered In our attempt at regionalization of the wind climate the large-scale classification of Plaut and Simonnet, in of Corsica (Section 4), we performed 15 different cluster that we have been able to distinguish the synoptically analyses of the available wind data, in order to test dif- driven wind patterns of maestro, libeccio, sirocco, and ferent techniques and identify the most suitable one. In gregale in clusters 2, 3, 5 and 7 of Figure 6. In addition, particular, the squared Pearson distance measure, based another four clusters corresponding to thermally forced on the linear correlation between variables, proved to sea and land breezes were identified.

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc WIND CLIMATE ANALYSIS BY MESOSCALE WIND FIELDS CLUSTERING 641

Finally, it is worth noting that a few wind regimes seem Everitt B. 1977. Cluster Analysis. HEB: London, UK. to greatly contribute to determining different anemolog- Fovell RG, Fovell M-YC. 1993. Climate Zones of the Conterminous United States Defined Using Cluster Analysis. Journal of Climate 6: ical regions. In general, we expect synoptically driven 2103–2135. wind regimes to make a greater contribution to deter- Hartigan JA, Wong MA. 1979. A K-means clustering algorithm. mining different anemological regions than thermally Applied Statistics 28: 100–108. Kalkstein LS, Tan G, Skindlov JA. 1987. An evaluation of objective driven wind regimes, in spite of their comparable over- clustering procedures for use in synoptic climatological classifica- all frequency of occurrence of 52 and 48%, respectively. tion. Journal of Climate and Applied Meteorology 26: 717–730. Indeed, the synoptically driven wind regimes usually cor- Kastendeuch PP, Kaufmann P. 1997. Classification of summer wind fields over complex terrain. International Journal of Climatology 17: respond to high-wind spatially correlated events charac- 521–534. terized by the presence of large sheltered areas on the Kaufmann P, Weber RO. 1996. Classification of Mesoscale Wind lee side of the mountains, as they depend on the inter- Fields in the Field Experiment. Journal of Applied Meteorology 35: 1963–1979. action between large-scale forcing and the topography Kaufmann P, Weber RO. 1998. Directional Correlation Coefficient of the island. On the contrary, the contribution of breeze for Channeled Flow and Application to Wind Data over Complex regimes is less important as low winds are usually poorly Terrain. Journal of Atmospheric and Oceanic Technology 15: 89–97. correlated. In cluster 2, for example, the average wind Kaufmann P, Whitemann CD. 1999. Cluster-Analysis Classification of Wintertime Wind Patterns in the Grand Canyon Region. Journal of vectors corresponding to stations in the Eastern Region Applied Meteorology 38: 1131–1147. blow southward, the wind vectors in the NWR blow Kistler R, Kalnay E, Collins W, Saha S, White G, Woollen J, south-westward, while the remaining stations in the SWR Chelliah M, Ebisuzaki W, Kanamitsu M, Kousky V, van den Dool H, Jenne R, Fiorino M. 2001. The NCEP-NCAR 50-Year are sheltered. Analogously, ER is identified in clusters 2, Reanalysis: Monthly Means CD-ROM and Documentation. Bulletin 3, 4, 5, 6, 7, and 8. NWR appears in clusters 2, 5, 6, and of the American Meteorological Society 82: 247–267. 7. SWR is distinguishable in clusters 2, 3, 5, 6, 7, and Lance GN, Williams WT. 1967. A general theory of classificatory sorting strategies: 1. hierarchical systems. Computer Journal 9: 8. Therefore, except in NWR of cluster 3, all the synop- 373–380. tically driven wind regimes together with the sea breeze Lorenz EN. 1963. Deterministic nonperiodic flow. Journal of the regime (cluster 5) identify the three anemological regions, Atmospheric Sciences 26: 130–141. Ludwig FL, Horel J, Whiteman CD. 2004. Using EOF Analysis to whereas the lower-intensity land breezes contribute Identify Important Surface Wind Patterns in Mountain Valleys. slightly to determining different wind climate regions. Journal of Applied Meteorology 43: 969–983. Mc Quitty LL. 1966. Similarity analysis by reciprocal pairs for discrete and continuous data. Educational and Psychological Measurement Acknowledgements 26: 825–831. Mengelkamp H-T. 1999. Wind Climate Simulation over Complex The present research has been supported by the French Terrain and Wind Turbine Energy Output Estimation. Theoretical and Applied Climatology 63: 129–139. “Agence de l’Environnement et de la Maˆıtrise de Mengelkamp H-T, Kapitza H, Pfluger U. 1997. Statistical-dynamical l’Energie” (ADEME) and “Collectivite´ Territoriale de downscaling of wind climatologies. Journal of Wind Engineering Corse”. The continuous assistance and encouragement by and Industrial Aerodynamics 67–68: 449–457. Mr. Philippe Istria is gratefully acknowledged. Special Michelangeli PA, Vautard R, Legras B. 1995. Weather regimes: recurrence and quasi stationarity. Journal of the Atmospheric thanks to Mr. Federico Cassola and Professor Roberto Sciences 52: 1237–1256. Festa for their precious suggestions in our discussions Mimmack GM, Mason SJ, Galpin JS. 2001. Choice of Distance on meteorological and technical aspects. All elaborations Matrices in Cluster Analysis: Defining Regions. Journal of Climate 14: 2790–2797. have been made by means of the R software for statis- Orlanski I. 1975. A rational division of scales for atmospheric tical computing and graphics (http://www.Rproject.org). processes. Bulletin of the American Meteorological Society 56: Professor A. Speranza of CINFAI is gratefully 529–530. Plaut G, Simonnet E. 2001. Large-scale circulation classification, acknowledged. weather regimes, and local climate over France, the Alps and Western Europe. Climate Research 17: 303–324. Smith P, Ide K, Ghil M. 1999. Multiple Regimes in Northern References Hemisphere Height Fields via Mixture Model Clustering. Journal Bunkers MJ, Miller JR, De Gaetano AT. 1996. Definition of climate of the Atmospheric Sciences 56: 3704–3723. regions in the Northern Plains using an objective cluster modification Stephenson DB, Doblas-Reyes FJ. 2000. Statistical methods for technique. Journal of Climate 9: 130–146. interpreting Monte Carlo ensemble forecasts. Tellus A 52: 300–322. Buzzi A, Fantini M, Malguzzi P, Nerozzi F. 1994. Validation of a Straus DM, Molteni F. 2004. Circulation Regimes and SST Forcing: limited area model in cases of Mediterranean cyclogenesis: surface Results from Large GCM Ensembles. Journal of Climate 17: fields and precipitation scores. Meteorology and Atmospheric Physics 1641–1656. 53: 137–153. Toth Z. 1991. Intercomparison of circulation similarity measures. Cassou C, Terray L, Hurrell JW, Deser C. 2004. North Atlantic Winter Monthly Weather Review 119: 55–64. Climate Regimes: Spatial Asymmetry, Stationarity with Time, and Vautard R. 1990. Multiple weather regimes over the North Atlantic: Ocean Forcing. Journal of Climate 17: 1055–1068. analysis of precursors and successors. Monthly Weather Review 118: Cramer´ H. 1946. Mathematical Methods of Statistics. Princeton 2056–2081. University Press: New Jersey, USA. Ward JH. 1963. Hierarchical grouping to optimize an objective Davis RE, Kalkstein LS. 1990. Development of an automated spatial function. Journal of the American Statistical Association 58: synoptic climatological classification. International Journal of 236–244. Climatology 10: 769–794. Weber RO, Kaufmann P. 1995. Automated Classification Scheme for Davis RE, Walker DR. 1992. An Upper-Air Synoptic Climatology of Wind Fields. Journal of Applied Meteorology 34: 1133–1141. the Western United States. Journal of Climate 5: 1449–1467. Wilks DS. 1995. Statistical Methods in the Atmospheric Sciences. Eder BK, Davis JM, Bloomfield P. 1994. An Automated Classification Academic Press: San Diego, CA, USA. Scheme Designed to Better Elucidate the Dependence of Ozone on World Meteorological Organization. 1996. Guide to Meteorological Meteorology. Journal of Applied Meteorology 33: 1182–1199. Instruments and Methods of Observations, WMO-No. 8.

Copyright  2007 Royal Meteorological Society Int. J. Climatol. 28: 629–641 (2008) DOI: 10.1002/joc