Disentangling different types of El Nino˜ episodes by evolving climate network analysis

A. Radebach,1, 2, 3, 4 R. V. Donner,1, 5, ∗ J. Runge,1, 2 J. F. Donges,1, 2, 6 and J. Kurths1, 2, 7 1Potsdam Institute for Climate Impact Research, P.O. Box 60 12 03, 14412 Potsdam, Germany 2Department of Physics, Humboldt University Berlin, Newtonstraße 15, 12489 Berlin, Germany 3Mercator Research Institute on Global Commons and Climate Change, Torgauer Straße 12-15, 10829 Berlin, Germany 4Macroeconomic Sustainability Assessment, Economics of Climate Change, Technical University Berlin, Straße des 17. Juni 145, 10623 Berlin, Germany 5Max Planck Institute for Biogeochemistry, Hans-Knoll-Straße¨ 10, 07745 Jena, Germany 6Stockholm Resilience Centre, Stockholm University, Kraftriket¨ 2B, 11419 Stockholm, Sweden 7Institute for Complex Systems and Mathematical Biology, University of Aberdeen, Aberdeen AB243UE, United Kingdom (Dated: October 18, 2018) Complex network theory provides a powerful toolbox for studying the structure of statistical interrelationships between multiple time series in various scientific disciplines. In this work, we apply the recently proposed climate network approach for characterizing the evolving correlation structure of the ’s climate system based on reanalysis data of surface air temperatures. We provide a detailed study on the temporal variability of several global climate network characteristics. Based on a simple conceptual view on red climate networks (i.e., networks with a comparably low number of edges), we give a thorough interpretation of our evolving climate network characteristics, which allows a functional discrimination between recently recognized different types of El Nino˜ episodes. Our analysis provides deep insights into the Earth’s climate system, particularly its global response to strong volcanic eruptions and large-scale impacts of different phases of the El Nino˜ Southern Oscillation (ENSO).

PACS numbers: 92.60.Ry, 92.10.am, 89.75.Hc, 05.45.Tp

I. INTRODUCTION tially extended dynamical system “climate” based on signifi- cant statistical associations extracted from the multitude of en- tangled interactions in the original system [19, 24]. Thus, cli- During the last years, complex network theory [1–3] has mate network analysis opens a new perspective on the Earth’s found wide use not only in the social sciences, engineering, complex climate system. and biology, but also in Earth and environmental sciences. Pioneering work on fundamental aspects of many real-world The bridge from complex network theory to the climate complex networks has triggered an enormous interest in ap- system is based on two fundamental identifications. First, a plying graph-theoretical concepts for the characterization of distinct set of climatological time series obtained at fixed lo- complex geophysical systems. Among others, prominent ex- cations on the Earth are interpreted as vertices of the climate amples include applications in hydrology [4], seismology [5– network. Second, relevant statistical associations between the 8], soil sciences [9–11], and geoscientific time series analy- time series are represented by the network’s edges. The cli- sis [12–18]. Recently, also climatologists started to discover mate network resulting from this approach is then subject to the instruments of complex network theory [19, 20]. Having certain well established (but still actively progressing) statisti- lead to novel insights into the climate system, this promising cal methods originated in complex network theory [1–3, 62]. new branch of climate science is on its way to refine and con- While, as sketched above, this approach is a relatively young solidate its tools [21–61]. one in the climate context, the same structural identification is nowadays widely used in neuroscience, leading to so-called In order to understand the functioning of the climate sys- functional brain networks [63–67] based on statistical asso- tem, relevant underlying physical processes and their interac- ciations between electromagnetic recordings at different parts tions have to be identified. For this purpose, a widely appli- of the brain. cable approach is performing a careful statistical analysis of existing climate data and successively refining existing math- Recent research on climate networks has either investigated arXiv:1310.5494v1 [physics.data-an] 21 Oct 2013 ematical models. Here, we focus on the statistical aspect several measures of the static network relying on the complete only. Traditionally, this problem has been addressed by meth- time span of observations [30, 42, 44] or considered the tem- ods from multivariate statistics, such as empirical orthogonal poral variability of only one specific measure [25–28, 45, 46]. function (EOF) analysis and related techniques. In order to In this work, we combine these two approaches to analyze study spatio-temporal climate variability from a different per- the time-evolution of the global climate system from a com- spective, the climate network approach has been introduced plex network perspective using a set of complementary net- for obtaining a spatially discretized representation of the spa- work characteristics. A similar approach has been recently ap- plied in the analysis of long-term variability in epileptic brain networks [68]. We emphasize that the approach of evolving networks (i.e., complex network structures representing the ∗Corresponding author: [email protected] system’s state within several consecutive windows in time) as 2 used in this work is conceptually related with, but distinctively statistical properties such as the variance are not removed. different from temporal networks [69]. Notably, the concept Moreover, interannual shifting of seasons [71–73] is not con- of temporal networks explicitly mixes topological and tem- sidered. For technical reasons, all leap days are removed from poral information, whereas both are clearly separated in the the resulting time series, which has only negligible effects on present study. the results as long as only lag-zero statistical associations be- In this paper, we present several methodological improve- tween different sites are studied. Regarding non-zero lags, the ments with respect to previous works on climate networks, corresponding effects are found to be statistically negligible such as an alternative type of spatial grid for the network con- as well. struction, which avoids distortions of the climate network’s properties due to the grid geometry [32]. Subsequently, we apply our modified approach to reanalysis data of surface air B. Climate network construction temperature around the globe, spanning the time period be- tween 1948 and 2009. The meaning of characteristic graph 1. Identification of vertices (nodes) properties of the climate network in terms of the underlying physical system as well as their temporal variability when ob- The first step in the construction of a climate network is tained from running windows in time are systematically stud- the appropriate identification of vertices. For example, the lo- ied and discussed in the context of known large-scale climate cations at which the considered time series are available can events such as El Nino˜ episodes or strong volcanic eruptions. be directly used as the spatial locations of network vertices. This paper is organized as follows: In Sect. II, we describe When operating with station data [29], this leads to an ir- the data set used in this work. Afterwards, the construction regular spatial distribution of vertices with a large variety of and statistical description of climate networks is discussed in nearest-neighbor distances. However, even for reanalysis data some detail. The results of an evolving climate network anal- sets or climate models, the arrangement of vertices in the pub- ysis are presented in Sect. III. Subsequently, we demonstrate lished data sets is commonly only regular with respect to the the robustness of our findings regarding various methodologi- difference angles in both longitudinal and latitudinal coordi- cal options in Sect. IV and put them into a climatological con- nates. This results in a significant spread in the actual spatial text in Sect. V. Finally, the main conclusions obtained from distances between neighboring vertices in low and high lati- the presented work are summarized (Sect. VI). tudes. Specifically, average inter-vertex distances are smaller close to the poles than in low- regions (see Fig. 1A,B). Such a heterogeneous distribution of vertices is known to in- II. MATERIALS AND METHODS duce severe distortions in the topological properties of spa- tially embedded networks [74, 75]. Even more, for reanalysis A. Description of the data data, information for high is typically provided with rather large uncertainty, since there are hardly any direct mea- As stated above, climate networks are complex networks surements that can be assimilated into the underlying climate based on statistical associations between climatological time model. As a consequence, there are many vertices with less series obtained at several locations on the Earth. In this study, reliable data in the polar regions. we use air temperatures obtained from the NCEP/NCAR re- In order to correct for the geometrically induced effects, analysis I data set [70], which cover the time period 1948- Heitzig et al. [74] recently introduced a specific class of 2009 (i.e., 62 years) with a daily resolution on an angularly vertex-weighted network measures explicitly taking informa- regular 2.5◦ × 2.5◦ grid. Specifically, we investigate air tem- tion on the spatial distribution of vertices and, hence, their peratures obtained at sigma level 0.995 (i.e., the atmospheric neighborhood size into account. This concept can be under- height where 99.5% of the surface air pressure is attained), stood as a sophisticated generalization of area-weighted con- shortly referred to as surface air temperatures (SAT) in the nectivity measures previously studied in the context of climate following. network analysis [20, 21, 30, 40, 60]. The annual cycle of solar insolation is known to induce the As an alternative approach, in this work we project the leading-order variation of air temperatures. Since we are in- available spatially distributed SATA time series onto a dif- terested in dynamical interactions within the Earth’s climate, ferent type of grid with a higher of homogeneity and this dominant externally triggered effect is not of interest. In isotropy on the sphere (i.e., a grid where the typical nearest- order to properly remove the effect of seasonality from obser- neighbor grid point distances as well as the numbers of neigh- vational time series, a number of different methods may be boring vertices are the same almost everywhere) by means used [29]. Here, we restrict ourselves to removing the long- of interpolation [32]. Specifically, we use a quasi-isotropic term mean annual cycle (base period are the 62 years of the icosahedral grid [76] (see Fig. 1C,D), which is constructed record) from the observational data separately for all consid- as follows: First, the vertices of an icosahedron are projected ered locations. For this purpose, we subtract the long-term onto the sphere, yielding 12 initial grid points with constant mean values for each day of the year, a procedure known as spacing. As anchor points defining the icosahedron, we use phase averaging [29]. Of course, the resulting first-order sur- North and South Pole as well as a third point at 26.56◦N, 0◦E face air temperature anomalies (SATA) only account for sea- (the choice of this third reference point on the zero meridian sonality in the mean, while annual variations in higher-order is convenient, but arbitrary); all other initial grid points fol- 3

A B climate sciences, we conjecture that a possible bias (given its existence) can be widely neglected. A detailed examination of this point is, however, beyond the scope of the present study.

2. Identification of edges (links)

Having thus defined the vertices of the climate network, in C D a second step, the corresponding connectivity is established. This step requires two basic ingredients: the selection of a pair-wise measure of statistical association between time se- ries obtained at each grid point (vertex), and the definition of an appropriate threshold criterion determining which of these associations are statistically relevant. Specific associ- ation measures previously used for climate network construc- tion include the linear (Bravais-Pearson) correlation coeffici- cent [19], (cross-) mutual information [30, 31], a phase syn- chronization index based on the normalized Shannon entropy FIG. 1: (Color online) Geographical neighborhoods of (A,C) a high- of the associated phase difference time series [28], the (cross- latitude (North Atlantic between Svalbard and Northern Greenland) ) mutual information of order patterns [40, 59, 60, 79], event and (B,D) a low-latitude (close to Singapore) grid point given on ◦ ◦ synchronization [41–43], transfer entropy [56], or graphical (A,B) a standard (angularly regular) 2.5 × 2.5 grid with 10,226 models for identifying “causal” climate networks [47, 48, 80, 10,242 grid points and (C,D) an icosahedral grid with grid points 81]. We refer to the corresponding references for details. Of (i.e., n = 5 completed refinement steps), respectively. The cho- sen grid points (red) are connected to all other grid points that are course, other association measures could be used here as well. closer than 500 kilometers. For the standard grid, the high-latitude In all cases, the resulting matrix of normalized pair- grid point has 56 geographical neighbors, whereas the one at low wise statistical associations, e.g., cross-correlation coeffi- latitude has only 8. In contrast, for the icosahedral grid, both grid cients (here within a given time window) is considered as the points have 16 neighbors each. Obviously the heterogeneous spatial weight matrix W = (Wij) of a fully connected weighted distribution of grid points in the standard grid determines the huge graph. In order to obtain a climate network representation (as difference between both neighborhoods, whereas the homogeneous a simple unweighted graph), thresholding is applied to this spatial distribution of grid points of the icosahedral grid enhances the matrix to infer the climate network’s adjacency (connectivity) comparability of vertex properties in different regions of the Earth. matrix A = (Aij) defined as

A = Θ(W − W ∗ ) − δ . (1) low from symmetry. The same procedure applies to the edges ij ij ij ij of the icosahedron (forming 20 equilateral triangles), which ∗ are also projected onto the sphere. Second, every projected Here, Wij is a threshold deciding whether or not the associ- triangle is partitioned into four smaller triangles with approx- ation between vertices i and j is considered statistically rel- imately the same area on the sphere by bisecting the projected evant, Θ(·) is the Heaviside function, and δij Kronecker’s edges. At each bisection point, a new vertex is introduced. delta. In principle, this thresholding can be performed in two This procedure of grid refinement is repeated as often as de- different ways: sired. The number of vertices grows as N = 5 · 22n+1 + 2 with n being the number of completed refinement steps, i.e., (i) On the one hand, it is possible to locally select an ap- N = 42, 162, 642, 2,562 and 10,242 for n = 1,..., 5. Con- propriate threshold separately for each pair of vertices version of the available data is performed using a standard [34, 44, 79], where the significance is determined inde- bilinear interpolation scheme using the four angularly regu- pendently by taking the individual time series’ proba- lar grid points of the quadrilateral surrounding the respective bility distribution and auto-covariance structure into ac- icosahedral grid point [77]. count, for example, by means of AAFT surrogates [82] We emphasize that for the SATA data used in this study, or block-bootstrapping [83]. In this spirit, local thresh- the described spatial interpolation does not cause any con- olding has the important conceptual advantage of rep- siderable errors, since the SATA variability at geographically resenting only the statistically significant interrelation- close points is typically very similar. However, interpolation ships with respect to some specific null model. can generally induce spurious correlations [78], which are not necessarily spatially homogeneous. Since the framework used (ii) On the other hand, the threshold can be defined glob- ∗ ∗ in this paper is based on correlations between time series from ally, i.e., Wij ≡ W [19, 27, 31]. This can be achieved different locations (see below), we cannot completely rule out by considering a fixed quantile of the empirical distribu- a possible effect on the resulting climate network properties. tion p(W••) [130] of all weights Wij (e.g., determined Given the wide-spread use of such interpolation approaches in by the significance of associations of a proper statistical 4

model), which results in an edge density In contrast to these local measures, in this work we are mostly interested in characterizing temporal changes of the N−1 N , X X N climate network topology on a global scale, which primar- ρ = A (2) ij 2 ily calls for the study of scalar-valued network characteris- i=1 j=i+1 tics evolving in time. Of course, one has to be aware of the fact that changes of such global characteristics always reflect (i.e., the fraction of possible edges realized in the net- changes at a local scale. work). Obviously, global thresholding is computation- Temporal changes in climate networks have already been ally by far less demanding than local thresholding and considered by different authors. Tsonis and Swanson [21] allows a more direct comparison of network patterns ob- compared the number and geographic length distribution of tained at different parts of the globe. edges as well as the spatial connectivity pattern for El Nino˜ (EN) and La Nina˜ (LN) years. They found that under EN Notably, both approaches are not equivalent, since global conditions, the global climate network contains considerably thresholding can lead to spurious results in the presence of fewer and geographically shorter edges when considering a strong serial dependences (e.g., auto-correlations) in some in- fixed threshold W ∗ for network construction. Using a more dividual time series [44]. Nevertheless, in this work, we will subtle approach, Yamasaki and co-workers [25–28] confirmed restrict ourselves exclusively to global thresholding in order to a considerable global impact of El Nino˜ on the climate net- reduce the computational efforts. Note that in general, thresh- work in terms of the appearance and disappearance of edges olding results in a loss of information about the exact strengths (“blinking links”). of pair-wise associations. Hence, different thresholds repre- Here, we mainly focus on the time evolution of three global sent different levels of considered association strength (or dif- network characteristics, which are widely used in complex ferent significance levels in case of local thresholding) and re- network research: sult in different edge densities of the derived networks. Conse- quently, looking at climate networks with different edge den- (i) The average path length L [2, 3] measures the mean sities highlights distinct aspects and intrinsic scales of the un- shortest () graph distance between all pairs of derlying association structure of the climate system. vertices in the network, i.e., the average smallest num- ber of edges to be traversed to cover the distance between two randomly chosen vertices on the graph, C. Network quantifiers N 1 1 X L = L with L = L , (5) After having transformed the available climate data into a N i i N − 1 ij complex network representation, the next step is to character- j=1 ize the resulting discrete structures. For this purpose, there is a with Lij denoting the length of the shortest path (i.e., the large amount of statistical characteristics quantifying different number of edges) between vertices i and j, and Lii = 0 aspects of network topology on both local (vertex or edge) and by definition. Note that for ensembles of spatially em- global scale [1–3, 62]. In recent research on climate networks, bedded networks with the same edge density, transfer of much attention has been spent on the probability distributions connectivity between spatial scales (i.e., changes in the and spatial patterns of vertex characteristics, such as degree edge length distribution) can change the average path

N length. However, spatial redistribution of edges alone X (i.e., even without transfer between spatial scales) can k = A , (3) i ij lead to similar changes in L. j=1 (ii) The network transitivity T [3] – sometimes also referred area-weighted connectivity [20, 21, 40, 44], or betweenness to as the (Barrat-Weigt) clustering coefficient [2, 84] – centrality [30, 31, 44]. characterizes the degree of transitivity in the connec- Besides such measures characterizing exclusively network tivity relations in the network relative to the maximally topology (connectivity), there are those quantifying certain as- possible value (or, put differently, the global density of pects of the spatially embedded geometry of the graph. These closed “triangles” in the network): measures rely on the geographical distance matrix d = (dij) which stores the shortest spatial distances (along on PN AijAikAjk the sphere) between all pairs of vertices i and j. Notable ex- T = i,j,k=1 . (6) PN A A amples are the edge length distribution p(d••|A•• = 1) – in i,j,k=1,j6=k ij ik the following understood as referring to present edges – and the maximal edge length per vertex (iii) The global (Watts-Strogatz) clustering coefficient C [85] measures the average density of triangles centered at all max di = max(dij|Aij = 1). (4) vertices of a network, j N PN 1 X j,k=1 AijAikAjk The latter quantity allows identifying vertices possessing C = C with C = , (7) N i i k (k − 1) long-range connections (teleconnections). i=1 i i 5

. where C is the local clustering coefficient of vertex i. C ? 0 N i i. e., H (G , G ) = 2c 2 . Recall that c is the number of is conceptually related with, but distinct from T and ac- edges that are present in the network with the lower edge den- tually captures a different property of the network under sity but not in the network with the higher edge density. For study. Particularly, T does not explicitly take the degree the mutual comparison of different climate networks the latter of each vertex into account, whereas C does. part, which we refer to as the corrected Hamming distance

For spatially embedded networks such as climate networks, ? 0 0 H (G , G ) = H(G , G ) − ∆ρ, (10) the possible ranges of the aforementioned global character- istics are often predetermined by the associated spatial con- is of particular interest, since it measures the structural dis- straints [52, 75, 86–89], which calls for a careful interpretation similarity one would find for two networks of the same edge of the corresponding results. For example, the small-world density, disregarding the effect of different numbers of edges. property (i.e., high global clustering coefficient and short av- By definition, we have 0 ≤ H?(G , G 0) ≤ H(G , G 0) ≤ 1, i.e., erage path length [85]) common to many real-world networks the same range as for the original Hamming distance. can be induced by the spatial embedding alone [75]. In the context of complex networks representing subse- quent snapshots of the evolving network topology, i.e., G = 0 Gt and G = Gt−δt with t denoting some time interval of inter- D. Characterization of graph dissimilarity est and δt being a fixed time increment, the Hamming distance

0 In addition to the scalar network characteristics discussed Ht,t−δt = H(G , G ) (11) above, for studying dynamical changes in climate network topology, it is useful to consider a measure for comparing two (as well as its counter-part corrected for the effect of differ- networks with the same set of vertices. This is traditionally ent edge densities) can be interpreted as the relative change achieved by the Hamming distance [30, 90] in connectivity between subsequent networks, i.e., a discrete “network derivative” given the direct analogy with the classi- N−1 N ,   cal difference quotient. This viewpoint is of particular interest 0 X X 0 N H(G , G ) = Aij − A , (8) in the context of evolving climate networks, since strong dif- ij 2 i=1 j=i+1 ferences between networks obtained for subsequent time in- tervals point to a (temporary) global-scale instability of the 0 where G and G are the two graphs to be compared with ad- spatial interdependence structure of the considered climato- 0 jacency matrices A and A , respectively. By definition, we logical observable. have H = 0 for identical networks, and H = 1 for net- works being inverse with respect to the presence and absence of edges. Note that H treats the combined presence and ab- III. RESULTS sence of edges in the two networks symmetrically, i.e., an in- version of both networks does not alter the result. A. Methodological setting Trivially two networks with different numbers of edges al- ways have H > 0. Hence, separating the corresponding effect In order to study the signatures of annual- to decadal-scale from a “real” difference in the placement of (present) edges variability in the climate network, we determine the underly- provides additional insights into network topology. Let us de- ing connectivity as described in Sect. II for running windows fine of a given width w in time and study the temporal variability 0 of the resulting global network characteristics. a = |{(i, j)|i < j ∧ Aij = 0 ∧ Aij = 0}| 0 For comparing the topological properties of evolving cli- b = |{(i, j)|i < j ∧ Aij = 1 ∧ Aij = 0}| mate networks, two different methodological settings are pos- 0 c = |{(i, j)|i < j ∧ Aij = 0 ∧ Aij = 1}| sible: d = |{(i, j)|i < j ∧ A = 1 ∧ A0 = 1}|, ij ij (i) On the one hand, the global threshold W ∗ used for edge where |S| is the number of elements of the set S, i.e., a is generation can be kept constant. In this case, we expect the number of edges absent in both networks, d is the number variations in the number of edges present in the network of edges present in both networks, and b and c refer to the as previously found by other authors [21, 25, 27, 28] re- respective numbers of edges present in exactly one of both lated with the global signature of ENSO variability. We 0 N will specifically discuss this situation in Sect. IV B. networks. This implies H(G , G ) = (b + c)/ 2 . Let ρ = N 0 N (b + d)/ 2 and ρ = (c + d)/ 2 be the edge densities (ii) On the other hand, many complex network characteris- of both networks. Without loss of generality, ρ ≥ ρ0 (i.e., tics depend on the number of vertices and edges present b ≥ c). With the edge density difference ∆ρ := |ρ − ρ0| = in the network (cf. our discussion on the Hamming dis- N (b − c)/ 2 ≥ 0, we obtain tance in Sect. II D). Hence, comparing the properties of climate networks with different numbers of edges and N thoroughly interpreting the corresponding results can be H(G , G 0) = ∆ρ + 2c = ∆ρ + H?(G , G 0), (9) 2 a non-trivial task. Therefore, it is desirable to keep the 6

edge density ρ of the networks fixed when studying their time evolution [131]. In this case, the threshold W ∗ A varies in time. A higher threshold thus implies that the empirical distribution p(W••) of the considered pair- wise statistical association measure is shifted towards ∗ higher values of W••. Thus, periods with increased W indicate that there is a higher fraction of strong statisti- cal associations in the climate system, i.e., the obtained edges represent stronger mutual interdependences. In the following, we study the resulting properties of the global SATA network based on the reanalysis data set (time resolution ∆t = 1 day) projected onto an icosahedral grid with N = 10,242 vertices (i.e., n = 5 refinement steps of the B grid construction algorithm described in Sect. II B). For the network evolution, running windows of width w = 1 year and offset ∆w = 30 days are considered. Network connectivity is established based on the lag-zero cross-correlations Cij(s = 0) between all pairs (i, j) of records (the alternative case of maximum cross-correlation after allowing for non-zero lags will be discussed in Sect. IV A). Only the 0.5% strongest pair- wise associations between time series are considered as edges (ρ = 0.005). Such sparse climate networks have been in- troduced and partly studied in previous works [19, 21, 30– 32, 49], where ρ ∼ 0.01 ... 0.05 or W ∗ = 0.5 have been typical choices. A brief discussion of climate networks with higher edge densities can be found in Sect. IV D [132]. In order to guarantee that the climate network at a given point in time only considers dynamical information of its past, we will display the network measures at the endpoint of the FIG. 2: (Color online) Spatial distribution of the degree k• for two associated running window. typical time windows (A) without (May 1960 to April 1961) and (B) with (May 1982 to April 1983) marked localized structures for the surface air temperature anomalies (SATA) network obtained using B. A conceptual view on sparse climate networks the setting described in Sect. III A.

Before investigating the time-dependence of global climate network characteristics, let us have a detailed look at the spa- tial patterns associated with the connectivity of these net- points are typically accompanied by stronger associations be- works. In the following, we will provide a general discussion tween the respective temporal climate variability. Hence, the of these patterns. Thereby, we obtain a conceptual view on substrate lattice describes “trivial” spatial correlations due to sparse climate networks, which will subsequently prove to be typical (synoptic-scale) atmospheric patterns. We emphasize helpful for understanding the temporal variability of evolving that this type of structure is always present in our climate net- climate network properties. works and behaves relatively static, i.e., its edges do not fluc- When looking at the evolution of spatial connectivity pat- tuate much in time. Further research should clarify the rela- terns, we find two prototypical phases of the sparse climate tion to the concept of a “skeleton of strongly correlated links” network (Fig. 2A,B). For certain episodes, it reveals one (or as introduced in [25]. more) distinct strongly connected region(s), i.e., with vertices Second, there are regions of larger spatial extension (≈ having extraordinarily high degrees ki (Fig. 2B), while such 3000 − 9000 km), which display very high internal connec- are not present during other periods (Fig. 2A). Inspired by this tivity [19, 30]. The presence of such “localized structures” observation, we propose a simple idealized view on this phe- indicates that the spatial correlation length is significantly en- nomenon: Certain instances of an evolving climate network hanced within a confined region, i.e., beyond typical syn- – constructed in the way outlined in Sect. III A – exhibit (at optic scales. Hence, the corresponding connectivity covers least) two types of (temporarily) coexisting structures. both short (synoptic-scale) and intermediate distances (see First, there is a “substrate lattice” which reflects strong Sect. III C for a detailed discussion). Note that localized struc- short-range associations between mutually close grid points tures appear only episodically in the evolving SATA network affected by the same atmospheric circulation patterns. Typi- (cf. Fig. 2A,B and [21]), but typically repeatedly in the same cally, we observe an approximately exponential decay of the region (especially the Eastern Equatorial Pacific). As a conse- strengths of statistical associations between vertices with in- quence, we expect them to contribute significantly to the cli- creasing distance [30], since shorter distances between grid mate network connectivity when considering the full 62 years- 7

A B A B

C D C D

E F E F

G H

FIG. 3: Kernel estimates of the probability density functions p(·) (obtained using a Gaussian kernel function with a bandwidth follow- ing Scott’s rule [91]) of (A,B) vertex degree k•, (C,D) local cluster- ing coefficient C•, and (E,F) edge lengths d••|A•• = 1 for the same FIG. 4: (Color online) Joint probability distributions p(·, ·) of (A,B) time intervals and setting as in Fig. 2. In all cases, the empirical dis- strength of statistical association W•• and geographical distance d•• tributions have been normalized by the distribution of edge lengths between all pairs of vertices (i.e., without thresholding), and (C,D) of a fully connected graph in order to eliminate purely geometric ef- max degree k• and maximum edge length d• , (E,F) average shortest- fects. max path length per vertex L• and maximum edge length d• , and (G,H) local clustering coefficient C• and degree k• of all vertices (i.e., after thresholding), for the same time intervals and setting as in Fig. 2. All long records, which is supported by other studies [30, 44]. distributions are represented as histograms using 100 (90 for W••) The postulated separation of the climate network into sub- equidistant bins. The grey lines in (A,B) depict the thresholds above which edges are established (here: 0.5% of all possible pairs of ver- strate lattice and localized structures is supported by the tices). Wherever appropriate, the distance-dependences of the distri- (evolving) edge length distribution (see Fig. 3E,F), showing butions have been corrected as in Fig. 3. one dominant peak for short-range edges (substrate lattice) and far less longer connections. In addition, there are edges of lengths that exceed the typical extension of the described terized by low connectivity [19–21, 30] (see also Fig. 3A,B). localized structures, which are denoted as “teleconnections” Furthermore, the localized structures in the Eastern Equatorial and interrelate climate variability at distant parts of the globe. Pacific (Fig. 2B) related with ENSO variability closely resem- We hypothesize that the latter show more “ordered” place- ble the corresponding results of recent studies [19, 30]. No- ment during certain climatic episodes, although they cannot tably, these observations hold for our analysis using an icosa- be clearly separated by means of the edge length distribution hedral grid, whereas former studies were based on on a stan- only. dard (angularly regular) grid. Localized structures seem to be favored starting points of Notably, our conceptual view refers to the membership of long-range edges. For example, the phasing on the El Nino˜ vertices to one or another category, but is induced by the Southern Oscillation (ENSO) in the Equatorial Pacific (see placement and temporal behavior of edges. In turn, analyz- Fig. 2B) is known to have considerable influence on climate ing fields of (topological or geographical) vertex properties variability in distant parts of the Earth [92], such as the Indian does only provide implicit information on the edges. How- monsoon system [93, 94]. ever, even though the degree field (Fig. 2) does not describe The proposed qualitative view is consistent with previous the spatial distribution of long-range connections (Fig. 3E,F), results for static (time-independent) climate networks, which our idealized conceptual view holds, since the joint distribu- max clearly demonstrated that the majority of vertices is charac- tion of maximal edge length per vertex d• and degree k• 8

1983 tween the variations of both L and T in the climate network 1961 does most probably originate from complex rewiring pro- cesses driven by climate variability, although we cannot fully rule out minor effects due to changing auto-correlations. In the following, we will provide a detailed graph-theoretical in- terpretation of these results, whereas the climatological mech- anisms beyond the obtained temporal variability pattern will be discussed in detail in Sect. V.

1. Association strengths and spatial scales

Since the SATA networks studied in this work solely rely on those pairs of time series the statistical association be- tween which exceeds W ∗, the evolving joint probability den- sity function p(d••,W••) (Fig. 4A,B) reveals first deep in- sights into relevant spatio-temporal modes of climate vari- ability. Specifically, this distribution can be qualitatively de- composed into the components introduced in Sec. III B: The substrate lattice manifests itself as dominant strong and rather persistent associations at small edge lengths. For larger edge lenghts, there is a more or less continuous distribution of as- sociation values. During some periods (e.g., in Fig. 4B), the corresponding distribution of statistical association values for FIG. 5: (Color online) Threshold W ∗, average path length L, tran- distant vertices is shifted towards higher values, indicating the sitivity T , global clustering coefficient C and (standard) Hamming presence of localized structures. distance Ht,t−1 between networks obtained for successive periods Considering the evolution of p(d••,W••), we see as a first in time for the evolving SATA network (using the settings given in approximation a net amplification of association values for ∗ Sec. III A). Note that due to the fixed ρ, we have Ht,t−1 = Ht,t−1. several time windows, leading to the peaks in the threshold Vertical bars indicate fall and winter seasons (SON-DJF) with max- W ∗ visible in Fig. 5. This amplification is not uniform with imal intensity of EN (red) and LN (blue; darker colors represent respect to the spatial scale d . Consequently, not only the stronger episodes according to the Nino˜ 3.4 index, cf. [95, 96]). •• degree distribution p(k•) (Fig. 6A), but also the edge length distribution varies substantially with time (cf. Fig. 6B). For several time windows exhibiting strong peaks in the evolving (Fig. 4C,D) shows that the vast majority of vertices with small scalar network characteristics, we observe more long edges degree has indeed almost no long-range edges (dmax ≤ 2500 i – but yet cannot find a clear separation of the longer spatial km). Further relationships between network properties will be scales. This suggests that these time windows are accompa- discussed below. nied by the emergence of localized structures and, hence, hub vertices (Fig. 6A). Recall that localized structures consist of vertices with very high degrees (cf. Fig. 3B) and exhibit high C. Temporal variability of global network properties internal connectivity. Typically, but not necessarily, the as- sociated structures are located in the Equatorial Pacific (cf. Performing an evolving climate network analysis as de- Fig. 2B). scribed above, we first observe that the two network mea- sures L and T as well as the Hamming distance Ht,t−1 widely change in parallel with each other and with the threshold W ∗, with characteristic peaks from a certain constant base level 2. Transitivity (Fig. 5). We emphasize that this co-evolution is ex ante non- trivial, since these three measures capture distinctively differ- By forming groups of vertices with very high degree (com- ent network properties. In turn, the variability of the global monly in the presence of localized structures), the network‘s clustering coefficient C is strongly anti-correlated with that of total number of connected triples rises, since the possible the aforementioned characteristics, which also deserves fur- number of triples centered at one particular vertex i grows ther discussion since C captures a similar network property as with its degree ki as ki(ki −1)/2. As a consequence, although T . the denominator in Eq. (6) peaks at those time windows Because the total number of edges has been kept fixed, within which the strength of statistical associations is ampli- all scalar network characteristics are not affected if the edge fied (peaking W ?), the transitivity T still increases because density ρ is varied within a certain range still corresponding the total number of closed triangles (numerator in Eq. (6)) in- to a “sparse” connectivity. Hence, the strong similarity be- creases even stronger than the number of connected triples. 9

note that the variability of C is by about one order of magni- A tude smaller than that of T , another observation that calls for 1983 explanation. In order to resolve the reason for the behavior described above, a deeper look into the probability distribution of C• (Fig. 3C,D) gives a two-fold finding: For several time win- dows we observe a secondary maximum of p(C•) at higher C• 1961 as well as a shift of the primary maximum towards smaller values. If we furthermore consider the dependence on the ver- tices’ degrees k•, we find that the hubs show a broad range of higher C• values than the vertices exclusively belonging to the substrate lattice (cf. Fig. 4H). Still, the vast majority of vertices with low degrees show declining C• during periods with marked localized structures. This causes the global clus- B tering coefficient to drop (even though a notable fraction of vertices increase their C•). We can exclude that the observed drops in C have been induced by vertices i of degree ki = 0 or ki = 1, since only five of such vertices emerge in the entire time-evolution of the SATA network. According to the general shape of the probability density p(d••,W••) (Fig. 4A,B), for a fixed time window we can ex- pect that the distributions of k• and C• will exhibit remarkable changes as the edge density ρ is varied. As a consequence, we hypothesize that the general behavior of C (for fixed ρ as a function of time) is much more strongly affected by the spe- cific choice of ρ than that of T . The validity of this hypothesis, and particularly the dependence of the distinct anti-correlation between T and C on the chosen edge density, will be further FIG. 6: (Color online) Evolution of (A) degree distribution p(k•) discussed in Sec. IV D. and (B) edge length distribution p(d••|A•• = 1) with the settings given in Sect. III A. In panel (B), the distributions obtained for all time windows have been corrected as in Fig. 3.

4. Average path length In analogy with the local clustering coefficient (Eq. (7)), we can formally split the transitivity (Eq. (6)) into (non- Ad hoc it seems counter-intuitive that a spatially embedded normalized) “local” transitivities T by just decomposing the i network (with periodic boundary conditions and fixed edge sum in the nominator of Eq. (6) as density) exhibits a rising topological path length when there N PN is a transfer of connectivity towards longer spatial scales. 1 X j,k=1 AijAikAjk T = T with T = , Specifically, in spatially embedded networks (e.g., airline N i i 1 PN i=1 N i,j,k=1,j6=k AijAik transportation), longer edges typically act as shortcuts. Thus, (12) the presence of such long-range connections is particularly re- i.e., Ti gives the ratio between the number of triangles cen- flected in shortest path-based quantities. In our SATA net- tered at vertex i and the average number of connected triples works, the same observation holds in each time window: ver- i max centered at all vertices. We find that vertices with high de- tices i with di & 2500 km have always minimal Li. How- max gree in the localized structures contribute stronger (i.e., with ever, at the same time we observe a total shift of p(d• , L•) higher Ti) to the overall transitivity than those exclusively be- towards higher L• values in these time windows (Fig. 4E,F). longing to the substrate lattice. We deduce, that this can be caused by a more redundant, partially parallel geographical placement of long-range edges compared to the base-level situation (cf. Fig. 2). This ex- 3. Global clustering coefficient planation is consistent with the physical continuity of the cli- mate system: Spatially close points tend to behave similar and Unlike network transitivity, the global clustering clustering thus correlate group-wise with others. Another possible cause coefficient C – as the arithmetic mean of all local clustering is the loss of edge density in the substrate lattice, enlarging coefficients – drops when localized structures emerge. This shortest paths starting or ending at (the majority of) vertices i behavior appears somewhat unexpected, since both charac- with small ki. In a nutshell, building up lots of parallel high- teristics quantify conceptually related properties and exhibit ways by dismantling rural roads does stretch shortest path- values within the interval [0, 1]. In connection with this fact, ways in the entire frame. 10

5. Hamming distance

In a similar spirit as for the global network characteristics discussed above, the peaks of the Hamming distance Ht,t−1 (Eq. (11)) coinciding with those of W ∗ can be explained as indicators of a persistent redistribution of edges between dif- ferent spatial scales, which is known to be a typical signature of EN episodes [21, 25, 27]. Specifically, Ht,t−1 exhibits a double-peak structure around time windows characterized by single peaks of the other network characteristics (T , C and L). Given our interpretation of the Hamming distance as a “net- work derivative”, this finding is consistent with the expected behavior: large values coincide with time periods where the SATA network connectivity is changing considerably, i.e., in parallel with the emergence and disappearance of localized structures.

D. Possible implications for local network organization

Following the observations described in Sect. III C in com- bination with our conceptual view on sparse climate networks FIG. 7: Linear cross-correlation functions C(s) between the differ- (Sect. III B), we are able to derive some preliminary insights ent evolving climate network characteristics from Fig. 5. Positive into the spatial organization of the association structure of (negative) s refer to the situation of the measure in the row leading SATA fields on the local (network) scale, which complement (lagging) the measure in the column. recent findings [31]. For this purpose, let us examine Fig. 4 in some more detail. In many examples of complex networks [97–101], hubs aligned in parallel with the preferred direction of atmospheric have a tendency to contribute to a lower fraction of trian- flow have on average stronger statistical associations over a gles than vertices with intermediate degree. In the climate wider spatial range than those in the perpendicular direction. network, this effect is only visible for those vertices i with A detailed examination of the associated climate network con- the highest degrees (i.e., ki & 450 in Fig. 4H), which be- nectivity patterns on the local scale is, however, beyond the long to densely connected and spatially localized structures scope of this study and will be subject of future work. We con- (see Fig. 2B). As discussed in Sect. III C, these hubs have jecture that in addition to established vertex characteristics, higher local clustering coefficients than the substrate lattice, purely geometric measures related to the spatial anisotropy of even though the associated spatial scales captured by the adja- connections [102, 103] can provide relevant complementary cent edges are considerably larger than the typical “correlation information for this purpose. range” (i.e., synoptic scales of up to O(103 km)) within which mutual associations are on average statistically relevant. No- tably, this effect acts against the decrease of the local clus- tering coefficients in the substrate lattice, which dominates E. Statistical interdependences between measures the resulting signature in the global clustering coefficient for sparse climate networks with an edge density of 0.5% as con- Beyond the qualitative interpretation of the observed simi- sidered here. larities and dissimilarities of various characteristics of evolv- In order to derive an alternative explanation, note again that ing climate networks, we will next provide a quantitative in the presence of localized structures, vertices belonging to assessment of the statistical interrelationships between these the substrate lattice are characterized by a lower average de- measures, empirically supplementing our arguments from the gree than otherwise, since the total edge density ρ is conserved former sections. For this purpose, Fig. 7 displays the linear (compare the left and right panels in Figs. 3 and 4). We sug- cross-correlation functions between the temporal variability gest that this finding could indicate that the connectivity in of different measures. For the global network characteristics, the substrate lattice becomes less isotropic, but rather reflects the most significant (positive or negative) interdependences the actual preferred directions of atmospheric dynamics (e.g., are found when considering the same time window. In con- westerlies, trade winds, etc.). This hypothesis is supported trast, for the Hamming distance, the maximum correlations by recent findings of Palusˇ et al. [44] who, by using a differ- with the other considered measures show a delay between ent climate network approach, observed that a stronger trans- 3∆w and 6∆w (i.e., of 3-6 months), underlining the distinc- port of air masses during positive phases of the North Atlantic tively different meaning of Ht,t−1 as a “network derivative” Oscillation (NAO) enhances the network connectivity in the indicating structural changes before and after their most sig- directly affected areas. Particularly, it is likely that vertices nificant reflection in the global network characteristics. In this 11 respect, the observed delay reflects the typical time-scale as- 1983 sociated with the emergence and disappearance of localized 1961 structures and, thus, of the underlying climate phenomena (cf. Sect. V).

IV. ROBUSTNESS OF THE RESULTS

The results described so far have been obtained using one specific setting of methodological options for climate network construction. In the following, we test the qualitative robust- ness of the obtained results using different methodological choices.

A. Non-zero lags

Our results in Sect. III refer to climate networks based on lag-zero statistical associations between first-order desea- soned SAT time series at distinct parts on the globe. Since atmospheric circulation patterns always travel with a finite ve- locity, the same variations usually affect different grid points at different times, so that the cross-correlation function Cij(s) between different grid points i,j could peak at non-zero mu- tual lags (s 6= 0). In order to study the impact of such lags on FIG. 8: (Color online) As in Fig. 5 for SATA networks based on the the topology and time evolution of the SATA network, in the maximum cross-correlation for time lags s ∈ [0, 30] days. Note that W ∗ is larger than in Fig. 5 for s = 0 as expected. following we replace the lag-zero cross-correlation Cij(s = 0) as the criterion for edge creation by the maximum value of the cross-correlation function Cij(s) for time lags s ≤ 30 days. This choice allows considering typical large-scale at- et al. [57], who observed higher correlations during EN peri- mospheric wave phenomena that could mediate between the ods than under “normal” climate conditions, and even stronger temperature variability at distant parts on the Earth, and re- pair-wise associations during LN phases in agreement with spects the typical lifetime of weather regimes. Keeping all [21, 44]. other parameters of our analysis the same as above, the re- On the other hand, [25, 27] considered a setting with a fixed sulting variations of climate network properties are shown in threshold rather than a fixed edge density. Since W ∗ and ρ Fig. 8. are closely interrelated, the decrease in the number of edges As for the lag-zero case, we observe a sharp increase of during EN phases reported in these studies would coincide the threshold W ∗ at the previously identified time windows, with a decrease in W ∗ if ρ is kept constant, which is different which mainly coincide with certain phases of the El Nino˜ from our results. One possible reason for this difference is that Southern Oscillation (ENSO, see Sect. V). Related to this we prefer not to normalize the estimated association strengths finding, we note that other authors (e.g., [25, 27]) even found Wij by the standard deviations of the measure taken over all signatures corresponding to a decrease rather than increase in considered time lags as in [25, 27]. Specifically, we can argue W ∗. that the absolute value of the maximum statistical association On the one hand, Tsonis and Swanson [21] as well as Palus and its magnitude relative to the fluctuations over a range of et al. [44] compared climate networks obtained for EN and delays (as defined in [25, 27]) provide complementary results. LN phases and found higher connectivity during LN in com- A decreasing relative magnitude in parallel with an increas- parison with EN episodes. For a fixed edge density, this would ing absolute value indicates stronger statistical associations correspond to higher values of W ∗. However, the aforemen- for most other delays. Here, we keep the two quantities (i.e., tioned studies did not explicitly consider the neutral state as absolute value of maximal statistical association and standard opposed to both EN and LN conditions. In turn, the results of deviation of associations over a certain range of s) separated. our evolving network analysis displayed in Figs. 5 and 8 do This point of view is supported by [57], who found that the not allow a systematic confirmation or rejection of any partic- temporal fluctuations in the network connectivity obtained us- ular asymmetry between the values of W ∗ or the considered ing the approach of [25, 27] do not necessarily reflect changes network characteristics for all EN and LN phases, although in the coupling between different regions. Even more, this there are two LN episodes (1973/74 and 1988/89) that ex- method appears to have a lower degree of robustness under hibit higher W ∗ values than during all EN phases. We note changes of its basic parameters than other approaches for cli- that these results are consistent with recent findings of Martin mate network construction [57]. 12

For the temporal variation of the considered network char- 1983 acteristics, we find no qualitative deviations from the findings 1961 previously obtained for the lag-zero SATA network (compare Figs. 5 and 8). A detailed inspection of the delays associated with the maximum cross-correlation (not shown) or, alterna- tively, the maximum mutual information [104] reveals that besides an exceptionally strong peak around s = 0, almost all delays contribute with comparable frequencies. This ob- servation demonstrates that statistically relevant atmospheric interactions appear predominantly on very short time scales, reflecting the presence of particularly strong interactions be- tween geographically close grid points, and subsequently on all other (here considered) time scales. By this, we qualita- tively reproduce [25, 27]. Note that the absence of marked changes on the global network scale in comparison with the lag-zero case does not necessarily imply that there are no changes at the local scale. A deeper discussion of the asso- ciated fields of local (vertex) characteristics is to be resumed in future work.

B. Fixed thresholds

As initially discussed in Sect. III A, there are two possible and theoretically justified options for selecting a global thresh- old W ∗ in evolving climate network analysis. While all pre- FIG. 9: (Color online) As in Fig. 5 for SATA networks with a fixed threshold W ∗ chosen as the minimal threshold from the evolution vious considerations have focused on a fixed edge density ρ ∗ for fixed edge density of ρ = 0.005 (see topmost panel in Fig. 5), and, hence, a variable threshold W , in the following we con- W ∗ ≈ 0.4958 – a value which was also used in [19–21]. Thus, ρ = ∗ sider the alternative choice of a fixed threshold W , which in 0.005 is a lower boundary for the evolving edge density, ensuring turn implies that ρ becomes time-dependent. We emphasize that the network does not tend towards disintegration. In addition to ∗ that the resulting variations of W and ρ, respectively, are di- the standard Hamming distance Ht,t−1 (lower panel, black line), the ∗ ∗ rectly interrelated, since maxima of both W (for fixed ρ) and corrected Hamming distance Ht,t−1 is shown (red line). ρ (for fixed W ∗) indicate a shift of the distribution p(W ) of association strengths towards larger values. Consequently, the temporal variability pattern of ρ (see Fig. 9) inferred from lag- tering coefficient C (from drops to peaks) is due to the fact zero-based cross-correlation Cij(s = 0) is similar to that of that the additional edges (which preferentially connect ver- W ∗ in Fig. 5. tices within the localized structures) do not substantially affect A detailed inspection of the different global network char- the edge structure of the substrate lattice. Consequently, we do acteristics (cf. Fig. 9) shows that the behavior of the transitiv- not find a shift of the primary maximum of p(k•, C•) towards ity T is qualitatively similar to the case of fixed edge density smaller values of C•, while the secondary maximum shows ρ, whereas the global clustering C exhibits peaks instead of the formerly described behavior contributing to an overall in- drops in the previously identified time windows, and the aver- crease in C. age shortest path length L lacks the formerly observed peaks The average shortest path length L is governed by two com- from a constant base level. peting mechanisms. While the effect pointed out in Sec. III C Clearly, the occurrence of localized structures as described (i.e., a spatial redistribution of edges leading to a more redun- in Sect. III B takes place in the case of fixed threshold as well. dant placement, yielding an overall increasing path length) is Specifically, in the presence of such structures, we observe a still present, additionally occurring edges trivially reduce the considerably higher edge density. Hence, edges are not only lengths of shortest paths. A further detailed investigation of spatially redistributed, but additional significant associations the distribution of shortest path lengths per vertex could sepa- emerge. rate the two effect but exceeds the scope of this work. Since the transitivity T shows the same signal as before, Since the Hamming distance Ht,t−1 as well as its density- ∗ our considerations from Sec. III C still apply. Specifically, the corrected counter-part Ht,t−1 indicate general connectivity emergence of localized structures results in a marked increase changes, both have high values whenever localized structures in local transitivities (even overshadowing a potential decrease emerge or disappear. In general, the amplitudes of Ht,t−1 are of local transitivity at the majority of lower-degree vertices). about a factor of 3 larger than in the case of fixed edge den- Thus, additional edges (of lower association strengths) follow sities (Figs. 5 and 8). This observation is probably related to the formerly discussed mechanisms. the fact that the edge densities in the considered fixed thresh- The switch in the qualitative behavior of the global clus- old scenario are bound from below by the value used in the 13

fixed edge density scenario. Hence, for most time intervals, 1983 there are considerably more edges contained in the networks * 1961 of Fig. 9 than in those of Figs. 5 and and 8 (up to about three times more). Since the higher the edge density, the more edges can be rewired between two subsequent time intervals (note that the range of possible Hamming distances is bound from above by the sum of the edge densities of the two networks to be compared if the latter is smaller than 1), the difference in the realized edge densities could explain the observed behav- ior. Comparing the classical and corrected Hamming distance, we find that the contribution of edge rewiring mostly domi- nates the effect of a changing edge density ρ, i.e., both Ht,t−1 ∗ and Ht,t−1 display qualitatively the same variability. How- ever, in some time windows (e.g., in 1957, 1976, or 1993), the corrected version of this measure remains at a much lower level (cf. Fig. 9), indicating that during these periods, the changes in the edge density ρ are particularly relevant as well.

FIG. 10: (Color online) Global network measures (without Hamming C. Further methodological options distance) for denser climate networks. The displayed results corre- spond to a fixed edge density ρ = 0.15 (left axes, gray lines) and a ∗ Besides the methodological choices discussed above, there fixed threshold W = 0.2997 (right axes, black lines), repectively. are further options that can be used for modifying the setting of our evolving climate network analysis. In the following, we just briefly note some of the possibilities that have been D. Denser climate networks tested within the course of the described work (see [104] for examples), but are not discussed here in detail since they lead All previous considerations referred to sparse evolving cli- to results that are qualitatively equivalent to those already pre- mate networks, i.e., networks with a very low edge den- sented above: sity. While there have been several studies using this set- ting (e.g., [19, 21, 30–32, 49]), one might argue for analyzing • use another statistical association measure, e.g., the denser networks (as used in [50, 52]) in order to obtain pos- nonlinear (cross-) mutual information or Spearman sibly better statistical estimates of network characteristics. In correlation coefficient (functions) instead of linear the following, we discuss to which extent our results described Pearson correlation, further measures are possible above are modified in case of higher edge densities. (cf. Sect. II B 2), As Fig. 10 shows, we find that many of the previously dis- • change the temporal resolution of the considered time cussed signatures of global network characteristics are qual- series data (e.g., 6 hours or one month), itatively robust when considering higher fixed edge densi- ties up to ρ = 0.25, as well as lower fixed thresholds (e.g., • change the size w of the running windows in time used W ∗ = 0.2997, corresponding to ρ ∈ (0.016, 0.072)). This for the evolving network analysis within a reasonable observation partially confirms the previous result of [23] that range. “the effect of different correlation thresholds (between 0.4 and 0.6) does not affect the conclusions reached”. Specifically, we In contrast to the aforementioned options, the choice of the make the following observations: spatial resolution of the icosahedral grid (e.g., use n = 4 or n = 6 instead of n = 5 refinement steps, cf. Sect. II B) is cru- (i) In accordance with its previously discussed robust- cial. Using a coarser grid than presented here leads to a sig- ness against different methodological choices (e.g., in nificantly increased fraction of (almost) disconnected nodes Sec. IV B), we find the transitivity T to be the most ro- (ki = 0 or ki = 1) for the edge densities used in this study, bust measure. Not only for relatively large edge densi- especially in the presence of localized structures. This effect ties, but also at extremely high thresholds (W ∗ = 0.9, then biases other network measures (e.g., the global cluster- corresponding to ρ ∈ (0.0004, 0.0010)), the evolution ing coefficient C) and reduces accessible information about of Fig. 5 is confirmed. the network’s structure. In turn, considering a finer grid would require data provided with a higher spatial resolution. For the (ii) In contrast to this, the global clustering coefficient C large-scale global climate characteristics we are interested in, shows a marked sensitivity to variations of the edge den- the considered resolution is reasonable, whereas consideration sity: it drops during certain time intervals for low edge of specific regional atmospheric processes and associated sta- densities (e.g., ρ . 0.01, cf. Sec. III C), but peaks for tistical association patterns [41–43, 52] would call for a denser higher ones. The critical edge density at which this be- grid. havior switches is determined by p(d••,W••): when 14

W ∗ lies above the typical association strength of in- A. ENSO vs. volcanic eruptions termediate and longer edges, we find the behavior of Sec. III C. In turn, for lower thresholds, edges are per- In order to understand why some time intervals display ex- manently present at all spatial scales, leading to gener- treme values of various SATA network characteristics even ally higher C• in the substrate lattice and, hence, episodic without any associated ENSO phase, we first note that not peaks of C since vertices in localized structures exhibit only EN/LN episodes, but also strong volcanic eruptions have very high C•. a considerable large-scale impact on the Earth’s climate sys- tem [93]. To our best knowledge, a corresponding effect (iii) Finally, since the considered climate networks are asso- on climate networks has not yet been described elsewhere. ciated with a continuous dynamics close to the Earth’s Specifically, as we will explain below, both types of “events” surface, high edge densities lead to very low average can lead to the emergence of marked localized structures in the shortest path lengths (up to L = O(1)). This makes L a climate network. If a sufficiently large amount of aerosols is less informative measure at high edge densitities, even if injected into the stratosphere in the course of a strong volcanic the basic signature described in Sec. III C is still present eruption, it can eventually stay there for a relevant period of at ρ = 0.15. time (depending on the specific conditions) leading to a large- scale temporary co-evolution of SAT variability in terms of Conclusively, lower edge densities are generally at least as a common cooling trend over a possibly large region. Due informative as higher ones, while the former rely on proba- to the corresponding relevant physical processes, this mecha- bly more reliable statistical associations (with respect to any nism requires a certain period of time. Hence, the associated kind of significance test). Information on climate dynamics signatures in the climate network properties can only be ob- becomes partially distorted at extremely low edge densities served with some delay. In this respect, strong volcanic erup- (ρ . 0.005), where the fraction of disconnected vertices be- tions can have a similar impact on the global climate system as comes non-negligible) and blurred for extremely high edge EN/LN episodes in terms of a marked co-variation of climatic densities (ρ & 0.15). observables over a relatively large part of the globe. Figure 11 demonstrates that the described effect is particu- larly well visible in the SATA network for the year 1992/93, i.e., the time period succeeding the largest stratospheric V. DISENTANGLING ENSO VARIABILITY aerosol injection of the 20th century, the Mount Pinatubo eruption in June 1991 [109], which had a distinct impact on Recent studies have revealed a distinct influence of ENSO global temperatures [110, 111]. Like strong EN/LN episodes, variability on the topological properties of SATA networks this period has been characterized by a markedly localized [21, 25–28, 57]. Here, we study the corresponding re- structure in the climate network emerging from the Philip- lationship more deeply. Specifically, we hypothesize that pines and then spreading over vast parts of South-East Asia the temporal variability of the climate network characteris- and the Western Pacific [104], which is manifested in the cli- tics discussed in Sect. III and IV is mainly determined by mate network in terms of pronounced extreme values of all the large-scale connectivity patterns associated with ENSO- considered network quantifiers (see Figs. 5, 8, 9 and 10). A related global climate episodes. similar effect on the global network characteristics can be ob- The latter hypothesis is further supported by Figs. 5, 8, 9 served following the El Chichon´ and Mount Agung eruptions and 10, where recent EN and LN episodes have been dis- in 1982 and 1963/64, respectively, the second and third largest played for a better comparison. Here, we observe a strik- injections of volcanic aerosols into the stratosphere within ing coincidence between the emergence of enhanced local- the time interval under consideration in this work. However, ized structures in the SATA network and the timing of cer- the El Chichon´ eruption approximately coincides itself with tain ENSO phases. However, a detailed inspection of these a strong EN episode (1982/83), so that the resulting vari- figures reveals that pronounced maxima (minima) of the dif- ability in the global SATA network properties cannot be un- ferent scalar network characteristics as well as the Hamming equivocally attributed to any of the two phenomena without distance do not always coincide unequivocally with EN and further detailed investigations of the associated local struc- LN episodes, as it was reported for another climate network tures. In contrast, the Mount Agung eruption clearly preceded approach [27]. For example, for the relatively strong 1990/91 a marked EN episode, resulting in a triple-peak signature in EN episode, we find no marked signature in the evolving cli- the evolving SATA network transitivity (Fig. 11) instead of mate network characteristics. In turn, some marked extreme the double-peak structure exhibited by some other EN events values of all considered measures are found in the time peri- (e.g., 1982/83 and 1997/98) or the single-peak pattern asso- ods 1988/89 and 1992/93, which have been characterized by ciated with the Mount Pinatubo eruption. We conjecture that a strong LN episode and the aftermath of the Mount Pinatubo this multi-peak structure highlights the emergence and disap- eruption, respectively. In the following, we will discuss the pearance of localized structures at different spatial locations climatological reasons for this complex behavior and demon- relatively shortly after each other. strate how the signatures in different network characteristics We emphasize that the considerable effect of strong vol- can be utilized for disentangling the signatures of different canic eruptions on the global SATA network characteristics types of ENSO phases. has not been studied elsewhere so far, i.e., the aforementioned 15

1983 different for EN and LN episodes and volcanic eruptions. On 1961 the one hand, volcanic activity with a considerable strato- spheric aerosol injection results in a consistent regional cool- ing trend due to reduced solar insolation, inducing generally stronger spatial correlations within a confined region. On the other hand, both extreme phases of ENSO variability (i.e., EN and LN) lead to a synchronization of variability within large areas of the globe (due to some internal dynamics of the coupled atmosphere-ocean system [92]). As a consequence of their similar signatures in the SATA network, these differ- ent types of events apparently cannot be distinguished by just studying individual network characteristics. However, consid- ering the temporal variability of a variety of complementary climate network characteristics provides a more holistic pic- ture than earlier works focusing on one parameter only [25– 28, 45, 46]. Notably, this conceptual idea could be important for the general understanding of the potentials of the climate network approach. Specifically, regarding the particular prob- lem of disentangling the signatures of ENSO variability in the Mount Pinatubo El Chichon Mount Agung evolving SATA network properties, the simultaneous study of multiple characteristics has the potential to identify some gen- eral mechanisms. Further methodological improvements such as the consideration of more sophisticated statistical associ- ation measures of time series (e.g., [80, 81, 112]) remain a subject of future work. FIG. 11: (Color online) Evolving SATA network transitivity T and Hamming distance Ht,t−1 for the same setting as in Fig. 5. In ad- dition, the global mean temperature anomalies T¯, the Nino˜ 3.4 in- C. Discriminating different types of El Nino˜ episodes dex (base period: 1948-2010, vertical lines represent thresholds of ◦ ±0.4 C as an indication of EN and LN episodes [105], respectively), Beyond our previous considerations, recent research pro- and the stratospheric aerosol optical depth SOD (i.e., the global vided considerable evidence that there are actually two qual- monthly mean optical thickness at 550 nm wavelength, cf. [106]) are shown. Vertical bars indicate fall and winter seasons (SON- itatively different types of EN episodes [107, 108, 113, 114]. DJF) with maximal intensity of EN (red) and LN (blue), respectively. On the one hand, many EN phases follow the traditional EN Symbols (bullets, shifted towards the top: CT; asterisks: WP) in- pattern with strong positive sea-surface temperature anoma- dicate EN episodes unambigously classified in the literature (e.g., lies starting in the Eastern Equatorial Pacific and then suc- [107, 108]). Note that there is no 1:1 correspondence between the cessively propagating westward. This class of events partic- shown definition of EN/LN based on the Nino˜ 3.4 index and other ularly includes the two strongest EN episodes (1997/98 and methods. 1982/83) recorded in the time period studied in this work with respect to the Nino˜ 3.4 index [95, 96] as well as the 1972/73 EN episode [107, 108]. On the other hand, over the results offer new directions for further research and serve as last decades there has been an increasing number of EN-like an additional proof of the usefulness of the climate network phases which are characterized by large sea-surface tempera- approach in general. ture anomalies in the Central Pacific (but smaller ones in the Eastern Pacific), including the 1990/91, 1994/95, 2002/2003 and 2004/05 EN episodes. The latter type of events has been B. El Nino˜ vs. La Nina˜ phases referred to as dateline El Nino˜ [113] or El Nino˜ Modoki [114] by different authors. Here, we adopt the terminology used by In addition to volcanic eruptions, we find that both EN and Kug et al. [108] distinguishing between Cold Tongue (CT) LN episodes can cause comparable (or even higher) peaks in and Warm Pool (WP) episodes corresponding to the tradi- the different evolving climate network properties, since they tional EN pattern centered in the Eastern Pacific (EP) and the can also be associated with certain localized structures in the Central Pacific (CP) pattern [107], respectively. climate network. Specifically, Palusˇ et al. [44] reported a As a novel aspect not covered in previous research, we will confinement of these structure to the tropical Pacific for EN, discuss the signatures of the two aforementioned EN types but an extension to all tropical areas during LN phases. These in the global SATA network characteristics next. Indeed, we findings relating to geographical aspects not explicitly studied are able to identify some distinctive features of the climate in this work appear largely consistent with our results. network associated with CT/EP episodes: We emphasize that the physical mechanisms beyond the First, peaks of L, T and H perfectly coincide with the tim- emergence of localized structures are, however, completely ing of the CT/EP episodes, which is not the case for WP/CP 16 events (cf. Figs. 5 and 11) where such peaks are widely absent. Second, for strong CT/EP episodes the Hamming distance H fluctuates at a high level for more than one year, indicat- ing a persistent redistribution (i.e., fluctuations or “blinking”, cf. [27]) of edges. For WP/CP episodes such fluctuations are present as well, but exhibit a considerably smaller amplitude and shorter duration. With respect to the known recent his- tory of ENSO, we note that the three strongest unambigously classified CT/EP episodes (1972/73, 1982/83 and 1997/98) within the studied time interval have been directly followed by considerable LN phases (i.e., the associated Nino˜ 3.4 in- dex exhibits a particularly marked drop from strongly positive to strongly negative values, which is not as strong for WP/CP events). This sudden shift in the ENSO phase enhances the proposed mechanism of a sustained rewiring in the evolving climate networks during these periods. Notably, L, T and H also show pronounced maxima dur- ing some strong LN episodes (this applies to both “isolated” LN episodes as in 1988/89 and LN phases directly following an CT/EP episode – cf. the double-peak pattern of the corre- sponding maxima in Figs. 5 and 8), which is due to the emer- gence of (though a possibly different type of) localized struc- tures in the climate network. However, for other LN phases, FIG. 12: Linear cross-correlation functions C(s) between the evolv- such peaks are absent. This observation suggests the existence ing SATA network characteristics transitivity T and Hamming dis- of two different climatological mechanisms, which could re- tance Ht,t−1 (obtained based on lag-zero cross-correlation with fixed ¯ sult in a classification of LN episodes in a similar way as for ρ = 0.005), the global mean temperature anomaly T , the absolute EN phases. We leave a more detailed investigation and dis- value of the Nino˜ 3.4 index, and the SOD index. cussion of this idea for future work. The distinctively different behavior of SATA networks dur- (e.g., [95, 108, 113–123]). ing the two types of EN episodes can probably be explained by reconsidering the associated typical spatio-temporal pat- terns. On the one hand, CT/EP episodes exhibit a relatively sharp, regionally confined pattern leading to a common SAT D. Network characteristics and climate-related indices trend starting from the Eastern Equatorial Pacific and then propagating westward. In this spirit, the spatio-temporal sig- In order to further support our previous results, the cross- nature in the SAT field resembles a wave travelling through correlation functions between the evolving climate network the Equatorial Pacific from East to West, leading to a succes- characteristics T and Ht,t−1 on the one hand, and the global sive synchronization of tropical climate variability over an in- average temperature anomalies T¯, the absolute value of the creasingly large region. On the other hand, the typical pattern Nino˜ 3.4 index as well as the stratospheric aerosol optical of WP/CP episodes commonly appears like a diffuse pulse depth SOD (as indicators of ENSO and volcanic activity, re- spreading from one region in the Central Equatorial Pacific spectively) on the other hand have been computed (Fig. 12). into different directions. The associated EOF patterns display Notably, we do not find any systematic effect of the average more fuzzy spatial structures and are less well localized in temperature anomalies on the evolving climate networks. This space than those of CT/EP episodes [107, 108, 114]. Due indicates that the general global warming trend is not directly to this spatio-temporal footprint, the spatial correlations in reflected in the corresponding network properties. However, the Equatorial Pacific change their magnitude in a much less such trends are practically only relevant for time scales clearly coordinated and marked way than under the influence of the above the window sizes of one year studied in this work. In sharper pattern associated with CT/EP episodes. As a conse- turn, dynamical characteristics such as captured by evolving quence, there is only a relatively minor redistribution of con- climate network analysis reveal signatures that go clearly be- nectivity in the SATA network, explaining the weaker signa- yond the behavior of global mean temperatures. tures in the considered network characteristics. Regarding the impact of volcanic activity, the obtained re- Notably, at this point we are not able to give a complete sults demonstrate a considerable influence on the SATA net- classification of EN episodes based on our complex network work topology. Despite recent findings suggesting a possible characteristics. This is particularly due to the fact that there effect of climatic processes on volcanic activity on longer time are EN episodes of mixed characteristics already known in scales [124, 125], we can practically rule out a significant cli- the literature, so that there is no unique reference for classifi- matic forcing of volcanism at the time scales considered in cation. Moreover, the definition of EN itself is partly ambigu- this work. Specifically, the network transitivity shows similar ous and depends on the specific method or index of choice variations as the SOD index with a delay of about 8 months, 17 which is in reasonable agreement with the typical lifetime of this results in the emergence of marked localized structures, volcanic particles in the stratosphere and the expected delay for ENSO particularly in the tropical Pacific. Note that the of SAT changes due to a reduction of solar irradiation. Note, results presented in this work go significantly beyond those however, that the SOD index only shows some distinct events of previous research. As a main new achievement, we can and remains close to 0 for most of the time. In this case, linear not only clearly distinguish between the signatures of differ- cross-correlation is not the best-suited measure for character- ent ENSO phases, but also differentiate different types of EN izing the co-occurrence of volcanic eruptions and peaks in the and LN episodes, which has not yet been possible by other SATA network characterics. In contrast, event-based charac- climate network approaches. Specifically, we have developed teristics such as event synchronization [41, 42, 126] or coinci- some initial understanding of similarities and differences be- dence analysis [14] are tailored for such purposes, but require tween the climate network reflections of physical mechanisms a larger number of events than those recorded in the studied acting during strong volcanic eruptions and different types of data sets. EN and LN episodes. More interestingly, cross-correlation analysis reveals con- Beyond the specific consideration of ENSO variability, our siderable interdependences between the SATA network prop- results have led to a substantially improved understanding of erties and the absolute value of the Nino˜ 3.4 index, character- the structures present in climate networks based on surface air izing the deviation of sea-surface temperature anomalies from temperatures. As a general finding, we have proposed a new the standard values in some defined region of the Pacific as- simple conceptual view of the climate network based on the sociated with ENSO. From Fig. 12, we find that network tran- alternating presence of different types of structures: the sub- sitivity T and Hamming distance Ht,t−1 show distinct max- strate lattice mainly capturing short-range connections versus ima preceding the peak amplitude of EN and LN by 6 and enhanced localized structures (i.e., densely connected parts of 5 months, respectively, with correlation values of about 0.4. the network covering larger spatial scales). In this respect, the Moreover, secondary maxima of the cross-correlation func- temporal variability of the climate network topology can be tions with smaller amplitude are found 8 and 5 months af- understood as an effect of a persistent redistribution of con- ter the correspondent peaks of the ENSO index, respectively. nectivity between these different types of structures. While we can unequivocally attribute this finding for the Ham- We note that our approach is distinctively different from ming distance as resulting from the largest rate of redistri- those previously used by other authors in the sense that we bution of connectivity within the SATA network, the corre- have considered a multiplicity of comprehensive measures sponding signature of T seems to rather relate to the pres- from complex network theory. Only this consideration of ence of common SAT trends over a substantial region associ- complementary characteristics allowed deriving a holistic un- ated with both the emergence and disappearance of localized derstanding of the underlying dynamical processes. Moti- structures in the Equatorial Pacific and beyond. (In turn, for vated by its successful application, we suggest further using the SOD index there exists only one maximum, indicating not only the global characteristics of climate networks as stud- that the disappearance of the associated characteristic struc- ied in this work, but also the associated spatial patterns of tures in the SAT field behaves fundamentally different than (both topological and geometric) vertex properties (i.e., in- for EN/LN episodes.) It will be subject of future studies to formation on the placement of edges in physical space) for what extent this information, in combination with the distinct future investigations on the detailed spatial backbones of dif- temporal variability profiles of different SATA network mea- ferent climate episodes. Initial results in this direction can sures, does not only provide important novel insights into the be found in [104]. We conjecture that this evolving network function of the climate system in general, but can be specif- approach has great potentials for supplementing other studies ically exploited for anticipating or even predicting type and based on traditional methods of multivariate statistics such as strength of approaching EN and LN episodes [61]. EOF analysis. As underlined by our analysis, different ENSO phases have a distinct impact on the spatial organization of the global cli- VI. CONCLUSIONS mate system. Since ENSO is a coupled atmosphere-ocean phenomenon, we additionally suggest the consideration of In this work, we have introduced a novel viewpoint on inter- complementary climatological observables (e.g., geopotential annual climate variability in terms of evolving climate net- height, sea-surface temperatures, sea-level pressure, etc.) in work analysis, i.e., studying the variation of a set of comple- corresponding future analyses. A methodological extension mentary global climate network characteristics with time. Our that is particularly tailored for such investigations are coupled analysis has provided deep new insights into the functionality climate networks [32, 49]. of the global climate system and impacts of different types of climate episodes, particularly such related to ENSO variabil- ity. Our findings particularly highlight the effect of “classi- Acknowledgments cal” El Nino˜ (EN) and some La Nina˜ (LN) episodes as well as very strong volcanic eruptions on the global climate sys- This work has been funded by the Leibniz association tem. Specifically, all three types of “events” lead to a common (project ECONS), the Federal Ministry for Education and large-scale temperature trend (i.e., some kind of synchroniza- Research (BMBF) via the Potsdam Research Cluster for tion) over a considerably large region. In the climate network, Georisk Analysis, Environmental Change and Sustainabil- 18 ity (PROGRESS), the German Research Foundation (DFG their website at http://www.esrl.noaa.gov/psd/. project “Interactions and complex structures in the dynam- We thank Norbert Marwan and Kira Rehfeld for inspiring dis- ics of changing climate”), the German Federal Environmen- cussions and support, and Roger Grzondziel and Ciaron Lin- tal Agency, and the EU FP7 projects SUMO, LINC and stead for help with the IBM iDataPlex Cluster at the Pots- MEDIATION. AR, JR and JFD acknowledge financial sup- dam Institute for Climate Impact Research. Complex network port of the German National Academic Foundation. JR measures have been calculated using the software packages was partially supported by the German Environmental Foun- igraph [127] and pyunicorn [128]. Finally, we are in- dation. NCEP reanalysis data have been provided by the debted to two anonymous referees for their insightful com- NOAA/OAR/ESRL PSD, Boulder, Colorado, USA, from ments on an earlier version of this paper.

[1] R. Albert and A.-L. Barabasi,´ Rev. Mod. Phys. 74, 47 (2002). [29] R. V. Donner, T. Sakamoto, and N. Tanizuka, in Nonlinear [2] M. E. J. Newman, SIAM Rev. 45, 167 (2003). Time Series Analysis in the Geosciences: Applications in Cli- [3] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. matology, and Solar-Terrestrial Physics, edited Hwang, Phys. Rep. 424, 175 (2006). by R. V. Donner and S. M. Barbosa (Springer, Berlin Heidel- [4] I. Zaliapin, E. Foufoula-Georgiou, and M. Ghil, J. Geophys. berg, 2008), pp. 124–154. Res. Earth Surface 115, F00A15 (2010). [30] J. F. Donges, Y. Zou, N. Marwan, and J. Kurths, Eur. Phys. J. [5] S. Abe and N. Suzuki, Europhys. Lett. 65, 581 (2004). ST 174, 157 (2009). [6] M. Baiesi and M. Paczuski, Phys. Rev. E 69, 066106 (2004). [31] J. F. Donges, Y. Zou, N. Marwan, and J. Kurths, Europhys. [7] J. Davidsen, P. Grassberger, and M. Paczuski, Phys. Rev. E 77, Lett. 87, 48007 (2009). 066104 (2008). [32] J. F. Donges, H. C. H. Schultz, N. Marwan, Y. Zou, and [8] A. Jimenez,´ K. F. Tiampo, A. M. Posadas, F. Luzon,´ and J. Kurths, Eur. Phys. J. B 84, 635 (2011). R. Donner, Eur. Phys. J. ST 174, 181 (2009). [33] K. Steinhaeuser, N. V. Chawla, and A. R. Ganguly, SIGKDD [9] H. Vogel and K. Roth, Adv. Wat. Res. 24, 233 (2001). Explorations 12, 25 (2010). [10] A. Santiago, J. P. Cardenas,´ J. C. Losada, R. M. Benito, A. M. [34] K. Steinhaeuser, N. V. Chawla, and A. R. Ganguly, Statist. Tarquis, and F. Borondo, Nonlin. Proc. Geophys. 15, 893 Anal. Data Mining 4, 497 (2011). (2008). [35] K. Steinhaeuser, A. R. Ganguly, and N. V. Chawla, Clim. Dyn. [11] S. J. Mooney and D. Korosak,ˇ Soil Sci. Soc. Amer. J. 73, 1094 39, 889 (2012). (2009). [36] K. Steinhaeuser, N. V.Chawla, and A. R. Ganguly, in Proceed- [12] J. B. Elsner, T. H. Jagger, and E. A. Fogarty, Geophys. Res. ings of the 2010 Conference on Intelligent Data Understand- Lett. 36, L16702 (2009). ing, CIDU 2010, edited by A. N. Srivastava, N. V. Chawla, [13] J. F. Donges, R. V. Donner, K. Rehfeld, N. Marwan, M. H. P. S. Yu, and P. Melby (NASA Ames Research Center, 2010), Trauth, and J. Kurths, Nonlin. Proc. Geophys. 18, 545 (2011). pp. 16–26. [14] J. F. Donges, R. V. Donner, M. H. Trauth, N. Marwan, H. J. [37] A. Pelan, K. Steinhaeuser, N. Chawla, D. de Alwis Pitts, and Schellnhuber, and J. Kurths, Proc. Natl. Acad. Sci. USA 108, A. Ganguly, in 2011 IEEE Symposium on Computational In- 20422 (2011). telligence and Data Mining (CIDM) (2011), pp. 239–245. [15] R. V. Donner and J. F. Donges, Acta Geophys. 60, 589 (2012). [38] J. Kawale, S. Liess, A. Kumar, M. Steinbach, A. R. Gan- [16] L. Telesca and M. Lovallo, Europhys. Lett. 97, 50002 (2012). guly, N. F. Samatova, F. H. M. Semazzi, P. K. Snyder, and [17] J. O. Pierini, M. Lovallo, and L. Telesca, Physica A 391, 5041 V. Kumar, in Proceedings of the 2011 Conference on Intelli- (2012). gent Data Understanding, CIDU 2011, edited by A. N. Srivas- [18] J. H. Feldhoff, R. V. Donner, J. F. Donges, N. Marwan, and tava, N. V. Chawla, and A. S. Perera (NASA Ames Research J. Kurths, Phys. Lett. A 376, 3504 (2012). Center, 2011), pp. 30–44. [19] A. A. Tsonis and P. J. Roebber, Physica A 333, 497 (2004). [39] J. Kawale, M. Steinbach, and V. Kumar, in Proceedings of [20] A. A. Tsonis, K. L. Swanson, and P. J. Roebber, Bull. Amer. the Eleventh SIAM International Conference on Data Mining, Meteor. Soc. 87, 585 (2006). SDM 2011 (SIAM / Omnipress, 2011), pp. 107–118. [21] A. A. Tsonis and K. L. Swanson, Phys. Rev. Lett. 100, 228502 [40] M. Barreiro, A. C. Marti, and C. Masoller, Chaos 21, 013101 (2008). (2011). [22] A. A. Tsonis, K. L. Swanson, and G. Wang, J. Clim. 21, 2990 [41] N. Malik, N. Marwan, and J. Kurths, Nonlin. Proc. Geophys. (2008). 17, 371 (2010). [23] A. Tsonis, G. Wang, K. Swanson, F. Rodrigues, and L. Costa, [42] N. Malik, B. Bookhagen, N. Marwan, and J. Kurths, Clim. Clim. Dyn. 37, 933 (2011). Dyn. 39, 971 (2012). [24] A. A. Tsonis and K. L. Swanson, Nonlin. Proc. Geophys. 19, [43] N. Boers, B. Bookhagen, N. Marwan, J. Kurths, and 559 (2012). J. Marengo, Geophys. Res. Lett. 40, 4386 (2013). [25] A. Gozolchiani, K. Yamasaki, O. Gazit, and S. Havlin, Euro- [44] M. Palus,ˇ D. Hartman, J. Hlinka, and M. Vejmelka, Nonlin. phys. Lett. 83, 28005 (2008). Proc. Geophys. 18, 751 (2011). [26] A. Gozolchiani, S. Havlin, and K. Yamasaki, Phys. Rev. Lett. [45] Y. Berezin, A. Gozolchiani, and S. Havlin, Scientif. Rep. 2, 107, 148501 (2011). 666 (2012). [27] K. Yamasaki, A. Gozolchiani, and S. Havlin, Phys. Rev. Lett. [46] O. Guez, A. Gozolchiani, Y. Berezin, S. Brenner, and 100, 228501 (2008). S. Havlin, Europhys. Lett. 98, 38006 (2012). [28] K. Yamasaki, A. Gozolchiani, and S. Havlin, Prog. Theor. [47] I. Ebert-Uphoff and Y. Deng, Geophys. Res. Lett. 39, L19701 Phys. Suppl. 179, 178 (2009). (2012). 19

[48] I. Ebert-Uphoff and Y. Deng, J. Clim. 25, 5648 (2012). [79] J. Runge, Master’s thesis, Humboldt-University Berlin (2010). [49] A. Feng, Z. Gong, Q. Wang, and G. Feng, Theor. Appl. Cli- [80] J. Runge, J. Heitzig, V. Petoukhov, and J. Kurths, Phys. Rev. matol. 109, 635 (2012). Lett. 108, 258701 (2012). [50] L. Carpi, P. Saco, O. Rosso, and M. Ravetti, Eur. Phys. J. B [81] J. Runge, J. Heitzig, N. Marwan, and J. Kurths, Phys. Rev. E 85, 389 (2012). 86, 061121 (2012). [51] L. C. Carpi, P. M. Saco, O. A. Rosso, and M. G. Ravetti, [82] T. Schreiber and A. Schmitz, Physica D 142, 346 (2000). in Dynamics of Information Systems: Mathematical Founda- [83] A. C. Davison and D. V. Hinkley, Bootstrap methods and their tions, edited by A. Sorokin, R. Murphey, M. T. Thai, and P. M. application, Cambridge series on statistical and probabilistic Pardalos (Springer, 2012), vol. 20 of Springer Proceedings in mathematics (Cambridge Univ. Press, Cambridge, 1997). Mathematics & Statistics, pp. 157–173. [84] A. Barrat and M. Weigt, Eur. Phys. J. B 13, 547 (2000). [52] A. Rheinwalt, N. Marwan, J. Kurths, P. Werner, and F.-W. Ger- [85] D. J. Watts and S. H. Strogatz, Nature 393, 440 (1998). stengarbe, Europhys. Lett. 100, 28002 (2012). [86] S. Bialonski, M. Wendler, and K. Lehnertz, PLoS ONE 6, [53] I. Fountalis, A. Bracco, and C. Dovrolis, Clim. Dyn. (in press). e22826 (2011). [54] K. Steinhaeuser and A. A. Tsonis, Clim. Dyn. (in press). [87] J. Dall and M. Christensen, Phys. Rev. E 66, 016121 (2002). [55] J. Hlinka, D. Hartman, M. Vejmelka, D. Novotna,´ and [88] R. V. Donner, J. Heitzig, J. F. Donges, Y. Zou, and J. Kurths, M. Palus,ˇ Clim. Dyn. (in press). Eur. Phys. J. B 84, 653 (2011). [56] J. Hlinka, D. Hartman, M. Vejmelka, J. Runge, N. Marwan, [89] J. F. Donges, J. Heitzig, R. V. Donner, and J. Kurths, Phys. J. Kurths, and M. Palus,ˇ Entropy 15, 2023 (2013). Rev. E 85, 046105 (2012). [57] E. A. Martin, M. Paczuski, and J. Davidsen, Europhys. Lett. [90] R. W. Hamming, Bell Syst. Techn. J. 29, 147 (1950). 102, 48003 (2013). [91] D. W. Scott, Multivariate Density Estimation (John Wiley & [58] M. van der Mheen, H. A. Dijkstra, A. Gozolchiani, M. den Sons, Inc., New York, 1992). Toom, Q. Feng, J. Kurths, and E. Hernandez-Garcia, Geophys. [92] A. J. Clarke, An Introduction to the Dynamics of El Nino˜ & Res. Lett. 40, 2714 (2013). the Southern Oscillation (Academic Press, London, 2008). [59] G. Tirabassi and C. Masoller, Europhys. Lett. 102, 59003 [93] D. Maraun and J. Kurths, Geophys. Res. Lett. 32, L15709 (2013). (2005). [60] J. Deza, M. Barreiro, and C. Masoller, Eur. Phys. J. ST 222, [94] I. I. Mokhov, D. A. Smirnov, P. I. Nakonechny, S. S. Kozlenko, 511 (2013). E. P. Seleznev, and J. Kurths, Geophys. Res. Lett. 38, L00F04 [61] J. Ludescher, A. Gozolchiani, M. I. Bogachev, A. Bunde, (2011). S. Havlin, and H. J. Schellnhuber, Proc. Natl. Acad. Sci. USA [95] K. E. Trenberth, Bull. Amer. Meteor. Soc. 78, 2771 (1997). 110, 11742 (2013). [96] N. A. Rayner, D. E. Parker, E. B. Horton, C. K. Folland, L. V. [62] L. d. F. Costa, F. A. Rodrigues, G. Travieso, and P. R. V. Boas, Alexander, D. P. Rowell, E. C. Kent, and A. Kaplan, J. Geo- Adv. Phys. 56, 167 (2007). phys. Res. Atmos. 108, D4407 (2003). [63] A. R. McIntosh, C. L. Grady, L. G. Ungerleider, J. V. Haxby, [97] E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, and S. I. Rapoport, and B. Horwitz, J. Neurosci. 14, 655 (1994). A.-L. Barabasi, Science 297, 1551 (2002). [64] C. Zhou, L. Zemanova, G. Zamora, C. C. Hilgetag, and [98] E. Ravasz and A.-L. Barabasi,´ Phys. Rev. E 67, 026112 J. Kurths, Phys. Rev. Lett. 97, 238103 (2006). (2003). [65] K. A. Schindler, S. Bialonski, M. T. Horstmann, C. E. Elger, [99] S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes, Phys. and K. Lehnertz, Chaos 18, 033119 (2008). Rev. E 65, 066122 (2002). [66] E. Bullmore and O. Sporns, Nat. Rev. Neurosci. 10, 186 [100] G. Szabo,´ M. Alava, and J. Kertesz,´ Phys. Rev. E 67, 056102 (2009). (2003). [67] G. Zamora-Lopez,´ C. Zhou, and J. Kurths, Front. Neurosci. 5, [101]A.V azquez,´ Phys. Rev. E 67, 056104 (2003). 83 (2011). [102] S. H. Y. Chan, R. V. Donner, and S. Lammer,¨ Eur. Phys. J. B [68] M.-T. Kuhnert, C. E. Elger, and K. Lehnertz, Chaos 20, 84, 563 (2011). 043126 (2010). [103] H. Kutza, Master’s thesis, Humboldt University of Berlin [69] P. Holme and J. Saramaki,¨ Phys. Rep. 519, 97 (2012). (2012). [70] E. Kalnay, M. Kanamitsu, R. Kistler, W. Collins, D. Deaven, [104] A. Radebach, Master’s thesis, Humboldt University of Berlin L. Gandin, M. Iredell, S. Saha, G. White, J. Woollen, et al., (2010). Bull. Amer. Meteor. Soc. 77, 437 (1996). [105] K. E. Trenberth and D. P. Stepaniak, J. Clim. 14, 1697 (2001). [71] D. J. Thomson, Science 268, 59 (1995). [106] M. Sato, J. E. Hansen, M. P. McCormick, and J. B. Pollack, J. [72] M. Palus,ˇ D. Novotna,´ and P. Tichavsky,´ Geophys. Res. Lett. Geophys. Res. Atmos. 98, 22987 (1993). 32, L12805 (2005). [107] H.-Y. Kao and J.-Y. Yu, J. Clim. 22, 615 (2009). [73] D. Rybski, A. Holsten, and J. P. Kropp, Physica A 390, 680 [108] J.-S. Kug, F.-F. Jin, and S.-I. An, J. Clim. 22, 1499 (2009). (2011). [109] M. P. McCormick, L. W. Thomason, and C. R. Trepte, Nature [74] J. Heitzig, J. F. Donges, Y. Zou, N. Marwan, and J. Kurths, 373, 399 (1995). Eur. Phys. J. B 85, 38 (2012). [110] A. Robock and J. P. Mao, J. Clim. 8, 1086 (1995). [75] S. Bialonski, M.-T. Horstmann, and K. Lehnertz, Chaos 20, [111] D. E. Parker, H. Wilson, P. D. Jones, J. R. Christy, and C. K. 013134 (2010). Folland, Int. J. Climatol. 16, 487 (1996). [76] R. Heikes and D. A. Randall, Monthly Weather Rev. 123, 1862 [112] B. Pompe and J. Runge, Phys. Rev. E 83, 051122 (2011). (1995). [113] N. K. Larkin and D. E. Harrison, Geophys. Res. Lett. 32, [77] P. W. Jones, A user’s guide for SCRIP: A spherical coordinate L13705 (2005). remapping and interpolation package, Los Alamos National [114] K. Ashok, S. K. Behera, S. A. Rao, H. Weng, and T. Yamagata, Laboratory, Los Alamos, New Mexico, USA (1998). J. Geophys. Res. Oceans 112, C11007 (2007). [78] K. Rehfeld, N. Marwan, J. Heitzig, and J. Kurths, Nonlin. [115] G. N. Kiladis and H. van Loon, Monthly Weather Rev. 116, Proc. Geophys. 18, 389 (2011). 120 (1988). 20

[116] H. H. Hendon, E. Lim, G. Wang, O. Alves, and D. Hudson, mann, A. Zech, J. Feldhoff, A. Rheinwalt, H. Kutza, A. Rade- Geophys. Res. Lett. 36, L19713 (2009). bach, et al., Geophys. Res. Abstr. 15, 3558 (2013). [117] S.-W. Yeh, J.-S. Kug, B. Dewitte, M.-H. Kwon, B. P. Kirtman, [129] P. J. Laurienti, K. E. Joyce, Q. K. Telesford, J. H. Burdette, and F.-F. Jin, Nature 461, 511 (2009). and S. Hayasaka, Physica A 390, 3608 (2011). [118] H.-M. Kim, P. J. Webster, and J. A. Curry, Science 325, 77 [130] We adopt the notation X• and X•• throughout the paper (2009). whenever we do not refer to a specific vertex, edge, or pair [119] M. J. McPhaden and X. Zhang, Geophys. Res. Lett. 36, of vertices when discussing a certain property X. L13703 (2009). [131] Similar considerations apply, for example, to the case of com- [120] W. Kim, S.-W. Yeh, J.-H. Kim, J.-S. Kug, and M. Kwon, Geo- plex networks obtained from time series of the same system, phys. Res. Lett. 38, L15809 (2011). but with different control parameters, or different parts of the [121] Z.-Z. Hu, A. Kumar, B. Jha, W. Wang, B. Huang, and same time series [13]. B. Huang, Clim. Dyn. 38, 2017 (2012). [132] We note that there is no generally accepted criterion for net- [122] H.-F. Graf and D. Zanchettin, J. Geophys. Res. Atmosph. 117, work sparsity so far. In the context of growing networks, ρ . D01102 (2012). O(N −1) is often considered as sparse, which is supported by [123] K. Miyakoda, A. Cherchi, A. Navarra, S. Masina, and the fact that self-organizing networks in various fields show a J. Ploshay, J. Meteorol. Soc. Japan Ser. II 90, 35 (2012). tendency of an approximate scaling behavior ρ ∼ N −1 [129]. [124] S. Kutterolf, M. Jegen, J. X. Mitrovica, T. Kwasnitschka, However, in our case of a fixed network size, these considera- A. Freundt, and P. J. Huybers, Geology 41, 227 (2013). tions may not apply, so that we distinguish between sparse and [125] B. McGuire, Phil. Trans. R. Soc. A 368, 2311 (2010). dense networks using more heuristic and context-dependent [126] R. Quian Quiroga, T. Kreuz, and P. Grassberger, Phys. Rev. E considerations. Notably, given our network size N, we would 66, 041904 (2002). have ρ ≈ 0.0008, which is about one order of magnitude be- [127] G. Csardi´ and T. Nepusz, InterJournal CX.18, 1695 (2006). low the operational range used in our basic setting. [128] J. F. Donges, J. Heitzig, J. Runge, H. C. Schultz, M. Wieder-