Aristotle University of Thessaloniki Faculty of Sciences Mathematics Department

Pest- time series analysis and the Development of Population Causal networks

Petros Damos

Dissertation for the fulfilment of the M.Sc. Degree in Web Science and Bio-Networks

Veroia 2012 Population Networks Petros Damos

The dissertation was completed in the Department of Mathematics at Aristotle University of Thessaloniki, defended before and approved by the following members of the scientific committee:

Dimitris Kugiumtzis (Thesis Advisor) Mathematics Department, Faculty of Science Aristotle University of Thessaloniki

Stefanos Sgardellis (Member) Biology Department Faculty of Science Aristotle University of Thessaloniki

John Halley (Member) Biological applications and Technology Department University of Ioannina

2 ______Population Networks Petros Damos

Contents Preface ...... 5 Abstract ...... 6 Περίληψη ...... 7 1. Introduction ...... 8 1.1 Graph theory and ecological networks ...... 8 1.2 Structural and Causal networks ...... 14 1.3 Transforming Time Series into Complex Networks ...... 18 1.3.1 Cycle networks ...... 18 1.3.2 Recurrence networks ...... 19 1.3.3. Correlation and causal networks ...... 19 1.4 Principal concepts of Population dynamics ...... 20 1.4.1 Deterministic population models ...... 21 1.4.2 Stochastic population models ...... 24 1.4.3 Population emergence and spatial synchronisation ...... 27 1.5 Scope of the dissertation ...... 28 2. Study system and population dynamics ...... 31 2.1 Species ...... 31 2.2.1 The summer fruit totrix Adoxophyes orana ...... 31 2.1.2 The peach twig borer lineatella ...... 31 2.1.3 The oriental fruit moth Grapholitha molesta ...... 32 2.2 Sampling sites ...... 33 2.3 Species monitoring and data registration ...... 33 2.4. Moth population and weather data time series ...... 34 3. Stochastic modelling of insect population cycling and seasonality...... 36 3.1 Problem setting and solving algorithm ...... 36 3.2 Basic background on univariate time series analysis ...... 38 3.2.1 Moth populations dynamics regarded as stochastic process ...... 38 3.2.2 Autocorrelation and partial autocorrelation ...... 39 3.2.3 Autocorelation in the frequency domain and Power spectrums ...... 40 3.2.4 Autoregressive models ...... 41 3.2.5 Model comparison and validation ...... 42

3 ______Population Networks Petros Damos

3.5 Results ...... 43 3.5.1 Seasonality and population feedbacks ...... 43 3.5.2 Spectral analysis ...... 47 3.5.3 Parameter optimisation ...... 48 3.5.4 Diagnostic checking and residual error analysis ...... 52 3.5.5 Model validation ...... 56 3.6 Discussion ...... 58 4. Multivariate moth population analysis and causal networks ...... 61 4.1 Ecological networks and state of the art ...... 61 4.2 Weighed and binary causal network construction algorithm...... 62 4.2 Basic background on multivariate time series analysis ...... 64 4.2.1 Cross correlations ...... 64 4.2.2 Partial correlations ...... 65 4.2.3 Granger causality measures - preliminaries...... 66 4.2.4 The Granger Causality Index (GCI) ...... 67 4.2.5 The Causal Granger Causality Index (CGCI) ...... 67 4.3 Time series networks ...... 68 4.3.1 Correlation networks and undirected links ...... 68 4.3.2 Causality networks and directed links ...... 71 4. 3. 3 Graph theoretic network measures ...... 72 4.3.3 Ecological network analysis and standard graph metrics ...... 73 4.6 Results ...... 75 4.6.1 Cross and partial cross correlation networks ...... 75 4.6.2 Granger causality and conditional Granger causal networks ...... 77 4.6.3 Causal force directed network layouts ...... 80 4.6.4 Graph theoretic metrics ...... 81 4.6.5 Landscape topology of moth population networks ...... 83 4.7 Discussion ...... 86 5. Concluding Remarks ...... 89 5.1 Population cycling and seasonality ...... 89 5.2 Moth population causal networks ...... 91 5.3 Population networks in advancement of pest information systems ...... 96 References ...... 100

4 ______Population Networks Petros Damos

Preface

In this dissertation we use multivariate time series analysis combined with graph theory to describe moth population dynamics and their causal relations. From a phenomenological perspective the insect spatiotemporal structure is addressed as dynamical (physical) system and efforts are made to elucidate the mechanism which could explain how its elements are functioning and are arranged in a particular network form. From a population ecologically perspective, efforts are made to infer upon the experimental result to provide information utile in pest management and improve web-based information systems. The author would like to express his most sincere thanks to the persons who helped to go through the above topics and especially Assoc. Prof. D. Kugiumtzis who was instrumental in enabling this work and in particular for the introduction of several topics of multivariate time series analysis, such as the concept of causality which exploits the natural ordering of variables. Prof S. Sgardellis for the interesting discussions we hade concerning some current issues of ecological networks and Prof. J. Halley for being member of the advisory board and evaluating the current work. The author also likes to thank Prof. I. Antoniou, director of the Post Graduate Web Science Program of the Mathematics Department at Aristotle University of Thessaloniki, for his inspirational lectures and discussions upon the phenomenological perspectives of network functioning in complex systems. The author acknowledges also the help provided by the Agronomists of the public confederation ALMME®, in collecting part of the data that were used to illustrate some representative region specific ecological networks. Any errors that remain throughout the current work are my sole responsibility.

5 ______Population Networks Petros Damos

Abstract

Over the last years there has been growing interest in graphical models and in particular in those based on directed graphs as a general framework to describe and infer causal relationships. In this dissertation we consider stochastic modelling of moth population time series and multivariate network causal analysis. Our starting points are stochastic autoregressive population models, and in particular the use of ARMA(p,q) and SARMA(p,q)x(P,Q)S models in describing moth phenology of three close related pest species. This is followed by a presentation of the model-fitting results and a discussion of the heuristic benchmarks used to assess the predictive performance of the models. The significant feedbacks that were estimated by these models formed the basis for the multivariate causal analysis that was followed in the next section. Assuming that the ecological variables observed at successive time points constitute a biological system, the system is illustrated by means of weighted and binary networks in which significant connections are constructed by examining correlation and Granger causality indexes. In this manner it was able to identify the direction of edges (connections) which represent significant interactions among ecological time series, the later being the nodes. Based on the empirical data of spatial distributed moth population, the approach is further illustrated by generating force directed layouts. The resulted networks were finally projected over the landscape topology to infer upon regional-specific population interactions and to detect population ''hot spots''. Modularity is detected by the graphical models as well as by estimating graph theoretic metrics. Empirical results are presented and in most cases found to be consistent with the theoretical hypothesis that particular locations (nodes) over the study regions are considered as more active in the sense that either populations appear regular at high levels (hot spots) or/and point nearby locations, in which high population activity usually follows. Moreover, interactions are observed only among populations that belong to same species; such kind of interspecies life cycle synchronisations could be related to regional interactions among subpopulations performed to improve successful matting options and host allocation by the species. The resulted graphical models provide means to describe the spatio- temporal arrangement pest-populations. This information is utile in improving pest management options as part of a wider web-based pest information and forecasting expert system.

Keywords: Ecological networks, graph theory, stochastic population models, Adoxophyes orana, Anarsia lineatella, Grapholitha molesta, pest forecasting, spatial population interactions

6 ______Population Networks Petros Damos

Περίληψη

Η παρούσα εργασία πραγματεύεται τη δημιουργία βιολογικών δικτύων μετά από ανάλυση πολυμεταβλητών χρονοσειρών που αφορούν μετεωρολογικούς δείκτες και πληθυσμούς εντόμων. Συγκεκριμένα, εξετάζεται η σχέση μεταξύ πληθυσμών λεπιδόπτερων – εχθρών των καλλιεργειών από δεδομένα που συλλέχτηκαν κατά τη διάρκεια της τελευταίας δεκαετίας από αντιπροσωπευτικές περιοχές του νομού Ημαθίας. Σκοπός ήταν η διερεύνηση χωρικών αλληλεπιδράσεων μεταξύ των βιοτικών-πληθυσμιακών μεταβλητών καθώς και της επίδρασης της θερμοκρασίας και σχετικής υγρασίας στους πληθυσμούς. Αρχικά μελετάται η περιγραφή της πληθυσμιακής εμφάνισης με την βοήθεια αυτοπαλίνδρομων γραμμικών στοχαστικών μοντέλων με στόχο την διερεύνηση της περιοδικότητας στην εμφάνισή τους. Ειδικά η ύπαρξη περιοδικότητας, αποτελεί ανοιχτό ερώτημα ενώ αποτελεί βασική προϋπόθεση για την ανάπτυξη πολυμεταβλητών στοχαστικών μοντέλων και τη δημιουργία βιολογικών δικτύων που ακολουθεί. Εδώ αντιμετωπίζεται συμπεριλαμβάνοντας στη διερεύνηση γραμμικά αυτοπαλινδρομικά μοντέλα ARMA(p,q) και εποχικά μοντέλα SARMA(p,q)x(P,Q)S. Γίνεται σταδιακή διερεύνηση πρώτα της περιοδικότητας, μετά της κατάλληλης τάξης του ARMA και τελικά του SARMA μοντέλου, χρησιμοποιώντας τη συνάρτηση αυτοσυσχέτισης και μερικής αυτοσυσχέτισης, καθώς και κριτήρια πληροφορίας. Η μετέπειτα δημιουργία των δικτύων βασίστηκε σε παραμετρικούς και τυχαιοποιημένους ελέγχους σημαντικότητας καθώς και σε μέτρα αιτιότητας κατά Granger. Σύμφωνα με την προτεινόμενη μεθοδολογία, κάθε χρονοσειρά παραστάθηκε ως κόμβος ενώ οι σημαντικές σχέσεις αιτιότητας όρισαν τις συνδέσεις. Τα αποτελέσματα της ανάλυσης συνοψίζονται με την τοπολογική απεικόνιση σταθμισμένων και μη-σταθμισμένων γράφων, που έχουν ως κόμβους αβιοτικές και βιοτικές μεταβλητές. Σύμφωνα με τα αποτελέσματα της παρούσας ανάλυσης δεν υπάρχει χωρική αλληλεπίδραση μεταξύ διαφορετικών ειδών εντόμων παρά μόνο μεταξύ πληθυσμών που ανήκουν στο ίδιο είδος. Αν και το φαινόμενο αυτό μπορεί να οφείλεται σε διάφορους παράγοντες, ο συγχρονισμός πληθυσμιακής δυναμικής ομοίων ειδών σε παρακείμενες περιοχές θα μπορούσε να παρέχει πλεονεκτήματα σε ότι αφορά την πιθανότητα επιτυχούς αναπαραγωγής και εκμετάλλευσης των διαθέσιμων πόρων από τα είδη. Τα αποτελέσματα της παρούσας εργασίας περιέχουν ακόμη πληροφορίες σχετικά με τις γεωγραφικές περιοχές-κόμβους ‘υψηλής σημαντικότητας’ και τη διασυνδεσιμότητά τους συναρτήσει του γεωγραφικού ανάγλυφου της περιοχής παρατήρησης. Η διερεύνηση χωρικών αλληλεπιδράσεων μεταξύ πληθυσμών εντόμων-εχθρών σε συνδυασμό με την χρήση φαινολογικών μοντέλων αποτελούν την βάση για την ανάπτυξη ή/και βελτιστοποίηση ενός ευρύτερου διαδικτυακού συστήματος πρόγνωσης εχθρών των καλλιεργειών, παρέχοντας έγκυρη πρόγνωση της γεωγραφικής εξάπλωσης με στόχο την έγκαιρη και διευθυνόμενη διαχείρισή τους.

Λέξεις κλειδιά: Οικολογικά δίκτυα, θεωρία γράφων, στοχαστικά πληθυσμιακά μοντέλα, Adoxophyes orana, Anarsia lineatella, Grapholitha molesta, σύστημα πρόβλεψης, χωρικές πληθυσμιακές αλληλεπιδράσεις

7 ______Population Networks Petros Damos

1. Introduction

1.1 Graph theory and ecological networks

The simplest representation of a system can take the form of a graph, in which its components are represented by nodes and interactions by edges [Bonningto and Little 1995, Bondy and Murty 2008]. The interactions can be represented algebraiccaly by adjacency matrices. From a statistical standpoint the degree of relations between nodes and edges correspond to a specific probability density function. The approaches can be combined based on basic principles of combinatorial mathematics. The representation of a network using mathematical formalisms facilitated significantly the studies of complex systems, which are using such kind of topological descriptions in an attempt to understand and characterise their structural behaviour. To date, such features usually express small-world properties [Watts and Strogatz 1998], degree distributions that are scale-free, hierarchical modularity [Barabasi and Albert 1999], or, degree distributions that follow power laws [Zhang 2009] (e.g. Figure 1). In principle, the complex systems are described as graphical depictions which are based on their representation as complex networks of passive or active (i.e. mutually interacting) subsystems. Here an undirected, usually unweighted, complex network G is introduced, consisting of n vertices and e edges and is conveniently represented by the binary adjacency matrix A, where Ai,j = 1 if vertex i connects to vertex j, and Ai,j =0 if the edge (i, j) does not exist [Bondi and Murty 2008]. However, deeper studies involve the use of detailed statistical descriptions to reveal the networks topological features and since this is new area of research a variety of new introduced statistical measures have been lately suggested [Albert & Barabasi, 2002; Newman, 2003; Costa et al., 2007]. These statistical measures are applied to quantify the properties of complex networks in various scientific disciplines in order to understand structure dynamics of such networks [Wang and Chen, 2002; Boccaletti et al., 2006]. Additionally, one can treat transition

8 ______Population Networks Petros Damos probabilities between discrete states as links and construct dynamic transition networks. In the search of unifying network patterns and dynamics the advent of computers and sophisticated algorithms provided effective means to study very complex networks [Borgatti & Everett 1997; Strogatz 2001; Barabasi et al. 2002; Dorogovtsev et al. 2003; Montoya et al. 2006]. Based on these approaches, numerous applications of complex networks have been considered in the literature, including representations of general networked representations [Pardalos et al. 1997, Amaral et al., 2000; Guimera et al., 2005], the derivation of network structures from data of social interactions [Freeman, 1979], the assessment of functional connectivity in the brain from spatially distributed (multi-channel) neurophysiological measurements [Zhou et al., 2006, Parker 2006], or the construction of complex network representations the application to engineering problems [Ceder and Wilson 1986, Onnela et al. 2007] and study of continuous systems such as atmospheric dynamics [Donges et al., 2009] and more [Aebersold et al. 2003, Antoniou and Tsompa 2008]. With an increasing emphasis towards studying system processes in life sciences, recently biological networks have gained much attention [Alon 2007]. In biology, graph theoretic network approach includes, among others, the study of protein interactions, metabolic networks and gene regulatory networks [Barabasi and Oltvai 2004, Fu et al. 2006, Vitkup et al. 2006, Alon 2007 Greenberg et al. 2008, Leclerc 2008]. However, since most biological systems are characterised by complexity, this has led to the development of computational and mathematical techniques allowing modelling of biological networks and the investigation of specific patterns that characterise a system (Figure 1.1).

9 ______Population Networks Petros Damos

Figure 1.1 Typical examples of network structures. a) Random, b) Small world [after Strogatz 2001], and c) Scale-free networks. Lines indicate links between nodes (circles) in a network. The greater number of links to/from a node, the darker the node [adopted by Karlsson 2007].

Recently, network theory has been also expanded in the fields of modelling complex ecological entities, such as food webs or mutualistic plant- interactions [Bascompte et al 2003, Guimaraes et al. 2006, Olesen et al. 2006, Bascompte et al 2006, Bascompte and Jordano 2007]. In most of these studies, graph theory has been used to depict relations of ecological network of interest to bear same sort of biological inference. In principle, most of these large scale investigations depict static relations among specimen of the food-webs and are used in comparative analyses. Traditionally, ecological networks are divided in the following broad types1: food webs, mutualistic networks and host-parasitoids interaction networks. From a population and community standpoint, food webs focus on throphic links among organisms, particularly predator-pray relations, as well as primary consumer- basal resource feeding relationships [Hall and Raffaelli 1993]. Moreover, food webs can be subdivided into community webs. The later include all links among organisms in a defined community in to smaller sources or sink webs [Ings et al. 2008]. The incorporation of abundance data result to trivariate webs which are

1 We are here mostly concerned in networks of traditional population community studies [Elton 1927, MacArhtur 1955] and not on ‘all ecosystem approaches’ which additionally emphasise on energy influxes, including biomass and nutrient cycles rather than taxonomic units [Lindeman 1942, Odum 1953]. 10 ______Population Networks Petros Damos characterised by recurrent patterns [Müller et al. 1999]. Apart of the throphic network structure, trivariate webs provide means to infer energy fluxes. The general flow pattern of such kind of food webs are characterised from many small individuals at the base of the web upwards into larger, rarer species at the top, with a concentration of resources into a progressively smaller number of nodes [Cohen 2003, Ings et al. 2006]. On the other hand, mutualistic networks are not interested in describing population dynamics or energy fluxes per se, but define the nexus of ecosystems services such as pollination and seed dispersal. Among the mutualsitic networks studied so far, three systems seem to have received much attention [Ings et al. 2006, after modification]: i. Pollination networks; these use graphical models to map interactions between plants and their animal pollinators (e.g. mostly and birds), here the plants provide food and the provide pollination services. ii. Frugivore networks; which study interaction between plants and animal seed dispersers. Here, in turn of the feed options, the animals facilitate plant dispersal. iii. Ant-plant networks, which focus on the relations between ants and plants providing them with food sources and/or domatia, while ants provide protection (i.e. herbivore insects) for the plants. iv. Host - parasitoid interaction network; which could be regarded as a special case of food webs since they concentrate on the special type of ‘predator–prey’ feeding relationships between parasitoids and their hosts. These systems are particularly well suited to the description of quantitative networks in which populations and interactions can be expressed in the same units (individuals/m2) [van Veen et al. 2006]. Figure 1.2, presents a usually observed community of a Greenhouse, which consists of two pest species (e.g. a Thrip and Mite) and four beneficial species (one predatory insect and three predatory mites).

11 ______Population Networks Petros Damos

Figure 1.2 Part of the artificial food web on cucumber in greenhouses. Shown are two pest species, the western flower thrips (F. occidentalis) and the two spotted spider mite (T. urticae) and the natural enemies used to control them. Natural enemies of thrips are the predatory mite N. cucumeris and the generalist predatory bug O. laevigatus. The predatory mites P. persimilis and N. californicus are used to control spider mites. Arrows indicate direct effects between members of different trophic levels (i.e. predation and herbivory) [after Venzon et al. 2001].

v. Eco - population causal networks; Here we introduce a special kind of networks. In particular the term Eco - population causal networks is used throughout the work to refer to a special type of mixed networks, in which nodes represent any kind of biotic and abiotic variables. Conceptually the eco -population network should be considered as a ‘conceptual-dipartite’ graph between clusters of biotic variables and abiotic variables. However, the degree of interactions and their direction is not a priori given, as in previous studies, but is defined after statistical hypothesis testing methods. The structure and related complexity of such a network, is expected to be directly related to the number of variables introduced as well as to their significant interactions. This topic is covered in detail in Chapter 4. Thus, we do not draw apart the two fundamental conceptual components that consist of an ecosystem, namely the biotic and abiotic variables. In addition,

12 ______Population Networks Petros Damos nodes correspond to populations instead of individual units. These characteristics provide a departure compared to most ecological network approaches. Nevertheless, beyond simple measures of connectance, or link species richness relations, there are several other major structural properties of ecological networks which are worth to study. For instance, graph theoretic measures, such as distribution of links among species are examined to infer upon the network stability and how robustness is affected under differed regimes. To date, according to the seminal work of May [May 1972, 1973] stable complex ecological networks are characterised by the following mathematical condition: 1 SIC 1 (1.1.1) 2 where S: the number of species (nodes), I: the mean interaction strength between connected species and C: connectance, which corresponds to the number of realized links l among those possible, so that: 1 l C (1.1.2) 2 S Additionally, graphical depictions provide information on how population communities are structured. Figure 1.3 for instance depicts four types of ecological network structures [Ings et al. 2006]. In particular, network A is clustered, having within-chain omnivory, in contrast to network B, which is not clustered (e.g. predators 3 and 6 are feeding on different food chains). Network C is nested, since the diet of consumer 8 is a subset of that of consumer 7, which, in turn, is a subset of consumer 6’s diet, whereas network D is not nested since diet of consumers 6, 7, and 8, overlaps equally.

13 ______Population Networks Petros Damos

Figure 1.3 Examples of clustering (A–B) and nestedness (C–D) in ecological networks. Arrows point from resource to consumer [adopted by Ings et al. 2006].

Mulualistic networks in particular seem to be nested (i.e. a two-mode network in mutualistic plant animal relations). This means that specialists interact with species that form well defined subsets. Topologically this pattern is expressed by a network structure in which a core of generalists is interacting among themselves, and a tail of specialists interacting with the most generalist species [Bascompte et al. 2003]. Nesteness is also related with body size especially in aquatic ecosystems.

1.2 Structural and Causal networks

By considering the above mentioned ecological networks we cannot ignore the additional potentially confounding effects of seasonality and spatial sampling [Petanidou et al. 2008, Dupont et al. 2009]. These also include perturbations arising from human-induced environmental changes. For instance, ecological

14 ______Population Networks Petros Damos networks are mostly described during periods when abundances are highest, which for most webs will be in the summer months for temperate climates [e.g. Woodward et al. 2005a, b]. Moreover, when aiming modelling functionality of populations in terms of super organisms, rather than taxonomic units, then most of the current ecological networks probably fail to incorporate inherent properties of the system. These properties are usually related to seasonal alterations on essential behavioural properties of the organism and their functioning (i.e. matting behaviour and reproduction, social defence interactions, metapopulations, etc). On other words, most networks in ecology have seldom been built on stochastic backgrounds which incorporate seasonal changes on network relations. For example, a species could be linked to a higher degree to other only during a particular observation period. However, the significance of dependence may diverse if we account population alterations as observed by long periods due to random, mostly unknown, perturbations. The use thus of sophisticated statistical background, such as time series network analysis, could permit a quantitatively assessment of the structural properties of network rather than a priori defined connectances. To put forward, based on the above thoughts and according to the construction method, it is convenient to divide ecological networks in the two following categories: Structural networks: we place here most of the current ecological networks in which the degrees of species interaction is a priori given. Here, seasonal constrains and alterations between species interactions are not considered. Therefore most of these networks are binary and their adjacency matrices describe relations among individuals of different taxa. Because, population communities in nature are characterised by assembles of species that co- occur on dynamic rather than static fashion, structural models may deviate from reality. However, their asset is that relations can be biologically interpreted. Causal networks: these are narrowed by econometrics and neuroscience although throughout the thesis we make efforts to evaluate their utility in describing posteriori ecological relations among biological entities. Most often the degree of relation is quantified based on stochastic multivariate

15 ______Population Networks Petros Damos

autoregressive models and probabilistic hypothesis testing. Based on this approach we can treat both: weighted as well as binary networks. However, in contrast to structural networks, only non-trivial significant relation among object are considered, while adjacency matrices describe relations among populations (e.g. long term ecological time series) and not individuals. Nevertheless, a drawback is that they can also indicate significant correlation among objects which do not necessary reflecting always a biological interaction. Moreover, because correlations by themselves are crude (they do not provide explanations) in some cases there is a need for theories and related assumptions that connect fundamental results to the corresponding phenomenological aspects of population interactions. Although several experiments in the fields of life sciences provide a time series of simultaneously recorded variables these are not always available for relatively long term population studies. However, if available (and reliable), researchers are interested in revealing synchronisation between time series of interest (i.e. predator-pray population, environmental stochasticity and population synchronisation). Futhermore, in many systems it is not only important to detect synchronisation but also to define causal relationships (i.e. cause-effect, drive- response) among components of the system. Figure 1.4 is an example which depicts the differences on network configuration between a structural connectivity network and a causal connectivity network, while Figure 1.5 describes some simple network configurations and corresponding causal connectivity patterns for three different causal densities.

The causal density (Cd) is a typical graph metric which measures the network’s dynamics and represents the fraction of interactions among nodes that are causally significant. The unbounded ‘weighted’ version of Cd which also exists, takes additionally into account the varying contributions of each causally significant interaction [Seth et al. 2005; Seth 2008].

16 ______Population Networks Petros Damos

Figure 1.4 (A): Casual connectivity (weighted graphs) vs structural connectivity (binary graph) and related network. Arrow width reflects magnitude of causal influence. Network construction differs in terms of construction principles. The causal is based on posteriori relation derived by hypothesis testing, whereas the Structural is based on a priori defined relations. (B): The effect of environmental stochasticity is reflected on the adjacency matrix of multivariate causal connectivity. Note that same relations are weighted on different degrees based on the construction approach [adopted by Seth 2008].

Figure 1.5 Simple networks (top row) and corresponding causal connectivity patterns (bottom row). (A) Fully connected network – complete graph. (B) Fully disconnected network. (C) Randomly connected network. Black arrows correspond to bidirectional connections and Grey arrows to unidirectional connections. The width of each arrow (and size of arrowhead) reflects the magnitude of the causal interaction (Cd: causal density) [adopted by Seth 2008].

17 ______Population Networks Petros Damos

The key challenge in most problems is to determine the functional connectivity of the underlying mechanism. Thus, if the interest is to investigate whether connectivity, in a set of unknown population variables that randomly evolve in time, exists and study the properties of the system in terms of a network, one should first clarify which of the links among the variables are not trivial. To date, such kind of problems were first addressed in econometrics to describe cause interactions among stock markets [Granger 1969, Granger and Newbold 1986]. Nevertheless, narrowed from econometrics, recently such kind of computational methods have been also applied to analyse serial neural data and reveal dynamic relations of very complex biological systems [Seth 2010].

1.3 Transforming Time Series into Complex Networks

There has been growing interest in graphical models over the last years and in particular in those based on transforming time series into networks to provide inference upon the topological features and functionality of the system. Recently, several approaches have been proposed for transforming time series data into complex network representations [Pearl 1995, 2000; Lauritzen 1989, 2000; Dawid 2000]. To date based upon their construction method; these networks can be roughly distinguished into the following classes:

1.3.1 Cycle networks

In principle these are cycle-proximity networks, introduced by Zhang and Small [2006], to study the topological features of pseudo-periodic time series by means of complex networks. In this case, the individual cycles contained in a time series are identified and are represented with the vertices of an undirected network. Edges between pairs of vertices are further established if the corresponding segments of the trajectory behave very similarly. For quantifying the proximity of cycles in phase space, a generalization of the cross correlation coefficient applicable to cycles of possibly different lengths (i.e. modeling Lorentz chaotic system characterized by a double-scroll topology of the attractor with pronounced chaotic oscillations in which the structure is well represented in the

18 ______Population Networks Petros Damos adjacency matrix A of the corresponding cycle network based on the x-coordinate time series). Because linear and periodic systems have cycle networks that appear randomly, while chaotic and nonlinear systems generate highly structured networks the vertex and edge properties of the resultant networks can be used to distinguish between distinct classes of dynamical systems. This class of networks, as well as the following recurrence networks is relevant to univariate time series analysis.

1.3.2 Recurrence networks

A recurrence network is a complex network whose adjacency matrix is given by the recurrence matrix of a time series. In addition, since the recurrence matrix can be defined in different ways there are distinct subtypes of recurrence networks that are characterized by different structural properties. Recurrences are usually described in the form of a binary matrix R, where Ri,j = 1 if the state xj is a neighbor of xi in phase space, and Ri,j = 0 otherwise. Considering k-nearest neighbours every (possibly embedded) observation vector is considered as a vertex which is then linked to those k other vertices j that have the shortest mutual distances. This means that a directed edge is introduced to every vertex and unlike cycle and correlation networks, the adjacency matrix of a k-nearest neighbour network are generally asymmetric. Complex systems with different types of dynamics exhibit distinct structural properties which can be further characterized in terms of their associated small-scale as well as large-scale features [Marwan 2002, Marwan et al., 2007]. Here the interest is to visualise the concept of recurrence for the analysis of dynamical systems (i.e. chaotic dynamical systems, unstable periodic orbits, or dynamical invariants). For more details refer to Eckmann et al. [1987] and references given.

1.3.3. Correlation and causal networks

Since in this class we deal with Xi,t multivariate processes having i variables that evolve simultaneously in discrete time steps t, correlation and causal network differentiate compared to cycle and recurrence networks. The network is here constructed by embedding an arbitrary time series, individual state vectors Xi in the m-dimensional phase space. Significant correlations among variables are

19 ______Population Networks Petros Damos further considered as edges of vertices of an undirected complex network. Specifically, if the Pearson correlation coefficient is larger than a given threshold r, the vertices (i.e. time series) is considered to be connected. One proceeds then to the construction of an adjacency matrix which is further used to visualize the correlation network embedded in an abstract two-dimensional space. The approach generally involves the use of different statistical inference measures to detect causality among time series of interest. The concept of direct causality most commonly used is that of Granger causality [Granger1969] and Fishers significance test, which exploits the natural time ordering to achieve a causal ordering of the variables. More precisely, one time series is said to be Granger causal for another series if the latter can be better predicted using all available information than if the information apart from the former series had been used. Granger causality (referred often as G-Causality) is a powerful technique that reveals connectivity from time series data [Eichler 2000, Ding et al. 2006]. To conclude, most of the above mentioned graph construction methods aim to construct binary matrices (i.e. adjacency matrices) which enclose fundamental topological properties of the underlying system which are further evaluated by sophisticated statistical measures. The detailed concepts of causal networks construction are extensively discussed in Chapter 4.

1.4 Principal concepts of Population dynamics

Population dynamics is used to indicate change in population size (number of individuals) or population density (number of individuals per unit area) over time. Population dynamics are influenced by four fundamental demographic processes: birth, death, immigration (individuals moving into the population), and emigration (individuals moving out of the population) [Kot 2001]. There are two categories which are used to simulate population growths. One classical deterministic, while the second, is stochastic (referred as conventional by Royama [2005]) and is based on the recognition that population processes are random processes that generally can be approximated by the use of autoregressive models. Throughout the Chapters 3 and 4 we are considered only

20 ______Population Networks Petros Damos with the stochastic approach and therefore the deterministic is presented only in brief due to some analogies that it has to the stochastic models and for review reasons. A third, intermediate approach is more recent and theoretical and is based on the assumptions of deterministic chaos [May 1975]. Although, the latter is not in the scope of the current work, a short example is in brief presented for complementary reasons.

1.4.1 Deterministic population models

Traditionally, for single species models and homogenous population, their growth is described according to an exponential, logistic and Gompertz growth models. On their simplest form it is assumed that all changes in the populations result from birth and deaths and that the per capita birth rates and death rates are constant: 1 dN bd (1.4.1.1). N dt If we symbolise the intrinsic rate of population increase with: r b d , then: dN rN (1.4.1.2) dt The above equations are linear first order differential equations and we derive to the following solution after integration:

rt N() t N0 e (1.4.1.3). However, if we consider that density depended and density independent factors, as primary factors in regulating populations and K represents the caring capacity, then we derive to: dN N rN(1 ) (1.4.1.4). dt K The above equation, 1.4.1.4., has exact solution: K Nt() (1.4.1.5) K 1 ( 1)e rt N0 Figure 1.7 is a schematic representation of density dependent self regulation in populations with no stochastic trends, while figure 1.8 illustrates a typical delayed density regulation in a predator-prey system with no stochastic trends. However, in most cases that system is more complicated and further interactions

21 ______Population Networks Petros Damos should be considered. Additionally, by using the same deterministic skeleton but by adding random shocks we derive the stochastic versions of the above models (see 1.4.2).

Nt Nt-1 Nt+1 Nt+n

Figure 1.7 schematic representations of density dependent self regulation in populations [from Damos 2012].

Nt Nt-1 Nt+1 Nt+n

Pt-1 Pt Pt+1 Pt+k

Figure 1.8 schematic representations of density dependent self regulation and related delayed density regulation in a predator-prey system (b) [from Damos 2012 after modification].

In the case were population N is affected due to competition P (i.e. predation), it can be described according to the following simple system of differential equations: dN rN cNP dt (1.4.6) dP bNP mP dt where r, c, b and m are constants. The time series of a typical prey-predator interaction are illustrated in Figure 1.9a, the phase-space trajectories for population time series having different

22 ______Population Networks Petros Damos periods are shown in figure 1.9b were each point of the pray population maps to the corresponding point of its predator. Different trajectories represent different combinations among the population dynamics of the two interacting species. For more details refer to Logan and Hain [1991].

a. b.

Figure 1.9 Time series of a typical predator – prey interaction (a) and related phase space plots of different population combinations (b) [Kot 2001].

Finally, in the cases were a population is governed by negative feedback the population dynamic system can act instantaneously in which a characteristic point attractor is created (e.g. point attractors limit cycle attractors). In some cases, strange attractors can also be derived under different parametric conditions (Figure 1.10). For more details refer to May [1973, 1974, 1975], Logan and Hain [1991] and Kot [2001].

23 ______Population Networks Petros Damos

Figure 1.10 Typical characteristics of population dynamics having steady state, period and chaotic behaviour [May 1975]. Model is: X(t+1)=rX(t) (1-Xt). Note that for parameter values r<3, the system is stable, but its performance is unpredictable ‘random behavior’ for r>3. The latest is referred to as deterministic Chaos. Thus population dynamics changes in relation to its species specific per capita rate of increase, showing bifurcations from the point equilibrium into a two point cycle and so on. Note also the presence of some ordered windows in the chaotic region.

1.4.2 Stochastic population models

From a time series perspective, because most species interact with other species, biological processes are in general multivariate. This simple means that a process of a particular species of interest depends by its own density as well as the density of the interacting species (as priory defined). However, usually a series of observations is available only for the target species and although this appears to create problems in describing and analysing populations processes, in exact terms many observed ecological time series can effectively be approximated by a univariate, density-dependent time series model with p lagged terms [Royama 2005]:

xt p( x t1 , x t 2 ,..., x t p 1 ) t (1.4.2.1), in which xt is population and εt is a random, density-independent disturbance (shock) [Royama 1984, Royama 1992, and Berryman 2007]. Figure 1.10 depicts the population dynamics over time in a locally stable system and Figure 1.11 depicts the time evolution in the variances in some representative stationary and nonstationary population processes [Karlsson 2007]. Natural population time 24 ______Population Networks Petros Damos series exhibit also red-shifted power spectra [Steele 1985, Sugihara 1995] and in some cases white power spectra [Halley 1996]. The order of the density dependence, p, depends on the food web structure, and usually p=1 or 2, is in many cases adequate to study certain probabilistic processes. Combination of parameter values are related to different modes explicit by the population process (Figure 1.12).

Figure 1.10 The dynamics of a population over time in a locally stable system. The horizontal line indicates the equilibrium abundance of the population. d denotes the magnitude of a disturbance. Dotted line: return to the equilibrium population level through damped oscillations. Line: monotonic return to the equilibrium population level [after Karlsson 2007].

Figure 1.11 Schematic representation of stationary/non-stationary noise models [after Karlsson 2007]. White noise and autoregressive red noise (AR) are stationary, which suggest the variance of population fluctuations is constant with time (white noise) or become constant with time (AR). In pink and brown whit noise variance of population fluctuations2 grows indefinitely with time and is therefore non stationary[after Karlsson 2007].

2 Natural population time series exhibit also red shifted - pink power spectra [Steele 1985, Pimm & Redfearn 1988, Ariño & Pimm 1995a, Pimm 1991a, Sugihara 1995, Halley 1996].

25 ______Population Networks Petros Damos

Figure 1.12 The parameter space (a1, a2) for the linear model:

Xt1 0 a 1 X t a 1 X t 1 t , divided into regions (marked with Roman numerals) in which generated series exhibit distinct patterns and qualitatively different possible behaviours of the underlying dynamic model [adopted by Royama 2005 and modified according to Fleming 2002].

Usually a measure of population density is transformed into logarithmic scale, to meet the theoretical aspects of population dynamics, although this depends also upon the type of investigation and deviance from normality in empirical studies [Royama [1984, Royama 1992, Fleming et al 2002, Berryman and Lima 2007]. In order to estimate the population density factors the deterministic and stochastic components of the autoregressive equations can be further isolated (or set to its mean value 0). This can simplify the analysis and parameter equation that are initially used to examine the effects of the deterministic factor alone on spatial transmutations (e.g. spatial synchronisations). It is also worth to note that in most cases underlying model that is used to describe population abundances is a discrete time, stochastic Gompertz model. Generally, the statistical properties of the stochastic Gompertz are well known. The probability distribution for instance is a normal distribution with mean and variance that change as functions of time. To date the stationary distribution is the stochastic version of equilibrium in the deterministic model and is an

26 ______Population Networks Petros Damos important statistical manifestation of density dependence in the population growth model. Nevertheless, since throughout the work we are mostly interested in order detection of stationary population time more details upon the direct application of AR models, as well as seasonal models, are covered in details in chapter 3.

1.4.3 Population emergence and spatial synchronisation

In the stochastic context which is in brief introduced in 1.4.2, population processes are considered as week stationary, i.e. expectations are constants for all t, and the cross-covariance functions depends on time lag and not on absolute time t (see Chapter 3 for more details). Based on these assumptions, one can introduce a certain measure of association between each pair of two variables and detect any synchronization patterns. The most common association measure is the autocorrelation function. To date, the concept of periodically correlated processes was actually introduced by Gladyshev [1961]. These earlier theoretical results were essential for the development of the class of periodic linear autoregressive moving average models which proved its usefulness and appropriateness for modelling time series exhibiting periodical autocorrelation structures. This feature cannot be accounted by standard ARMA models [Pagano 1978, Tiao and Grupe 1980, Bentarsi and Aknouche, 2005]. Moreover, many species of phytophagous forest insects undergo synchronous population cycles in which outbreaks occur simultaneously over large regions and the spatial extent of damage may have devastating consequences [Myers 1988]. Thus the detection of population synchronisation among nearby and other regions is of special interest in pest management. Most notions are covered in detail in the following Chapters and are actually based on empirical results in which two or more temporal series of local moth population cycling in quite similar manners. This means that the population attributes are of synchrony among the populations (i.e. similar fluctuation patterns) but first in terms of their autocorrelation functions and next in terms of their frequencies (i.e. coincidence in phase). In addition, significant correlation and the use of direct Causal measures are then used to indicate locations, which from a practical standpoint are referred as hot spots, since they are able to drive

27 ______Population Networks Petros Damos

(at least by a statistical standpoint) pest population emergence in nearby locations.

1.5 Scope of the dissertation

In applied entomology, empirical approaches are often used in the construction of population developmental models. These models are used as prediction models in the frame of Integrated Pest Management3 [Damos 2012]. In general, the procedures include the delimitation of all the factors that affect development to the most crucial one, which is further chosen in order to reveal empirical dependences of the developmental variables upon the limiting factor. A function, linear or non-linear, which describes the data with higher accuracy, is plugged to these relations, and its prediction power is further evaluated by using new datasets [Damos et al. 2011, Damos and Savopoulou-Soultani 2012]. However, there is a new impetus towards integrating the different approaches to construct population models and network depictions to study ecological relations. Hence, across all network types, the importance of the specific configuration of complexity and interaction strengths are recognised and are used to infer upon the functionality of ecological systems such as the presence of modules, the evaluation of interaction strengths and the detection of directionality among non-trivial interactions. In this context, the current work addresses the challenge of constructing causal ecological networks, by using relatively new tools of multivariate time series analysis narrowed from econometrics and neuroscience. We further apply the proposed mathematical formalism to detect spatial relations among populations of three close related ’ species as well as the effect of environmental stochastic trends. We treat thus the potential interactions among abiotic and biotic variables as a multivariate vector autoregressive system which is bounded in the particular studied area and is finally represented as a network.

3 Integrated Pest Management (IPM) is a decision-based process, involving coordinated use of multiple tactics for optimizing the control of all classes of pests (insects, pathogens, vertebrates and weeds) in an ecologically and economically sound manner. For IPM population models are in the core of decision making in order to apply available control options to successfully manage them [Damos 2012]. 28 ______Population Networks Petros Damos

Based on well defined statistical tools which are used to quantify relations among ecological objects we are able to move beyond simple structural relations describing species-averaged data and start exploring relations among populations and potential effects of diverse factors, which interact with the system such as abiotic variables. Thus to the best of our knowledge the current approach is a departure from most standard graphical description which are made through bipartite networks in which nodes (species) are linked (interact) only with nodes of the opposite guild (i.e. plants with animals and vice versa) [Albert and Barabási, 2002].

The method has two advantages: 1. Conceptual: we approach the ecological network concept by an holistic standpoint and by considering both, abiotic and biotic variables. This is made since populations, and especially poikilothermic organisms, are directly affected by abiotic variables such as temperature. 2. Constructional: connectance is defined posteriori only if significant correlation among ecological time series is indicated and based on a priori artifact.

The analysis of species spatial interaction is fundamental to the understanding population spatial organisation and functionality. Species population dynamics of nearby regions seldom act in an isolated manner; rather, region specific abiotic factors often interact, while potential inter and intra species interactions are also involved. Functions of uncharacterized mixed relation may be described and predicted through time series analysis and network theory. The importance of identifying casual structure within time series data is important during the exploratory analysis phase and constitutes a preliminary for further multivariate modelling of the system of interest. Additionally, mapping and topological projection of the population networks to the study sites may reveal landscape effects and locations of higher importance. The utility of such kind in studying colonisation dynamics and/or predicting region specific spatiotemporal pest outbreaks is more or less obvious. The rest of the current dissertation is structured as follows. After the experimental setting which is described in Chapter 2, the work is further divided

29 ______Population Networks Petros Damos in two main parts the first one, Chapter 3, provides the mathematical background which is prerequisite to study time series and proceed network analysis. Here the scopes are the detection of population order, seasonality and synchronisation, while in the second part (Chapter 4); the theoretical foundations and results are used to study casual relations among the ecological variables. Here we study what might be regarded as central issues in spatiotemporal population ecology based on multivariate network analysis. Finally, efforts are made to draw some general conclusions in Chapter 5 on the properties of the constructed eco-population network and the practical utility in such kind of information for pest management and improvement of web based information and pest forecasting systems.

30 ______Population Networks Petros Damos

2. Study system and moth population dynamics

2.1 Species

Three Lepidoptera - pest species were studied including: Anarsia lineatella Zeller (Lepidopterea: ), Grapholitha molesta Busck (Lepidopterea: Tortricidae) and Adoxophyes orana Fisher von Röslerstamm (Lepidopterea: Tortricidae). Their eco-biology is briefly discussed.

2.2.1 The summer fruit totrix Adoxophyes orana The summer fruit totrix Adoxophyes orana (Fisher von Röslerstamm) (Lepidoptera: Tortricidae) attacks a wide variety of plants with a preference for Rosaceous plants, especially apple, pear and peach. The species however, is reported to feed and develop on more than 50 plant species of multiple families including fruits, forest trees, and ornamentals. The species has three to four generations per year depending upon regional specific prevailing temperatures. The species is a leafroler and larvae of the first generation damaging mostly leafs, while larvae of the subsequent generations are damaging also fruits. Usually one larva can attack more than one nearby fruits. Species overwinters as larva in cocoon shelter in bark crevices. This species, as well as the following ones has a regular presence in Northern Greece and is considered as a regular pest of fruit orchards.

2.1.2 The peach twig borer Anarsia lineatella

The peach twig borer, Anarsia lineatella Zeller (Lepidoptera: Gelechiidae), is one of the major economic pests of stone fruits in central and southern Europe. It is referred to be oligophagous, preferring mainly peaches, apricots, and almonds. In southern Europe (i.e. Greece), A. lineatella has three or usually four generations per year depending on prevailing temperatures. The species overwinters in bark crevices as second or third instars, forming hibernacula. Larvae become active in spring and are able to cause early season injury burrowing into new twigs. Later during summer, newly hatched larvae, originating from next generations, feed mainly on fruit, causing significant damage. In most cases, the above species

31 ______Population Networks Petros Damos appear simultaneously during a growth season and therefore species-specific detailed phenology is essential for management success.

2.1.3 The oriental fruit moth Grapholitha molesta The oriental fruit moth G. molesta is considered a regular problem in stone fruits as well as in apples when there are nearby stone fruit orchards. Apart of peaches ( persica) and nectarines, G. molesta attacks a great variety of hosts including apricots (Prunus armeniaca), almonds (Prunus amygdalus), quince (Cydonia oblonga), pears (Pyrus sp.), plums () and cherries (Prunus sp.) as well as woody ornamental plants. The Oriental fruit moth has three full generations and occasionally a partial 4th and 5th generation in Southern Europe. Flight patterns however are very confusing and generations are difficult to be distinguished. The moths overwinter as full-grown larvae in cocoons, settled in various sites, mostly tree bark crevices and weed stems, trash on the ground, fruit containers and packing sheds. G.molesta has a marked daily flight period in the evening. On peach orchards, the species lays the majority of its eggs on the leaf surfaces and first generation larvae usually damage two or three shoots during the first generation. Larvae of subsequent generations damage both shoots and fruits. Larvae of these generations are the major cause of wormy fruit production at harvest, often with little or no external sign of injury. Additionally, about half of the injury in late ripening peach varieties is characterized by no visible entrance (concealed injury). The oriental fruit moth G. molesta attacks also apples, but is considered usually as a regular pest of stone fruits. All the above Lepidoptera have a wide distribution in Europe, North America and northern Asia and thus are considered the most important pests for fruit production worldwide [Balachowsky 1966; Damos and Savopoulou-Soultani 2008, 2010]. Damage caused by the first-generation larvae of A. lineatella and G. molesta have similarities as they both attack young peach shoots, while A. orana is a leaf roller. During growing season, larvae of later generations attack fruits in a species-specific way [Balachowsky, 1966]. From a systematic standpoint, the above species belong to the larger group of Heterocera which in most cases include multivoltine smaller moth species [Damos and Savopoulou-Soultani 2011]. These species are diurnal and unusually

32 ______Population Networks Petros Damos complete more than three generations per year. The above micro moths can also regularly be observable at rest taken positions, in which their wings are held horizontally against the substrate (in contrast to larger butterflies of the Ropalocera division in which the wings are held vertically) [Damos and Savopoulou-Soultani 2011].

2.2 Sampling sites

Observations were carried out in a population sampling network that is established in Northern Greece and in particular in the prefecture of Imathia in Veroia (40.32_N, 022.18_E). The observation network covers an agricultural landscape of approximate 25.000km2. The Agro ecosystem consists of plots of several crop cultivation in which fruit orchards are randomly distributed and represent approximately the half of the observed area. The moth observation network consists by 13 observation points. Traps were placed in representative, in terms of landscape architecture, locations of each observation plot. In particular, a network was established with sample points deployed in a wide- mesh net, with a maximum density of ~3traps per ~20 hectares. Most of the experimental fruit orchards belonged to farmers of the public peach confederations (ALMME ®) and some were close to the wider experimental area covered by the Agricultural and Pomological Research Institute of Naousa (NAGREF ®). Analysis was based on combined data sets provided by the cooperatives as well as regular recordings which were personally performed during the past decade in the above mentioned locations and in precise covering a period from 2004-2012.

2.3 Species monitoring and data registration Adult moth time series data were observed by sex pheromone traps. Cardboard delta traps (Pheromone-pheromone traps: Trece Inc., Salinas, CA, USA) were placed in each study location. For each moth species separated traps were used, having sticky inserts baited with mixtures of synthetic sex pheromones (i.e. A. lineatella: E)-5-Decenyl acetate (E)-5-Decen-1-ol, G. molesta: (Z)-8-Dodecenyl acetate A. orana: (Z)-11-Tetradecenyl acetate (Z)-9-Tetradecenyl acetate). Traps were hung 1.5m above ground before the start of the first flight and were inspected for moth catches from April till October. Pheromone dispensers were 33 ______Population Networks Petros Damos replaced every 6 weeks and sticky boards every 2-4 weeks. Regular inspections of moth traps were performed every 3 days throughout the years. Trapped moths were registered at each time interval and then carefully removed.

2.4. Moth population and weather data time series Figures 2.1 and 2.2 depict the time series of the ecological moth population of the population variables, while Figure 2.3 presents the time series of the two environmental variables which were registered during the same time intervals (mean temperature and relative humidity). Ecological time series showed cyclic dynamics trough the observation period. Moreover, in most cases a repeated pattern is observable, although it’s difficult to judge whether the observed moth population are fluctuating with following a constant frequency. Apart from the frequency of population emergence, the population levels of A. orana and A.lineatella were higher and relatively constant during the past decade in contrast to that of G.molesta. Nevertheless, since all three species are multivoltine, they complete more than one generation during a growth season. Moreover, in all cases the ecological time series are characterised by time windows of none or very low moth activity. Based on the moth population emergence data the related phenology patterns are similar and in most cases non overlapping moth generations were observed. Generaly, the observed moth population patterns that were observed are typical to insects and related that are active in temperate climates.

X1 X2 100 100

50 50

0 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 X3 X4 100 100

(A.orana) 50 50

0 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 X5 X6 100 100

Moth Population Moth Population 50 50

0 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 X7 X8 100 100

50 50

0 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 Time Scale (2003-2011) 34 ______Population Networks Petros Damos

Figure 2.1 Moth population time series data (X1X8: A.orana) as observed in a typical agricultural landscape in Northern Greece. Time scale represents continuous population counts performed on 3-day time intervals and throughout the years 2004-2011.

Y1 Y2 100 100

50 50

0 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 Y3

100 (A.lineatella) 50

0 0 50 100 150 200 250 300 Moth Moth Population Z1 Z2 100 100

50 50

(G.molesta) 0 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300

Time Scale (2003-2011)

Figure 2.2 Moth population time series data (X1X8: A.orana, Y1Y3: A. lineatella and Z1Z2: G.molesta) as observed in a typical agricultural landscape in Northern Greece. Time scale represents continuous population counts performed on 3-day time intervals and throughout the years 2004-2011.

Mean Temperature (Cel.) Relative Humidity (%) 40 50

40 30 30 20 20

10 10 0 50 100 150 200 250 300 0 50 100 150 200 250 300

Time Scale (2003 - 2011)

Figure 2.3 Weather time series data, mean temperature (oC) and relative humidity (%). Time scale represents weather data counts performed every 3-day time intervals and throughout the years 2004-2011.

35 ______Population Networks Petros Damos

3. Stochastic modelling of insect population cycling and seasonality

3.1 Problem setting and solving algorithm Many populations fluctuate in synchrony, and potential driving forces behind such patterns of spatial correlation include predation and predator switching [Ydenberg 1987, Ims and Steen 1990, Ims and Andreassen 2000], dispersal and environmental synchrony, or the Moran effect [Moran1953]. Nevertheless, although several works have addressed the challenge to answer the question whether species spatiotemporal synchronous population cycles over large regions can be described best by using autoregressive models, there are still concerns on how to detect and precise estimate the order and periodicity of the particular process [Buonaccorsi et al. 2001, Berryman and Lima 2007]. The presence of periodically, or seasonally correlated patterns in ecological time series provides information on the number of additional seasonal variables which should be incorporated in the candidate model. However, in spite of the primary importance and the increasing need for the estimation purpose of seasonal (periodic) parameters in autoregressive time series models, they have received less attention in population modelling. Furthermore, most of these models are interested in direct feedbacks and do not examine the effect of seasonal feedbacks. Generally because in natural populations the presence of seasonal feedbacks tends to be difficult to be detected, the examination of seasonal effects in regulating population growth is usually neglected. Based on the general assumption that ecological time series contain clues to the underlying causal structure of the process [Berryman and Lima 2007], the scope of the current Chapter is to discover the degree of self dependence, seasonality and order, which gives rise to population dynamics in moth species. Moreover, since seasonality is too frequently ignored (as in the case of insect species); the construction of reliable models goes hand-in-hand with an urgent need to apply and evaluate methods to fit ecologically realistic seasonal autoregressive-time models. Therefore, our approach to understand seasonality and to describe moth

36 ______Population Networks Petros Damos population dynamics involves the use of statistics and population ecology in a combined manner. Generally, model identification and parameter optimisation consists of specifying the appropriate structure and order of a model. Moreover, based on the principle of parsimony, the interest here is to evaluate whether additional parameters of a complicate model (i.e. adding more time lags and/or seasonal feedbacks) add to the explanatory power compared to simpler models. Therefore, a four step procedure is proposed to construct and choose among the best fitted model that describes the ecological time series. The algorithm is presented in the scheme of Figure 3.1 and its construction is based on the classical methods of model identification as proposed by Box and Jenkins [1976].

n x1, x 2 ,..., xnt { x } 1

Α PACF ACF

p=0,1,…,p0 q=0,1,…,q0

B AIC p=0,1,..,p0 FPE q=0,1,..,q0 S

Γ ARMA(p,q) SARMA(p,q)

AIC P=0,1,..,p0 FPE Q=0,1,..,q0

Δ Final SARMA

Res. errors NRMSE

Figure 3.1 Chart flow of the logical operations for the construction of moth population models (details in the text).

37 ______Population Networks Petros Damos

In the first step (A) we use the autocorrelation function (ACF) to detect seasonality and the orders of the moving average (MA) part, while the partial autocorrelation function (PACF) is used to judge upon the significant time lags of the autoregressive (AR) process. In the second step (B) we use the previously indicated significant time lags to define the order of several candidate models and proceed to statistical comparison based on well known information criteria. Two closely related alternative statistical measures of goodness-of-fit were used, the Akaikes information criterion (AIC) and the Akaikes final prediction error (FPE), where smaller values of AIC and FPE indicate a better model. Additionally, based on a routine, which performed estimations of the AIC and BIC values for candidate ARMA models of different orders. At the next step (Γ) the routine is repeated but this time to compared performances of candidate seasonal autoregressive moving average (SARMA) models. Finally, the descriptive power of the constructed models is examined at the last step (Δ), by using the normalised root mean square error (NRMSE) and residual error (RE) correlation plots [Kugiumtzis 2010]. The mathematical formalism of each statistical measure that was used is described in details in the following sections.

3.2 Basic background on univariate time series analysis

The time series analysis that is followed involves a range of disciplines which are build under the general assumption of a stochastic or deterministic process that evolves in time [Box and Jenkins 1970, Kay 1988; Tong 1990; Kantz and Schreiber 1997, Timmer2008] and are now in brief described.

3.2.1 Moth populations dynamics regarded as stochastic process

We considered moth phenology patterns as a univariate stochastic process:

n xxtt{} on which 3-days interval weekly counts (x1 , x 2 ,..., xnt ) { x } 1 form the time series of the stochastic process realisation available through observation. In general the stochastic process is described by an n-dimensional probability distribution so that the relationship between the realisation (e.g. moth

38 ______Population Networks Petros Damos emergence) and the stochastic process (e.g. model) is analogous to that between the sample and population in classical statistics [Damos et al. 2011]. Because specifying the complete form of the probability distribution will in general be too ambitious we just content ourselves with the first and second moments. In time series analysis the first and second moments are respectively the expected mean: x E[Xt ] (3.2.1.1), which is estimated as:

n x(1 n ) x (3.2.1.2), t 1 t and the variance:

2 2 x Var[xt ] E[(xt x ) ] (3.2.1.3), estimated as:

n 22 sxt[1 ( n 1)] ( x x ) (3.2.1.4). t 1 Under normality assumptions the above statistics completely characterise the properties of the moth ecological time series. However, since the study is based on experimentally derived field data, in which the first and second moments exist, we consider the above defined stochastic process to be weakly, or wide-sense stationary, so that:

Ex()tx (3.2.1.5), and

E( xt x )( y() t s x ) ( s ) ( s ) (3.2.1.6), for all t and s, having variance of the process γ(0) [Royama 1992, 2005]. Actually the above statement is more close to reality and further implies

2 that E( x12 ) E ( x ) ... E ( xtx ) ,V( x12 ) V ( x ) ... V ( xt ) and covariance cov(xt , x t k ) cov( x t l , x t k l ) ( k ), i.e. the mean and variance are constant at all observation times and the autocovariance at lag k depends only upon the lag and not on t . This is the standard definition of the weak stationarity.

3.2.2 Autocorrelation and partial autocorrelation

We consider now the question of detecting seasonality of moth phenology. This is very important information since it will be later used in order to select among

39 ______Population Networks Petros Damos simple linear autoregressive model or seasonal linear autoregressive model to describe population cycling. Because generally the order of a stochastic process, utile to construct appropriate linear models, is not known for a given data set and therefore several methods of model selection have been used to estimate orders [Shibata 1976, Potscher and Srinivasan 1994]. Here two commonly diagnostics, the autocorrelation function (ACF) and partial acutocorelation functions (PAF), are used [Fuller 1976, Hamilton 1994, Wei 2006].

Considering the covariance between xt and xt+1 as:

kCov( x t , x t k ) E ( x t )( x t k ) (3.2.2.1),

and the correlation between xt and xt+1 as:

Cov(,) xt x t k k k (3.2.2.2), Var()() xt Var x t k 0 where we note that Var(xt)= Var(xt+k)=γ0 (Wei 2006). As function of k, γk is the autocovariance function, while ρk is called the autocorrelation function (ACF). The

ACF represent the covariance and correlation between xt and xt+1 from the same procces, separated by k time lags (Wei 2006). In addition to the ACF, the partial autocorrelation function (PACF), is further used to investigate the correlation between xt and xt+1 after their mutual linear dependency on the interventing variables xt+1, xt+1,…, xt+k-1 has been removed (Wei 2006). The PACF corresponds to the following conditional correlation [Hamilton 1994, Wei 2006]:

Corr(,) xt x t k x t1,..., x t k 1 (3.2.2.3),

From an ecological perspective, the use of the PACF is utile in identifying the time lags at which population feedbacks are operating.

3.2.3 Autocorelation in the frequency domain and Power spectrums

Power spectrum analysis (PSA) reveals the intensity of feedbacks in the frequency domain (i.e. 1/lags) [Welch 1967]. The power spectrum of a signal

40 ______Population Networks Petros Damos represents the contribution of every frequency of the spectrum to the power of the overall signal. Seasonal patterns, which are not always observable by time delayed ACF and PACF correlograms, often emerge by after the estimation of power spectrum and provide additional information that enhances precise order selection. The approach is here presented since it was also generated for comparative reasons. We consider thus the ecological time series as a periodic waveform which can be further described in terms of the following, well known, frequency- transformation:

M

xt a0 [ a k cos(2 kfn ) b k sin(2 kfn )] (3.2.3.1), k 1 where α0 are the mean value and ακ, bk are the with for each oscillation on the harmonic frequencies k2 kfn . In all cases Welch's averaged modified periodogram method of spectral estimation was applied and spectral density is calculated in units of power per radians per sample [Welch 1967].

3.2.4 Autoregressive models

We further describe the moth time series xt in a form of a linear combination in respect to past population values of an increased order p. This results to an autoregressive model AR(p) of the form [Wei 2006, Timmer 2008]:

xt0 1 x t 1 2 x t 2... ptpt x 1 itit x (3.2.4.1), i 1 where 01,,,K p is the vector of the coefficients of the AR(p) model and εt is the prediction error, or white noise, normally distributed having mean 0 and 2 ,

2 t : W(0, ) . For each time lag p, the partial autocorrelation is estimated from the parameter

pp, (here the two indexes refer to the parameter having lag p on the model having order p) of the AR(p) model. From a population-ecological perspective the linear second order autoregressive models have been widely used to predict the logarithm of population density, however since moth population phenology is characterised by the presence of time intervals in which no captures were

41 ______Population Networks Petros Damos observed, the stochastic modelling procedure was applied on non transformed data. It is convenient to further generalise the AR(p) model by including relative time lag expressions of order q of the white noise process as well. This results to the following mixed linear autoregressive moving average model ARMA(p,q) [Kantz and Schreibe 1997, Box and Jenkins 1970, Timmer 2008]:

xxxxt01122 t t... ptpt 1 1122 t t ... qtq 1 (3.2.4.2).

To put forward, when population xt is not affected only by short feedbacks of order p and q of x and ε, but also by long-termed seasonal past values the ARMA model can be extended to a seasonal ARMA model, or SARMA(p,q)x(P,Q)s, having period s:

xt0 1 x t 1 2 x t 2LLL ptp x 1 pts 1 x 2 ptsp x 1

PptPs1xxLL ( PptPsp 1) 1 t 1 t 1 2 t 2 qtq (3.2.4.3).

qts1LLL 2 qtsq 1 QqtQs 1 ( QqtQsq 1) 1 In the above equation, the time dependence of the process by its time lagged past values of order p for the autoregressive AR part and q for the moving average MA part, is repeated for P and Q periods in past, for the AR and MA part, respectively. Finally, the parameters of the models have been estimated based on the fit of the model to an original time series set which consists of the model training set (i.e. 8 years of time series recordings), based on the ordinary least square (OLS) method. The estimation is fairly transparent, although accomplished automatically by a MatLab computer program.

3.2.5 Model comparison and validation

Two closely related alternative statistical measures of goodness-of-fit were used, the Akaikes information criterion (AIC) and the Akaike’s final prediction error (FPE), which are respectively [Akaike 1973]: 2k AIC()logks2 (3.2.5.1) n and nk FPE(ks ) 2 (3.2.5.2), nk

42 ______Population Networks Petros Damos

2 where s is the estimated error dispersion (i.e. that of 2 ) and k the number of model parameters. Finally, the descriptive power of the constructed models was examined by using standard statistics of the normalised root mean square error (NRMSE) and residual error (RE) correlation plots. The NRMSE is given:

n ()xxˆ 2 tl1 tt NRMSE n (3.2.5.3). ()xx2 tl1 t

Where xˆt is the estimator (prediction) of xt by knowing the time series until time step t-1, and l is the highest time lag of the model (i.e. for SARMA(p,q)x(P,Q)S its Ps+p). Values of NMRSE close to 0 indicate very good model performance; while those close to 1 report that the prediction is as good as when predicting with the mean value of the process.

3.5 Results

3.5.1 Seasonality and population feedbacks

Figures 3.2 and 3.3 present the ACFs and PACFs respectively. In most cases a periodic pattern is observed in which peaks (i.e. high autocorrelation) in the ACF die out rapidly following seasonal patterns that are repeated every kth time interval. Moreover, higher autoccorelations are observed at the 1st and 2nd second time lags and then decay to zero in most data sets. In addition, since in the same ecological data sets the PACF is following a same pattern, with large values at the first and possibly at the second time lags it is most possible that both autoregressive and moving averages processes are present in the ecological time series. Overall and based on the ACFs and PACFs, the ecological time series give the following impressions. The variables Xi and Yi which correspond to the data set of species A.orana and A. lineatella respectively, have strong lag-1AR component in most cases, although there is some ambiguity about assessing the precise order of the MA process. In the data set of the variables Zi, which corresponds to G. molesta population, it’s also quite difficult to assess whether an exact period exists. Nevertheless, based on the shape of the ACFs and PACFs a first impression

43 ______Population Networks Petros Damos upon the structure of the candidate (S)ARMA model can be instructed, since in most cases no more than three AR and/or MA parameters seem to be required.

44 ______Population Networks Petros Damos

X1 X2 0.5 1

0 0

-0.5 -1 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 X3 X4 0.5 0.5

0 0

-0.5 -0.5 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40

X5 X6 (A.orana) 0.5 1

Moth population ACF Moth population 0 0

-0.5 -1 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 X7 X8 0.2 0.5

0 0

-0.2 -0.5 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40

Time lags Time lags

Y1 Y2 0.5 0.5

0 0 -0.5 5 10 15 20 25 30 35 40 -0.5 5 10 15 20 25 30 35 40 Y3

0.5 (A. lineatella) (A.

0

-0.5 5 10 15 20 25 30 35 40 Z1 Z2 Moth population ACF Moth population 0.5 0.5

(G.molesta) 0 0

-0.5 -0.5 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 Time lags Time lags

Figure 3.2 Autocorrelation functions (ACFs) of ecological time series corresponding to three adult moth populations; (X1X8: A.orana, Y1Y3: A. lineatella and Z1Z2: G.molesta).Time series are separated by k Lags - time units.

45 ______Population Networks Petros Damos

X1 X2 0.5 0.5 0 0 -0.5 -0.5 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 X3 X4 0.5 0.5 0 0 -0.5 -0.5 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40

X5 X6 (A.orana) 0.5 0.5

0 0 Moth Population Moth PACF Population -0.5 -0.5 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 X7 X8 0.5 0.5 0 0 -0.5 -0.5 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 Time lags Time lags

Y1 Y2 0.5 0.5

0 0

-0.5 -0.5 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 Y3

(A.lineatella) 0.5

0

-0.5 5 10 15 20 25 30 35 40 Z1 Z2

Moth population PACF Moth population 1 0.5

0 0

(G.molesta) -0.5 -1 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40

Time lags Time lags

Figure 3.3 Partial autocorrelation functions (PACF) of ecological time series corresponding to three adult moth populations; (X1X8: A.orana, Y1Y3: A. lineatella and Z1Z2: G.molesta).Time series are separated by k Lags - time units.

46 ______Population Networks Petros Damos

3.5.2 Spectral analysis Figure 3.4 depicts the power spectrum of the 13 moth population time series which describes the distribution (over frequency) of the power contained in each time series when treated as a signal of a finite data set, while figure 3.5 shows the relative Hamming windows that were used. Here each data set is decomposed to its harmonic components and is further regarded as a partitition of the variance of the series in relation to different oscillating components of different frequencies (i.e. periods). In terms of the plotted normalized frequency, there are non signals buried in wide-band noise. Moreover, in most cases, periodogram peaks appear quite similar for most data sets and indicates that almost the same frequencies are contributing the most to the variances in each of the thirteen time series that were analyses. For instance, power spectrum densities of X1, X2, X3, X4 and X5 are quite similar as well as that between Y1-Y2 and Z1-Z2, respectively. Thus, periodicities are detected in all data sets and in most cases, species that belong to the same genera cycling by the same manner.

40 40 40 X1 X2 X3 30 30 30 20 20 20 10 10 10

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 40 40 40 X5 X6 30 X4 30 30 20 20 20 10 10 10

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 40 40 X7 X8 30 30 20 20 10 10

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

40 40 40 Y2 Y3

30 Y1 30 30 Power/frequency (dB/rad/sample) Power/frequency 20 20 20 10 10 10

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 40 40 Z2 Z1 30 30 20 Power/frequency (dB/rad/sample) Power/frequency 20 10 10 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Normalized Frequency ( rad/sample) Figure 3.4 Welch’s power spectrum density (PSD) of real-valued one sided ecological time series (X1X8: A.orana, Y1Y3: A. lineatella and Z1Z2: G.molesta) based on Welch’s averaged, modified periodogram spectral analysis. Estimators of the PSD consisted of dividing the time series data into eight section of equal length (each segments with 50% overlap), computing a modified periodogram of each segment, and then averaging the PSD estimates (power spectral density is calculated in units of power per radians per sample).

47 ______Population Networks Petros Damos

50 50 50 X1 X2 X3 0 0 0

-50 -50 -50

-100 -100 -100 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 50 50 50 X4 X5 X6 0 0 0

-50 -50 -50 -100 -100 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 -100 0 0.2 0.4 0.6 0.8 50 50 X7 X8 0 0

-50 -50 Magnitude (dB) Magnitude -100 -100 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 50 50 50 Y1 Y2 Y3 0 0 0

-50 -50 -50

-100 -100 -100 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 50 50 Z1 Z2 0 0

-50 -50

-100 -100 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 Normalized Frequency ( rad/sample)

Figure 3.5 Hamming windows for each ecological time series used to compute the power spectrum frequencies of figure 2.4 (Length: 64, Sampling: Symmetric, Main lobe with -3dB: 0.039063).

3.5.3 Parameter optimisation Since in most cases it’s virtual impossible to judge whether an additional parameter adds to the explanatory power of a candidate model, all possible combinations of potentials ARMA models having order p,q=0,…,n and P,Q=1,…,n were compared based on the AIC and FPE and according to the proposed algorithm an its subroutines (see Figure 3.1). Figure 3.6 is a typical example which presents, on a two dimensional state space, the effect of different parameters of a SAR(p,P) model on the AIC values for each combination (i.e. autoregressive part p=0,1,2,3 and seasonal part P=0,1,2,3). It is

48 ______Population Networks Petros Damos thus, from a statistical standpoint, obvious that although a seasonal pattern (i.e. ACFs) on the ecological time series exist, its incorporation to the model structure, in most cases, do not add to its explanatory power (i.e. AIC for P=0 is the same as P>0). Besides, in some cases in which the ecological process is thought to have 2nd and/or 3rd order feedbacks, it could be also described with similar accuracy by a 1st order AR process.

3.5 3.5 3.5 3.5 X1 X2 X3 X4 3 3 3 3

2.5 2.5 2.5 2.5

2 2 2 2

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 3.5 3.5 3.5 3.5 X5 X6 X7 X8 3 3 3 3

2.5 2.5 2.5 2.5

2 2 2 2

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 AIC 3.5 3.5 3.5 p 3.5 Y1 Y2 Y3 Z1 3 3 3 3

2.5 2.5 2.5 2.5

2 2 2 2

0 1 2 3 0 1 2 3 0 1 2 3 W2 0 1 2 3 3.5 3.5 3.5 Z2 W1 Order (Seasonal part) 3 3 3 P=0

2.5 2.5 AIC 2.5 P=1 P=2 2 2 2 P=3

0 1 2 3 0 1 2 3 0 1 2 3 p p p Order (AR part)

Figure 3.6 Extrapolation of a routine that was performed on MatLab to infer upon the exploratory power of different SAR(p,P) models in describing 13 moth population data sets based on AIC.

49 ______Population Networks Petros Damos

In figure 3.7 similar simulations were performed as those presented in Figure 3.5 but this time based on the FPE. It is noteworthy to indicate that although FPE seems in some cases to be ‘statistically more sensitive’ compared to AIC (e.g. X3, X7, X8), differences are very slight and simulation displayed similar results. Once again there are very slight improvements on the models explanatory power if we increase the AR and Seasonal order from 1 to 3. In overall, because researchers often are not able to know the true model for a given data set, the presented parameter optimisation procedure provides reasonable reasons to choose among the best fitted models.

25 25 25 25 X1 X2 X3 X4 20 20 20 20

15 15 15 15 10 10 10 10 5 5 5 0 1 2 3 0 1 2 3 5 0 1 2 3 0 1 2 3 25 25 25 25 X5 X6 X7 X8 20 20 20 20

15 15 15 15

10 10 10 10

5 5 5 5

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 FPE 25 25 25 25 Y1 Y3 Z1 Y2 20 20 20 20 15 15 15 15 10 10 10 10 5 5 5 5 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 25 25 25 p Z2 W1 20 W2 20 20 Order (Seasonal part) 15 15 15 P=0 10 P=1 10 10 P=2 5 P=3 5 5 0 1 2 3 0 1 2 3 0 1 2 3 p p Order p (AR part)

Figure 3.7 Extrapolation of a routine that was performed on MatLab to infer upon the exploratory power of different SAR(p,P) models in describing 13 moth population data sets based on FPE.

50 ______Population Networks Petros Damos

Based on the above procedure the final structure of the models for each of the time series data set is presented in Table 3.1. This latter group of models implicitly incorporates the best estimates for each ecological time series data sets by a statistical sence. Hence each of the moth population data sets can be accurately described according to a related linear autoregressive model. It is feasible that in some cases the incorporation of seasonality seems to add on the explanatory power of the model in describing a particular data set. For instance, the seasonal models describe with high accuracy the emergence patterns and moth abundance for the variables X6, X7, X8, Y1 and Y3.

Table 3.1 Parameter estimates and model performance statistics of ‘SARMA training models’ in describing moth phenology of three species. Models generated in respect to a training data set corresponding to counts registered during 2004-2009.

Variable Species Model Equation (Location) (Xi,p,q,P,Q,s) X1 A.orana (X1,1,0,0,0,16) xt = 3.249 - 0.559 xt-1 (Τsourouki)

X2 A.orana (X2,1,0,0,0,16) xt = 2.843 - 0.664 xt-1 (Lykopi)

X3 A.orana (X3,1,0,0,0,16) xt = 3.240 - 0.639 xt-1 (Kavaki)

X4 A.orana (X4,1,0,0,0,16) Xt = 3.787 - 0.563 xt-1 (Vromes)

X5 A.orana (X5,2,0,0,0,16) Xt = 2.455 - 0.476 xt-1 -0.161 xt-2 (Paliomana)

X6 A.orana (X6,2,0,2,0,16) Xt = 1.577 - 0.572 xt-1 - 0.072 xt-2 - (Metohi) 0.025xt -16 - 0.039 xt-17+0.030 xt-18 + 0.005 Xt-32 - 0.002 xt-33 + 0.015xt-34

X7 A.orana (X7,1,0,2,0,16) xt = 1.503 - 0.633 xt-1 - 0.078 xt-16 - 0.022 (Vergina 1) xt-17 - 0.015 xt-32- 0.008 xt-34

X8 A.orana (X8,1,0,2,0,16) xt = 1.834 - 0.502 xt-1 - 0.0427 xt-16 - (Vergina 2) 0.063 xt-17 + 0.019 xt-32-0.025x t-33

Y1 A.lineatella (Y1,1,0,1,0,8) x(t) = 6.250 - 0.273 xt-1 + 0.003 xt-8 - (Galeneika) 0.014 xt-9

Y2 A.lineatella (Y2,1,0,0,0,8) xt = 4.575 - 0.497 xt-1 (Paliomana)

Y3 A.lineatella (Y3,3,0,2,0,8) xt = 3.090 - 0.605 xt-1 - 0.023 xt-2 - 0.058 (Vergina 2) xt-3 - 0.043 xt-8+ 0.01 xt-9 - 0.001 xt-10 + 0.003 xt-11 - 0.007 Xt-16+ 0.021 xt-17 - 0.013xt-18 + 0.023 xt-19

Z1 G.molesta (Z1,1,0,0,0,8) xt = 1.234 - 0.745 xt-1 (Paliomana)

Z2 G.molesta (Z2,1,0,0,0,8) xt = 1.491 - 0.741 xt-1 (Metohi)

51 ______Population Networks Petros Damos

3.5.4 Diagnostic checking and residual errors Model performance statistics are presented in Figure 3.7 by means of the NRMSE measure evaluating how closely the final constructed model group fits the model training data set. In particular, the smaller the NRMSE, the closer our model follows the data (e.g. if a model goes through each data point exactly, then the NRMSE should be ideally zero). Among the final group of models that were constructed the best fits were performed in describing data sets: X6, X7, X8 and Y3, Z1, Z2, while higher mismatches, between experimental and estimated data points, were observed in describing mostly variables X3, X4, X5 and Y1,Y2 in which respective models are over fitting.

1 NRMSE 0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2 Model Performance StatisticsPerformance Model 0.1

0

sarma (X1,1,0,0,0,12)sarma (X2,1,0,0,0,12)sarma (X3,1,0,0,0,12)sarma (X4,1,0,0,0,12)sarma (X5,2,0,0,0,10)sarma (X6,2,0,2,0,17)sarma (X7,1,0,2,0,17)sarma (X8,1,0,2,0,17)sarma (Y1,1,0,1,0,8)sarma (Y2,1,0,0,0,8)sarma (Y3,3,0,2,0,8)sarma (Z1,1,0,0,0,8)sarma (Z2,1,0,0,0,8)

Model structure (Var_,p,q,P,Q,s)

Figure 3.7 NRMSE of constructed SARMA models in describing moth population data (data corresponds to that of Table 1).

An additional comparison on the statistical performances of the final group of models, based on their AIC and BIC values, were also made and is presented in Figure 3.8. As expected, the AIC and FPE model results seems to complement that of the NRMSE. As in the case of the NRMSE, smaller values of AIC and FPE indicate better model performances. Additional information that could be drawn by figure 3.8 is, that

52 ______Population Networks Petros Damos the FPE appears to be more sensitive compared to the AIC. In addition, for higher- order stochastic processes, AICc also tended to select models of lower order.

6 160

140 5 120 4 100

3 80

AIC FPE 60 2 40 1 FPE 20 AIC 0 0

(Y1,1,0,1,0,8)(Y2,1,0,0,0,8)(Y3,3,0,2,0,8)(Z1,1,0,0,0,8)(Z2,1,0,0,0,8) (X1,1,0,0,0,16)(X2,1,0,0,0,16)(X3,1,0,0,0,16)(X4,1,0,0,0,16)(X5,2,0,0,0,16)(X6,2,0,2,0,16)(X7,1,0,2,0,16)(X8,1,0,2,0,16)

SARMA Model Structure (Xi,p,q,P,Q,s)

Figure 3.8 AIC and FPE for the final group of models that were constructed to describe moth population data (dashed lines generated for comparative reasons, data corresponds to that of Table 1).

Apart of the above statistical measures to evaluate the models it is often necessary to use some descriptive statistics in order to get a sense upon the error distributional patterns. Figures 3.9a and 3.9b are the ‘prediction error’ history charts of the models in respect to each moth population data set. These charts depict the residues of the fitting error in respect to each data set.

53 ______Population Networks Petros Damos

Residuals Versus the Order of the Data (response is X1) Residuals Versus the Order of the Data (response is X2) 1.5 1.5 1.0

1.0

l

a

l u

0.5 a

0.5

u

d

i

d

s

i s e 0.0

e 0.0

R R -0.5 -0.5

-1.0 -1.0 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 Observation Order Observation Order

Residuals Versus the Order of the Data Residuals Versus the Order of the Data (response is X3) (response is X4) 1.5 1.5

1.0 1.0

l l

a 0.5 0.5

a

u

u

d

i

d i s 0.0

s 0.0

e

e

R R -0.5 -0.5

-1.0 -1.0

1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 Observation Order Observation Order

Residuals Versus the Order of the Data Residuals Versus the Order of the Data (response is X5) (response is X6)

1.5 1.0

1.0

0.5

l

l a

0.5 a

u

u

d

d i

i 0.0

s

s e

0.0 e

R R -0.5 -0.5

-1.0 -1.0 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 Observation Order Observation Order

Residuals Versus the Order of the Data Residuals Versus the Order of the Data (response is X8) (response is X7) 1.5 2 1.0

1 l a

l 0.5

a

u

u

d

i

d s

i 0 0.0

s

e

e

R R -0.5 -1

-1.0 -2 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 Observation Order Observation Order

Figure 3.9a Residual errors time history analysis of fitted models on observed moth population time series (series are X1X8: A.orana).

54 ______Population Networks Petros Damos

Residuals Versus the Order of the Data Residuals Versus the Order of the Data (response is Y1) (response is Y2) 1.5 1.0

1.0 l

0.5 l a

a 0.5

u

u

d

d i

0.0 i s

s 0.0

e

e

R R -0.5 -0.5

-1.0 -1.0 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 Observation Order Observation Order

Residuals Versus the Order of the Data Residuals Versus the Order of the Data (response is Y3) (response is Z1)

1.5 1.0

1.0 0.5

l

l

a a

u 0.5 u

d 0.0

d

i

i

s

s e

0.0 e R R -0.5 -0.5 -1.0 -1.0 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 Observation Order Observation Order

Residuals Versus the Order of the Data (response is Z2)

1.0

l 0.5

a

u

d i

s 0.0

e R

-0.5

-1.0 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 Observation Order Figure 3.9b Residual errors time history analysis of fitted models on observed moth population time series (series areY1Y3: A. lineatella and Z1Z2: G.molesta).

Residuals Versus the Fitted Values Residuals Versus the Fitted Values (response is X1) (response is X2) 1.5 1.5

1.0 1.0

l

l a

a 0.5 u

u 0.5

d

d

i

i

s

s e

0.0 e 0.0

R R -0.5 -0.5

-1.0 -1.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 Fitted Value Fitted Value

Residuals Versus the Fitted Values Residuals Versus the Fitted Values (response is X3) (response is X4) 1.5 1.5

1.0 1.0

l

l a

a 0.5 0.5

u

u

d

d

i

i s

s 0.0 0.0

e

e

R R -0.5 -0.5

-1.0 -1.0

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 Fitted Value Fitted Value

Residuals Versus the Fitted Values Residuals Versus the Fitted Values (response is X5) (response is X6) 1.0 1.5

1.0 0.5

l

l

a a

0.5 u

u

d d

i 0.0

i

s

s e

e 0.0

R R -0.5 -0.5

-1.0 -1.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 Fitted Value Fitted Value

Residuals Versus the Fitted Values Residuals Versus the Fitted Values (response is X7) (response is X8)

2 1.5

1.0

1

l l a

a 0.5

u u

d d i

i 0 s

s 0.0

e e

R R -1 -0.5

-1.0 -2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Fitted Value Fitted Value

Figure 3.10a Residual error charts of fitted models on observed moth population time series (series are X1X8: A.orana).

55 ______Population Networks Petros Damos

Residuals Versus the Fitted Values Residuals Versus the Fitted Values (response is Y1) (response is Y2) 1.5 1.0

1.0 l

0.5 l a

a 0.5

u

u

d

d i

0.0 i s

s 0.0

e

e

R R -0.5 -0.5

-1.0 -1.0 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 Fitted Value Fitted Value

Residuals Versus the Fitted Values (response is Y3)

1.5

1.0

l a

u 0.5

d

i s

e 0.0 R

-0.5

-1.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 Fitted Value

Residuals Versus the Fitted Values (response is Z2) Residuals Versus the Fitted Values (response is Z1) 1.0 1.0

l 0.5

0.5 a

u

l

d

i a

s 0.0 u

0.0 e

d

i

R

s e

R -0.5 -0.5

-1.0 -1.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 Fitted Value Fitted Value Figure 3.11b Residual error charts of fitted models on observed moth population time series (series areY1Y3: A. lineatella and Z1Z2: G.molesta).

Moreover, in order to investigate the effects of measurement error on the constructed models the residuals are presented in Figures 3.11a and 3.11b. The charts show the residual errors of the original time series vs. the predicted ones and it is quite obvious that in all cases residual errors appear to be normally distributed and no specific pattern is observable and thereby enhancing quality of the final models.

3.5.5 Model validation In order to check the prediction capability of the final group of (S)AR(MA) models in respect to each time series, a model validation procedure was followed. Based on the total amount of available data they were firstly divided in two parts. The data set covered the period from 2003 to 2007 and was used for model training and parameter calibration, while the remaining data set, from 2007 to 2011, served only for validation. Figure 3.12 depicts the prediction performances of the trained models in describing the unknown moth population data sets. In all cases

56 ______Population Networks Petros Damos the trained model performed quite well and the slight deviations are in reasonable limits.

X1 X2 100 100 Observed 50 Predicted 50

0 0 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 X3 X4

(A.orana) 100 60

40 50 20

0 0 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 X5 X6

100 100 Moth population Moth

50 50

0 0 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 X7 X8 100 100

50 50

0 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 Time Scale (2008-2011)

Y1 Y2 100 100 Observed Predicted 50 50

0 0 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 Y3

100 (A.ineatella)

50

0

Moth population Moth 0 20 40 60 80 100 120 140 Z1 Z2 100 100

50 50

(G.molesta) 0 0 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 Time Scale (2008-2011)

Figure 3.12 Prediction capability of the stochastic (S)ARMA models in describing moth population time series (X1X8: A.orana, Y1Y3: A. lineatella and Z1Z2: G.molesta). Models were trained based on data set covering years 2003-2007 and validated on the remaining data set from 2007-2011.

57 ______Population Networks Petros Damos

3.6 Discussion Time series data, whether ecologic or other, stationary or not, have particular features that require special methods to be used for analysis and biological interpretation. The analysis here differs from procedures employed in other investigations since it tries to incorporate a realistic description of the seasonal trends in moth population dynamics. This has been examined by using a relatively large dataset for three different Lepidoptera species that were regularly observed during the last decade and over a wide agricultural landscape. Since the proposed models are empirically based, their formulation required the sophisticate statistical analyses that are presented to identify attributes of the species specific parameters. The preconditions for the models were that the model should be developed under the parsimonious principle (be relatively simple), represent the time evolution of the ecological time series, regarded as a linear stochastic process and predicts phenological events such as the seasonal variations in moth emergence patterns and abundance with high statistical accuracy. Therefore, to meet the scopes of this investigation it was essential to provide means to detect short termed time lags which were further used to define the dimensions of each autoregressive model. In addition, compared to relative studies, in this work we also focus on the detection of long termed seasonal feedbacks and whether the incorporation of this additional seasonal dimension improves model performance. To facilitate calculation in cases of dealing with ecological time series we have proposed a simple step – wise parameter optimisation algorithm using some classic statistical information criteria. As a consequence, optimisation procedure revealed that the simple structured models performed better in most cases since moth population time series displayed very short time lags. In addition, compared to relevant studies which used the ACFs and PCFs to detect ecological feedbacks, we also focus on the use of power spectrum analysis (PSA). In particular, the PSA provides statistical means of quantifying rigorous fashion patterns of ecological time series and could thereby used for identifying the existence of contrasting patterns time series which are not always visible based

58 ______Population Networks Petros Damos on ACFs and PCFs. Actually, the PSA generations enhanced the presence of periodic patterns on the particular time series and additionally, showed relatively similar patterns in most data sets which suggest population cycling among species that belong to the same genera. In general, the use of power spectrum density analysis and relative periodograms for explanatory reasons is fundamental tool in time series analysis [Chatfield 1989, Lau et al. 1995, Torrence and Compo 1998]. Both approaches were able to detect the seasonal trends and periodic properties of the moth population time series. Moreover, in most cases results accrue by SPA were reasonable when compared to the ACFs and PCFs. For pest management the followed time series analyses and the related model construction should be considered as valuable not only in identifying contrasting population dynamics, but provide also practical means to predict and project future population patterns utile for decision making in plant protection. From a mechanistically perspective, although that the constructed models differ on parameter values, in respect to each time series, it is not risky to claim that at least two best-fitting structures can be identified. For instance, eight of the thirteen variables, namely: X1, X2, X3, X4 (A. orana), Y1,Y2 (A. lineatella) and Z1,Z2 (G. molesta) appeared to have an AR(1) structure, while the remaining variables described better by a SAR(p)Q model structure. More precisely variable

X6 displayed a SAR(2)2 structure, X7 and X8 a SAR(1)2 , while Y1 and Y2 had a

SAR(1)1 and SAR(3)2 structure, respectively. Moreover, if we take into account that AIC and FPE values between seasonal and non seasonal models are quite similar and can therefore be omitted, it is not risky to claim that moth population dynamics that were studied are governed by an AR structure. However, the present study also demonstrated the existence of cyclical patterns of moth populations through the seasons and even though most presented models are non seasonal and empirical by nature, their structure incorporates the biological features of the time series and thereby provide some important ecologically conclusions. For instance, it is obvious that moth populations are governed by autoregressive dynamics having level one feedback which is periodically repeated. Concluding, model estimates of the current work and seasonality detection provide means to evaluate the spatiotemporal dynamics and life cycling of the

59 ______Population Networks Petros Damos examined pest threats. The time lag analysis showed a remarkable consistency, with prior independent estimates concerning the number of generations by direct observation of the above species [Damos and Savopoulou 2010]. Additionally, seasonal patterns of series belonging to the same species appeared well synchronised over the observation region. It is also interesting that density dependence was identified in most cases and that the incorporation of the observed seasonal parameters, although closer to observed data, do not necessary enhances model performances. However, all three populations’ explicit high populations during the past decade and at least two of them are clearly governed by periodic signals. Additionally, the erratic nature of moth population data was captured well by most cases based on the proposed autoregressive modelling approach since evaluation statistics and validation of the models indicate a very successful model fit. Forecasting of moth population emergence and seasonal phenology described in this study for all three peach pests, is quite practical for decision making. For instance, since population of A.orana and A.lineatella appeared to be more synchronised a simultaneous control is most possible. In contrary seasonal dynamics of G.molesta is characterised by different degrees of self dependence dynamics. Observed, differences in micro moth feedback processes could be addressed to some extend to inherent demographic characteristics, although other factor may play also a role.

60 ______Population Networks Petros Damos

4. Multivariate moth population analysis and causal networks

4.1 Ecological networks and state of the art

Because most ecological networks are dealing with taxonomic units rather than populations, their construction is based mostly on phenomenological rather than statistically derived – mechanistic approaches which define non-trivial interactions. This means that they map snapshots of known interactions between species and don’t involve any kind of dynamic population analysis which statistically can define active relations among objects. In traditional community studies for instance, the nodes are comprised of individuals, regarded as taxonomic units, and the links connecting them represent some kind of relation [Solé and Montoya 2006, Dune et al. 2002, Krause et al. 2003]. However, population communities in nature are characterised by assemblages of species that co-occur on dynamic rather than static fashion. Hence, when we approach the functionality of populations in terms of a super organism rather than taxonomic units, then structural networks probably fail to incorporate obscure information which affects the initial properties of the system (i.e. matting behaviour and reproduction, social defence interactions, migration etc). In other words, structural networks in ecology have seldom been built by taking into account the spatiotemporal variations of species of interest to construct a clear view of the relationships between network architecture and the functioning of individuals within populations. Here we address the challenge of constructing causal ecological networks, by using relatively new tools of multivariate time series analysis narrowed from econometrics and neuroscience [Seth 2008]. We further apply the proposed mathematical formalism to detect spatial relations among populations of three close related moths’ species as well as the effect of environmental variability (referred often as environmental stochasticity). We treat thus the potential interactions among abiotic and biotic variables as a multivariate vector

61 ______Population Networks Petros Damos autoregressive system which is bounded in the particular study area and is represented as a network. Traditional studies of ecological networks most of fail to propose a way of statistically quantifying direct relations among the ecological variables of interest. Since, multivariate time series analysis has been expanded and provides means of studding time series related network configurations of very complex biological systems we use this approach to construct ecological networks. Thus the current work can be considered as departure from most ecological network configurations which are mostly descriptive by the sense that nodes represent biological entities and any sort of a priori observed relation among the nodes is given by an edge. The statistical background of casual networks, provide tools which permit quantitatively assessing the structural properties of systems which are composed of different interacting entities [Donner et al. 2010]. This provides bases for topological descriptions of very complex systems [Granger 1969; Ding et al. 2006; Seth et al. 2005, 2008] and enhances their descriptive power.

4.2 Weighed and binary causal network construction algorithm

Figure 4.1 depicts a step wise procedure that we propose to analyse moth population time series and to construct network topologies from spatially distributed insect population time series. In principle, correlation and causal networks are constructed based on hypothesis test of significance. Specifically, if the Pearson correlation coefficient is larger than a given threshold r, the vertices (i.e. time series) is considered to be connected. In general, the approach can be extended by using MVAR modes by involving the use of several statistical inference measures to detect causalities. For instance, the concept of direct causality most commonly used is that of Granger causality [Granger1969], which exploits the natural time ordering to achieve a causal ordering of the variables. More precisely, one time series is said to be Granger (G) causal for another series if the latter can be better predicted using all available information than if the information apart from the former series had been used. Thus, G Causality is a powerful technique that reveals connectivity from time series data [Eichler 2000, Ding et al. 2006].

62 ______Population Networks Petros Damos

Correlation Measures Causality Measures

Cross Correlations Partial Correlations Granger Causality Conditional Granger (CRCO) (PACO) Index (GCI) Causality Index (CGCI)

Weighted Probability Weighted correlation Values causal networks networks

Statistical Test Statistical Test (Par/Non Par) (Par/Non Par) P> α P> α Binary Parametric False Binary correlation discovery Rate causal networks networks adjacency matrix

Figure 4.1 Chart flows which represent the logical operations that were followed to construct correlation and causal, weighted and non weighted networks, respectively (the step wise procedure is covered in details in the text).

Hence in this work, efforts are made to depict the topological features of ecological networks and characterise the causal relationships among its objects, namely moth populations and abiotic variables, based on cross correlations and Granger causality measures which are combined with graph theory. We consider that this approach has the advantage of allowing the informative comparison of topological (e.g. structural) and casual connectivity among the ‘ecological objects’ and thereby can be used to infer upon the functionality and properties of the underlying ecological system. In these causality graphs, the vertices, representing the components of the time series, are connected by arrows according to the specific causality relations between the variables whereas lines correspond to contemporaneous conditional association. After reviewing the principles of Granger causality analysis [Granger 1969; Ding et al. 2006; Seth 2008] we illustrate the causal network approach by analyzing 15 ecological time series (13moth population times series and two abiotic variables: temperature and relative humidity). The scope is to introduce the use of casual

63 ______Population Networks Petros Damos time series network analysis in the study of time varying ecological networks having nodes that reflect the initial properties of the actual population (e.g. ecological time series). Since, availability of data is mostly scarce, as a case study we used the ecological time series of chapter 2 to detected casual topological interactions of the respective moth populations of interest. In addition, because insects and related arthropods are direct affected by temperature and relative humidity a potential driving effect of these abiotic variables was also examined. Afterwards, based on standard statistics narrowed from graph theory, we check whether the architecture of each developed population networks was significantly affected in respect to each causality measure. Finally efforts are made to infer upon the structure of the empirically derived casual networks in relation to the specific landscape topology.

4.2 Basic background on multivariate time series analysis

In chapter 3 we have discussed the theoretical considerations in stochastic modelling of univariate ecological time series. Here we provide means of transforming ecological time series into networks based on statistically defined measures to detect causal relations among the ecological variables of interest.

4.2.1 Cross correlations

If we consider each of the ecological time series as univariate process: xxtt{}

n on which moth counts x x1, x 2 ,..., xnt { x } 1 is a vector which represents the process realisation available through observation (i.e. Chapter 2) we can proceed to a multivariate correlation analysis as follows. Let C be a diagonal covariance matrix which has elements the covariances among all available variables and are potential interaction nodes of the ecological network [Hamilton, 1994]:

cov(x1 x 1 )cov( x 1 x 2 )...cov( x 1 x j )

cov(x2 x 1 )cov( x 2 x 2 )...cov( x 2 x j ) Cij (4.2.1.1).

cov(xi x12 )cov( x n x )...cov( x i x j )

64 ______Population Networks Petros Damos

We further introduce a similarity measure denoted by sim(xi;xj) which actually corresponds to cov(xi;xj), to quantify the level of association between two such nodes i and j in which a link is assigned if the level of sim(xi;xj) constitutes non- trivial association between i and j.

However, since sim(xi;xj) is not directly observable it can be inferred by the information of xi and xj. For simplicity reasons we used Pearson correlation coefficient as a standard similarity measure:

sxx Corr(;) x x r ij (4.2.1.2), i j xij x ss xxij where s stands for the sample covariance of the variables xi and xj and s , xxij xi s for the standard deviations of samples of xi and xj, respectively. x j

4.2.2 Partial correlations Cross correlations are very suitable in detecting a sort of dependence between pair of variables (e.g. population Xi on population Xj, vice versa, or both). However, since we include abiotic variables as well (e.g. W1: temperature and/or W2: relative humidity) it is most probable that both Xi and Xj are dependent on another variable Xk or even m other variables (nodes): {XX ,..., }, where K k1 km

K{ k1 ,..., km }. We consider the above cases as trivial correlations which more likely not suggest a link(i,j). Therefore to maintain links among the ecological variables of only direct dependence, we intend to use as well the following partial correlation measure [Hamilton, 1994, Wei 2006]: K ij (4.2.2.1), ij K ijKK ji

where ijKK, ii and jk K are components of the 2x2 partial covariance matrix: 2 ssii K ij K 112 2 (4.2.2.2), ssij K jj K and 112 is defined as: 1 112 11 12 22 21 .

The matrices Σ11, Σ12, Σ22 and Σ21 are components of the partitioned covariance matrix:

65 ______Population Networks Petros Damos

11 12 Cov() W (4.2.2.3), 21 22

of all involved variables partitioned as WWW[,]12and WWW1 [,]ij,WX2 k .

Thus, if Xi (i.e. population of species 1) and Xj (i.e. population of species 2) are independent, conditional to Xk (i.e. whether variable) then ij K should ideally be zero.

4.2.3 Granger causality measures - preliminaries

Probably the most common way to examine casual relationship between two variables is by using the Granger-Causality proposed by Granger [1969].Granger causality is based on the general concept due to Norbert Wiener [1956] that a causal influence should manifest in improving the predictability of the driven process when the driving process is observed [Krumin and Shoham 2010]. A measurable reduction in the unexplained variance of the driven process (say Xi) as a result of inclusion of the causal (driving) process (say Yi) in linear autoregressive modelling marks the existence of a causal influence from Xi to Yi in the time domain [Granger 1969, Krumin and Shoham 2010]. The measure involves the estimation of the following multivariate vector auto regressions (MVAR) as follows:

pp

Xt a i Y t i i X t i1 t (4.2.3.1) ij11

pp

YYXt i t i i t i2 t (4.2.3.2) ij11 where ε1,t and ε2,t are uncorrelated at the same time disturbances-residuals and p is the maximum number of lagged observations included in the autoregressive model. In addition, αi, βi, λi and δi are coefficients of the model (i.e. the contribution of each lagged variable to the predicted values of x and y). Equation

(4.2.3.1) states that the ecological variables (i.e. moth populations) Xi are decided by a lagged variable Yi and Xi and so does also equation (4.2.3.2), but on the opposite way, in which the dependent variable is Yi instead of Xi. In order to provide useful heuristics for interpreting the empirical ecological time series it is necessary to move beyond simple two variables vector autoregressive

66 ______Population Networks Petros Damos model and consider more variables which incorporating aspects of a more complex system. To address this need we extend our analysis to embody more

T variables. Specifically, if Xk=[X1,X2,…,Xk] denote the realisations of k ecological variables and T denotes matrix transposition. Then the process is adequately described by the following multivariate vector linear autoregressive process (MVAR):

p

t i t i t (4.2.3.3) i 1

4.2.4 The Granger Causality Index (GCI) The GCI is the pairwise linear Granger causality in the time domain and defined as [Ding et al. 2006]: 2 () 1,t (4.2.4.1), GCI XY ln 2 ()2,t

2 where ()1,t is the unexplained variance (prediction error covariance) of Yi in its

2 own autoregressive model, whereas ()2,t is its unexplained variance when a joint MVAR model for both Yi and Xi is constructed. It is expected that GCI:XiYi>0 when Xi influences Yi, and GCI:XiYi=0 when it does not. In practice, GCI:XiYi is compared to a threshold value, which can be determined using a variety of methods (i.e. using surrogate data).

4.2.5 The Causal Granger Causality Index (CGCI) One disadvantage of the GCI is that indirect partial effects of the other variables are not concerned (e.g. examining whether X1Granger causes X2, by excluding the activities of all other variables X3, ...,Xn). This multivariate extension consist of the conditional Granger causality index (CGCI) and is introduced because it is very useful in revealing the causal interactions among sets of nodes by eliminating common - input artifacts in the ecological variables [Ding et al. 2006].

n Thus, for K data sets let {,,}xt y t z t t 1 be the population time series for the three

nn moth species and {wt } t1 { w 1, t , w 2, t ,..., w k 2, t } t 1 the weather series. If we consider the population Xi as driving system and as response system the population Yi, we

67 ______Population Networks Petros Damos

can further conditioning both on system Wi, WWWWi{12 , ,..., K z }. Then for a restricted MVAR model of absent Xi:

pp

yt a i y t i A i z t i1 t (4.2.5.1). ij11

Respectively, for an unrestricted present Xi MVAR model is:

p p p

yt a iti y iti x A iti z 2, t (4.2.5.2). i1 i 1 j 1

The conditional Granger causality index equals:

2 () 1,t (4.2.5.3). CGCI XYZ ln 2 ()2,t If we remove self dependencies in respect to each variable the Granger causality index is: 2 ()ˆ 1,t (4.2.5.4), CGCI XYZ ln 2 ()ˆ2,t

2 where, ()ˆit, are now variances of the prewhitening error estimators. Additionally, the idea of Granger causality can be extend by using information causality measures such as transfer entropy (TE) and partial transfer entropy (PTE) [Papana et al. 2011].

4.3 Time series networks

To describe the ecological network it is convenient to generate a graph view which is the easiest possible mental model for visual explanations. This implies the use of some sort of similarity measures to detect non trivial-links among ecological variables. These are in brief described in the following sections.

4.3.1 Correlation networks and undirected links Here significant causality interactions between variables are represented by edges, which are estimated by two paired or multivariate statistical comparisons which detect non trivial links. We are concerned mostly on regular parametric comparisons, although non parametric randomization test will be also in brief discussed for comprehensive reasons.

68 ______Population Networks Petros Damos

Parametric testing of correlation

Let xxi, j ij be the true Pearson correlation coefficient of Xi and Xj. with estimate:

n

(xi x i )( x j x j ) i 1 rij (4.3.1), (n 1) sij s where: xi and x j are the sample means for the first and second variables, respectively, si and sj the standard deviations for each variable and n: the series length. Additionally, as standard similarity measure is the Pearson correlation coefficient yields:

sij ij: r ij (4.3.2), ssij where, sij : sample covariance of Xi and Xj. The correlation coefficient assumes a value between -1 and +1. Generally, if one variable tends to increase as the other decreases, the correlation coefficient is negative. Conversely, if the two variables tend to increase together the correlation coefficient is positive. However, although that the Pearson correlation coefficient measures the degree of linear relationship between two variables it does not provide significant ‘threshold levels’. Therefore an accurate quantitative measure for reporting the correlation between two variables is the p-value which is used in hypothesis tests, where we either reject or fail to reject a null hypothesis. For Pearson's correlation coefficient, the following test for significance is performed for a particular significance level (i.e. reject the null hypothesis if the p-value<0.05) with null hypothesis that H0 :0ij and alternative H1 :0ij . Thus, if we assume that each paired data set is normal distributed and stationary or:

22 (,XXNi j ) ([, i j ],[ i , j ], ij ) (4.3.3),

69 ______Population Networks Petros Damos where μ and σ2 standard notation for mean and variance, respectively, we proceed to the performance of successive two side t-tests:

2 t rij n21 r ij t n 2 (4.3.4)

Additionally, significant correlation can be detected according with the z-statistic:

2 1 111 rij ztanh ( rij ) log[2 ] N (0, ) (4.3.5) 2 1rnij 3 Tests are performed by using several statistical packages, while rigorous calculation and proof requires knowledge of mathematical statistics beyond the scopes if the current work and the reader should refer to Sokal and Rohlf [1996] or related statistical textbooks for more details.

Non parametric testing of correlation In cases of small sample size and absence of information concerning possible deviation from normality, it is wiser to follow also non-parametric randomized multi comparison test [Kugiumtzis 2002a, b, 2008]. Thus, because non- parametric tests make no normality assumptions, they are usefully to detect significant interactions in ecological time series which are not wide sense stationary and deviate from normality.

Here we derive the null distribution of rij from resampled pairs consistent to H0 :0ij . Considering the original pair of the ecological time series (,)xxij we

**bb generated B randomised pairs (,)xxij, b=1,…,B. Although this random sample permutation destroys the time order it uses the same distribution of the original

*b time series. We then compute rij on each pair and the ensemble

*bB {}rij b 1 forms the empirical null distribution of . H0 is rejected if the sample is not in the distribution of . In the same manner we detect significance of partial correlations to asses non-trivial links by testing the null hypothesis

HK0 :0ij holds and reject HK1 :0ij or vice versa. Randomisations for 100 surrogates were carried out and non parametric comparisons were also performed for confirmative reasons.

70 ______Population Networks Petros Damos

4.3.2 Causality networks and directed links

The MVAR framework that was in brief introduced is powerful set of time- and frequency-domain statistical tools for inferring directional and causal information flow based on Granger’s framework and can be further is associated with the construction of directed networks. In the following section we introduce some preliminaries which are associated to construction of the causal weighted and binary networks.

Parametric testing of causality and weighted networks

In the case were the ecological variable (i.e. moth populations) Xi is decided by a lagged variable Yi and vice versa, equations 4.2.3.1 and 4.2.3.2, we can further jointly test if the estimated lagged coefficient αi and δi are both different from zero with F-statistic and reject the two null hypotheses and thus statistically confirm casual relationships between Xi and Yi. Typically one-tailed, F-tests refer to the F-distribution. An F-test evaluates whether the observed statistic exceeds a critical value from the distribution. If the observed F-statistic exceeds the critical value, reject the null hypothesis. An F- statistic that is larger than the critical value from the F-distribution, using the appropriate confidence level and degrees of freedom, supports rejecting the null hypothesis.

The False discovery rate and binary networks Since, we are dealing here with more than two ecological variables (i.e. 15) and considering that the proposed analysis can be extended to more, it is important to detect probability values that although are significant could be false positives. False discovery rate is a multiple testing procedure that is followed to correct significant correlations of multiple comparisons tests [Benjamini and Hochberg, 1995]. Since we have n>2 variables we perform simultaneously pairwaise tests to detect significant correlation among the ecological time series, this results to nn( 1)/ 2 n probability values. However, some of them are not significant and should be rejected and therefore find the maximum probability values, say pk for which it holds:

71 ______Population Networks Petros Damos

ak p (4.2.6.1), k n where α is a particular significance level (usually set to be 0.05). Note that p- values are ordered from the small to higher values: p1 p 2 p 3 ,..., pk . Since only real-significant values are indicated this results in a binary adjacency matrix from which several graph measures can be calculated.

4. 3. 3 Graph theoretic network measures

By definition a graph G consist of a set of verticesV(G) { 1, 2 ,..., n}, and a set

2 of edges E(G) {e1,e2 ,...,en} in a disjoint pair form: G=(V,E) such that E [V ] . Thus the graph is an ordered pair (V(G),E(G)), of edges, together with and incidence function ψG that associates with each edge of G an unordered pair of vertices G [Bondy and Murty 1976, Bonningto and Little 1995]. To proceed on the construction of graphs based on multivariate time series analysis it is wise to introduce some matrix notations which mathematically formalise the causal network that we intend to construct.

Thus, we define Cij, Pij and SigPij to be nxn matrices which have as elements aij ; the first the correlations: aij:() corr x i x j , the second the p-values: apij: ij and the third, the significant p-values (derived using either parametric or non parametric comparisons), respectively. Pij and SigPij having positive semidefinite (i,j) elements. To put forward, we considered low p-values of having more influence and considered these as the weighted versions of causality networks. Such kind of directed network takes into account the varying contributions of each causal significant interaction among the variables of interest.

Finally, we consider adjacency matrix Aa()ij n x n , with entries determined by the final discovery rate:

1 if i j E a : (4.3.3.1.), ij 0 otherwise

where ij, denotes any pair of vertices that are linked by each other with an edge. A is binary and is used to record the most probable, non-trivial, numbers of edges joining pairs of vertices.

72 ______Population Networks Petros Damos

4.3.3 Ecological network analysis and standard graph metrics

Based on the above mathematical formalism the interest is now to compress the information accrues from the statistically derived population networks in order to extract its features. These are standard graph metrics and are to some extend as simple as descriptive statistics and can therefore be used as qualitative criteria to reveal complex interactions and modularity of ecological networks. The following representative graph metrics were used: (i) Number of nodes. Here we consider as nodes all ecological time series in respect to location and species, either biotic (X1X8: species1, Y1Y3: species2 and Z1,Z2: species 3) or abiotic (W1: mean temperature and W2: relative humidity) (ii) Number of edges. Here we consider correlations related to the predefined correlation measures and which are represented by the matrices C, P, Ps and A. (iii) Average node degree d ). By definition, the d(u) of a node u is the number of edges linked to u. The average degree is: 1 d = d(u) (4.3.3.1) n u V If we denote the minimum degree (G): min{d(u) V}and the maximum degree (G ) : max{ d ( u ) u V }then (G) d(G) (G) (3) and it can be shown that for any Graph: d(u) 2m (4.3.3.2). u V (iv)Clustering coefficient (referred also as cliquishness). The ClusterCoef is density measure of local connections. 1 n CCi (4.3.3.3), n i 1 where (Ci) is the local clustering coefficient. For a directed graph having vertices

ij, connected by an edges ejk , and triples ki the Ci is given as:

{}e jk Ci,,, j k N i e jk E (4.3.3.4). kkii( 1)

Since for an undirected graph eij and eji considered as identical Ci is respectively:

73 ______Population Networks Petros Damos

2 {e jk } Ci,,, j k N i e jk E (4.3.3.5). kkii( 1) C is a measure of the degree to which nodes in a graph tend to cluster together and is used to define network modularity [Watts and Strogatz 1998]. In a completely connected network, all nodes are connected to every other node. These networks are symmetric in that all nodes have in-links and out-links from all others.

(v) Connectivity (ki). Connectivity measures the degree of connected components as follows:

kki ij (4.3.3.6). ji

In unweighted networks, the connectivity ki equals the number of nodes j that are directly linked to node i. The maximum connectivity is defined as:

kkmax max(j ) (4.3.3.7). j (vi) Network diameter (D) and. The diameter of G is the maximum eccentricity among the vertices of G. The eccentricity of the vertex v is the maximum distance from v to any vertex. That is, e(v)=max{d(v,w):w in V(G)} (4.3.3.8). Thus, diameter(G)=max{e(v):v in V(G)} (4.3.3.9). Additionally, the diameter is the longest of all the calculated path lengths (see vi). The diameter, thus is representative of the linear size of a network. (vi) Shortest Paths - characteristic path length. Shortest path is the distance between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized. There are several algorithms which aim to solve the problem of the ‘shortest path estimation’ such as the Dijkstra's shortest path algorithm. Characteristic path length is calculated by finding the shortest path between all pairs of nodes, adding them up, and then dividing by the total number of pairs. Shortest paths Sp (4.3.3.10). pairs

This measure represents on average, the number of steps it takes to get from one node of the network to another. (vii) Network radius (R). This is the minimum eccentricity of any vertex.

74 ______Population Networks Petros Damos

Therefore, radius(G)=min{e(v):v in V(G)} (4.3.3.11).

(vii) Network centralisation Cx . That is, the greatest distance between any pair of vertices [Yoon et al. 2006]. The network centralization (also known as degree centralization) is given by: nkmax k Centralization() Densitymax Density (4.3.3.12). n21 n n The centralization is 1 for a network with star topology; by contrast, it is 0 for a network where each node has the same connectivity. A regular grid network such as a square has centralization 0. (vii) Network density. By definition is the ratio of extant edges to potential edges. The density D of a network is defined as a ratio of the number of edges e to the N number of possible edges, given by the binomial coefficient : 2 2E D (4.3.3.13). nn( 1) Density measures the fraction of interactions among nodes that are causally significant and [Seth et al. 2005, 2008]. (viii) Network heterogeneity. This measure reflects the tendency of a network to contain hub nodes [Dong and Horvath 2007]. In principle, the network heterogeneity measure is based on the variance of the connectivity, although differences exist also among researchers. The coefficient of variation of the connectivity is defined: var(k ) Heterogeneeity (4.3.3.14). mean() k This heterogeneity measure is invariant with respect to multiplying the connectivity by a scalar.

4.6 Results

4.6.1 Cross and partial cross correlation networks

Fig 4.1 depicts the correlation matrices Pij (i.e. significant measure values whereas insignificant are set zero) and the related weighted networks, in respect to each of the statistical similarity measure that was used to construct them (i.e. CRCO

75 ______Population Networks Petros Damos and PACO). Having relative analogies to Fig. 4.1a, Fig. 4.1b depicts the p-values and the resulted weighted networks for the CRCO and PACO similarity measures, respectively. In both cases, deeper (hot) colours, either in matrices or networks, are indicating higher correlations and significances for each pair of variables tested, while light colours and thinner vertices are related to lower correlations and significances. These graphical depiction provide a first impression on which of the ecological variables are more correlated. However, although the α = 0.05 cut of threshold was used for the construction of the SigPij matrices, it is difficult to judge which of the variables have stronger connections on the related networks considering that it potentially includes false positives due to the multiple comparisons. Fig 4.1c presents the adjacency matrix accrue after the FDR. These binary values (0 or 1) were used to generate the final ecological networks in respect to each of the two similarity measures that were used. In addition, nodes with higher degrees and clustering coefficients are represented with larger nodes and darker colours, respectively.

a CRCO Networks PACO Networks p MatrixTimeSeriesp CRCO par=0 parametric p-val MatrixTimeSeries PACO par=0 parametric p-val

15 X 15 X 14 X 14 X 13 X 13 X 12 X 12 X 11 X 11 X 10 X 10 X 9 X 9 X 8 X 8 X 7 X 7 X 6 X 6 X 5 X 5 X 4 X 4 X 3 X 3 X 2 X 2 X 1 1 X X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Sig p Sig p MatrixTimeSeries PACO par=0 measure (par p-val>0.05->0) b MatrixTimeSeries CRCO par=0 measure (par p-val>0.05->0) 15 X 15 X 14 X 14 X 13 X 13 X 12 X 12 X 11 X 11 X 10 X 10 X 9 X 9 X 8 X 8 X 7 X 7 X 6 X 6 X 5 X 5 X 4 X 4 X 3 X 3 X 2 X 1 X 2 X 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 FDR MatrixTimeSeries CRCO par=0 nsur=100 FDR(a=0.05) MatrixTimeSeriesFDR CRCO par=0 par FDR(a=0.05) c 15 15 X X 14 14 X X 13 X 13 X 12 X 12 X 11 X 11 X 10 X 10 X 9 X 9 X 8 X 8 X 7 X 7 X 6 X 6 X 5 X 5 X 4 X 4 X 3 X 3 X 2 X 2 X 1 X 1 X

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Figure 4. 1 Cross correlation (CRCO) and partial cross correlation (PACO) statistical similarity measure matrices and the respective network configurations. (a) Significant measure values (b) p-values and (c) binary values from False Discovery Rate (FDR) test. (c). Matrix corresponds to K=15 ecological time series of length n=285 (3Julian days intervals; series 12: weather data, series 315: moth population data).

76 ______Population Networks Petros Damos

Since in the CRCO and PACO-FDR networks represent only the significant correlations among the variables some very interesting information are observed. In particular, significant correlations are mostly observed among similar nodes so that populations of the same species or weather variables are interlinked. For instance, we observe the nodes 11, 12 and 13 which belong to populations of A. lineatella consist a triangle. The couple of nodes 14-15, which represent populations of G. molesta, are connected, whilst populations of A.orana, nodes 3,4,5,6,7,8,9 and 10 define a sub graph. Finally, the two weather variables are also connected, a trend which is more feasible when observing the PACO-FDR networks.

4.6.2 Granger causality and conditional Granger causal networks Figure 4.2 presents the Granger causality (GCI) and conditional Granger causality (CGCI) matrices as well as the respective directed causal networks. In particular, Fig3.3a depicts the significant measure values of GCI and the CGCI, while Fig. 4.3b presents the p-values after the pair wise multivariate analysis and parametric hypothesis testing, for each case respectively. To relative analogies as in the case of Fig 4.1, different colours indicate different degrees of statistical correlations among each pair of variables that were compared (i.e. black boxes in matrices and brown edges in networks). In general, the similar network configurations are observed, as in the case of the CRCO and PACO networks. However, since the Granger matrices are not symmetric the GCI and CGCI casual networks are more dense and directed. Based on these network configurations and respective probability values it is relatively difficult to indicate which links are non trivial. Considering that a threshold of 0.05 is quite acceptable in two sample testing, there is an increased risk of false positives in performing multivariate comparisons. Considering that we have to perform k-independent significance tests, each using a critical value corresponding to a significant level of a, then the probability of making no type I errors in any of the k tests is (1a )k . Thus the probability of making at least on type I error is then aa' 1 (1 )k and increases by adding k variables to be tested. Therefore, as previous, to avoid the multiple testing

77 ______Population Networks Petros Damos problems, we estimated the p-value to each test according to false discovery rate. It is also noteworthy to state that the FDR approach is less conservative than the Bonferroni analysis but has greater ability to indicated truly significant results [Sokal and Rohlf 1996]. Moreover, on the GCI and CGCI casual networks of Fig 4.2c which big nodes represent variables having higher degrees, while deep colored nodes are having higher clustering coefficient. Based on these criteria the variables 5 and 10, which correspond to populations of A. orana, have higher out degrees and clustering coefficients and could be regarded as hubs. According to the CGCI-FDR network at least two sub graphs can be observed. The one consists of populations of A.orana, whilst the second, is that of the variables 14 and 15, which both belong to the same species G.molesta. Moreover, both abiotic variables 1 and 2 are related. Finally, Figure 4.3 depicts two different random simulated network configurations. Fig 4.5a is a small world in which the number of vertices was prior set to be 15 and the average number of linked neighbors to be 3 having 50% replacement probability. Fig 4.5b is a Erdös –Rényi random network (which usually has a Poisson degree distribution) [Erdős and Rényi 1960], in which average degree was set to 3, the number of potential vertices 10, whilst initial probability of adding edges was 0.5 and a=0.05. These networks are given for comparative reasons and were generated based on similar characteristics as the population networks.

78 ______Population Networks Petros Damos

GCI Networks CGCI Networks p MatrixTimeSeries GCI par=1 parametric p-val MatrixTimeSeriesp CGCI par=1 parametric p-val

15 X 15 X a 14 X 14 X 13 X 13 X 12 X 12 X 11 X 11 X 10 10 X X 9 9 X X 8 8 X X 7 7 X X 6 6 X X 5 5 X X 4 X 4 X 3 X 3 X 2 X 2 X 1 X 1 X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

b MatrixTimeSeriesSig GCI par=1 p measure (par p-val>0.05->0) MatrixTimeSeries CGCISig par=1 p measure (par p-val>0.05->0)

15 X 15 X 14 X 14 X 13 X 13 X 12 X 12 X 11 X 11 X 10 X 10 X 9 X 9 X 8 X 8 X 7 X 7 X 6 X 6 X 5 X 5 X 4 X 4 X 3 X 3 X 2 X 2 X 1 1 X X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0 0.01 0.02 0.03 0.04 0.05

FDR MatrixTimeSeriesFDR CGCI par=1 par FDR(a=0.05) MatrixTimeSeries GCI par=1 par FDR(a=0.05) 15 c X 15 X 14 X 14 X 13 X 13 X 12 X 12 X 11 X 11 X 10 X 10 X 9 X 9 X 8 X 8 X 7 7 X X 6 6 X X 5 5 X X 4 4 X X 3 X 3 X 2 X 2 X 1 X 1 X

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Figure 4.2 Granger causality (GCI) and conditional Granger causality (CGCI) similarity measure matrices and the respective network configurations. (a) significant measure values (b) p-values and (c) binary values from False Discovery Rate (FDR) test. Matrix corresponds to K=15 ecological time series of length n=285 (3Julian days intervals; series 12: weather data, series 315: moth population data).

a b

Figure 4.3 Simulated random network configurations: (a) small world (number of vertices=15, number of linked neighbors on each side of a vertex=3, replacement probability=0.5) and (b) scale free Erdös –Réney random network (average degree of vertices=3, number of vertices=10, edges are set between nodes with equal probabilitie 0.5, a=0.05). Note that the configuration of (a) is closer to the experimental derived significant correlation measures networks (i.e. Fig 4.1a and Fig4.2a), while configuration (b) is closer to the p-values networks (e.g. Fig 4.1b and 4.2b).

79 ______Population Networks Petros Damos

4.6.3 Causal force directed network layouts Figures 4.4 and 4.5 are force directed layouts that correspond to the same networks described in 4.6.4 and their construction was based for time lag τ=1. Based on these configurations the initial properties of the biological variables are more easily indicated. All configurations clearly show the presentce of sub graphs which consist of populations which belong to the same species.

Figure 4.4 Force directed network layouts for correlation (CRCO and PACO) and causality (GCI and CGCI) parametric analysis. Nodes are given to final discovery significant multicorrelations (a=0.05 threshold, Xi: A.orana, Yi: A.lineatella and Zi: G. molesta, Temp: mean temperature, RH: Relative humidity).

80 ______Population Networks Petros Damos

The weather variables seem not to have a significant driving role in most cases. This attribute probably suggest that although the environmental variables have a great impact on insect population ecology (especially temperature), their effect is diminished after the current network analysis. Moreover, some significant cross correlations in population time series could accrue to population autocovariances are removed after prewitening.

4.6.4 Graph theoretic metrics

The network properties were compared with respect to each correlation measure that was used to construct them (i.e. CRCO, PACO, GCI and CGCI). Additionaly, to the binary networks we decided to depict weighted networks as well since they enclose information related to the degree of dependence among different populations which is usually neglected when trying to interpret ecological relations based only on binary networks.

Thus, for each matrix (i.e. Cij, Pij and Aij) the set of graph metrics are presented in Table 4.1. Since the configurations of the weighted networks are similar to that of complete graph (e.g. all possible combinations among nodes are included) they have the same metrics. In contrary, the structural differences of the FDR- networks are reflected on their related graph metrics. For instance, the number of interactions (vertices) is significantly higher in the CRCO-FDR networks compared to the PACO-FDR networks and this trend is respectively reflected on the average degree distributions of each graph. Nevertheless, this is counter- intuitive as PACO restricts the correlation of CRCO. This suggests in other words, that some of the correlations between the initial population variables as reflected in the CRCO-FDR networks may be dependent by the same third variable. Moreover, since the PACO-FDR networks keep only those links of direct dependence, the related graphs display simpler network configurations. The average cluster coefficient was also lower for the PACO-FDR network indicating a lower degree to which the population variables tend to cluster together. It also provides as picture upon the modular organisation of each network configuration, which for the CRCO-FDR networks seem to be closer to scale free configurations. However, based on the average number of connected components the connectivity patterns between the CRCO-FDR and PACO-FDR are

81 ______Population Networks Petros Damos equal. This means that any two ecological variables that are connected to each other by significant links, and which are not connected to additional variables in the graph are quite similar in these cases. Generally, the number of connected components indicates the connectivity of a network, a lower number of connected components suggest a stronger connectivity. Network diameter and radius (e.g. minimum eccentricity of any vertex) was higher in the PACO-FDR and thus having the greatest distances between any pair of the ecological variables. For the disconnected network configurations the diameter corresponds to the maximum of all diameters of its connected components (e.g. for complete networks diameter is 1 since every node is connected to every other node). Centralisation although higher in the CRCO-FDR networks compared to the PACO- FDR they were very close. For structural networks in ecology, centralisation can have direct biological interpretation since it measures the extent to which a network is centred on particular species (hubs, i.e. heavily linked nodes in Graph theory) around which other species connect [Everett and Borgatti 1999, DuPont and Nielsen 2006, DuPont et al.2009]. Networks having high centrality tend to be star-shaped (i.e. pollination networks, [Jordano et al. 2006]). We used network density as a comparative measure of network size. Because this index encloses information on the total number of interactions compared to the possible interactions it takes values from 0 to 1 (e.g. the weighted graphs, as complete have density 1). As previously, PACO-FDR networks explicit lower values compared to CRCO-FDR in which more than the half of the variables are connected. However, the tendency of containing more hub nodes was higher in the PACO-FDR networks according to the network heterogeneity index. The metrics of the GCI-FDR and CGCI-FDR networks were in general lower compared to the CRCO-FDR and PACO-FDR networks but the network heterogeneity. Thus although, GCI and CGCI resulted to the construction of simpler networks their configurations expressed a higher tendency of containing hub nodes (e.g. variable 7, 8 and 10, which all correspond to populations of A. orana are high interconnected). Moreover, as in the case of removing the partial dependences among cross correlations, the CGCI-FDR analysis yielded to

82 ______Population Networks Petros Damos adjacency matrices having fewer unities compared to the CGI-FDR matrices. This trend is reflected in all graph metrics as displayed in Tables 4.1.

4.6.5 Landscape topology of moth population networks Figures 4.5 are the landscape topological projections of the GCI-FDR and CGCI- FDR causality networks. Each node represents the location in which the moth populations were captured during all observation years and arrows represent the significant correlations. In the topological projections of the GCI-networks a higher number of interactions were observed than in the CGCI-FDR causality networks (Figure 4.5) since some of these correlations were removed after the CGCI-FDR analysis. In all cases no significant inter-species correlations were indicated and according to the analysis only populations that belonged to the same specimens seem to interact.

Table 4.1 Graph metrics of moth population networks of non prewitened time series in respect to each correlation and causality measures that was applied to construct them.

Graph Statistic Correlation Networks Causality Networks

Nodes ( e) 15 15 15 15 Links (v) 63 23 19 9 Average degree ( d ) 8.4 3.067 2.133 1.2 Clustering coefficient (C ) 0.786 0.333 0.253 0.156 Connectivity (Cc) 1 1 3 1 Diameter (D) 3 6 1 3 Radius R 2 3 3 1 0.132 0.159 0.236 0.148 Centralization ( Cx ) Number of Shortest paths 210(100%) 210(100%) 56(26%) 34(16%) Characteristic path length 1.448 2.886 2.25 1.647 Network density 0.6 0.219 0.152 0.086 Network heterogeneity (h) 0.237 0.346 0.703 1.758

83 ______Population Networks Petros Damos

Moreover, based on the topological network configurations of Figure 4.7 it is apparent that some nodes, which represent populations in respect to observation region, are significantly correlated compared to others. In the CGCI- FDR network (Figure 3.7b) it is feasible that the moth populations of A. orana, X8, Granger causes X2, X3 and that population X6 Granger cause X3. Moreover, A. lineatella populations Y2 Granger cause Y1, while population Y3 does not interact and G.molesta populations of Z1 Granger cause Z2.

X3 Y2 Z1 X1 X4 X5 Y3 X8

Y1

X2

Z2 MatrixTimeSeries GCI par=1 par FDR(a=0.05)

15 X X6 X7 14 X 13 X 12 X 11 X 10 X 9 X 8 X 7 X 6 X 5 X 4 X 3 X 2 X 1 X

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

X3 Y2 Z1 X1 X4 X5 Y3 X8

Y1

X2

Z2 MatrixTimeSeries CGCI par=1 par FDR(a=0.05)

X6 X7 15 X 14 X 13 X 12 X 11 X 10 X 9 X 8 X 7 X 6 X 5 X 4 X 3 X 2 X 1 X

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Figure 4.5 Projections of casual population networks (a) CGI-FDR network and (b) CGCI-FDR network, in respect to sampling points and Landscape topology (Xi: A.orana, Yi: A.lineatella and Zi: G. molesta, weather variables are not included).

84 ______Population Networks Petros Damos

Moreover, apart of the in-species interactions related information (e.g. force directed layouts) the topological projection of the networks encloses also some information which is related to the landscape architecture. Figure 4.6 for instance indicates possible relations among the spatial arrangement of the biological variables in relation to landscape structure. Based on the landscape properties the experimental area can be divided in two major areas, Northwest and Southeast, separated by the Aliakmon River. However, based on the GCI and CGCI-FDR analyses, interactions via the physical river border are more, compared to the interactions among population within each sub region. This could be the result of several factors and therefore it’s difficult to be interpreted. However, the results support to some extent the hypothesis, that moth population emergence in one location is synchronised to nearby located populations which emerge few days later.

X3 Y2 Z1 X1 X4 X5 Y3 X8

Y1

X2

Z2 X6 X7

Inter-river correlations on river side correlations

Figure 4.6 Extrapolation of the CGCI-FDR networks in relation of topological features of landscape. The observation region (Veria valley) is divided in two major sub regions due to the presence of a physical border (Aliakmon River) and according to the properties of the nodes. Hubs correspond to population hot spots. Dash lines represent significant interactions between the two subregions (i.e. interriver correlations), while semi dashed lines represent the significant population interactions indicated insight sub region

85 ______Population Networks Petros Damos

4.7 Discussion

Based on the general appearances of most networks, few general remarks are worth to be noted: 1. between the two abiotic variables temperature display higher degrees and clustering coefficients compared to relative humidity. Nevertheless, no weather effects were indicated on the prewitened GCI and CGCI-FDR networks. 2. Causal relations were indicated only among variables of the same populations (i.e. species). 3. Concerning the biotic variables, populations of A. orana have higher degrees and clustering coefficients. However, this could be also related to the higher number of A. orana variables compared to the other species. 4. GCI and CGCI- FDR network affirm that in most cases nodes-variables which belong to same entities tend to be significant associated and 4. The CRCO and PACO networks display graph configurations which are closer to the random scale free networks. Moreover, the proposed approach provides useful means of quantifying the functionality of ecological networks and has some substantially advantages. In specific, casual measures, overcome certain limitations associate to structural networks by (i) introducing mathematical properties which serve to define ecological interactions and relations, (ii) naturally incorporating casual interactions having probabilistic and not subjective nature, and (iii) they incorporate conceptually the dynamics and properties of the ecological variables. These evolve simultaneously with surrounding environment and are not captured based on a priori non weighted constant relations. Restricted to an ecological perspective and without being concerned about mathematical disciplines, it is worth to outline some issues related to moth population dynamics. For example, the study revealed that populations belong to the same species appear to interacting over a wider area. Moreover, local populations explicit a (Granger) cause effect over more populations located in nearby regions. In some cases (i.e. variables X5, X6 and X8), there are populations which can be characterised as hubs and act as multi - driving variables which

86 ______Population Networks Petros Damos point to more than one location, while significant reverse connections were also observed. A supporting evidence to this conclusion is that population mostly interact through barriers (river) rather than at the same cost line. However, it is difficult to interpret why moth populations mostly interact across two regions through physical barriers and not on in-habitat nearby regions. One hypothesis is that since the ‘driving population variables’ are located mostly close to the river sides, one can assume that they are most favoured by local micro environmental conditions and emerge few days earlier and/or develop quicker. For poikilothermic organisms and insect in particular, an environment of moderate temperatures and reach in humidity favours reproduction and development [Damos and Savopoulou-Soultani 2008a,b, 2012]. Moreover, in chapter 3, the ACF and PACF indicating very close lagged population synchronisations of the moths that belong to the same species and seem to affect current MVAR causal network results. Whether this trend is only an intra – species, short lagged, multivariate population synchronisation, which is captured by the causal network, or additionally describes ability of moth populations to move over short distances, is unknown. For instance, it has been proved, either in field or laboratory, that short in-habitat flights are age dependent; with the probability of long flights occurring is highest between 2 and 7 days after moth emergence. For A. lineatella this period corresponds to the first week of adult life span and before and during major egg-laying activities [Damos and Savopoulou-Soultani 2012a]. Additionally, synchronisation may offer successful matting options [Damos et al. 2011]. Concluding, the analysis described in the current chapter demonstrates the usefulness of casual network perspectives in describing the functional connectivity of some representative moth populations and is a shift from a phenomenological towards a more mechanistic approach. The causal networks provide an alternative means of distinguishing driving from modulatory connections, since the approach does not depend on a-priori assumptions of conectivity. Moreover, combined with graph theory the approach provides very valuable insights in the structural and functional organization of the related ecological networks. However, its use for comparing network topologies is not

87 ______Population Networks Petros Damos without challenges and the major difficulty arises from the fact that network structure and related measures depend on the number of nodes and edges in a way that is specific for the final network topologies. To address this need one should extend the current analysis to engage more available empirical data in order to move deeper in to the modulation of causal interactions and gain more insight in to the functionality of population networks. Despite these concerns, the current work provides general means to study several kinds of causal ecological interaction and pinpoints region specific network features by detecting the location of hubs (hot spots) or the occurrence of communities (modules) in the moth population network. The latest can be beneficial for rational management of moth-pests based on ecological perspectives.

88 ______Population Networks Petros Damos

5. Concluding Remarks

5.1 Population cycling and seasonality In this work we have considered insect multivoltine seasonal life cycle fluctuations, within the context of stochastic processes and autoregressive models and we further analyse for first time ecological time series of three host related micro moths, including: Adoxophyes orana (Lepidoptera: Tortricidae), Anarsia lieneatella (Lepidoptera: Gelechiidae) and Grapholitha molesta (Lepidoptera: Tortricidae). To date, all the above species have appear simultaneously in fruit orchards. Hence, the order and seasonality detection may improve rational management of the above moth complex. Additionally, since stochastic modelling provides useful insights in understanding the nature and relative importance of endogenous and exogenous effects on population dynamics [Dennis et al. 1994, Hilborn and Mangel 1997, Ives and Jansen 1998, Ives et al. 2003, Dennis et al. 2006, , Kendall, et al. 1999], we aimed also to detect the type and extend of feedback actions in order to construct spatial distributed ecological networks (see Chapter 3) based on time series analysis and statistical causality measures. From a statistical standpoint, the time series of the moth populations system should ideally reflect the technical and conceptual framework of the process and any kind of time lagged feed backs can be used to interfere upon its behaviour under certain boundary conditions. Based on the current work we have find that most of the moth population processes that were observed are AR(1) stochastic process. In ecology, this actually represents a first order negative population feed back. First order negative feedbacks are considered as the result of intraspecific interactions, such as competition for food or space [Berryman and Lima 2007], while the second order feedbacks are related to interspecies interactions between consumers and their resources [Royama 1992, Berryman 1999, Turchin 2003]. These first and second order negative feedbacks are also referred as level 1 and level 2 negative feedbacks and have been also observed in related studies performed to othe species [Levins 1974, Lande et al. 2006, Berryman and Lima 2007].

89 ______Population Networks Petros Damos

In fact the demographic and environmental stochasticity in age-specific vital rates of A. orana, A. lineatella and G.molesta acting through time lags of the species specific life history. These results are generally in accordance to related studies performed for other species [Pollard et al. 1987, Dennis and Taper 1994, Ives et al. 2003, Dennis et al. 2006, Dennis et al. 2010]. Thus, a first order lagged structure which was observed in most data sets of the current work indicate probably that the underlying mechanisms that drive the observed dynamics should be addressed to the species intrinsic capacities to increase rather than to interactions with other species. Generally, the greater the number of lags, the more key interacting species there are likely to be [Royama 1992]. Moreover, the AR(1) models that were constructed in this work are in principle very closely connected to the discrete-time, stochastic Gompertz models which represent the density dependent growth of a population, while the noise process which is included represents environmentally induced fluctuations in the logarithmic population growth rate. Actually, the Gompertz model cast on the logarithmic scale and following a simple parameterisation results to an AR(1) model [Dennis et al 2010]. Furthermore, as in the case of the stochastic Gompertz, the AR(1) the moth population models that were structured, were based on the same assumption; that population abundance is known without error, with the random fluctuations being driven only by ecological processes. One other important information that revealed the current study is that in most cases populations belong to the same species cycling by similar manners. This result is supported by the patterns of the ACF and PCF. Thus, in most cases, population of A. orana and A. lineatella, appeared to be highly synchronized and a seasonal patterns (pseudo-periodic) was observed, while in populations of G.molesta a periodic trend is more difficult to be observed. These results are in accordance to prior studies which refer that populations of A. orana and A. lineatella are characterized usually by a standard number of generations through out the growth season [Damos and Savopoulou- Soultani 2010] in contrast to G. molesta which is populations patterns and

90 ______Population Networks Petros Damos number of generations are often unpredictable and are characterized by fluctuatuations from to year. From a practical standpoint, considering that the extent of damage of pest species, either in time or space, may have devastating consequences [Myers, 1988, 1993, 1998], detecting the seasonality of species population is important not only for model construction but also to predicting the extend of its damage and initiate pest management measures at specific time points.

5.2 Moth population causal networks

In this work, we have developed causal population models based on time series analysis combined with graph theory. Additionally, we have considered not only biotic variables (i.e. populations) but also abiotic variables (i.e. weather data). As a case study we have considered the question of evaluating causal relations among moth populations. In particular, we have studied the spatial interactions among a set of ecological time series including two abiotic variables and thirteen moth populations. In this context, we propose several causality measures to asses the significances of correlations and we demonstrate the application of these measures to construct related causal networks. The later causality relations among populations are further projected in relation to the landscape topology. We have found in particular that (i) populations that ‘interact’ belong to the same species (interspecies causal population relations), (ii) weather variables (namely temperature and relative humidity) appear to have no causal relation to the population. This is an interesting asset considering that moth populations are insects and as all poikilothermic organisms their growth is directed affected by temperature. (iii) Population that ‘interact’ (explicit a sort of causal relations) through physical borders such as the Aliakmon River. This is to some degree unexpectable for terrestrial and one expects that interactions should appear to adjacency locations. Concerning only moth population dispersal of the above species, a comparison of the current results on an ecological-behavioural basis is difficult to be made because there are no related studies. One reason to that is the great difficulty in

91 ______Population Networks Petros Damos tracking moth dispersal in the field. This results to limitations in evaluating hypotheses of moth dispersal in the particular study region. Generally, it is known that there are differences in moving behaviour concerning the type of intra-habitat or appetitive short distance flight behaviour and inter- habitat or ‘migratory’ long-distance flight behaviour for insect groups such as Noctuids moths [Sappington and Showers, 1991]. While short distance (i.e. in orchard) appetitive moving behaviour is mostly performed by larval stages, it is most possible that inter-habitat long-distance movement is mostly related to specific moth behaviours such as reproductive maturity, mating status. An obvious inter-habitat migratory behaviour is observed in aphis species which are characterised by alary dimorphism [Nottingham and Hardie, 1989]. It is also believed that species inhabiting temporary habitats are more likely to express a migratory behaviour [Southwood, 1977]. Furthermore, a predator avoidance or reduced activity and feeding rates affect also ecological relations [Sih 1993, Werner 1993], while forage behavioural beginning to be also recognized as playing an important role within ecological structures since it alters the strength of interactions within the network [Beckerman et al. 1997, Schmitz, et al. 1997, Beckerman et al. 2006]. These founds are related to the current study and support the idea that it could be possible that moth populations, to move over short landscape distances to matte or feed. These migratory behaviours could be captured by the networks. Additionally, considering that all three pest have the same host and that only interspecies interactions have been observed enhance the possibility of mating behaviour. Thus affront mentioned, as well as several other factors, which were not investigated in the study [Damos and Savopoulou-Soultani 2010, 2012], can affect moth movement and respectively inference of the causal relations. In some cases for instance (i.e. populations of A. lineatella and G.molesta), significant reverse connections were also observed. Such kinds of modular relations are the result of non symmetric adjacency matrices and are generally difficult to be interpreted. However, by terms of functional connectivity they may provide an intuitive valuable representation of multiple populations’ interactions which can be realised within the system. Therefore, the network time series analysis we followed is mostly considered for the investigation of causal relation per se,

92 ______Population Networks Petros Damos rather than to infer upon the flight behaviour and the species spatial population arrangements. Moreover, by extending the population network, including more variables, it should be possible to explore inter and intraspecies relations in larger ecological systems, and identify specific population traits that might constrain their structures. From a practical standpoint and based on the network perspectives, some region specific populations can be characterised as hubs (i.e. populations of A.orana in locations Paliomana, Metohi and Vergina 2 are heavily linked to other regions), since they emerge as multi - driving variables which point to more than one location. Based on this information one can judge whether these multidriving variables are the cause of a lagged moth population synchronisation and indicated the emergence of populations few days later on nearby regions. One additional and essential feature of the current work was also to relate the conditional association of multivariate variables to a network based on sophisticate modelling approaches in contrast to a priori assumptions, which define the degree of connectance among taxonomic units. Actually, it is most likely ecological networks to express temporal dynamics [Traveset and Saez 1997, Olesen et al. 2008]. Long term observation of a pollination networks for example, revealed fluctuation in species interactions [Petanidou et al. 2008, Dupont et al. 2009]. While sampling procedure and landscape properties during observation may also affect the network structure [Goldwasser and Roughgarden 1997, Banasek-Richter et al. 2004]. Although the constructed networks do not fall to any ecological category by a restrict sense (i.e. food webs), the introduction of population multivariate time series analysis may open a new window on the study of complex ecological network organizations in which both, abiotic and biotic dynamic variables are considered. To apply the method we have used the structural information of autoregressive models that were developed in Chapter 3, to define optimum temporal ordering of the simultaneously measured population time series. We then used this information to define the order of the MVAR model and proceeded to the construction of correlation causality networks. In addition, by removing the inherent to the time series correlations, as well as the partial multivariate

93 ______Population Networks Petros Damos interactions series, we derived to the final network configurations. These, network configurations represent the most significant relations among the population variables, by a statistical point of view and their graph metrics were further compared. Generally, although that FDR analysis results to a binary adjacency matrix, in this work we present both weighted as well as binary graphs. However, the transformation of connectivity values from a continuous to a binary scale generally entails difficulties and has some drawbacks. In particular, while the binary scale clearly enhances contrast it also hides potentially important information as connectivity values below or above threshold levels. Nevertheless, the general appearance of the weighted population graphs qualitatively different to the binary ones although that very similar connectivity patterns are repeated in both graph categories. The modularity’s are depicted in both the circular as well as the force directed layouts. Besides, because weighted graph analysis seeks to preserve the affront mentioned information, while is prerequisite to select appropriate thresholds, we finally decide to set forth all type of network configurations and indicate differences in networks solely through structural changes. The related graph topologies of causality networks that were finally projected over the observation landscape were estimated using various graph measures. In some cases small world properties appeared on the network configurations. However, it is essential to indicate that other properties can also arise considering that that most of these measures depend on network size. For instance if the network size is altered (i.e. by adding more population variables and species) and the number of nodes and/or connection are changed, then one cannot exclude the possibility that the new graph measures will differ even if the network topology remains identical [Dunne et al. 2002]. Thus, some caution still needs to be exercised when considering general patterns in the moth population network structures, because they rely upon a relatively small number of variables. For instance, only assumptions can be made whether the same relationships hold in other areas as well. Second, there are several factors which affect actual population dynamics and are not considered in the study and therefore biological inference remains rudimentary. Third, correlations by

94 ______Population Networks Petros Damos themselves are crude (they do not provide explanations) and there is a need for theories and related assumptions that connect our fundamental results to the corresponding phenomenological aspects of population interactions. For instance, similarity and causality measure are of statistical nature, and there are cases in which significant correlations do not necessarily reflect physical casual chains. Nevertheless, we also feel that the current quantitative population analysis track the major properties of the ecological system. Generally, across all network types, researchers seek for ways to quantify interactions [Bersier et al. 2002; Woodward et al. 2005], while huge array of different ‘interaction strength’ measures are used depending upon the questions asked and the system considered [Frieston 1994, Frieston et al. 2003, Berlow et al. 2004]. Hence, quantitative development of ecological networks as described in the current work could be further expanded to include more variables which, ideally, are embedded in larger networks of causal factors reflecting several biotic and environmental constraints. From a practical standpoint, since we compare only structures of similar sizes (i.e. nodes), while in all cases the same threshold was set to convert significant probability values into edges (i.e. pair wise comparisons and FDR for α=0.05), the resulted connectivity and density patterns can be accurately compared and reveal hidden modules. Moreover, we used a fixed edge density (with the same number of nodes) that represents empirical derived networks to construct representative random networks (e.g. the probability for existence of an edge) for comparisons to the experimental derived modules. We note thus in particular, that the generated random networks result to network configurations having similar structure and average degrees to most weighted population networks. Empirical networks often resemble small-world characteristics and, indeed, graph measures will probably have values that lie somewhere between those of a lattice and random networks [Bernadette et al. 2010] as in the current work.

95 ______Population Networks Petros Damos

5.3 Population networks in advancement of pest information systems

By definition Agricultural Ecosystems (Agro-ecosystems) are spatially and functionally coherent units of agricultural activity, and include the living and nonliving components involved in that unit as well as their interactions. In this context the development of ecological networks by examining and incorporating the effect of abiotic and biotic variables simultaneously provides means of an integrated approach in modelling spatiotemporal connectivity among ecological objects. Additionally, by following a holistic perspective, agricultural processes and pest management in particular, should be considered as the result actions which take in to account all available information on a combined manner and based on the contribution of all branches of agriculture, biological sciences and computational applied mathematics. For pest management in particular, development of population models has been driven by the changes in pest management that farmers are regularly facing [Dent 1995, Damos 2012] and the results of the current work may add to knowledge which is required in and Agro-Ecology and Integrated Pest Management. In particular, since Integrated Pest Management (IPM) is a decision-based process, involving coordinated use of multiple tactics for optimizing the control of all classes of pests (insects, pathogens, weeds, vertebrates) in an ecologically and economically sound manner [Dent 1995, Ehler 2006, IPM Europe 2000], information of the spatiotemporal ‘interactions’ among nearby regions should be of special interest. Although traditional IPM programs use current comprehensive information on the life cycles of pests and their temporal evolution their spatial interaction as well as with more environment variables are usually neglected. Therefore, this information, in combination with other approaches, geostatistical analysis and grid sampling, may be used in decision making to manage pests by the most economical means, and with the least possible hazard to people, property, and the environment.

96 ______Population Networks Petros Damos

The development and evaluation of IPM is also improved by the use of expert pest information systems, and is lately a dynamic area of research aiming to incorporate spatiotemporal and other population related information [Jones 1989, Hassan et al. 1994, Gilman, and Green 1998, Green and Klomp 1998, Hannon and Ruth 2009, Plénet et al. 2009, Karabatakis and Damos 2012, Damos et al. 2012]. Therefore, in this topic research knowledge and innovative modelling procedures of the current work are offering new options in pest modelling and plant protection science which are in accordance, at least theoretical, with the principles of Integrated Pest Management, Precise farming and Sustainable Agriculture. Ecological network models combined with other technologies such as real time pest modelling and latest Web3 technologies (i.e. semantics, wireless data transmission and smart devices), provides grounds for the development of an efficient information system and intensive data extrapolation. Moreover, combined with grid sampling population network information provides important feedback for temporal and local specific rational use of pesticides energy inputs. Although the major scope of the current work was to study the spatial arrangement of pest population network we should also stress some benefits and major obstacles that are coupled with the incorporation of such kind of result in pest information systems. Related web based information systems, are lacking compatibility with close related experimental derived data sets and actually do not interfere directly on region specific population dynamics. On the other hand, most of them are focused on the development of decision tools for a particular pest and region and simply providing informational brochures, pesticide recommendations, fact sheets, extension guides, and bibliographic references without providing the framework for multiple interpretations based on ecological facts. This simple means that it is actually difficult to integrate multidisciplinary and fragmentized knowledge in to functional expert information system.

97 ______Population Networks Petros Damos

Sustainable agriculture enrols the recognition of site-specific differences within fields and adjusts management actions accordingly and these variations can further traced to management practices, soil properties and/or environmental characteristics. Hence, it is not an exaggeration to state that sustainability is attainable only due the integration of all available new research information and technologies on a complementary manner. Moreover, a great deal of research attention has been devoted by the United Nations Food and Agricultural Organization (FAO) to the sustainability of industrial farm systems by means of reducing all kind of energy influxes. In this context, the current work extensively explores the use new modelling approaches in studying agricultural systems utile in pest decision making and application of management actions. Namely, to indicate regions of high pest activity and reveal some possible population interactions as related to landscape architecture. The idea actually is to go beyond the use of conventional practices, explore and develop Agro- Ecosystem with minimal dependence on high agrochemical and energy inputs and sustain a healthy and reach environmental system for the future generations [Gliessman 1998, Hatfield 2000]. Such systems will permits self –organization in which ecological interactions and synergism between biological components provide mechanisms to sponsor own soil fertility, productivity and crop protection [Altiery 1995, Altiery and Rosset, 1995, Altiery and Nichols 2005]. Concluding, since ecological systems are very complex, obviously we do not claim that current work provides all answers which fully explain the pest spatiotemporal arrangement and connectivenes. However, we do believe that the proposed methods and the related results if validated over wider areas (by incorporating more data sets) and appropriately incorporated to current web forecasting systems [Damos and Karabatakis 2012], they can significantly improve their functionality. Pest-insect landscape ecology and related farm system managers could benefit by the current time series network analysis. In particularly information upon pest population may significantly improve collaborative efforts among professionals in the neighbouring areas/countries, thus greatly enhancing the

98 ______Population Networks Petros Damos quality of information that should be channelled on specific areas of rural development and under a wider scale. The latest will enhance spatiotemporal prediction of pest intensity and improve integrated management. Benefits are thought directly visualized through the rational use of resources, such as reduced energy flows to Agroecosystems (i.e. pesticides) and by decreasing production costs, amend productivity, quality and prices commodities. Additionally, potential population distribution and the development of pest risk mapping modules, in which the population models are implemented in a GIS real time forecasting grid environment that allows for spatial – regional or global - simulations of species spatial interaction and mapping is our next goal.

99 ______Population Networks Petros Damos

References

Aebersold, R. and M. Mann, 2003. Mass spectrometry-based proteomics. Nature, 422, 198–207, 2003.

Akaike, H., 1974. A new look at the statistical model identification. IEEE Trans. Automat. Control, 19, 716-723.

Albert, R. and A. Barabasi, 2002. Statistical mechanics of complex networks, Rev. Mod. Phys. 74, 47–97.

Altieri, M.A., (1995). Agroecology: The Science of Sustainable Agriculture. Boulder, CO: Westview.Press.

Alon U. 2007. An introduction to systems biology. Design and Principles of Biological Circuits. Chapman and Hall/CRC Mathematical and Computational Biology Series. NY.

Altieri, M.A. and P.M. Rosset, (1995). Agroecology and the conversion of large-scale conventional systems to sustainable management. International Journal of Environmental Studies 50: 165-185.

Altiery M. A. and C. I. Nicholls, (2005). Agroecology and the Search for a Truly Sustainable Agriculture, United Nations Environment Programme, Environmental Training Network for Latin America and the Caribbean1st edition.

Amaral L. A. N., A. Scala, M. Barthelemy, and H. E. Stanley, 2000. Classes of small- world networks. PNAS, 97,11149–11152.

Antoniou I. E. and Tsompa E. T. 2008. Statistical Analysis of Weighted Networks, Discrete Dynamics in Nature and Society, 1-16.

Balachowsky, A.S., 1966. Entomologie Applique a l’Agriculture. Traité. Tome II. Lepidoptères. Masson et Cie éditeurs, Saint Germain, Paris

Banasek-Richter, C., Cattin, M.-F. and L. F. Bersier, 2004. Sampling effects and the robustness of quantitative and qualitative food-web descriptors. Journal of Theoretical Biology, 226, 23–32.

Barabasi, A.L. and R. Albert, 1999. Emergence of scaling in random networks. Science, 286, 509–511, 1999.

Barabási A.L., H. Jeong, Z. Neda, E. Ravasz, A. Schubert, T. Vicsek, 2002. Evolution of the social network of scientific collaborations. Physica A 311, 590-614

Barabási, A. L. and Oltvai Z.V. (2004). Network biology: understanding the cell's functional organization. Nature Reviews in Genetics 5: 101-113.

Bascompte J., P. Jordano P., C.J. Melian and J. M. Olesen 2003. The nested assembly of plant animal mutualistic networks. Proc Natl Acad Sci, 100, 9383–9387.

100 ______Population Networks Petros Damos

Bascompte, J., P. Jordano and J. M. Olesen. 2006. Asymmetric coevolutionary networks facilitate biodiversity maintenance. Science, 312, 431–433.

Bascompte, J. and P. Jordano, 2007. Plant-animal mutualistic networks: The architecture of biodiversity. Annu. Rev. Ecol. Evol. Syst., 38, 567–593.

Beckerman, A.P., Uriarte, M. and O. J. Schmitz, O.J.(1997) Experimental evidence for behavior-mediated trophic cascade in terrestrial food chain. Proceedings of the National Academy of Sciences, USA, 94, 10735–10738./

Beckerman, A.P., Petchey, O.L. & Warren, P.H. (2006) Foraging biology predicts food web complexity. Proceedings of the National Academy of Sciences, USA, 103, 13745– 13749.

Bentarzi M. and A. Aknouche, 2005. Calculation of the Fisher Information Matrix for periodic ARMA models. Communications in Statistics-Theory and Methods, 34, 891-903.

Benjamini, Y. And Y. Hochberg, 1995. Controlling the false discovery rate: a practical powerful approach to multiple testing. Jou. R. Stat. Soc. B (Methodological), 57, 289-300.

Berlow, E.L., Neutel, A.M., Cohen, J.E., et al. (2004) Interaction strengths in food webs: issues and opportunities. Journal of Animal Ecology,73, 585–598.

Bersier, L.F., Banasek-Richter, C. & Cattin, M. (2002) Quantitative descriptors of food web matrices. Ecology, 83, 2394–2407.

Berryman, A. A. 1994. Population dynamics: forecasting and diagnosisfrom time series. Pages 119–128 in K. E. F. Watt, S. A. Leather, and D. M. Hunter, editors. Individuals, populations and patterns in ecology. Intercept, Andover, England.

Berryman, A. A. 1999. Principles of population dynamics and their application. Stanley Thornes, Cheltenham, UK.

Berryman A. and M. Lima , 2007. Detecting the order of population dynamics from time series: nonlinearity causes spurious diagnosis, Ecology, 88, 2007, 2121–2123.

Boccaletti, S., V. Latora, Y. Moreno, M. Chavez and D. U. Hwang, 2006. Complex networks: Structure and dynamics. Phys. Rep., 424, 175–308.

Bondy, J. A. And U.S.R. Murty, 1982. Graph theory with applications. Elsevier Science Publishing Co., Inc.

Bondy J.A. and Murty U.S.R., 2008. Graph Theory. Springer, US

Bonnington C.P. and C.H.C. Little, 1995. Foundations of Topological Graph Theory",. Springer, New York.

Box G. and G. Jenkins, 1970. Time series analysis: Forecasting and control, San Francisco: Holden-Day.

Bonningto C.P. and Little C.H.C., 1995. The Foundations of Topological Graph Theory. Spring, New York.

101 ______Population Networks Petros Damos

Borgatti, S. P.and Everett, M. G. 1997. Network analysis of 2-mode data. Social Networks,19(3): 243-269.

Box, G.E.P. and G.M. Jenkins, 1976.Time series analysis: forecasting and control. Holden-Day, San Francisco.

Box, G.E.P., G.M. Jenkins and G.C. Reinsel, 1976. Time series analysis. John Wiley & Sons.

Buonaccorsi J. P., Elkinton J. S., Evans S.R., Liebhold A. M. 2001. Measuring and testing spatial synchrony. Ecology 82, 1628–1679.

Blüthgen, N., Menzel, F., Hovestadt, T. & Fiala, B. (2007) Specialization, constraints, and conflicting interests in mutualistic networks. Current Biology, 17, 341–346.

Ceder, A. and Wilson N. H. M. (1986) Bus network design. Transportation Research Part B: Methodological 20: 331–344 (1986).

Chatfield J. R. 1989. The analysis of time series: an introduction. Chapman & Hall, London

Costa, L. d. F., F.A. Rodrigues, G. Travieso and P. R. V. Boas, 2007. Characterization of complex networks: A survey of measurements, Adv. Phys. 56, 167–242.

Cohen, J. E., T. Jonsson and S.R. Carpenter, 2003. Ecological community description using the food web, species abundance, and body size. Proceedings of the National Academy of Sciences, 100, 1781–1786.

Damos, P., Savopoulou-Soultani, M., (2008a). Temperature dependent bionomics and modeling of Anarsia lineatella (Lepidoptera: Gelechiidae) in the laboratory. J. Econ. Entomol. 101, 1557-1567.

Damos, P. and Savopoulou-Soultani M. (2008b). Des notes sur la préférence de ponte par Anarsia lineatella (Lepidoptera: Gelechiidae) sur le pêcher et des autres hôtes en laboratoire. IOBC/wprs Bulletin, 37, 35-42

Damos P. and Savopoulou-Soultani. 2010. Development and statistical evaluation of models in forecasting major lepidopterous peach pest complex for integrated pest management programs. Crop protection, 29, 1190-1199.

Damos P. and M. Savopoulou-Soultani. 2011a. Microlepidoptera of Economic Significance in Fruit Production: Challenges, Constrains and Future Perspectives fo Integrated Pest Management. In: Moths: Types, Ecological Significance and Control. Editor: Luis Cauterruccio, Nova Science Publications, (Chapter 3).

Damos P., A. Rigas and M. Savopoulou-Soultani, 2011b. Application of Markov Chains and Brownian motion models on Insect Ecology. In Brownian Motion: Theory, Modelling and Applications, Editors: Robert C. Earnshaw and Elizabeth M. Riley, Imprint: Nova Science Publications, ISBN: 978-1-61209-537-0, (Chapter2: pp.71-104).

Damos, P. and M. Savopoulou-Soultani, 2012. Temperature-Driven Models for Insect Development and Vital Thermal Requirements. Psyche Volume 2012, 1-13.

102 ______Population Networks Petros Damos

Damos, P. 2012. Insect efficient progeny distribution and demographic Entropy. Demographics and Stochastic Modelling Techniques and Data Analysis, International Conferences, Crete, June 5 - 8, Greece.

Damos, P. 2013. Current Issues in integrated pest management of Lepidoptera Pest threats in Industrial Crop Models. In Lepidoptera: Classification, Behavior and Ecology. Editors: Elia Guerritore and Johannes DeSare. NovaScience, NY.

Damos P. and S. Karabatakis, 2012. Real time pest modelling through the World Wide Web: decision making from theory to praxis. VIII. International Conference on Integrated Fruit Production, 7-12 October 2012, Kusadasi/Turkey, IOBC/wprs.

Damos, P. M. Oikonomou, Ch. Bratzas and I Antoniou. 2012. Agro-Semantics knowledge representation via Open Linked data (OLD) cloud: a case study in Integrated Crop Production, ESDO/MIBES proceedings 2012, 106-118 (English abstract).

Dawid, A. P., 2000. Causal inference without counterfactuals. Journal of the American Statistical Association, 86, 9-26.

Dennis, B., and M. L. Taper, 1994. Density dependence in time series observations of natural populations: estimation and testing. Ecological Monographs, 64, 205–224.

Dennis B., J. M. Ponciano, S. R. Lele, M. L. Taper, D.F. Staples, 2006. Estimating density dependence, process noise and observation error. Ecological Monographs, 76, 323–341.

Dennis, B., J.M. Ponciano, and M. L. Taper, 2010. Replicated sampling increases efficiency in monitoring biological populations. Ecology, 91, 610–620

Dent, D. 1995. Integrated Pest Management. Chapman and Hall, London

Ding M. Y. Chen Y and Bresslier S. (2006) Granger causality: basic theory and application to neuroscience. In: Schelter S, Winterhalded M, Timmer J. (eds) Handbook of time series analysis. Wiley, Wienheim, pp 438-460.

Dong J, Horvath S. 2007. Understanding Network Concepts in Modules. BMC Systems Biology 2007, 1:24.

Donner R.V., Y. Zou, J.F. Donges, N. Marwan, J. Kurths, 2010. Ambiguities in recurrence-based complex network representations of time series. Physical Review E, 81, 015101(R).

Dorogovtsev S.N., J.F.F. Mendes, A.N. Samukhin, 2003. Principles of statistical mechanics of uncorrelated random networks. Nuclear Physics B 666, 396–416

Donges, J. F., Y. Zou, N. Marwan and J. Kurths, 2009. Complex networks in climate dynamics. Comparing linear and nonlinear network construction methods,” European Phys. J., 174, 157–179.

Dunne, J.A., R.J. Williams and Martinez N.D. (2002) Food-web structure and network theory: the role of connectance and size. Proceedings of the National Academy of Sciences, USA, 99, 12917–12922.

103 ______Population Networks Petros Damos

Dupont, Y. L. and Nielsen B. O., (2006) Species composition, feeding specificity and larval trophic level of flower visiting insects in fragmented versus continuous heathlands in Denmark. Biological Conservation. 131: 475-485.

Dupont Y. L., Padrón B., Olesen J. M. and Petanidou T. (2009) Spatio-temporal variation in the structure of pollination networks. Oikos 118, 1261-269.

Eckmann. P., S. O. Kamphorst, D. Ruelle (1987). Recurrence plots of dynamical systems. Europhysics Letters 5: 973–977.

Ehler. L. E. 2006. Perspective Integrated pest management (IPM): definition, historical development and implementation, and the other IPM. Pest Manag Sci 62:787–789.

Eichler, M., 2000. Granger-causality graphs for multivariate time series. Technical report, University of Heidelberg, Germany.

Elton, C.S., 1927. Animal Ecology. Sedgewick and Jackson, London.

Erdős, P. and Renyi A. (1960) Publ Math Inst Hung Acad Sci, 5: 17

Everett, M. G. and Borgatti, S. P. (1999). The centrality of groups and classes. J. Math. Sociol. 23: 181-201.

Fleming R. A., H. J. Barclay and J. N. Candau, 2002. Scaling-up an autoregressive time- series model (of spruce budworm population dynamics) changes its qualitative behaviour. Ecological Modelling, 149, 127–142.

Freeman, L.C., 1979. Centrality in networks: I. Conceptual clarification. Social Networks 1, 215–239.

Friston, K., (1994) Functional and effective connectivity in neuroimaging: a synhesis. Hum Brain Mapp, 2, 56-78.

Friston, K., L. Harrison L and W. Penny (2003) Dynamic casual modelling. Neuroimage, 19, 4: 1273-1302.

Fu., B.Y., et al. (2006). Complex genetic networks underlying the defensive system of rice (Oryza sativa L.) to Xanthomonas oryzae pv. Oryzae. Proc Natl Acad Sci 103: 7994– 7999.

Fuller, W., 1976. Introduction to statistical time series (Wiley, New York).

Galeano, P. and Peña, D. , 2007. Improved model selection criteria for SETAR time series models. Journal of Statistical Planning and Inference, 137, 2802-2814.

Gladyshev, E. G., 1961. Periodically correlated random sequences. Soviet Math 2:385– 388.

Goldwasser, L. & Roughgarden, J. (1997) Sampling effects and the estimation of food- web properties. Ecology, 78, 41–54.

Granger, C. W. J., 1969. Investigating causal relations by econometric models and cross- spectral methods. Econometrica, 37: 424-438.

104 ______Population Networks Petros Damos

Granger, C. W. J. and P. Newbold, 1986. Forecasting economic time series, second edi- tion (Orlando, Academic Press).

Greenberg A. J., S. R. Stockwell, A. G. Clark AG. 2008. Evolutionary Constraint and Adaptation in the Metabolic Network of Drosophila. Mol Biol Evol 25: 2537–2546.

Guimaraes Jr P. R., V. Rico-Gray, S. Furtado dos Reis et al., 2006. Asymmetries in specialization in ant-plant mutualistic networks. Proc R Soc Lond B 273: 2041–2047.

Guimera ,R. and Amaral,L.A.N., 2005. Cartography of complex networks: modules and universal roles. J. Stat. Mech.: Theor. Exp., P02001.

Guimera, R., Mossa, S., Turtschi, A. and Amaral, L., 2005. The worldwide air transportation network: Anomalous centrality, community structure, and cities’ global roles,” Proc. Nat. Acad. Sci., 102, 7794– 7799.

Gilman, E. F. & J. L. Green. (1998). Efficient, collaborative, inquiry-driven electronic information systems. HortTechnology 8: 297-300.

Gliessman, S.R. (1998). Agroecology: ecological processes in sustainable agriculture. Michigan: Ann Arbor Press.

Green, D.G. & N. I. Klomp, 1998. Environmental informatics - a new paradigm for coping with complexity in nature. Complexity International Vol. 6 .

Hall, S. J. and D. G. Raffaelli, 1993. Food webs – theory and reality. Advances in Ecological Research, 24, 187–239.

Halley J.M. 1996. Ecology, evolution and 1/f noise. Trends in Ecology and Evolution, 11, 33-37.

Hamilton, J.D., 1994. Time series analysis (Princeton, Princeton N.J.).

Hannon, B. and M. Ruth (2009). Dynamic modeling of diseases and pests. Springer, ISBN: 978-0-387-09559-2

Hatfield, J. L, (2000). Precision Agriculture and Environmental Quality: Challenges for Research and Education.

Hassan, R. J.D. Corbett and K. Njorge, (1994). Combining Geo-referenced survey data with agroclimate attributes to characterize maize production systems in Kenya. Chapter 4, The Kenya Maize Data Base Project, to be published by CIMMYT (International Maize and Wheat Improvement Center, Mexico), 1996.

Hilborn, R., and M. Mangel, 1997. The ecological detective: confronting models with data. Princeton University Press, Princeton, New Jersey, USA.

Ims, R. A. and Steen, H. 1990. Geographical synchrony in microtine population cycles: a theoretical evaluation of the role of nomadic avian predators. Oikos, 57: 381–387.

Ims, R. A. & Andreassen, H. P. 2000. Spatial synchronization of vole population dynamics by predatory birds. Nature 408, 194–196.

105 ______Population Networks Petros Damos

Ings, T. C., J. M. Montoya, J. Bascompte et al., 2008. Ecological networks – beyond food webs. Journal of Animal Ecology, 1-17.

IPM Europe (2000). For the harmonisation of European support to developing countries in the use of IPM to improve agricultural sustainability.

Ives, A.R., K.C. Abbot, N. L.Ziebarth. 2010. Analysis of ecological time series with ARMA(p,q) models. Ecology, 91, 858–871.

Ives A. R. and V. A. A. Jansen 1998. Complex dynamics in stochastic tritrophic models. Ecology, 79, 1039-1052.

Jones, J.W. (1989). Integrating models with expert systems and data bases for decision making, p.194-211. In Climate & agriculture - system approaches to decision making, A.Weiss, Ed. Charleston Sc 5-7 March 1989, 256p.

Jordano, P. et al. (2006) The ecological consequences of complex topology and nested structure in pollination webs. In: Waser, N. M. and Ollerton, J. (eds), Plant_pollinator interactions. Univ. of Chicago Press, pp. 173-199.

Kantz H, Schreiber T.,1997. Nonlinear Time Series Analysis. Cambridge University Press, Cambridge.

Kay, P. and C. J. Fillmore 1999, Grammatical constructions and linguistic generalizations: The What’s X doing Y? construction, Language 75: 1-33.

Karabatakis S. and P. Damos 2012. Supporting Integrated Pest Management using metereological networks and Information Technology through the World Wide Web. 12 annual meting of the Entomological Society of America, 12-14Nov.,Tennesi, US.

Karlsson, P. 2007. Food Webs, Models and Species Extinctions in a Stochastic Environment. Doctoral dissertation, Lund University.

Kendall, B. E., C. J. Briggs, W. W. Murdoch, P. Turchin, S. P. Ellner, E. McCauley, R. M. Nisbet, and S. N. Wood, 1999. Why do populations cycle? A synthesis of statistical and mechanistic modeling approaches. Ecology, 80, 1789–1805.

Kot, M., 2001. Elements of Mathematical Ecology. Cambridge University Press, NY.

Krause, A. E., Frank, K. A., Mason, D. M., Ulanowicz, R. E. & Taylor, W.W. (2003a) Compartments revealed in food-web structure. Nature, 426, 282–285.

Krumin, M. and Shoham, S., 2010. Multivariate AutoregressiveModeling and Granger Causality Analysis ofMultiple Spike Trains. Computational Intelligence and Neuroscience Volume 2010, Article ID 752428, 9 pages

Kugiumtzis, D., 2002a. Surrogate Data Test for Nonlinearity Using Statically Transformed Autoregressive Process." Physical Review E, 66, 025201.

Kugiumtzis, D., 2002b. Surrogate Data Test on Time Series." In A Soo, L Cao (eds.),Modelling and Forecasting Financial Data, Techniques of Nonlinear Dynamics, chapter 12, pp. 267{282. Kluwer Academic Publishers.

106 ______Population Networks Petros Damos

Kugiumtzis, D. 2008. Evaluation of Surrogate and Bootstrap Tests for Nonlinearity in Time Series." Studies in Nonlinear Dynamics & Econometrics, 12(4).

Lande R., S. Engen, B.E. Saether, T. Coulson. 2006. Estimating Density Dependence from Time Series of Population Age Structure. The american naturalist, 168, 10-12.

Lau K. M., Weng H. 1995. Climatic signal detection using wavelet transform: how to make a time series sing. Bull Am. Meteorol. Soc. 76, 2391–2402.

Lauritzen, S. L., 2000. Causal inference from graphical models. In E. Barndor_-Nielsen, D.R. Cox, and C. Klüppelberg (eds.) Complex Stochastic Systems, CRC Press, London.

Leclerc, R. T. (2008) Survival of the sparsest: robust gene networks are parsimonious. Mol Syst Biol 4: 213–218.

Levins R., 1974. The qualitative analysis of partially specified systems. Ann NY Acad Sci 231:123–138.

Logan J. A. and F. P. Hain 1991. Chaos and Insect Ecology. Papers presented at the symposium: Does Chaos Exist in Ecological Systems?, IUFRO XIX World Congress Montreal, Canada August 5-11, 1990.Information series 91-3. ISSN 0742-7425.

Lindeman, R. L., 1942. The trophic-dynamic aspect of ecology. Ecology, 23, 399–418.

MacArthur, R., 1955. Fluctuations of animal populations and measure of community stability. Ecology, 36, 533–536.

Marwan, N., Thiel, M. and N. R. Nowaczyk, 2002. Cross recurrence plot based synchronization of time series,” Nonlin. Process. Geophys. 9, 325–331.

Marwan, N., M. C. Romano, M. Thiel and J. Kurths, 2007. Recurrence plots for the analysis of complex systems,” Phys. Rep., 438, 237–329.

May, R. M., 1972. Will large complex system be stable? Nature, 238, 413–414.

May, R. M., 1973. Stability and Complexity in Model Ecosystems. Princeton University Press, Princeton, New Jersey.

May, R. M. 1974. Biological populations with nonoverlaping generations: stable points, stable cycles, and chaos. Science, 186, 645-647.

May, R. M. 1975. Biological populations obeying difference equations: stable points, stable cycles and chaos. Journal of Theoretical Biology, 49, 511-524.

Montoya1 J. M., S.L. Pimm and R. V. Sole´ 2006. Ecological networks and their fragility Vol Science” 442, 259-264.

Moran, P. A. P. 1953. The statistical analysis of the Canadian lynx cycle. II. Synchronization and meteorology. Australian Journal of Zoology, 1, 291–298.

Myers, J. H. 1988. Can a general hypothesis explain population cycles in forest Lepidoptera? Advances in Ecological Research, 18, 179–242.

107 ______Population Networks Petros Damos

Myers, J. H. 1993. Population outbreaks in forest Lepidoptera. American Scientist, 81, 240–251.

Myers, J. H. 1998. Synchrony in outbreaks of forest Lepidoptera: a possible example of the Moran effect. Ecology, 79, 1111–1117.

Müller, C. B., I. C. T., Adriaanse, R. Belshaw and H. C. J. Godfray, 1999. The structure of an aphid-parasitoid community. Journal of Animal Ecology, 68, 346–370.

Newman, M. E. J., 2003. The structure and function of complex networks. SIAM Review, 45, 167–256.

Nottingham, S.F. and Hardie, J. (1989) Migratory and targeted flight in seasonal forms of the black bean aphid, Aphis fabae. Physiological Entomology. 14.45 1-458.

Odum, E.P., 1953. Fundamentals of Ecology Saunders, Philadelphia.

Olesen J. M., J. Bascompte, Y. L. Dupont and P. Jordano 2006. The smallest of all worlds: pollination networks. J. Theor. Biol., 240, 270–276.

Onnela J. P., et al. (2007) Structure and tie strengths in mobile communication networks. Proc Natl Acad Sci 104: 7332–7336.

Olesen, J. M. et al. (2008) Temporal dynamics in a pollination network. Ecology 89: 1573-1582.

Pagano, M., 1978. On periodic and multiple autoregression. Ann. Statist. 6:1310–1317.

Papana A, D. Kugiumtzis and P.G. Larsson (2011) Reducing the bias of causality measures", Physical Review E, 83: 036207

Pardalos, P. M., D. W. Hearn and W. W. Hager (1997) Network optimization. Springer Verlag.

Parker, D. 2006. Complexities and uncertainties of neuronal function. Philos Trans R Soc Lond B Biol Sci 361: 81–99.

Petanidou, T. et al. (2008) Long-term observation of a pollination network: fluctuation in species and interactions, relative invariance of network structure and implicate

Pearl, J., 1995. Causal diagrams for empirical research (with discussion). Biometrika, 82, 669-710.

Pearl, J., 2000. Causality, Cambridge University Press, Cambridge, UK.

Petanidou, T. et al. (2008) Long-term observation of a pollination network: fluctuation in species and interactions, relative invariance of network structure and implicate

Plénet, D. Giauque, P. Navarro, E. Millan, M., Hilaire, C. Hostainou, E., Lyoussoufi, A. Samie, J. F. (2009). Using on-field data to develop the EFI information system to characterise agronomic productivity and labour efficiency in peach (Prunus persica L. Batsch) orchards in France. Agricultural Systems, 1-3, 1-10.

108 ______Population Networks Petros Damos

Pollard, S.D., A.M. MacNab & R.R. Jackson, 1987. Communication with chemicals: Pheromones and spiders. Pp. 133–141, In Ecophysiology of spiders. (W. Nentwig, ed.). Springer-Verlag, Berlin.

Pötscher B. M. and S. Srinivasan, 1994. A comparison of ordered estimation procedures for ARMA models. Statistica Sinica, 4, 429-50

Ranta E, Kaitala V, Lundberg P. 1997. The spatial dimension in population fluctuations. Science. 278, 1621–1623.

Ranta E, Kaitala V, Lindstrom J., 1999. Spatially autocorrelated disturbances and patterns in population synchrony. Proc. R. Soc. B. 266, 1851–1856.

Royama, T. 1977. Population persistence and density dependence. Ecological Monographs. 47, 1–35.

Royama, T., 1984. Population dynamics of the spruce budworm Choristoneura fumiferana. Ecological Monographs, 54, 429– 462.

Royama, T., 1992. Analytical population dynamics. Chapman and Hall, London UK.

Royama, T., 2005. Moran effect of nonlinear population processes. Ecol. Monographs., 75, 277-293.

Sappington, T.W. & Showers, W.B. (1992) Reproductive maturity, mating status, and long-duration flight behavior of Agroris ipsilon (Lepidoptera: Noctuidae) and the conceptual misuse of the oogenesisflight syndrome by entomologists. Environmental Enromology, 21.

Schmitz, O.J., Beckerman, A.P. & Obrien, K.M. (1997) Behaviorally mediated trophic cascades: effects of predation risk on food web interactions. Ecology, 78, 1388–1399.

Seth A. K., B. J. Baars and D. B. Edelman D. B. (2005) Criteria for consciousness in humans and other mammals. Conscious Cogn 14(1):119–139

Seth, A. K. (2008) Causal networks in simulated neural systems Cogn Neurodyn 2:49– 64.

Seth A. K., 2010. A MATLAB toolbox for Granger causal connectivity analysis. Jour. Neur. Meth., 186: 262–273.

Shibata, R., 1976. “Selection of the Order of an Autoregressive Mode by Akaike Information Criterion”, Biomertica, 63, pp.117-126.

Sih, A. (1993) Effects of ecological interactions on forager diets: competition, predation risk, parasitism and prey behaviour. Diet Selection: An Interdisciplinary Approach to Foraging Behaviour (ed. R.N. Hughes), pp. 182–211. Blackwell Scientific Publications, Oxford, UK.

Solé, R.V. & Montoya, J.M. (2006) Ecological network meltdown from habitat loss and fragmentation. Ecological Networks: Linking Structure to Dynamics in Food Webs (eds M. Pascual & J. Dunne), pp. 305–323. Oxford University Press, Oxford, UK.

109 ______Population Networks Petros Damos

Sokal R.R. and Rohlf F.J., (1996). Biometry, 3rd edition, Freeman and Company, NY.

Southwood, T.R.E. (1977) Habitat, the templet for ecological strategies? Joumal of Animal Ecology, 46.337-365.

Steele J.H. 1985 A comparison of terrestrial and marine ecological systems. Nature, 313, 355-358

Strogatz, S. H., 2001. "Exploring complex networks". Nature 410, 6825: 268–276.

Sutcliffe, O. L., C. D. Thomas, and D. Moss, 1996. Spatial synchrony and asynchrony in butterfly population dynamics. Journal of Animal Ecology, 65, 85–95.

Sugihara G. 1995. From out of the blue. Nature, 378, 559–560.

Tiao, G. C. and M. R Grupe, 1980. Hidden periodic autoregressive-moving average models in time series data. Biometrika 67:365–373.

Timmer J., 2008. Handbook of time series analysis. Wiley, Wienheim, pp 438–460.

Tong, H. ,1990. Non-linear time series: a dynamical system approach. Oxford University Press, Oxford.

Traveset, A. and Saez, E. (1997) Pollination of Euphorbia dendroides by lizards and insects: spatio-temporal variation in patterns of flower visitation. Oecologia 111: 241- 248.

Torrence C, Compo G.P., 1998. A practical guide to wavelet analysis. Bull Am Meteorol Soc 79, 61–78.

Turchin, P. 1990. Rarity of density dependence or population regulation with lags? Nature 344:660-663.

Turchin, P., 2003. Complex population dynamics: a theoretical/empirical synthesis. Princeton University Press, Princeton, New Jersey, USA. van Veen, F. J. F., Morris, R.J. and H. C. J. Godfray, 2006. Apparent competition, quantitative food webs and the structure of phytophagous insect communities. Annual Review of Entomology, 51, 187–208. van Wijk, B.C.M., Stam C.J., Daffertshofer A., 2010. Comparing Brain Networks of Different Size and Connectivity Density Using Graph Theory. PLoSONE 5(10): e13701. doi:10.1371/journal.pone.0013701

Vecchia, A. V., 1985. Periodic autoregressive-moving average (PARMA) modeling withapplication to water resources. Water Resour. Bull. 21:721–730.

Venzon, M. A. Pallini and A. Janssen, 2001. Interactions Mediated by Predators in Arthropod Food Webs. Neotropical Entomolgy 30, 1-9.

110 ______Population Networks Petros Damos

Vitkup D, Kharchenko P, Wagner A (2006) Influence of metabolic network structure and function on enzyme evolution. Genom Biol 7: R39.,

Wang, X. and G. Chen, 2002. Synchronization in smallworld dynamical networks, Int. J. Bifurcation and Chaos, 12, 187–192.

Watts, D. J. and S. H. Strogatz, 1998. Collective dynamics of ‘small-world’ networks. Nature, 393, 440–442.

Werner, E.E. & McPeek, M.A. (1994) Direct and indirect effects of predators on two anuran species along an environmental gradient. Ecology, 75, 1368–1382.

Wei, W.W.S., 2006. Time series analysis. Univariate and Multivariate Methods. 2nd Edt.Peasron Education Inc., NY

Welch, P.D., 1967. "The Use of Fast Fourier Transform for the Estimation of Power Spectra: A Method Based on Time Averaging Over Short, Modified Periodograms", IEEE Transactions on Audio Electroacoustics, AU-15, 70–73.

Wiener, N. (1956) The theory of prediction. In: Beckman EF (ed) Modern mathematics for engineers, MacGraw-Hill, NY.

Wold, H., 1938. A study in the analysis of stationary time series (second edition, 1954) (Almqvist and Wiksell, Uppsala).

Woodward, G., Speirs, D. C. and A. G. Hildrew, 2005a. Quantification and resolution of complex, size-structured food web. Advances in Ecological Research, 36, 85–135.

Woodward, G., Speirs, D.C. & Hildrew, A.G. (2005b) Quantification and resolution of complex, size-structured food web. Advances in Ecological Research, 36, 85–135.

Woodward, G., Ebenman, B., Emmerson, M., Montoya, J.M., Olesen, J.M., Valido, A. and P.H. Warren, 2005b. Body size in ecological networks. Trends in Ecology & Evolution, 20, 402–409.

Ydenberg, R. C. 1987. Nomadic predators and geographicalsynchrony in microtine population cycles. Oikos, 50, 270–272.

Yoon, J., Blumer, A., Lee, K. (2006) An algorithm for modularity analysis of directed and weighted biological networks based on edge-betweenness centrality. Bioinformatics, 22, 3106-3108

Zhang, J. and M. Small.,2006. Complex network from pseudoperiodic time series: Topology versus dynamics, Phys. Rev. Lett. 96, 238701.

Zhang, A., 2009. Protein interaction networks. Computational analysis Cambridge University Press UK.

Zhou, C., Z. Zemanova, G. Zamora, C. C. Hilgetag and J. Kurths, 2006. Hierarchical organization unveiled by functional connectivity in complex brain networks,” Phys. Rev. Lett., 97, 238103.

111 ______Population Networks Petros Damos

112 ______