Mechanical Inference in Dynamic Ecosystems

by

R. E. Langendorf

B.S., Bates College, 2010

A thesis submitted to the

Faculty of the Graduate School of the

University of Colorado in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy

Department of Environmental Studies

2018 This thesis entitled: Mechanical Inference in Dynamic Ecosystems written by R. E. Langendorf has been approved for the Department of Environmental Studies

Prof. Daniel F. Doak

Prof. Sharon K. Collinge

Prof. Aaron Clauset

Prof. James A. Estes

Prof. Mark Novak

Date

The final copy of this thesis has been examined by the signatories, and we find that both the content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline. iii

Langendorf, R. E. (Ph.D., Environmental Studies)

Mechanical Inference in Dynamic Ecosystems

Thesis directed by Prof. Daniel F. Doak

Empirical studies of graphs have contributed enormously to our understanding of com- plex systems, growing into a more scientific exploration of communities spanning the physical, biological, and social called network science. As the quantity and types of networks have grown so has their heterogeneity in quality and specificity resulting in a wealth of datasets that are not matched by existing theoretical methods. This is especially true in ecology where the ma- jority of interactions are indirect and unobservable even in well-studied systems. As a result ecologists continue to grapple with three fundamental questions: Most basically, (i) ‘How do ecosystems function?’ I answered this question by comparing networks to each other such that poorly-studied systems can be understood through their similarity to well-understood ones and theoretical models. To do this I created the alignment algorithm netcom which recasts ecosystem processes as statistical dynamics of diffusion kernels originating from a network’s constituent nodes. Using netcom I constructed a supervised classifier which can distinguish processes in both synthetic and empirical network data. While this kind of inference works on currently available network data, I have shown how causality can serve as a more effective and unifying currency of ecological interaction. Measures of causality are even able to identify complex interactions across organizational scales of communities, answering the longstanding question (ii) ‘Can community structure causally determine dynamics of constituent species?’ Moreover, causal inference can be readily combined with existing modeling frameworks to quantify dynamic interactions at the same scale as the underlying data. In this way we can answer the question (iii) ‘Which species in an ecosystem cause which other species?’ These tools are part of a paradigm shift in ecology that offers the potential to make more reliable management decisions for dynamic ecosystems in real time using only observational data. Dedication

For Camp Bear. v

Acknowledgements

Dan Doak

Committee members

Collaborators

The Doak lab

The University of Colorado

Environmental Studies Program

IQ Biology Program

Family

Friends

Roommates

Gilbert and Bear

My mandolin

Planet Bluegrass

Bowers & Wilkins

Weber

3821 Paseo Del Prado

Funding generously provided by the National Science Foundation

IGERT 1144807

GK-12 0841423

DGE-1144083 vi

Contents

Chapter

1 Overview 1

1.1 Can comparisons of communities infer underlying processes and their dynamics? .2

1.2 Can community structure causally determine dynamics of constituent species? . . .2

1.3 Can we infer real-time ecological interactions using observational data? ...... 3

2 Aligning Statistical Dynamics Captures Biological Network Functioning 5

2.1 Introduction ...... 5

2.2 Methods ...... 7

2.2.1 Conceptual rationale ...... 7

2.2.2 Networks ...... 8

2.2.3 Alignment algorithm ...... 8

2.2.4 Analyses ...... 14

2.2.5 Data availability ...... 18

2.2.6 Code availability ...... 19

2.3 Results ...... 19

2.3.1 Comparison with other aligners ...... 19

2.3.2 Node centralities ...... 19

2.3.3 Tracking network dynamics ...... 20

2.3.4 Functional network classification ...... 20

2.4 Discussion ...... 27 vii

3 Can community structure causally determine dynamics of constituent species? A test

using a host-parasite community 32

3.1 Introduction ...... 32

3.2 Methods ...... 35

3.2.1 Overview ...... 35

3.2.2 Causal Inference with CCM ...... 36

3.2.3 Data ...... 39

3.2.4 Community Properties ...... 40

3.3 Results ...... 40

3.3.1 Causal Community Matrix ...... 43

3.3.2 Cause vs Effect ...... 46

3.4 Discussion ...... 47

4 Understanding interspecific causation in multi-species systems: development of a gen-

eral approach and application to dynamics of the endangered vernal pool

conjugens 52

4.1 Introduction ...... 52

4.2 Methods ...... 55

4.2.1 Data ...... 55

4.2.2 S-maps ...... 56

4.2.3 Causality ...... 57

4.2.4 Causally-filtered S-maps ...... 59

4.3 Results ...... 61

4.3.1 Lolium multiflorium ...... 64

4.3.2 Eryngium vaseyi ...... 67

4.3.3 Water-Mediated Interactions ...... 69

4.4 Discussion ...... 69 viii

5 The future of causal community ecology 74

5.1 How do incomplete or aggregate community descriptions effect our ability to infer

causal interactions? ...... 74

5.2 How does causality propagate through a community? ...... 75

5.3 Can we compare causality inferred in different systems? ...... 75

5.4 What is the relationship between causal strength and interaction strength? . . . . . 76

5.5 To what extent can we substitute spatial replication for temporal replication in

inferring causality? ...... 76

5.6 Does the principle of nil causality result in a trade-off in sensitivity? ...... 77

5.7 What is the asymptotic behavior of causal inference? ...... 77

5.8 Can causal inference predict the effects of adding or removing a variable in a system? 78

5.9 What will convince people that causality can be inferred from observational data? . 78

Bibliography 79

Appendix

A Chapter 1 Appendix 96

A.1 Supplemental Figures ...... 96

B Chapter 2 Appendix 106

B.1 Supplemental Methods ...... 106

B.1.1 Determining an Embedding Dimension ...... 106

B.1.2 Testing for Nonlinearity ...... 107

B.1.3 Applying CCM to Community Properties ...... 108

B.1.4 Real World Shadow Manifolds ...... 110

B.2 Supplemental Figures ...... 111 ix

C Chapter 3 Appendix 117

C.1 Supplemental Figures ...... 117 x

Tables

Table

3.1 Community properties ...... 42

A.1 Theoretical models ...... 96

A.2 Induced Conserved Structure (ICS) scores ...... 97

A.3 Sources of networks in Fig. 2.6 ...... 105 xi

Figures

Figure

2.1 Example networks ...... 7

2.2 Network alignment algorithm ...... 9

2.3 Centrality comparisons ...... 21

2.4 Edge dynamics ...... 22

2.5 Network dynamics ...... 23

2.6 Biological network state-space ...... 25

2.7 Theoretical network classification ...... 26

2.8 Applied network classification ...... 28

3.1 Slovakia’s rodent-ectoparasite community ...... 41

3.2 Causal community matrix ...... 44

3.3 Cause vs effect ...... 48

4.1 Addressing multicollinearity in Pool 300 ...... 63

4.2 Interspecific effects on Lasthenia conjugens ...... 65

4.3 Lolium multiflorium’s effects on Lasthenia conjugens ...... 66

4.4 Eryngium vaseyi’s effects on Lasthenia conjugens ...... 68

4.5 Water-mediated interactions ...... 70

A.1 Ordination stress in Fig. 2.5 ...... 98

A.2 Ordination stress in Fig. 2.6 ...... 99 xii

A.3 Alignment as a function of network size ...... 100

A.4 Ordination stress in Fig. 2.8 ...... 101

B.1 CCM assumptions ...... 111

B.2 Smoothness assumption ...... 112

B.3 Correlation vs causation ...... 113

B.4 Individual node effects on a community property ...... 114

B.5 Community detection without causality ...... 115

B.6 Real world shadow manifolds ...... 116

C.1 Temporal interactions ...... 118

C.2 Spatial interactions ...... 119

C.3 Density-dependent interactions ...... 120

C.4 Water-mediated growth rates of Lasthenia conjugens ...... 121

C.5 Direct effects of early and late rainfall on Lasthenia conjugens ...... 122

C.6 Relationship between Eryngium vaseyi and Lolium multiflorium ...... 123

C.7 Probability of maximum S-map weight per observation ...... 124

C.8 Transfer entropy vs interaction strength ...... 125

C.9 Species-specific transfer entropy distributions ...... 126 Chapter 1

Overview

Ecology is a study of interactions. Interactions between organisms, between populations, between species, and between all of these and their environments. As the study of the largest biological scales, ecology also encompasses interactions at finer organizational scales, between cellular, molecular, and chemical physiologies of organisms and their environments, but also interactions across all of the organizational scales of life. The immensity of these interactions creates dynamics that are challenging to explain or predict, and that ecologists have struggled to collate into general macrobiological laws. Even so, there are strikingly nonrandom patterns in the structures of these interactions, which are surprisingly reproducible. In one of the most ambitious ecological experiments [220], Simberloff and Wilson compared the species composi- tion of six small islands in the Florida Keys before and after defaunation with methyl bromide.

Different arthropod species recolonized the islands, but in less than a year they converged on the same distribution across seven trophic classes. New kinds of inference are required to explain how interactions between organisms and their environment can produce such dynamics. This dissertation contributes to the exciting progress ecologists are making toward such explanations by answering three general questions: 2

1.1 Can comparisons of communities infer underlying processes and their dy- namics?

Properties of community structure are often assumed to be important because they are non- random with respect to community types [21, 142, 236], which are themselves nonrandom with respect to generative processes [142, 254]. While these are undoubtedly fascinating results about the natural history of communities, they offer no explanation for their formation or maintenance.

Moreover, this method of classifying communities comes at a cost. Community properties reduce higher-dimensional data to single dimensions that disregard information in unpredictable ways, and often multiple generative models can produce the same patterns, making them mostly use- ful for falsification. An alternative approach is to directly compare communities and use these distances to infer community functioning. I have therefore worked to answer the question “What is the distance between two communities?”

To answer this question I created a novel alignment algorithm [139, 142]. netcom models communities as networks and compares the uncertainty through time of diffusion kernels orig- inating from each node in a pair of networks. Using netcom I have been able to functionally classify communities by comparing them to theoretical generative models, quantify the impor- tance of community members by simulating their removal and comparing the resulting commu- nity with the original observed one, and even track simulated community dynamics with critical transitions by aligning an evolving network to itself through time. This work suggests that non- parametric community comparisons can be used to understand the structure and underlying functioning of complex, dynamic communities.

1.2 Can community structure causally determine dynamics of constituent species?

Simberloff’s and Wilson’s island colonization experiment [220] suggests downward causa- tion, from the structures of those communities to their constituent populations. This conclusion is contrary to the long-held paradigm that causation goes up organizational scales, best summa- 3 rized by Robert Ricklefs when he wrote that ”The ability of the community to resist change [is] the sum of the individual properties of component populations... Relationships between preda- tors and prey, and between competitors, can affect the inherent stability of the community, but trophic structure does not evolve to enhance community stability.” It would seem simple to ad- dress this hypothesis experimentally, but systematically altering a community property involves manipulating its constituent populations. This confounding of treatment effects makes it diffi- cult to independently measure effects on those populations. An added difficulty is the general issue of separating causality from correlation, which is particularly acute when asking about the relationships between constituents and aggregate measures of the same community.

If we cannot experimentally test for interactions across a community’s organizational scales, we must rely on observational data. Fortunately, ecologists have recently made strides at infer- ring causality in state space reconstructions of observational time series data. I have used one of these methods, Convergent Cross Mapping (CCM) [232], to test a commonly-studied host para- site community [31, 32, 131, 195, 227] for across-scale causality, and found [141] that rodent host species formed a causal feedback loop with how connected and clustered the community was as a whole. This in turn drove the dynamics of the parasite species which showed little to no causal autonomy. These findings are the first evidence of top-down forcing across organizational scales of a community based on causality inferred from observational data. We cannot know how common downward causation is in ecology, but our findings suggest that pairwise interspe- cific interactions are insufficient to describe a community’s dynamics as they ignore higher-order interactions with systemic properties that may be both descriptive and ecologically meaningful.

1.3 Can we infer real-time ecological interactions using observational data?

Much of the struggle to explain and predict community dynamics stems from modeling them with static interaction strengths, and then parameterizing these models with correlational methods that cannot discern real interactions from covarying species which are similarly re- liant on a shared environment or common biotic driver. This is exacerbated by the inherent 4 incompleteness of virtually all community data. Biotic and abiotic players that matter are of- ten overlooked and their importance assigned to covariates. While static models based on such data sometimes make reasonable predictions, they are less helpful in identifying the causes of a particular species dynamics or how effects from one species can propagate across a community.

Some of the scientists who developed CCM have successfully used the same state space reconstruction techniques to calculate real-time interaction strengths as a time series of Jacobians

(community matrices), which they call an S-map [57, 231]. While dynamic, S-maps still do not address the tendency of regression to claim covarying species are interacting. I have addressed this shortcoming by combining S-maps with a novel causal filter based on transfer entropy [214].

Unlike CCM, transfer entropy is a proxy for causality that makes no assumptions about species dynamics and is therefore applicable to entire communities. Using these causally filtered S- maps, I have identified spatiotemporal dynamics in the interspecific interactions of the federally endangered vernal pool plant Lasthenia conjugens [140], and how these interactions have been mediated by early- and late-season rainfall.

What follows is the development and application of methods to improve how we infer ecological interactions, how we characterize community structures, and how we identify rela- tionships between individual species and the structures of the communities they reside within.

These tools are part of a paradigm shift in ecology that offers the ability to make management decisions for dynamic ecosystems in real time using only observational data. Chapter 2

Aligning Statistical Dynamics Captures Biological Network Functioning

2.1 Introduction

Alignment methods attempt to find overlap in pairs of non-Euclidean data as a means to compare complex systems and identify similarities within them. There are many kinds of alignments, but they vary in their adoption and utility across fields and data types. Consider the success of sequence alignments like BLAST [5] which is queried more than 100,000 times every day [163]. Network alignments exist, but their adoption has been slow and fractured. This is the result of two prevailing challenges. 1 Network data varies dramatically in both size and detail, spanning orders of magnitudes in their numbers of nodes and ranging from simple unweighted, undirected networks [251] to more realistic ones akin to a system of differential equations [209].

It is precisely this generality of network models that makes network aligners both very tempting to employ and nontrivial to justify. 2 True global network alignment is NP-complete and quickly becomes computationally intractable as network size increases [37, 46] so all aligners rely on heuristics and there is justifiable disagreement on what it means for two networks, or nodes within them, to be similar. Even so, there is simply too much network data on physical systems

[1, 6, 122], natural communities [10, 116, 218, 236, 252], and social dynamics [17, 79, 130, 262] to not consider the utility of network alignments more carefully.

The first issue is likely to get worse as network scientists continue to study increasingly different kinds of systems. Then, the key challenge to making network alignments more widely applicable is the difficulty in deciding what it means for networks to be similar. Simple mea- 6

p 2 sures of distances such as ∑i (~x1i − ~x2i ) are unlikely to capture dynamics of coupled nonlinear

systems ~x1i and ~x2i , and often assume the two networks are the same size which is rarely the

case. Instead, some additional measure m(~x1i ,~x2i ) is needed to serve as a mapping between the systems. The unresolved question is what m should be.

This problem was originally couched in graph theory with the goal of creating a bijection between two graphs such that every node in each graph gets paired with exactly one node in the other graph [26,197]. These mapping functions can be used to recover a scrambled graph, but are unable to address degrees of node or graph similarity. More recent developments by network sci- entists have focused on approximate solutions that emphasize recovering a system’s functioning over its exact topology [45], with some notable successes [132, 133, 165, 172, 236]. Even so, judg- ing the quality of an alignment remains a challenge, and most measures emphasize the graph theoretic goal of topological similarity over emergent functional similarity, even when studying the latter. Unsurprisingly then, network alignments have mostly improved at producing topo- logical alignments [37]. It remains less clear how well to expect these approaches to perform at identifying functional similarities or predicting system trajectories in networks modeling com- plex systems structured by different processes occurring at vastly different spatial and temporal scales.

We have developed a principled and non-topological network aligner to explore the utility of comparing networks by their dynamics rather than properties of their edge topologies. Our approach uses the entropies of simulated diffusion kernels emanating from each node in two net- works to make a global pairwise-node alignment between them. We contextualized our method by comparing it to ten recently compared network aligners [37], explored its behavior relative to known network dynamics, and tested its ability to functionally classify both common synthetic networks as well as biological systems ranging from gene regulatory networks all the way up in scale to food webs. 7

Figure 2.1: The two networks used as a demonstrative example in the algorithm description. These networks are identical except for the reversal of a single unidirectional edge between nodes I and IV, which is dashed for emphasis.

2.2 Methods

2.2.1 Conceptual rationale

Details of the algorithm are visually accompanied (see Fig. 2.2) by the following two net- works (Fig. 2.1) which emphasize our general approach, of system dynamics mattering more in the actual utility of network alignments than their literal topology which matters more for codifying systems than understanding them.

The unidirectional edge between the first and fourth nodes has been reversed, but oth- erwise these two networks are topologically identical. If aligning two networks is formulated topologically, as a kind of layout problem where the aligner seeks to match up the edges with the minimum number of discrepancies, then these two networks should be classified as nearly identical. We argue that they are in fact quite different, and embody the kind of functioning we hypothesized a dynamics-oriented approach to network alignment would more successfully capture. The left network is a cycle whereas the right network has source-sink dynamics [200], which can be seen in their node-specific equilibria where the left network can be in any state but the right network can only be in states III or IV.

Comparing networks node-wise has this advantage of being able to trace systemic differ- ences to individual components of each system, which is why the approach implemented here relies on capturing the dynamics of the two networks being aligned from the perspective of each 8 node. Consider trying to compare two hypothetical cities of houses connected by roads. Our approach is to pairwise compare each house with those in the other city by creating a house- specific signature. To do so we quantified the predictability of the location of a person at various times after they left their house, assuming they move randomly. This predictability across all houses captures much of the way each city is organized and, we hypothesized, functions. We aligned networks using this conceptual rationale, with nodes as houses, edges as roads, and random diffusion representing people leaving their houses and walking around the city to other houses. The mechanics of this, which are conceptually akin to flow algorithms [172] and Lapla- cian dynamics [169], are depicted in an example alignment in Fig. 2.2 using the two networks in

Fig. 2.1.

2.2.2 Networks

A network (or graph) G was defined as N nodes connected by E edges of potentially varying weights determined by a mapping function w : E → R. That is, G := (N, Ew). Note that we did not consider node properties. These networks can also be represented as a transition matrix A, where the element Aij is the edge from node i to node j and an edge of weight zero indicates no edge.

We used only static networks. While dynamic edge and node functions require a more simulation-oriented approach than that developed here, the general method is amenable to addi- tional input functions that specify how edges and/or nodes change as a function of time and/or each other, which other network aligners are not. Similarly, this study used only connected net- works, but the method can also be applied to networks comprised of disconnected subgraphs between which all edges have weight zero.

2.2.3 Alignment algorithm

This family of network alignment algorithms proceeds in 5 general steps to produce a node-level pairwise alignment between two input networks. This creates a bijection between the 9

Figure 2.2: Aligning two networks by comparing the predictability of dynamics originating from their constituent nodes. Analytically this is equivalent to (Step 1) turning a network into a Markov process by row-normalizing its matrix representation to unity, (Step 2) exponentiating this Markov process matrix to different powers, which is identical to sampling the diffusion kernel at different time steps, (Step 3) quantifying the predictability of the system across these time steps using a normalized version of Shannon’s entropy, (Step 4) storing in a cost matrix the numerically integrated difference between these predictability-over-time curves for all pairwise starting points in the two networks, and finally (Step 5) using the Hungarian algorithm to find an optimal alignment between nodes based on how the entropy of the system changed over time for diffusion emanating from each node. Sample output from the R package NetCom is presented at the bottom right. 10 nodes of two identically sized networks and an injection if one is larger. No single model can perfectly characterize a network, which is why this method should be viewed as a family of

flexible methods that all align two networks using the same conceptual rationale, but which can differ in their implementations to accommodate varying network types as well as analytic goals.

We have distinguished these Options from the core algorithm and indicated our implementation.

2.2.3.1 Step 1/5

Convert the N1xN1 and N2xN2 square transition matrix representations of the two networks being compared into Markov processes by normalizing each row sum to unity [64]. For each

(1) Nx (1) = ∀ ∈ network’s new transition matrix Ax , ∑j=1 Axij 1 i 1, 2, . . . , Nx. The i, jth position is then the probability of moving from the ith component of the network to the jth in a single unit

(1) of time, from here on denoted by parenthetical powers as Ax .

Empirical networks vary greatly in their size, reflecting a range of system complexity but also an inherent noise in the current network data. We chose to allow differently sized net- works to be compared, but with an optional penalty created by adding disconnected nodes to the smaller network colloquially called padding. This approach allows the use of common as- signment methods like the Hungarian algorithm which require a square cost matrix (2.3.5 Step

5/5), and creates a penalty that grows approximately linearly with the gap in size as these ad- ditional nodes will always align poorly to nodes of degree > 0. Moreover, this penalty can be ignored without changing the alignment scores between actual nodes in the networks, as was done in all of our analyses, to make size-invariant comparisons of system functioning.

Option: Normalizing each row to unity will alter the ratio of edge weights between nodes with different out-degrees. This can be prevented by adding a row-specific constant to each diagonal

(1) (1) element Axii such that ∑j Axij = K for all nodes i in the network. We did not use this nor- malization because our analyses included only unweighted networks where edges are nominally present or absent and are not considered comparable in strength. 11

Option: Note that for directed networks we use the row-to-column orientation of the adjacency matrix in the style of Ulanowicz [124], but column-to-row matrix models can be employed sim-

(1) ply by transposing the adjacency matrices Ax before proceeding. Our implementation creates a signature for each node by mapping diffusion emanating outward, but depending on the ap- plication it might make more sense to capture how diffusion kernels arrive at a given node. A

(1) linear combination of both can be considered, though this requires exponentiating both Ax

T(1) and Ax adding to the computational complexity of the algorithm. For smaller analyses where

(t) T(t) the investigator is unsure which interpretation better applies, αAx + βAx should provide a more generally unbiased alignment, where α + β = 1 and α, β > 0.

2.2.3.2 Step 2/5

Separately exponentiate the adjacency matrices to different powers t ∈ N, where the resul-

(t) tant matrices Ax are the probabilities of interactions transferring some currency (e.g. energy, information) from the ith member of the system to the jth in exactly t units of time. In this way enumerating different powers of A is equivalent to discretely sampling a diffusion kernel.

Option: Which powers A should be raised to is application specific. We have used t ∈ {2y}

∀ y ∈ N such that tmax ≥ 2D and tmax−1 < 2D where D is the larger diameter of the two networks being compared. That is, the largest element of t was always at least twice the network’s diameter. This set of time steps was chosen to ensure every cycle could be traversed at least once.

Additionally, sampling logarithmically helped preserve the uniqueness of the origin node as many diffusion patterns tend toward the same stationary distribution regardless of their initial configuration, while emphasizing direct and local interactions that may matter more in real- world functioning. This is similar to sampling every time step with a non-uniform weighting function, but is computationally less expensive. Note that this method of exponentiating the normalized transition matrix only works for t ∈ N. Continuous-time Markov models [224] can be used to model fractional matrix powers, but the added complexity is less worthwhile here 12 where each system’s state is sampled discretely regardless.

2.2.3.3 Step 3/5

The diffusion kernels generated by Step 2 need to be compared. For the two example networks in Fig. 2.2, their kernels could be directly compared in Euclidean space. However this limits the method to aligning networks of the same size. Instead, we used the normalized

(t) entropy [217] of each row of each Ax matrix, creating a sequence of T entropies for each node in each network describing diffusion emanating from that particular node. All together, this produces a characteristic NxxT matrix for each network, here denoted Sx, where each node’s signature entropy-over-time curve is the same length allowing nodes in differently sized networks to be compared.

( ) ( ) − Nx t ( t ) ∑j=1 Axn,j ln Axn,j Sx = (2.1) n,t H(1Nx )

Note that the entropies were size-normalized and only calculated including nodes originally in each network, even for those diffusion kernels emanating from padding nodes added to the smaller network. The size normalization H(1Nx ) is important to prevent incorporating the size bias inherent in Shannon’s formulation of entropy. To see this bias, consider the entropies of

{1, 1} and {1, 1, 1}. Entropy increases with size because the added states the system can be in reduces information content and predictability. We therefore normalized all entropies by the maximum entropy of a system of the same size. This normalizing constant can be thought of as the null model, where the system is entirely random and maximally unpredictable.

Option: The diffusion kernels can be captured using a measure other than entropy. Here we treat network functioning as statistical non-randomness, quantified by random diffusion within the network. This has the advantage of both a rigorous theoretical foundation and demonstrated applicability across multiple orders of scale [9,76,119]. Other measures (e.g. 1-D: Simpson index

[221], Gini coefficient [84]; >1-D: ordination techniques) can and should be adopted depending 13 on the systems being compared and the intended use of the alignment.

2.2.3.4 Step 4/5

Store the Euclidean distance between rows of the two matrices S1 and S2 in a square cost

(or distance) matrix C with dimensions equal to the number of nodes in the larger of the two networks being aligned. The diagonals of C will always be zero, and it will always be off-diagonal symmetric.

r C = (S − S )2 ab ∑ x1a,t x2b,t (2.2) t∈T

2.2.3.5 Step 5/5

Run the Hungarian algorithm [66,129,134,135,171] on the cost matrix C to find the optimal node-level alignment between the two networks. The arithmetic mean of these pairwise node alignments was considered the overall alignment score between the two networks.

Option: The size penalty for aligning networks with differing numbers of nodes can be assessed here by including the full bijection output from the Hungarian algorithm. We chose to ignore this penalty in our analyses to reduce any size bias associated with heterogeneous network data, and therefore only averaged the elements of the cost matrix associated with nodes originally in both networks.

Option: The Hungarian algorithm guarantees an optimal solution to the assignment problem, and therefore the best possible alignment given the cost matrix C from Step 4. However, it’s O(n3) runtime makes it computationally expensive for applications involving very large networks. To limit biases in our analyses, and because matrix multiplication is almost as expensive a process

[52], we exclusively used the Hungarian algorithm. The cost matrix C can however be fed into any of the many methods of solving the assignment problem that use heuristics to run faster [37]. 14

Option: It will often be more appropriate to consider the probability of having made an alignment of the same quality rather than the raw alignment score. Consider obtaining an alignment score of 0.01. It is unclear if this is a surprisingly good alignment or inevitable given the types of networks under consideration. Testing for this requires creating stochastic versions of the two networks under consideration which can then be aligned to bootstrap an alignment distribution from which a p-value can be empirically calculated. Rather than introduce the bias of choosing a means of stochastically generating similar networks, and to avoid the added computational complexity, we used raw alignment scores in our analyses. However there is a need for null models in the alignment literature that deserves more attention if network aligners are to be applied more commonly across studies and even fields.

2.2.4 Analyses

Alignments produced by this method need to be reliable such that surprises can be thought of as hypothesis generating rather than methodological artifacts. We attempted to asses this reliability rigorously using synthetic data in several ways.

2.2.4.1 Comparison with other aligners

Given the computational complexity of matrix exponentiation, if our approach is not mean- ingfully different from current network aligners why not continue developing the same kinds of functional inference with an already established method? Clark and Kalita [37] recently used a subset of the synthetic protein-protein interaction (PPI) networks in the NAPAbench database, which was created to test network alignments [208], to compare ten common network aligners.

As this subset of the network alignment literature is perhaps the most robust and functionally- oriented, we chose to focus on the same methods in providing context. We calculated the same

Induced Conserved Structure (ICS) [190] of the alignments produced by the approach detailed 15 here and compared them to the corresponding scores Clark and Kalita [37] presented.

| f (E ) ∩ f (E )| ICS = 1 2 (2.3) | | EG2[ f (N1)] The numerator of ICS is the number of edges preserved under the alignment, and the denominator is the number of edges in the subgraph of the second network G2 induced by the mapping from the first network G1 (those edges between nodes in G2 mapped to by nodes in

G1). This not only measures how many edges are preserved by an alignment, but penalizes aligning sparse and dense regions which can lead to better alignments through combinatorics alone. There were 30 alignments total, 10 each within the following three generative network models: duplication with random mutation [189], duplication-mutation-complementation [245], and crystal growth [127]. In all 30 cases the alignment was made between a network with 3000 nodes and a network with 4000 nodes.

2.2.4.2 Node centralities

One of the most common characterizations of a network is the distribution of the cen- tralities of its nodes and/or edges [175]. These characterizations help identify systems that are driven primarily by a few very important components, compared to more neutral systems [110] where nodes are mostly similar. While alignment algorithms are not primarily measures of cen- trality, they can be used to measure the importance of a node or edge by aligning the original network to itself after removing one, similar to knocking out a gene [71, 93] or locally exclud- ing a species [184]. We compared this alignment-based centrality with 3 common ones (degree, eigenvector, and betweenness), Katz for its conceptual similarity [123, 175], and PageRank for its ubiquitous familiarity [183]. Each of the 200 nodes considered were uniformly randomly se- lected from randomly generated directed Erdos-Renyi networks [69] with 1000 nodes. Half of these were for a noisy static edge density of p ∼ N (0.3, 0.01), whereas the other half had a uni- formly random edge density of p ∼ U(0.2, 0.8). This allowed us to explore the relative similarity 16 of node importance between different centralities across a realistic range of model complexity.

2.2.4.3 Tracking network dynamics

More similar networks should align better, with alignment scores closer to zero. Here we defined similarity as an edge-based Hamming distance [94], where the number of edges required to turn one network into the other was used as a baseline measure of network similarity.

This allowed us to test the alignment’s behavior under known, albeit synthetic, conditions. We simulated 100 directed Erdos-Renyi networks with 100 nodes and an edge density of 0.5, and then randomly removed or added edges before aligning back to the original network. This was done

100 times (once for each network) up to 128 edges at log2 intervals to allow for more replication without sacrificing too much resolution at the likely more biologically relevant smaller levels of change. Additionally, to test the robustness of alignments to network dynamics where the order of lost or gained interactions can matter [41, 61], edges were removed or added in 3 ways using their respective edge betweenness as a measure of importance [85]: randomly, from least to greatest importance, and from greatest to least importance. The importance of each edge was recalculated following each removal or addition to account for systemic changes created by removing or adding edges.

We also created two network trajectories to demonstrate how an evolving network can be tracked by aligning all pairs of time steps. The first was a 100 node Erdos-Renyi random network [69] with edge probability 0.5 that at each time step randomly flipped a single edge from present to absent or absent to present. The second was a 100 node Barabasi-Albert Preferential

Attachment random network [2] which was randomly scrambled into an Erdos-Renyi network.

At each time step an edge was uniformly randomly picked and then uniformly randomly labeled as present or absent regardless of its previous state. Note that the same edge could be picked multiple times. 17

2.2.4.4 Functional network classification

We explored the generality of our alignment algorithm’s ability to distinguish different kinds of systems functionally by pairwise aligning 307 empirical networks spanning micro and macro-biology with 120 reference networks generated from four common theoretical models (30 each across a range of sizes). All 427 networks are listed in Table S3. Note that networks were

first divided into connected components, if necessary, before being aligned.

For network alignments to be a useful kind of inference they should be able to identify functionally similar communities that are likely to react similarly to the same stimulus. That is, they ought to be able to perform supervised learning to classify unknown networks functionally.

To test this we used synthetic data to classify unknown networks solely by aligning them to known networks. All networks were generated from one of four commonly-cited generative models: directed Erdos-Renyi [69], Barabasi-Albert Preferential Attachment [2], Duplication and

Divergence [113], and Watts-Strogatz Small World [251] networks. Using the Variable Version scheme listed in Table S1 to allow for conservatively noisy data, we generated 1000 networks with exactly 100 nodes, uniformly randomly according to one of the four models. This process was then repeated. Each of the 1000 new networks was aligned to all 1000 of the previously generated known networks. First the model type was predicted as the minimum of the average distance to all networks within each type. Then, the parameter itself was predicted as an average of the parameter values of those networks belonging to the predicted type weighted exponentially by their alignment scores with the unknown network. An exponential weight of λ = 100 was used to place most of the predictive power on those networks that aligned well with the unknown network because network similarity is likely nonlinear with respect to underlying parameters.

As most empirical networks vary in size even when describing the same process, we repeated this analysis with networks having an integer number of nodes ranging uniformly randomly between 10 and 100. The actual and predicted model type and parameter were then plotted against each other (Fig. 2.7). 18

To explore the real-world utility of aligning networks by simulating their dynamics we attempted to classify 100 ecological networks assembled by the Systems Ecology and Ecoin- formatics laboratory at the University of North Carolina Wilmington [25]. This subset of the

307 empirical networks in Fig. 2.6 were annotated both traditionally, by their composition, and functionally allowing us to test our approach’s ability to recover functional classifications. Ad- ditionally, they were assembled by the same research group limiting differences in the nature of the models. We chose to focus on ecological communities both for their age, as ecologists have been using networks to describe communities for decades [54, 99, 230], and for their inherent noise [178] as a way to ensure conservative results. In addition to the pairwise alignments, we also attempted to ascertain the underlying processes in each network by aligning them to ran- domly generated networks of the same size from each of the four reference network models as in Fig. 2.7. These networks ranged from 4 to 125 nodes with an average and standard deviation of 26.66 ± 27.14. Each model’s parameter state space was divided evenly into 101 bins (includ- ing the endpoints), and each of these models was used to randomly generate 100 networks. A dimension reduced plot of the pairwise alignment scores between these 100 ecological networks along with the averages of the alignments to the four reference models are plotted in Fig. 2.8. A

Shepard plot of the deviations in the dimension reduced plot is given in Figure S4.

2.2.5 Data availability

All biological networks used in this study were publicly available as of February 20, 2017 and listed in Table S3. All reference networks were simulated using the R igraph package [50], version 1.0.1, except for the Duplication and Divergence networks which were generated as in

[113] but with an additional parameter governing the probability of a duplicated node linking to the original node [83, 245]. See Table S1 for details. 19

2.2.6 Code availability

The alignment algorithm developed here is part of the upcoming R package NetCom. All analyses were run using the following command: NetCom(..., ..., base = 2, characterization =

“entropy”, normalization = FALSE)

2.3 Results

2.3.1 Comparison with other aligners

Alignments produced by our method were fundamentally different than any of those pro- duced by the ten algorithms Clark and Kalita [37] studied (all alignment scores are listed in

Table S2). The highest individual alignment’s ICS score across all types was only 0.046, in con- trast to all ten network alignments Clark and Kalita [37] tested which mostly produced scores above 0.5. Natalie 2.0 [67] even managed to average above 0.8. No other aligner, even the older

GRAAL [132] which performed the worst, produced ICS scores below 0.1. The differences in

ICS scores produced by the method developed here suggest current aligners preserve network topology at the expense of similarities in their dynamics.

2.3.2 Node centralities

Despite producing novel alignments, the alignment scores of Erdos-Renyi networks aligned with themselves following the removal of a single node were positively correlated with common measures of centrality (degree, eigenvector, betweenness, Katz [123, 175], and PageRank [183]) when the networks had a noisy but static edge probability (Fig. 2.3). However, under a uniformly random model these correlations weakened and even became negative (betweenness). This dis- agreement was captured by the empirical centrality distributions of each measure, which are presented along the diagonal in Fig. 2.3. When the edge probability was noisily static all six cen- tralities were unimodal and qualitatively symmetric. However when that parameter was allowed to vary randomly they behaved differently. Degree centralities became more uniform, eigenvec- 20 tor and betweenness centralities skewed toward more and less important nodes respectively, Katz centralities shifted toward less important nodes but remained symmetric, and PageRank and the alignment-based centrality remained symmetric and unimodal about approximately the same mean node importance. This centrality invariance with respect to the variability of underlying parameters makes the approach developed here more amenable to between-study comparisons and extrapolations of node importance that deserves further study.

2.3.3 Tracking network dynamics

Alignments monotonically worsened, though not symmetrically, as more edges were re- moved or added independent of whether this was done randomly or in order of edge between- ness (Fig. 2.4). This held across three orders of magnitude in the number of edges changed suggesting the potential to track systemic changes by aligning a network to itself through time.

This approach may help identify critical points as well as assess general system stability, as demonstrated in Fig. 2.5, and can be equivalently applied along other dimensions (e.g. spatial) if the underlying dynamic is not temporal in nature. See Figure S1 for a Shepard plot of all deviations from the true alignment scores between each pair of time steps.

2.3.4 Functional network classification

Comparing networks using their simulated dynamics created novel alignments, correlated with well-studied measures of node importance, and behaved smoothly with respect to under- lying changes in a network. We therefore hypothesized our approach would be capable of iden- tifying functionally similar communities that are likely to react similarly to a given stimulus. 21

Figure 2.3: A comparison of five common measures of node importance with each other and the diffusion-based alignment algorithm presented here (Alignment). Each of the 1000 points per panel was a randomly selected node from a randomly generated 100 node directed Erdos- Renyi network. The lower diagonal (blue) networks had a stochastic edge probability of p ∼ N (0.3, 0.01) and the upper diagonal (green) networks had a stochastic edge probability of p ∼ U(0.2, 0.8). The diagonal contains the numerically smoothed empirical centrality distributions, with the same blue/green colorings (dashed lines correspond to the upper diagonal). Note the color-coded centrality-specific axes. 22

Figure 2.4: The change in the alignment score of a network aligned with itself following edge removals or additions. All networks began as 100 node directed Erdos-Renyi networks with an edge density of 0.5. Edges were then removed (negative x-axis values) or added (positive x-axis values) randomly (green), starting with the most important (red), or least important (blue). Boxes show the median, interquartile range (IQR), and confidence interval (1.5*IQR/sqrt(n)). n = 100 networks per box. 23

Figure 2.5: Non-metric multidimensional scaling of network dynamics captured by aligning all pairs of time steps. a) An Erdos-Renyi network (N = 100, p = 0.5) randomly changing a single edge at each of the 1000 time steps. Stress = 0.07. b) A 100 node network formed by linear preferential attachment randomly changing a single edge at each of the 1000 time steps. Stress = 0.0002. See Figure S1 for Shepard plots of all deviations. 24

As seen in Fig. 2.6, our approach was largely able to distinguish between 13 types of systems present in an assembled database of 427 networks spanning micro- and macro-biology (see Table

S3 for details). Additionally, the difference in the number of nodes between two networks was unrelated to their alignment score (Figure S3) indicating that our prioritization of dynamics over topology produced size-invariant alignments. Figure S2 contains the corresponding Shepard plot of all deviations in Fig. 2.6 from the true higher-dimension alignment scores.

We leveraged our approach’s ability to differentiate network types in a classifier that was largely successful at inferring the model and parameter of unknown networks by aligning them to known networks (Fig. 2.7). When network size was fixed every network type was correctly predicted, as seen by all points lying within the diagonal squares. Even when network size was allowed to vary, only 18 networks (1.8%) were assigned to the wrong model type. All of these networks except for one had fewer than 30 nodes and were Erdos-Renyi models, indicating that smaller random networks may appear as other types of developing systems in their function- ing. In general the alignments were most successful at within-model parameter identification of

Erdos-Renyi networks, and struggled to identify preferential attachment and small world net- works with high probabilities of attachment and rewiring respectively.

We applied this classifier to a subset of the biological networks that were curated by the same research group [25] and labeled with traditional as well as functional classifications which we successfully recovered (Fig. 2.8). This analysis assumes these 100 ecological networks derive from some combination of only four models, which is unknowable in this context and almost surely incorrect. However, these alignments are able to infer if any of the proposed models account for their dynamics. The magnitudes of the alignment scores are of less use here be- cause they are not reliably normalized, but changes in alignment scores following changes in a 25

Figure 2.6: Non-metric multidimensional scaling of pairwise alignment scores between 307 bi- ological networks and 120 common generative network models, totaling 13 distinct types of systems. Each point is a single network with its size proportional to its number of nodes. The four synthetic types are: Erdos-Renyi, Preferential Attachment, Duplication and Divergence, and Small World. The nine biological types are: enzymatic pathways, gene regulations, protein-gene interactions, protein-protein interactions, C. elegans metabolism, a Macaque’s brain (visuotactile regions), the human diseasome, trophic ecosystems, and biogeochemical ecosystems. Stress = 0.11 (see Figure S2 for a Shepard plot of all deviations). 26

Figure 2.7: Classification and parameter prediction of unknown networks. All of the 1000 net- works in each panel were created from one of four generative models, with a single uniformly random parameter: Erdos-Renyi’s edge density, Preferential Attachment’s attachment power, Du- plication and Divergence’s probability of divergence, and Small World’s probability of rewiring. Each network was classified by aligning it with 1000 known networks and averaging their pa- rameters weighted exponentially by their alignment score (λ = 100). All 1000 networks in the left panel and the 1000 networks they were each aligned to had exactly 100 nodes, whereas in the right panel these networks had a uniformly random integer number of nodes between 10 and 100. There are no networks in the diamond shading which serves only to demarcate the four network types. 27 model’s parameter offer evidence of that process being at play in the network being aligned. As an example, the cycling estuary networks all aligned better with Erdos-Renyi networks as the probability of an edge increased, and aligned worse with Duplication and Divergence networks as the probability of divergence increased. In contrast to this they aligned consistently well and poorly with Small World and Preferential Attachment networks respectively. This consistency across parameter values indicates that particular parameter is unrelated to the functioning of the estuary communities, and that any high quality alignments occurred by chance or a shared sim- ilarity to a different community driver. In this way, much like regressing covariates, generative models can be assessed as predictors of network data.

2.4 Discussion

We have developed an algorithm that aligns networks using diffusion kernels to simulate temporal dynamics. This approach was shown to characterize nodes in a network in accor- dance with common measures of centrality, and to behave smoothly with respect to the number of changes in a network’s edges allowing system trajectories to be captured. We successfully mapped a functional state space of four common network models and nine types of biological systems across a range of mechanisms and sizes, and were able to predict the underlying process in a single-model network of unknown origin. Unlike most statistical inference which aligns data from a single system to a proposed model, our approach need not even involve a model.

Indeed, it is more akin to machine learning in that it looks for similarities in patterns of network dynamics across many pairwise comparisons offering a nonparametric means of tracking and classifying complex systems.

Why attempt to align networks by simulating dynamics on static models inferred from data that was never intended to capture dynamics so literally? Because network data with a temporal component is often challenging to collect so most network data are currently static. These data may only address the structure of a system’s direct interactions, but these can in turn shed light on the more complicated effects of indirect interactions which have been shown to play a critical 28

Figure 2.8: Classification and model prediction of the 100 ecological networks in the enaR database [25]. a) Non-metric multidimensional scaling of the pairwise alignment scores between all 100 networks. These plots are identical except for their labeling and coloring, which indi- cate the traditional classifications (left) and functional classifications (right) as labeled by their curators at the Systems Ecology and Ecoinformatics laboratory at the University of North Car- olina Wilmington. Stress = 0.16 (see Figure S4 for a Shepard plot of all deviations). b) Average alignment scores of each network (column) with randomly generated networks from each of the four theoretical reference models in Fig. 2.7 (see Table S1 for details). The parameter space for each of the network models was divided evenly into 101 bins (rows), each of which was used to create 100 reference networks which were aligned to the ecological networks. The parameters were: Erdos-Renyi’s edge density, Preferential Attachment’s attachment power, Duplication and Divergence’s probability of divergence, and Small World’s probability of rewiring. Networks are arranged within the two classifications to keep pairs that aligned well with each other close together to emphasize patterns in their alignments with the reference networks. Darker colors indicate better alignments. 29 role in the assembly and control of biological systems [121, 159, 215, 257]. Our approach is an attempt to infer systemic functioning from the same common network data.

The graph and subgraph isomorphism problems have been adopted by network scientists such that measures of alignment quality still emphasize topology over dynamics or functioning.

From a practical perspective, of needing to assess and predict system trajectories, measures like

ICS seem to hold less potential. If indirect effects matter as much as direct ones, then global network alignment ought to be reformulated away from data-centric topological models toward ones of system dynamics. Consider an invasive species like the zebra mussel. Physiologically, even genetically, zebra mussels vary little between the Eurasian bodies of water they are native to and the North American lakes and rivers they have invaded [228]. However from a systemic perspective they are different species. Similarly, two very distinct species can function almost identically in their respective communities. This idea was most famously formulated in the concept of a keystone species decades ago by Robert Paine [184,185]. We would argue that just as there are functionally similar species, there are also functionally similar communities. Moreover, this is just as true in microbiology as ecology. It is encouraging to see that aligning networks by their dynamics captured some of their underlying functioning, even when the dynamics were simulated using static data.

There are however network aligners which were designed to generate heuristic solutions to the graph and subgraph isomorphisms problems. The two networks in Fig. 2.1 illustrate why our approach should be used cautiously in this context. While diffusion from each node in Network 1 produced a unique pattern of entropy over time, only one node in Network 2 did. The trajectories of diffusion kernels emanating from nodes 2, 3, and 4 in the second network are identical making them all pairwise interchangeable in the final alignment. Our analyses used only a single best alignment, but considering the uniqueness of this alignment may offer additional insight into its inferential and predictive utility. Similarly, considering the probability of an alignment score given the two networks being aligned may offer additional insight. However this involves align- ing permutations of the two networks to bootstrap an empirical distribution of alignment scores 30 which is computationally expensive and requires justifying a particular permutation procedure.

It is important to note a coupled benefit and drawback of the simulation-oriented ap- proach developed here. Most other approaches to aligning networks assume static data, where the node and edge properties are fixed. This is logistically and analytically convenient. The approach developed here intentionally prioritizes the underlying conceptual approach over its computational complexity, which is rate limited by two steps. The Hungarian algorithm runs in cubic time with the number of nodes in a network [65], though this can be substituted for any heuristic solution to the assignment problem [37] which leaves matrix exponentiation as the rate-limiting algorithmic step. The current best upper bound on the complexity of matrix multi- plication is O(nodes2.37) [145]. This is only a modest improvement on the almost three decades old algorithm published by Coppersmith and Winograd [47] and therefore seems unlikely to improve dramatically in the future. In practice this limits our approach to networks of a few thousand nodes. However, the assumption that network data is static has already limited the analysis of noisier systems where interactions are continually changing [19]. Our approach can be used to align networks with explicit temporal components by having node and/or edge prop- erties change mid-diffusion. The same approach can also be used to model communities more stochastically. The diffusion kernels used here are only the deterministic mean-field approxima- tions of the true stochastic interactions. If those interactions have an asymmetric distribution of strengths, Jensen’s inequality [115] proves the mean rates of diffusion between nodes will give a biased view of the system’s dynamics. These limitations make a diffusion-based alignment more tractable alongside the long-term goals of network science.

Our findings argue that as network science continues to grow as a field, encompassing an increasingly diverse set of biological systems as well as physical and social communities, there will be a growing need for unification. The practicality of this study’s underlying goal to identify different types of systemic functioning using statistical network dynamics will depend heavily on the way network scientists incorporate biases into their data collection and analyses. There is no single correct way to compare networks so it is important to decide what kind of comparison 31 will be most insightful relative to the questions being asked of the networks before comparing them. As an example, consider the goal of making size-invariant comparisons and what role a system’s size might actually play in its functioning. For models like the one proposed by Erdos and Renyi [69] there is an assumed size-invariance because the density of edges is both random and size-invariant. However some systems are actually not size-invariant. Consider two systems formed through preferential attachment, one young and the other old. We would like to classify them similarly regardless of their age, but they function quite differently. Time will have turned the older system into a less equitable system that spreads information (or energy or any currency of interest) in a more predictable top-down manner. They are the result of the same process, but should they be classified together? Network scientists need to better understand how tools like an alignment algorithm are actually used, and agree more on the ways they envision them shedding light on systems of all kinds. Chapter 3

Can community structure causally determine dynamics of constituent species? A test using a host-parasite community

3.1 Introduction

A long-standing question in community ecology is whether aggregate or emergent proper- ties of communities exert strong feedbacks on their constituent populations, rather than simply arising from the numbers and dynamics of the species themselves. Concern with this question goes back at least to the very early 1900s and Clements’ conception of successional plant commu- nities as superorganisms ( [38]), and continues today with both explicit and implicit expectations that system properties will alter dynamics of players within ecological communities. For exam- ple, efforts to characterize static ( [63]) and dynamic ( [236]) networks are usually predicated on the assumption that these measures of community state in turn influence the behaviors of species or other actors within the network. There have even been calls to conserve these commu- nity properties ( [241]), based on the assumption that they are not just measures of community states, but actually drive ecological and evolutionary processes.

In spite of the expectation that community structure matters for populations of constituent species, testing this idea has proven difficult. Convincing tests would require a demonstration that community properties have causal effects on populations that are above and beyond those exerted by the summed effects of each interacting species. Said differently, community properties must be shown to be more than a convenient bookkeeping tool, but to causally influence dynam- ics of at least some parts of the system. While this would seem a simple problem to address 33 experimentally, community properties do not exist physically. They are multivariate statistics of a communitys state based on its constituents (for overviews of these ideas see [198], [14]). Testing for these effects requires systematically altering a community-level property, but this can only be easily accomplished by manipulating its constituent populations. This confounding of treatment effects prevents independent measurements of effects on those populations. An added difficulty is the general issue of separating causality from correlation, which is particularly acute when asking about the relationships between constituents and aggregate measures of the same system.

To see these problems, consider a host-parasite community where parasites of multiple species compete for hosts of multiple species and vary in both their host specificity and the time they spend on any one individual host, the numbers of which are regulated by species-specific environmental factors. In some years specialist parasites may face almost no competition from highly competitive generalists because of temporal or spatial niche partitioning, but in years with more niche overlap the abundances of specialists may decline dramatically due to increased competition. Importantly, these fluctuations need not be driven by any particular interspecific interaction. Even with considerable niche overlap no individual generalist species may be able to monopolize the desired host species of our specialist parasites to the point where their abun- dance will decrease. It is only in the aggregate that the specialist parasite species faces high competition, so that the degree to which the community at large interacts is the primary driver of the fluctuations of this one species. This hypothetical community is not a self-determining su- perorganism, but it also cannot be understood without considering the roles its community-level properties play in the dynamics of its constituent populations.

These types of feedbacks between levels of organization are likely to at least partially ex- plain the difficulty of understanding complex system behaviors like hysteresis and alternative stable states ( [108, 147, 160, 212, 233]). Moreover, there is already a precedent in epidemiology for the importance of community structure. The dilution effect, where increased species richness makes a particular interspecific interaction less likely, has been shown to reduce transmission in a range of systems. For instance, host heterogeneity mediates the risk of Lyme disease to humans 34 by reducing tick-borne transmission of Borrelia burgdorferi ( [180]), and decreases the prevalence of Ribeiroia and associated deformations in Bufo americanus ( [117]). Heterogeneity can also aid viral expansion. [126] found that extreme heterogeneity in avian host transmission could increase the relative reproductive ratio of West Nile Virus by an order of magnitude. The structure of host contact networks has even influenced immunization protocols where targeting high degree (con- tact) individuals has been shown to reduce the total number of vaccinations needed to eliminate a disease from a population in humans ( [210]) and canids ( [102]). However, there has been comparatively little research on the effects of these types of structures on the abundances of host populations themselves.

Here we apply recently developed methods that seek to directly assess the causal inter- actions among different parts of a community and show that these methods can be used on interactions of community properties as well as community members. We analyze an example system to show how assessment of the importance of community properties can be carefully tested and, more specifically, to provide a recipe for understanding and using recently devel- oped methods that are not widely understood by most ecologists. The specific tool we use,

Convergent Cross Mapping (CCM) ( [232]), is one of a body of methods that use time series of population fluctuations to test for causal interactions between different variables. In the ecol- ogy literature these techniques are collectively being referred to as Empirical Dynamic Modeling

(EDM) ( [53, 260]), but they are rooted in work on nonlinear dynamics from the latter half of the

20th century ( [49, 182, 234]). While these methods seem similar to many traditional approaches to inference about species interactions ( [114, 143, 177]), they are fundamentally different, being concerned with causality per se rather than, for example, interaction strength. As we describe, these differences confer both strengths and weaknesses for ecological analysis.

We use CCM to test for across-scale causality in a Slovakian rodent-ectoparasite community that has been subject to several other network and community analyses ( [31, 32, 131, 195]). This is an appealing system to test for causal effects of community-level structures on constituent populations because species abundances and parasitic interactions were observed at the same 35 resolution, with both temporal and spatial replication. Analysis of a host-parasite system is also relevant to the recent literature focused on whether and how host community properties can determine disease dynamics ( [117, 118, 126, 180, 191, 210]). This same community was recently found to have structures similar to those produced by random interactions proportional to the relative abundances of its hosts and parasites ( [31]), but the conclusion that the community’s structures were the result of neutral interactions assumed those structures could not have shaped the host or parasite species pools. In addition to our main goal of testing for across-scale causality, we revisit this expectation through the lens of causal inference.

3.2 Methods

3.2.1 Overview

Convergent Cross Mapping (CCM) ( [232]) is a promising method within the EDM frame- work. These methods use time series to infer causal relationships between pairs of variables belonging to a common deterministic dynamical system. CCM uses time lags to reconstruct system dynamics into a “manifold” (see [234]) from the perspective of each variable, which are each called “shadow” manifolds, and asks how well they predict each other as the amount of data increases. A shadow manifold is a multidimensional reconstruction of an entire systems dynamics from the perspective of a single constituent variables time series, with each dimension representing a different time-lagged value from that single time series. If species A’s dynamics caused those of species B, more data on the dynamics of species B (the effect) will increasingly add predictive power about what the population of species A (the cause) was doing. Note that the directionality of this is counterintuitive, and that the reverse need not be true, making CCM sensitive to the direction of causation. Most importantly, CCM does not infer the underlying mechanism or functional form of any inferred causal interactions, only if they exist in the first place. This feature makes it extremely flexible and complimentary to many traditional commu- nity modeling frameworks, although it also means that the method cannot give information on 36 many aspects of interactions important to ecology.

There are multiple approaches to the assessment of causal relationships ( [91, 148, 157]), with methods ranging from those already applied and subjected to scrutiny to those only very recently derived and less tested on empirical systems. CCM is particularly appealing because of its prior use in ecology. To date, CCM has been used to address causal interactions between the climate and ecosystem productivity ( [28]), disease outbreaks ( [51, 56]), catch dynamics ( [96]), algal blooms ( [164]), and even the timing of hibernation ( [70]). It has also been used to study interspecific biotic interactions such as between fish abundances and coral reef accretion rates

( [48]), and between moth defoliation of spruce and subsequent bark beetle infestations ( [88]).

Despite these past uses, it is important to understand that CCM works only for deter- ministic systems with coupled variables resulting in chaotic nonlinear dynamics. Decades after

Hassell’s work classifying single-species insect dynamics it remains unclear how common chaos is in the natural world and therefore how broadly applicable this tool may be for ecological in- ference ( [97, 98, 239]). More to the point, any use of CCM must include careful assessment of its underlying assumptions. Below, we first describe how we used CCM and represent its outputs, and then detail the data and measures of community properties that we have used. In Appendix

B we provide a detailed recipe for the steps employed to test CCMs assumptions, which are only described briefly in the Causal Inference with CCM section that follows.

3.2.2 Causal Inference with CCM

CCM infers bidirectional causation between pairs of time series, which in our example includes host and parasite species but also community properties, by looking for an improve- ment in the ability of each time series’ shadow manifold to predict the other as the number of observations increases. By permuting each time series and reapplying CCM we can bootstrap a probability of seeing an increase in predictive ability as great or greater than that observed to obtain a p-value for the null hypothesis H0: variable A did not cause variable B. Note that significance is not necessarily associated with the strength of the detected causality, a relation- 37 ship which to our knowledge has not been characterized. CCM only addresses the presence of causation, not its direction or functional form. This is the first of two important limitations, the other being that it does not apply to all systems (see the following paragraph). Also note that while this is a frequentist p-value, it’s relationship to sample size is fundamentally different from the p-values that arise in most statistical applications. The convergent part of cross map- ping, the improvement in prediction with an increase in data, is the signature of causality. No matter how long uncoupled variables are observed, they will never show this improvement in prediction. In this way increased sample size actually offers a more trustworthy p-value rather than guaranteeing significance, as with most p-values arising from standard statistical tests. To estimate these p-values, we used 100 permutations of each time series as suggested by [232]. As

CCM is bidirectional this produced two p-values for each pair of time series, assessing whether the first time series caused the second and vice versa. We have assembled these p-values into a column-oriented causal community matrix C where Crc = p-valuec caused r, the columns, c of this matrix indicate the causing done by each species and community property, and the rows, r, are the causing they each experienced.

There are two important assumptions which must be met for CCM to work: (1) the time series of interest must involve deterministically coupled variables, resulting in chaotic nonlinear dynamics. As [232] discuss, causality in separable (linear) stochastic systems can be tested using

Granger causality ( [91]) or measures of information transfer between time series ( [148,214]). This is the other limitation of CCM, that is cannot be used to infer causality in all systems or even all members of a single system, as was the case here. Additionally, (2) the shadow manifolds of each variable need to capture the entirety of the underlying systems dynamics. For example,

CCM cannot infer causality involving a species that was almost never observed, regardless of the quality of observations of other species. We tested both of these assumptions for each of the 26 rodent species, 61 ectoparasite species, and 9 community properties measured for the system.

As CCM has never been tested on community properties, we further considered assumptions

Takens made in his delay embedding theorem ( [234]) that are likely to apply to time series of 38 population dynamics but not necessarily to those of community properties, most notably that the shadow manifolds are smooth (twice differentiable). We did not test for autocorrelated red noise because even though [232] and [260] warn of the potential to confuse autocorrelated red noise and deterministic chaotic behavior, CCM should correctly infer no causation regardless ( [35]). In

Appendix B we provide a recipe for testing each of these assumptions and applying CCM to time series of population fluctuations as well as community properties. For a more complete tutorial on CCM, including sample code, we recommend the vignette for the R package rEDM ( [260]).

In the Discussion, we also provide an overview of the methods strengths and weaknesses and further avenues of research to allow better inferences using EDM methods.

In addition to examining causal interactions between individual pairs of hosts, parasites, and community properties, we tested for causal relationships between these three functional groups. This was accomplished by scrambling the individual pairwise causal p-values (the cells in Fig. 3.2) 10,000 times to bootstrap a probability of the cumulative pairwise group level values

(the sum of the cells in each panel in Fig. 3.2) occurring by chance given their empirical dis- tribution. The results of this permutation test are shown as the shading around each panel in

Fig. 3.2.

While CCM is often used for only a single long set of observations, it can also be applied to multiple sets of time series observed in different sites. For the data set used here, the best sampled location was observed 32 times. This is long for an ecological record, but short in the context of causal inference. Moreover, most locations were visited far less (see below); three sites were sampled only twice. We addressed this balance of temporal and spatial replication by using the multispatial version of CCM in the R package multispatialCCM ( [35, 201]), and checked the quality of the resulting spatiotemporal manifolds in aggregate using leave-one-out cross validation assessed with a correlation coefficient (ρ), adjusted R2, mean absolute error (MAE), and root mean squared error (RMSE) in line with [260]. All shadow manifolds were constructed with a time lag of τ = 1. This maximized the number of data points usable in the state space reconstructions, but was also the most biologically appropriate time lag because the dynamics of 39 the parasitic interactions occurred on shorter scales than those at which the system was sampled

(Fig. B.2).

3.2.3 Data

We have used these methods to reanalyze the dynamics of a rodent-ectoparasite community where the host-parasite interactions were directly measured along with relative host and parasite abundances. The data (Fig. 3.1) detail 113 trapping efforts of rodents across 10 locations in

Slovakia between 1983 and 2001. Altogether this included 26 species of rodents from genera

Apodemus, Arvicola, Clethrionomys, Crocidura, Glis, Microtus, Muscardinus, Micromys, Mus, Neomys,

Ondatra, Rattus, Sorex, and Talpa, and 61 ectoparasite species spanning mites, fleas, and ticks.

Each spatio-temporal observation corresponds with a trapping effort and has three components: relative host abundances, relative parasite abundances, and pairwise host-parasite interactions.

Hosts were recorded as the proportion of each rodent species across all traps at each location.

Parasites were recorded as the proportion of each parasite species across all hosts trapped at each location. Their interactions were quantified as the fraction of each parasite species on each host species across all traps at each location. All together this produced 113 unique triplets of relative rodent and parasite abundances and their interactions, six of which were temporally isolated and therefore excluded. Note that no effort was made to sample parasites independent of the host trappings. This likely resulted in a bias toward generalists (more chances to be observed) and more obligate species (more likely to be observed per trapping effort), and resulted in an unknown degree of dependence between the parasite populations and the host-parasite interactions.

This bipartite community was originally detailed by [227] and the data made freely avail- able in the Dryad Digital Repository , resulting in sub- sequent analyses in contexts including neutrality ( [31,32]), community dissimilarity ( [195]), and coexistence ( [131]). However, the published data do not include species identities or locations

(which included forests, grasslands, and mountains; [31]) except as indexed by categorical codes. 40

Additionally, these locations were not equally sampled, but instead for the following numeri- cally ordered numbers of years: 2, 2, 2, 7, 10, 10, 13, 14, 15, and 32. For further details see [226] and [31].

3.2.4 Community Properties

We calculated 9 common measures of community structure (Table 3.1) for each of the 107 host-parasite networks, where each network corresponds to a unique spatiotemporal observation of the community, using the function networklevel in the R package Bipartite ( [58,59,201]). These measures include indices to quantify the degree of connectivity between species as well as how even, nested, and clustered those connections were. Of these 9 measures, only three met the assumptions of CCM (see the “Causal Inference with CCM” section above): Connectance, Clus- tering Coefficient, and Interaction Evenness. To define these common network properties note that E is the weight of an edge (a direct interaction between two species, here the fraction of each parasite species on each host species across all traps at each location), and |E| and |V| are the numbers of edges (interactions) and vertices (species) in the network respectively. Additionally, an N path is a series of N edges connected continuously such that if the network were drawn on paper you could trace a path along them without having to lift your pencil). Except for Inter- action Evenness, which was calculated using only realized interactions in accordance with [20] and [242], each interaction matrix was expanded to include all possible interactions between the

26 host and 61 parasite species, with zeros for all of these additional interactions, to prevent any size bias. Note that the Clustering Coefficient of a bipartite network involves 4 paths instead of

3 paths.

3.3 Results

As is the case in other ecological communities (e.g. [39]), the majority of the Slovakian rodent-ectoparasite community did not meet the assumptions of CCM. We can only apply CCM to the dynamics of 3 hosts, 7 parasites, and 3 community properties that were observed well 41

Figure 3.1: Data on the Slovakian host-parasite community. A) Time series of 26 rodent species, 61 ectoparasite species, and 9 community properties replicated at 10 spatial locations in Slovakia beginning in 1983. The community properties were calculated from interaction matrices, one for each spatio-temporal point. B) Spatially and temporally averaged interactions between the rodents and ectoparasites. The color of each cell is the relative probability of parasitism averaged across space and time such that darker cells indicate species that interacted more strongly. Only 25.2% of all possible interactions occurred at least once, and the strongest interaction involved parasite 39 which on average was found on host 2 85% of the time. This spatio-temporal aver- age community has a Connectance of 0.25, a Clustering Coefficient of 0.16, and an Interaction Evenness of 0.65. 42

Community Properties Property Formula Meaning References |E| Connectance The fraction of possible interactions that occurred. [175] |V|h + |V|p blank line 1  1 1 1 1  ∑h + ∑p 2 |V|h d |V|p d |V|h + |V|p blank line Weighted Connectance Connectance for weighted interactions. Calculated as [242] Ehp Ehp weighted link density divided by the number of host [31] where d = − ∑ ∑ log and parasite species in the community. h p ∑ E ∑ E

∑ Eclosed 4 paths ∑ E blank line Clustering Coefficient all 4 paths The fraction of species sharing a single interactor [175] where an N path is a series which also share two interactors. A clustering coef- [251] of N contiguous edges ficient of one indicates a perfectly transitive commu- nity. E E − hp ∗ ( hp ) ∑h ∑p ∑ E log ∑ E Interaction Evenness The entropy of a network’s edges normalized by the [194] log(|E|) maximum entropy possible for a network of the same size. blank line 2−1 ∑ sk − 1 s Alatalo Interaction Evenness ∏ sk k − 1 A measure of evenness that is robust to outliers. [170] where sk is the proportion of (host or parasite) species k blank line 0 H2max − H2 H2 = H2max − H2min

H 0 Specialization where H2 = − ahpln(ahp) 2 ∑ ∑ The deviation of observed interactions from those [22] blank line h p that would be expected given the system members’ Ehp marginal totals.aaaaaaaa and ahp = ∑ E

|V|p − |V|h Network Asymmetry Asymmetry between source and sink members of the [22] |V|h + |V|p community. blank line Nestedness Calculated algorithmically The “temperature” of a matrix. This is the deviation [8] from a perfectly nested system where edges in row i and column j contain those edges in row i + 1 and column j + 1. blank line Weighted NODF Nestedness Calculated algorithmically Size normalized nestedness based on decreasing fill [3] and paired overlap. [4]

Table 3.1: Community properties considered as causes of constituent host and parasite species dynamics. h and p are host and parasite species respectively. Only Connectance, Clustering Coefficient, and Interaction Evenness met the assumptions of CCM and were actually tested for across-scale causality. 43 enough to allow the construction of a shadow manifold which then displayed the signature of chaotic deterministic dynamics. Thus, we were able to test 21 of the 1186 observed interactions.

While the species that met these criteria were among the most abundant, their interactions were also representative of the entire community (Fig. 3.1). We were able to account for the entirety of this sub-community’s dynamics using their shadow manifolds (predicted vs observed: ρ = 0.92, adjusted R2 = 0.85; model error: MAE = 0.08, RMSE = 0.07), and identified causal drivers both within and across its organizational scales.

3.3.1 Causal Community Matrix

Figure 3.2 depicts the Slovakian rodent-ectoparasite community as a causal community matrix where darker cells are more significant causal interactions. Only five interactions were significant at the α = 0.05 level and 11 at the α = 0.1 level, though 64% (non-white cells) were more likely causal interactions than expected purely by chance. The color indicates the direction of each interaction with green and blue cells indicating negatively and positively correlated time series respectively. These correlations are not intended to accurately capture the direction of direct causal interactions, which is the motivation for using causal inferences like CCM. Unfor- tunately CCM only addresses the presence of causation, not its direction or functional form, and the sign of interactions is obviously important. So, while any individual color in Figure 3.2 is unverifiable, in aggregate they add context to CCM’s findings. It may be that strong enough con- vergence in cross mapping skill guarantees the trustworthiness of a simple correlation coefficient, but to our knowledge this relationship has yet to be rigorously tested for CCM.

The direct interactions between pairs of rodent and ectoparasite species exhibit a great deal of variability in the kinds of causal interactions. Host 3 caused population growth for all but one of the parasites, host 7 only played an important role in the dynamics of a single parasite species, and host 23 was responsible for population declines in three parasite populations. The parasites’ causal portfolio was equally diverse. Most had no effect on their hosts, but several showed both positive and negative influences. Parasite 9 in particular had a strong positive effect 44

Figure 3.2: Causal community matrix where columns cause rows. Darker shades are more likely causal interactions and shaded borders indicate the bootstrapped probability of causality at the functional group level. Green and blue are negative and positive Pearson correlation coefficients respectively to indicate the direction of the inferred causal forcing. The three community proper- ties are: C = Connectance, CC = Clustering Coefficient, and IE = Interaction Evenness. The four upper left/center panels form a causal version of the traditional community matrix of interspe- cific interactions. Note that CCM cannot be used to infer self-causation, hence the diagonal is shown with hatched boxes to indicate that no test was performed.

on the population of host 7 which was reciprocated, suggesting some form of mutual facilitation, possibly sustained by parasite 9 preventing parasitism of host 7 by more detrimental parasite species. There was also high variability in the inferred causal interactions between pairs of hosts and between pairs of parasites. Whereas the rodent populations appeared to have no effect on each other, effects of parasites on each other ranged from competition (e.g. 3-24 and 10-24) to facilitation (e.g. 22-24). 45

CCM also inferred a range of interactions involving the three community properties, in- cluding support for them as causal agents as well as receivers of causal effects. Similar to the hypothetical community in the introduction, most of the parasites did better in a more connected, clustered, and even community (blue cells in the mid-right grouping of Figure 3.2). This likely reflects that these are generalists which were limited by host availability. This was not univer- sally the case, as parasite 31 was unaffected by the community’s structure. Combined with the strong negative effect it had on host 3, which itself did better when the community was more connected and clustered, it is likely that an increase in community connectance increased niche overlap and competition between generalist parasites which led to less intense parasitism of host

3 by parasite 31, but not to such an extent that parasite 31 was unable to find hosts as evidenced by its lack of a relationship with the community’s structure. In contrast, host 7 was helped by parasite 9 but hurt by a more connected and clustered community. These two systemic proper- ties likely increased the parasitism of host 7 by parasites 10 and 27, neither of which competed with parasite 9 and therefore were able to overcome its positive effect on host 7. There was also evidence of the community’s structure amplifying the negative effects of parasites on their hosts.

An example of this is the causal and correlative relationships between parasite 27, host 7, and the connectance of the community. Both connectance and parasite 27 were negatively correlated with host 7, but they were positively correlated with each other. While these correlations may be the result of indirect effects with other species and community properties, this is evidence that properties of the community may have indirectly hurt host populations by helping parasites which were detrimental to their hosts.

CCM also inferred strong causal interactions between community properties, but this likely results from a lack of independence rather than true causation (as in Figure 3.2). An example of this is that the Connectance and Clustering Coefficient of a fully connected community (every pair of species interacts) are both 1. While these properties capture different kinds of structures in a community, they are still formulas which use the same underlying network data and are therefore not independent. The community properties were outliers both in how correlated they 46 were and how certain CCM was that they were causally related (Fig. B.3). In communities with at least several dozen interacting species these properties can fluctuate with a practical degree of independence from any individual constituent population (e.g. Fig. B.4), but not from each other.

We therefore recommend ignoring the bottom-right panel in Fig. 3.2) until there are trustworthy null models for the expected causality between network properties of the same system.

Scrambling all 169 causal p-values revealed relationships at the level of functional groups

(shaded borders in Figure 3.2). These results suggest that community properties caused the dynamics of the hosts, which in turn also caused the dynamics of the community properties and very likely (p = 0.07) those of the parasites. However, the parasite populations did not causally affect the hosts or the structures of the community at large, acting instead as causal sinks.

3.3.2 Cause vs Effect

Similar patterns to these group-level trends can also be visualized for individual species and community properties in plots of the summed effects from and toward each one, as measured by CCM p-values, as well as the balance of how much causing vs receiving causal effects each one experienced (Fig. 3.3). All three hosts did more causing than they experienced. The same was true for only a single parasite species, most of which acted as causal sinks in the aggregate, with all but one lying above the 45◦ line in the ∑ Cause vs ∑ Effect scatterplot. The parasites also showed a non-significant but positive relationship between the amount of causing they were responsible for and the amount they experienced (r = 0.47). While this may reflect the differences between generalist and specialist parasite species, it suggests the parasite species were involved in causal loops where even when they were stronger drivers of community dynamics, they also experienced stronger causation from the host species and community properties. In contrast, the hosts (r = -0.54) and community properties (r = -0.45) were able to be stronger causal drivers without experiencing greater causation themselves indicating an increased level of causal autonomy not experienced by the parasites, though these were also non-significant relationships.

The evenness of the community’s interactions was a marginal causal sink, unlike the other two 47 community-level properties. Quantifying net causation also revealed that host 3 and parasite 24 were the most and least causal community members. In this way causal inference can generate hypotheses about species that may act as keystone or dominant species in a community versus those that may be indicator species.

3.4 Discussion

We have presented evidence of top-down forcing across organizational scales in a host- parasite community based on casual inference methods that utilize observational time series data. Whereas a previous analysis of the same Slovakian rodent-ectoparasite community found its structure was likely the result of random interactions proportional to the relative abundances of its constituent populations ( [31]), we have shown this conclusion depends on an assump- tion that is likely to be incorrect, that the community structure is entirely emergent. Instead, the community’s structures and constituent rodent populations appear to drive each other in a causal loop that in turn drives the dynamics of ectoparasite populations. The result is a feedback loop across organizational scales where structures of the community act as “both a pusher and a mover” in the language of [105]. His claim that such “strange loops” distinguish life is par- ticularly relevant here, as some have viewed community properties merely as useful proxies for biological processes but not themselves biologically causal. We cannot know how general our results are, but they suggest that pairwise interspecific interactions are insufficient to describe a community’s dynamics as they ignore higher-order interactions with systemic properties that may be both descriptive and ecologically meaningful.

Our results illustrate the promise of EDM approaches to disentangling the relationships be- tween species and also community properties. In particular, our application of CCM shows how causal methods can help to distinguish correlative from causal relationships, especially when combining individual and group-level inferences. Consider that there was no evidence of causal interactions between any of the rodent populations despite Host 3 being negatively correlated with Hosts 7 and 23. Instead, community properties drove all three, but positively in the case 48

Figure 3.3: The cause and effect relationships for each of the 3 hosts, 7 parasites, and 3 commu- ∑row P ∑column P nity properties. A) ∑ Cause = N−1 and ∑ Effect = N−1 , where P = 1 − C, C is a matrix of causal p-values (e.g. Figure 3.2), and N is the number of interacting variables. We subtract these values from 1 so that larger values intuitively correspond to stronger causation, and normalize by N − 1 because CCM cannot test self-causation so both C and P have no diagonal values. The scat- ter plot shows normalized causing done by (Cause) and experienced by (Effect) each species and community property. Points below the ∑ Cause = ∑ Effect line are species or community proper- ties that did more causing than they experienced, and vice versa for those above the line. B) Net causality captures the aggregate causal hierarchy of the community. This measure is normalized by the total causing done by each variable and so ranges between -1 and 1 where positive values indicate causal autonomy and negative values indicate causal dependence. It equals 0 when a variable causally drove other variables exactly as much as it experienced causation from other variables. This net causal hierarchy suggests community properties play an important role in the dynamics of both hosts and parasites, and that parasites do little to structure the community. 49 of Host 3 and negatively for Hosts 7 and 23. This resulted in the apparent competition between rodent species. Apparent interactions of all kinds, not just competition, are likely ubiquitous in ecological modeling as suggested by the lack of a relationship between the covariance and causal structures of the data (Fig. B.3; Fig. B.5). Aside from assessing causation across organizational scales, the prevalence of correlations that do not show causal relationships in this particular com- munity supports the need to preface the use of many correlational approaches to community analysis with some form of causal inference. As most kinds of causal inference do not infer the functional form of the interaction, and therefore say nothing about the actual mechanisms responsible for the observed patterns of causation, they should be viewed as complementary to traditional parametric modeling rather than as a replacement.

While EDM methods show promise in furthering community analyses, their interpretation still requires caution. In particular, future applications of the EDM framework and CCM will have to contend with four poorly studied phenomena: (i) causal propagation, (ii) embedding dimension heterogeneity, (iii) incomplete community descriptions, and (iv) false positives. First

(i), properties of the Slovakian rodent-ectoparasite community strongly caused the dynamics of its host species which in turn strongly caused the parasite species dynamics. Ought we then to expect CCM to conclude that the community properties caused the parasite dynamics through a causal transitivity? We found a p-value of 0.35 for community properties having caused the parasites’ dynamics. Is this evidence of weak causation or the propagation of strong causation through the hosts on to the parasites? Depending on the strength of the causal interactions between community properties and hosts populations, these may actually be the same thing.

Consider the case where species A determines species B’s dynamics so completely that they exhibit a tight temporal synchrony. Can species B then be considered the cause of anything?

Unfortunately, methods like Structural Equation Modeling (SEM) cannot address loops which are an inherent feature of causal interactions. For causal inference to be applied successfully to large systems of any kind, null models describing the propagation of causal interactions will need to be developed. Identifying the underlying causal hierarchy of a system, as in Figure 3.3, 50 should help distinguish between direct and propagated causality in tightly coupled systems.

Second (ii), the embedding dimensions of the three hosts, seven parasites, and three com- munity properties that met the assumptions of CCM were not identical or even normally dis- tributed: 2, 3, 4, 4, 4, 5, 5, 6, 6, 6, 9, 9, 9. This heterogeneity is likely common, but its effects on EDM analyses have yet to be properly studied. Either all 13 interactors belong to the same causal system but each interacts with only a subset of it, such that their dynamics embed better in lower-dimensional shadow manifolds, or they belong to multiple causal systems. CCM should correctly conclude there was no causation in the latter case, but to our knowledge this has not yet been reliably studied in any of the EDM methods.

Third (iii), in many communities only a minority of the constituent time series will meet the assumptions of CCM. Less than 12% of the species in the host-parasite community studied here met these assumptions, resulting in less than 2% of the observed host-parasite interactions being assessed by CCM. It is almost certain that many of these are in actuality important causal interactions. Coupled with our poor understanding of causal propagation (i), it may be that apparently important species like host 3 are in fact proxies for causal drivers which did not meet

CCM’s assumptions or were not measured. Either different kinds of causal inference will have to be combined into a unified causal framework or a more general unifying notion of causality will have to be adopted. In either case, more broadly applicable methods of causal inference ought to be considered more carefully, such as the ones proposed by [214] and [148] which quantify the transfer of information between two time series. While promising, these methods cannot yet deal with higher dimensional (N > 2) systems and also need more testing in an ecological context to be sure of their power to correctly predict causality across the wide range of ecological system behaviors.

Lastly (iv), we would like a theoretical measure of causality to adhere to what Liang calls

“the principle of nil causality: an event is not causal to another event if the evolution of the latter does not depend on the former.” ( [150]) Absent this principle, a method for detecting causality is particularly susceptible to inferring bidirectional causality in a one-way causal relationship. 51

This is troubling because we found evidence of reciprocal causal relationships (e.g. host 7 and clustering coefficient; Fig. 3.2). CCM does not adhere to this principle, and at this time there are no systematic studies on how it deviates. Moreover, there is already evidence of informa- tion feedback in a one-way causation resulting in CCM incorrectly inferring a reciprocal causal relationship ( [229]). We suspect this issue, which is not unique to CCM ( [222]), is especially prevalent when analyzing community properties. Liangs causal formalism using information

flow does follow the principle of nil causality, but it currently only does so for deterministic two- dimensional systems ( [150]). This remains one of the most important unresolved areas of causal theory for systems of arbitrary dimension and stochasticity. To deal with this issue, heuristics like

CCM will have to be reformulated ( [162]) and normalized against appropriate, system-specific null models of information feedback ( [149]).

In ecology, most applications of CCM have been to test a small number of purported drivers of an individual species of interest. However, the promise of these methods lies in better under- standing typical communities, with dozens to hundreds of species and much less certainty about many of their relationships. As we show here, some insight into more complex communities can come from these approaches, but our findings suggest the assumptions of CCM also create barri- ers to the characterization of entire communities. While considerable work still needs to be done to make more flexible and general tests for causality, we have shown here that these approaches can help resolve the long-standing debate regarding the importance of community structure in determining the dynamics of individual species. Chapter 4

Understanding interspecific causation in multi-species systems: development of a general approach and application to dynamics of the endangered vernal pool plant Lasthenia conjugens

4.1 Introduction

A common goal of both community and population ecology is estimating the proximate drivers of species abundances. However, separating correlated dynamics from causal relation- ships has stymied progress toward this goal, particularly in studies of communities where a multitude of species could, or could not, meaningfully influence the dynamics of every other population. While experiments can be used to distinguish and estimate the strength of causal relationships, they are often difficult to perform and rarely consider more than a handful of the possible abiotic and biotic drivers of species abundances. Regardless, a clear understanding of community structure, as well as the formulation of meaningful conservation plans, requires understanding causal relationships, particularly in response to legal protections for endangered species [144].

The most common way to model proximate effects in multi-species systems is with a com- munity matrix. The conceptual framework that forms the basis for this approach rests on Lotka-

Volterra dynamics [152, 247]. Consider a predator Y and its prey X, whose rates of change in

dX dY abundance are governed by the equations dt = Ax − Bxy and dt = Cxy − Dy. Community matrices combine all pairs of interactions into a square matrix with density dependent terms

 A −B  along the diagonal: C −D . Unfortunately there is no consensus on how to estimate these in- 53 teraction strengths. There are established protocols for many direct interactions, such as using gut contents to estimate predation rates [154] or ectoparasite loads to estimate parasitism [227], but no consensus on how to estimate indirect interactions (e.g. competition, mutualism). The myriad of methods that do exist generally utilize trait comparisons [137, 199], phylogenetic sim- ilarities [202, 243], or geographic overlap/co-occurrence patterns [7, 13, 181, 216]. These all have merit, but they do not separate causal relationships from associative ones. Even time series anal- yses [57,114] tend to ignore the “inescapable fact that probability theory, the official mathematical language of many empirical sciences, does not permit us to express sentences such as ‘Mud does not cause rain’ ” [192].

More recently, causality itself has been proposed as a currency of interaction that can infer indirect interactions from observational data. While there is similarly no agreement on how best to infer causality, and no single universally applicable method, those that do exist all attempt to identify causal relationships even in the face of confounding correlations of dynamics due to common forcing from factors such as weather. This is especially important in ecology where multicollinearity is ubiquitous and too often inappropriately modeled with generalized linear models and post-hoc information theoretic model selection [23, 90]. Unfortunately, the relation- ship between causal strength and interaction strength is poorly understood, and likely varies between causal inferences. In addition, the outputs of all causal tests with which we are familiar are not translatable into per capita effects, such as those that comprise the community matrix, or any similar direct measure of effects on dynamics [143, 179]. As such, these methods do not reliably provide estimates of interaction strength that can be directly applied to estimate rates of change of a target species’ abundance.

A second major limitation of the usual ways of characterizing drivers of species abundances is that these effects are likely to shift with densities as well as environmental conditions. As Paine showed experimentally [258] and argued as a general point [186], the natural history of virtually all communities makes a single static community matrix inadequate. However, using typically available ecological data to make estimates not only of multiple species interactions, but also the 54 variation in these effects, is a daunting challenge. One recently suggested approach is Empirical

Dynamic Modeling (EDM; [259]), which has shown promise as a nonparametric framework capa- ble of estimating dynamic interactions [57]. EDM methods eschew a precise mathematical model of a system in favor of empirically reconstructing the system’s dynamics, which is called an at- tractor or manifold. In this framework each species and abiotic factor is treated as an axis in a higher-dimensional plot. Through time this plot maps the system’s “evolution semi-flow” [234].

These evolving flows are often too chaotic to be described as a mathematical function, but if the manifold is sufficiently dense (well sampled, or filled in) it can be used to recover reliable estimates of the system’s instantaneous dynamics. These estimates are Jacobians, or a kind of community matrix containing all pairwise interactions in the system [143]. Rather than a single matrix for the whole system, this approach generates a time series of Jacobians, one from the perspective of each data point. This sequence is called an S-map [231]. While this approach has shown promise in its dynamic characterizations of community interactions [57], it does not allow a direct test to separate causation from correlation.

Thus, while they attack two fundamental shortcomings of community interaction charac- terizations, causal inference and S-maps addresses the tendency of species abundances to covary and their interactions to change separately. Here we develop a novel combination of these meth- ods. Our overall goal is to look across large numbers of possibly interacting species and quantify the influences of the most likely drivers of the dynamics of a target species. Specifically, we focus on the following: (i) Developing a method that combines causal inference and S-maps in a way that yields more flexible and rigorous approaches to assessing the strength of causal interspecific interactions, (ii) using this approach to analyze a typical system in which time series data on abundances of multiple species are available but experimental data are not, and (iii) comparing the results we obtain against those from a non-causal S-map as well as data from other studies of the same system to gauge what improvements or shortcomings our combination of causal and dynamic inference has. For this work, we focus on Lasthenia conjugens in a vernal pool system with 14 years of data on 16 species in 247 pools. While here we focus exclusively on the effects 55 other species have on Lasthenia conjugens, as a way to address the conservation of a federally listed species, the broader goal of our work is to engender this type of analysis for entire communities.

4.2 Methods

4.2.1 Data

We use causally-filtered S-maps to study interspecific drivers of endangered Lasthenia con- jugens populations growing in vernal pools at Travis Air Force Base (AFB) in Solano County,

California, USA in the Sacramento Valley near the town of Fairfield (3815’00” N, 122,00’00” W,

6 m elevation). This area receives approximately 50 cm of annual precipitation, almost entirely during the wet season from December to April. The site has approximately 100 naturally occur- ring vernal pools which are host to one of the few remaining populations of the annual plant

Lasthenia conjugens (Contra Costa goldfields, : ). However, all of the data used here come from 247 experimentally constructed pools previously used to study Lasthenia conjugens restoration, invasion, and community assembly [41–43]. These pools were constructed in December 1999 and seeded in 1999, 2000, and 2001. We have used data from 2002 to 2016, during which time Lasthenia conjugens declined in most of the pools.

In each of these 15 years all 247 pools were sampled using a 100 cell quadrat, where the frequency of each species was quantified as the percent of cells in which an individual of that species was visible. Note that an individual plant can occupy more than one cell, as can multiple individuals of different species. So while each species ranged between 0 and 100 percent, cumulative community composition ranged from 0 to 400 percent. Of these, 0 and 100 percent were the most common, with cumulative compositions between them uniformly distributed and cumulative compositions greater than 100 percent exponentially declining.

The most important abiotic driver of population dynamics in this system is water [41–

43, 82]. Lasthenia and most other species in this system are winter annuals (or perennials with similar phenology). Germination starts with the fall rains, and growth is limited in the spring 56 by dry-down following cessation of the winter rains. To capture the potentially differing effects of fall and spring rainfall, we divided annual precipitation into early (October - December) and late (Januray - April) periods. Rainfall data was left in its original units (inches), so while these models included their effects, their magnitudes are not comparable with the inferred interspecific effects. See Fig. C.5 for the direct effects of early and late season rainfall on Lasthenia conjugens.

4.2.2 S-maps

S-maps [57, 231, 259] are sequences of Jacobians, or matrices of interactions where the ijth

∂i element is the partial derivative ∂j for species i and j, and the iith elements along the diagonal are population growth rates of species i. These are the same values in one of the most used kinds of community matrix, a common way of representing species interactions [143, 179]. However, a single static community matrix is usually estimated and analyzed, while S-maps contain one

Jacobian for each observation. These Jacobians are calculated as a multivariate regression pre- dicting the system’s transition from Xt to Xt+1, but where every other observation Xk has weight wk equal to its exponentially-weighted Euclidean distance k · k to the system’s state at time t.

−θkXk − Xtk wkt = exp (4.1) kXk − Xtk We used an exponential decay of θ = 10.22 which minimized the normalized mean absolute error

(nMAE) of leave-one-out cross validation in accordance with [57]. This distance is not temporal, or spatial in the case of spatial replication, but rather the distance in the state-space where the system’s manifold was reconstructed, and is scaled exponentially because system manifolds are often nonlinear and even chaotic [97, 98, 239] In accordance with [57], the focal observation is left out from each multivariate regression. Fig. C.7 shows that even the strongest weight in each

S-map tends to be close to zero (the maximum is one because e0 = 1), and falls off exponentially.

Using these weights dramatically improves the estimated partial derivatives in each Jacobian, especially for communities with chaotic dynamics occurring faster than they are observed [57]. 57

The strength of this weighting relative to the state-space distances between observations can be seen in Fig. C.7. In this study these weights spanned all observations, across all 14 years and also all 247 pools, as there is evidence in support of combining spatial and temporal replication to infer causality [36]. This works by finding the regression estimates

T −1 T βˆt = (X hWtiX) X hWtiY (4.2)

ˆ where hWti is a square matrix with weights wkt on its diagonal and β is the least squares solution ˆ ∂i to Y = βX. These coefficients, βt, are the estimates of the partial derivatives ∂j in the community matrix.

Eq. 4.2 is the general formula for calculating an S-map, but the data on the Lasthenia conju- gens vernal pool plant community require a more complicated model because species frequencies were quantified as the percent of cells in a quadrat they occupied. We converted this data to den- sities, by normalizing to unity, and then used weighted beta regression. Lasthenia conjugens was often unobserved, and sometimes occurred in all 100 quadrat cells, so we used weighted zero- one-inflated beta regressions to characterize effects on Lasthenia conjugens densities, which was accomplished with the R package GAMLSS [203]. In these models, we fit the future density of

Lasthenia conjugens in pool p as a function of its past density in the same pool, the past densities of other species in the same pool, and the early and late rainfall that year. Pool size (small, medium, or large) and pool ID were included as categorical variables.

L ∼ L + X + ··· + X + + + ( ) + ( ) pt+1 pt 1 pt N pt Rainfallearlyt Rainfalllatet factor Pool Size p factor Pool IDp (4.3)

4.2.3 Causality

Calculating the Jacobians in an S-map with multivariate regression makes them susceptible to “mirage” correlations, where variables that do not interact covary because of common drivers 58

∂i [90]. This in turn makes any individual ∂j unreliable as an indicator of actual causal effects. Ideally, then, we would use a method of assessing true causation to filter or weight the results of an S-map analysis. Convergent Cross Mapping (CCM; [232]) is a natural choice because it too derives from the EDM framework. Unfortunately, CCM assumes a degree of deterministic coupling (chaotic nonlinear dynamics) that is not ubiquitous in ecological systems. Often many observed species in a community will not be testable with CCM [141], and weights for predictor variables should include all predictors. Additionally, both CCM and Granger causality [91] collapse the data’s temporal component down to a single value. Using one of them as a weight for the predictors in an S-map therefore results in a single weight for each predictor across all time points, preventing the S-map from capturing dynamic interactions. Liang’s information flow

[150] is extremely promising as an alternative in it’s adherence to the principle of nil causality, where “an event is not causal to another event if the evolution of the latter does not depend on the former”, but is currently only applicable to two-dimensional systems.

We have instead used a well-studied information theoretic measure of causality which

first assumes no information is being transfered between a potential causer and causee and then measures how wrong this assumption is as a Kullback entropy [136], also known as a KL divergence and here denoted with the letter T because the most common formalism, Schreiber’s

[214], called this idea transfer entropy.

(c) (d) ( ) ( ) p(A + |A , B ) T = p(A , A c , B d ) log t 1 t t (4.4) species B → species A ∑ t+1 t t (c) t p(At+1|At ) Here A and B are species abundances (or densities) and p is the probability of the system being in a particular state. Note that this state spans multiple time points. Schreiber described the exponents c and d as the order the Markov system assumed in modeling the data in discrete time steps, from t to t + 1. Within the EDM framework these values can be calculated as the

(c) manifold’s embedding dimension, such that At = {At, At−τ, At−2τ, ··· , At−(c−1)τ}. The value of c and d are challenging to determine and larger values make calculating the probabilities 59 that comprise transfer entropy computationally intractable. Fortunately, the vernal pool plant community studied here contains mostly annual making c = d = 1 a natural choice.

Schreiber recommends approximating the probabilities that comprise transfer entropy us- ing generalized correlation integrals estimated with a step kernel Θ averaged across all observa- tions.

  At+1 − At0+1 1 pˆ (A , A , B ) = Θ A − A 0 − r (4.5) r t+1 t t | 0| − ∑  t t  t 1 t06=t Bt − Bt0 This estimates each probability as the proportion of system states that were within some distance r from the state of interest, namely {At+1, At, Bt}. We have instead calculated these probabilities using an exponentially decaying kernel estimation where each observation gives every other observation probability mass exponentially weighted by their Euclidean distance, to parallel the weighting used in S-maps (Eq. 4.1).

1 −θk(A + , A , B ) − (A 0+ , A 0 , B 0 )k pˆ (A , A , B ) = exp t 1 t t t 1 t t (4.6) r t+1 t t | 0| − ∑ t 1 t0 k(At+1, At, Bt) − (At0+1, At0 , Bt0 )k The decay parameter θ serves a similar purpose here to the exponential weighting of data in

S-maps: to share information between observations using the Euclidean distance between them.

We have therefore used the same value for both: θ = 10.22. Note that this works identically for

p(At+1,At,Bt) the other two probabilities needed to calculate transfer entropy, p(At+ |At, Bt) = and 1 p(At,Bt)

p(At+1,At) p(At+ |At) = , only with two and one-dimensional distances. 1 p(At)

4.2.4 Causally-filtered S-maps

Transfer entropy has three properties that make it especially well suited to causally correct- ing S-maps. (i) It makes no assumptions about the dynamics of the time series of species A or

B and is therefore universally applicable. (ii) Transfer entropy can account for common drivers

Z of species A and B “by conditioning the probabilities under the logarithm to each zn ∈ Z as (c) (d) (e) (c) (d) p(At+1|At ,Bt ,zt ) well”. [214] For example, TA → B = ∑t p(At+1, At , Bt ) log (c) (e) prevents species A p(At+1|At ,zt ) 60 from appearing causal to species B when they were both caused by a common environmental driver z, and therefore contain overlapping information irrelevant to their interspecific interac- tions [222]. (iii) Transfer entropy is a sum. By splitting this sum apart we can quantify causality at the temporal resolution of the data. While this almost certainly reduces its sensitivity, each calculation still contains information about every observation. The features of this fine-grained use of transfer entropy have yet to be studied and deserve further attention.

A temporally split transfer entropy therefore forms a one-to-one correspondence (bijec- tion) with an S-map. We considered two ways of combining these two analyses. The first was a causal filter where entries in the S-map Jacobians are proportional to the magnitude of the corresponding marginal transfer entropies. We decided against this approach because of a desire to decouple the amount of information moving between variables and their biological effects on each other.

Instead, we calculated the Jacobian entries as the expected S-map coefficients across all possible community compositions where species are included proportional to their marginal transfer entropies. This models the system as a weighted average across an infinite number of realizations of the target species’ dynamics, here Lasthenia conjugens, but in each realization only some of the other species are included, with a probability proportional to their marginal transfer entropy. Simulating this approach is intractable in this system because many of the species have relatively tiny transfer entropies with Lasthenia conjugens and therefore never occur by chance.

However, the expected value of each interspecific interaction can be calculated as a weighted average of S-map coefficients, with one S-map for each set ρ of observed species x ∈ X in the power set P(X) of all observed species. Each S-map is weighted by the product of the transfer entropies (normalized by ϕ) from the species x in its corresponding set ρ to Lasthenia conjugens, denoted by the transfer entropy Tx. We excluded the null set ∅ such that Lasthenia conjugens had

to interact with at least one of the observed species. Note that this treats Txi as independent from

all Txj6=i .

Causally-filtered S-map estimates βˆc of the interactions of variables x ∈ X on a variable y 61 of interest, here the density of Lasthenia conjugens, at time t are therefore defined as the regression estimates:

(4.7)

h i The weights on the diagonal of the matrices Wρt are at the heart of this method. They vary both for each time t, in accordance with the S-map method, and for each subset of species observed at each time t, proportional to the product of their marginal transfer entropies. As with traditional

S-maps, these weights span all observations, across all 14 years and also all 247 pools, which treats temporal and spatial replication as independent observations of the community’s dynam- ics. While this assumption is almost never valid, it makes estimates of the probabilities that comprise transfer entropy (Eq. 4.4) more robust, and there is evidence in support of combining spatial and temporal replication to infer causality [36].

4.3 Results

To introduce the types of results and inferences from this method, we first discuss the re- sults of the causally-weighted S-maps for just one pool. Predictions from Pool 300 provide a typ- ical example of the multicollinearity in the dynamics of these vernal pool plant populations, how this results in unreliable S-map estimates of interaction strengths, and how our causally-filtered extension corrects for this. As seen in the top panel of Fig. 4.1, Lasthenia conjugens remained at very low densities until 2007 when it then rose to over a quarter of the community’s composi- tion before crashing in 2010. The population rebounded the following year, rising to its highest density of 0.5 in 2012, before crashing again the following year. This trend was not unique to Las- 62 thenia conjugens. Downingia concolor, Crassula aquatica, and the exotic species Lolium multiflorium similarly rose and fell in density twice. Separable models, like regression, cannot distinguish causes from covariates, even when weighted by community state similarity as S-maps are. The result is a parsing of variance across all covariates, including those that are not causally related.

This can be seen in the difference between the middle and bottom panels in Fig. 4.1. The un- corrected S-map (middle panel) suggests that Lasthenia conjugens had a small (< 10) and mostly constant annual growth rate, and that interspecific interactions were mostly unimportant except for a few very important exceptions which drove its dynamics. Achyrachaena mollis helped Las- thenia conjugens establish in this pool, but along with Lupinus bicolor had a strong negative effect on Lasthenia conjugens in 2008. Curiously this competition was not reflected in the dynamics of

Lasthenia conjugens which increased in density. More recently, Lasthenia conjugens appeared only to interact with Plagiobothrys stipitatus and Psilocarphus oregonus.

These estimates are unconvincing for two reasons. First, previous analyses show that

Lolium multiflorium and Hordeum marinum, the two exotic species, played an important role in Las- thenia conjugens’ decline [42], but the S-map did not recover these interaction. Second, while in- terspecific interactions are dynamic, constant weak interactions punctuated by strong ephemeral ones with different species seem biologically unrealistic. Consider the complexity of mechanisms which would produce near identical weak interactions with a dozen co-occurring species most of the time, but then momentarily strong interactions with alternating individual species. This is more likely a statistical artifact.

The causally-filtered S-map offers a more parsimonious explanation of Lasthenia conjugens’ dynamics. As seen in the bottom panel of Fig. 4.1, they were more a result of intraspecific in- teractions, which were likely driven by early- and late-season rainfall which had strong effects on both Lasthenia conjugens’ annual growth rate (Fig. C.4) and the interspecific effects of other species in the community on Lasthenia conjugens (Fig. 4.5). However rainfall was measured in inches making the resulting strengths of early- and late-season rainfall on Lasthenia conjugens

(Fig. C.5) incomparable with (smaller than) the inferred interspecific interaction strengths. Also 63

Figure 4.1: Observed population fluctuations and inferred interactions in vernal pool 300. Top) Observed densities of Lasthenia conjugens, 14 other native plant species, and 2 exotic plant species from 2002 to 2015. Middle) Interactions between Lasthenia conjugens, the 16 other plant species, and both early and late season precipitation inferred using an S-map. These values are the Betaˆ t coefficients in the S-map’s multivariate zero-one-inflated beta regressions, which measure the effect of a change in the density of each species on the annual growth rate of Lasthenia conjugens. These inferred interactions incorrectly parse Lasthenia conjugens variance across covarying but non-interacting species. Bottom) Identical to the middle panel only using causally-filtered S- maps, which appear to more correctly parse the communitys covariance structure to identify Lasthenia conjugens independence, and the exotic Lolium multiflorium and perennial Eryngium vaseyi as the main interspecific drivers of its dynamics. 64 more parsimonious was the finding that those interspecific interactions that were present were consistent, and with species we know from previous work [42,73] did interact with Lasthenia con- jugens: the exotic Lolium multiflorium and Eryngium vaseyi which was the only observed perennial species. As Fig. 4.2 shows, these trends were consistent across all 247 pools, where the exotic and perennial species had the strongest effects on Lasthenia conjugens.

S-maps, and our causal extension of them, are powerful in that they estimate interactions at the same resolution as the underlying data. We can therefore more closely examine patterns in the interactions of the exotic and perennial species with Lasthenia conjugens. The following two sections are devoted to dissecting these results for the entire dataset, though not for Hordeum marinum which only had a strong effect on Lasthenia conjugens in 2004 (Fig. C.1).

4.3.1 Lolium multiflorium

Lolium multiflorium’s effect on Lasthenia conjugens was spatially uniform (Fig. 4.3, top right panel). While there was variance among pools in the strength of its interaction, this variance was not directional. This contrasts with the 16 native species growing in these pools with Lasthenia conjugens which had a tendency to interact more strongly in more northern pools (Fig. C.2). We suspect this resulted from a northern downward slope. This slope is difficult to see with the naked eye and therefore unlikely to effect dispersal, but it could have effected drainage across the site leading to an accumulation of water in the more northern pools.

While Lolium multiflorium did not co-occur at high densities with Lasthenia conjugens, its effect on Lasthenia conjugens was independent of either of their densities (Fig. 4.3, bottom left panel). It could be that Lolium multiflorium’s thatch accumulated too quickly in these pools for either of their densities to matter. Alternatively, this system may be driven by rainfall to such an extent that the co-occurrence of other species is comparatively unimportant to the survival and reproduction of Lasthenia conjugens. The top left and bottom right panels of Fig. 4.3 show that these interactions were predominantly water-mediated, with the strongest interactions occurring at intermediate levels of both early and late season rainfall. Lasthenia conjugens’ growth rate was 65

Figure 4.2: Interspecific effects on Lasthenia conjugens, inferred with causally-filtered S-maps, aggregated across all 247 pools for all 14 years. Hordeum marinum’s effect on Lasthenia conjugens mostly occurred in 2004 (Fig. C.1). 66

Figure 4.3: Aggregate effects of Lolium multiflorium on Lasthenia conjugens’ annual growth rate (βˆc). Top Left) Annual interspecific interactions. Boxes span interquartile ranges (IQR), with the inside bar at medians and whiskers extending to the most extreme data within ±1.5 ∗ IQR. Top Right) Average interspecific interactions in each vernal pool. Only pools where Lolium mul- tiflorium was observed are plotted. Bottom Left) Density-mediated interactions between Lolium multiflorium and Lasthenia conjugens. Bottom Right) Interactions as a function of early- and late- season precipitation. 67 highest at intermediate levels of precipitation (Fig. C.4), making it also the most susceptible to in- terspecific effects. In the first five years of observations, up to 2006, rainfall drove this interaction.

As the years got wetter Lolium multiflorium’s effect on Lasthenia conjugens weakened and almost disappeared, before strengthening in the drier 2006 and 2007. Lolium multiflorium antagonized

Lasthenia conjugens at lower levels of precipitation, but the fluctuations in precipitation between

2010 and 2015 had little effect on Lolium multiflorium’ effect on Lasthenia conjugens.

4.3.2 Eryngium vaseyi

Eryngium vaseyi’s effect on Lasthenia conjugens was also spatially homogenous, with four exceptions. In four pools in the north and east of the group south of the abandoned airstrip

Erygnium vaseyi benefited Lasthenia conjugens, on average. Drainage likely cannot explain these outliers because the effect was not present in surrounding pools, and there was no directional gradient.

Unlike with Lolium multiflorium, Eryngium vaseyi’s interactions with Lasthenia conjugens was density dependent, but only Eryngium vaseyi’s density effected these interactions. In a two way regression explaining Eryngium vaseyi’s interactions with Lasthenia conjugens, Lasthenia conjugens’ density had a p-value of 0.65, Eryngium vaseyi’s density had a p-value of 0.004, and the interaction between their densities had a p-value of 0.056. This can be seen in the preponderance of blue points along the vertical axis, where Eryngium vaseyi’s density was close to zero. This may be evidence that, as a perennial plant, Eryngium vaseyi alters the physical makeup of vernal pools in ways that are advantageous to Lasthenia conjugens at low densities, or that Erygnium vaseyi prevented exotic establishment or the accumulation of their thatch. Eryngium vaseyi and Lolium multiflorium tended not to co-occur and showed a complimentary tradeoff in the strength of their effects on Lasthenia conjugens (Fig. C.6). However, Eryngium vaseyi occurred at low densities even when Lolium multiflorium occurred at high densities, possibly keeping Lolium multiflorium’s thatch from completely hindering Lasthenia conjugens survival.

Water mediated Eryngium vaseyi’s effect on Lasthenia conjugens the same as it did Lolium 68

Figure 4.4: Aggregate effects of Eryngium vaseyi on Lasthenia conjugens’ annual growth rate (βˆc). Top Left) Annual interspecific interactions. Boxes span interquartile ranges (IQR), with the in- side bar at medians and whiskers extending to the most extreme data within ±1.5 ∗ IQR. Top Right) Average interspecific interactions in each vernal pool. Only pools where Eryngium vaseyi was observed are plotted. Bottom Left) Density-mediated interactions between Eryngium vaseyi and Lasthenia conjugens. Bottom Right) Interactions as a function of early- and late-season pre- cipitation. 69 multiflorium’s (Fig. 4.4, bottom right panel). It had the strongest negative effect on Lasthenia conjugens at intermediate levels of precipitation, and its weaker positive effects, which were more numerous than Lolium multiflorium’s, were water-independent. The top left panel of Fig. 4.4 indicates that late season rain drove these interactions. 2006 and 2010 were the wettest years between January and April (the late season) and were also years where Eryngium vaseyi’s effect on Lasthenia conjugens was positive. This trend was inconsistent though. 2011 was dry all the way from October through April (early- and late-season) yet these interactions were still positive.

4.3.3 Water-Mediated Interactions

It is no surprise that water mediated interspecific interactions between these vernal pool plant populations, as their life histories have co-evolved with the annual rainfall cycle. We there- fore also considered how it mediated all of Lasthenia conjugens’ interspecific interactions.

As Fig. 4.5 shows, the species that interacted with Lasthenia conjugens fell into two general kinds of interaction strength distributions: stronger interactions at intermediate levels of precipi- tation and stronger interactions at extreme levels of precipitation. The three strongest interactors, the perennial Eryngium vaseyi and the two exotics Hordeum marinum and Lolium multiflorium, were all in the first category. Those species that fell into the second category also differed in being inconsistent in the way they interacted with Lasthenia conjugens. Downingia concolor and

Plagiobothrys stipitatus were like the perennial and two exotic species in that they had a negative effect on Lasthenia conjugens, but Pleuropogon californicus and Deschampsia danthonioides helped

Lasthenia conjugens. Also unlike those species in the first category, the sign of these species’ interactions tended to flip at intermediate levels of precipitation.

4.4 Discussion

We have presented a way to infer traditional ecological interaction strengths amidst mul- ticollinearity in a real community of conservation interest by combining S-maps [57], which are time series of Jacobians (matrices of pairwise interaction strengths) with transfer entropies [214], 70

Figure 4.5: Lasthenia conjugens’ interspecific interactions as a function of early- and late-season precipitation, defined as their effect on Lasthenia conjugens’ annual growth rate (βˆc). The darkness of the colors in the legend are 5 or more overlapping points. 71 which measure causality. These causally-filtered S-maps can be inferred from observational data and estimate interaction strengths at the same resolution as the data being used. As we have demonstrated, this approach offers a way to infer drivers of important species, such as the endangered Californian vernal pool plant Lasthenia conjugens, in communities with substantial covariance structures without the benefit of costly or impractical experimental manipulations.

Inferring interaction strengths at the resolution of individual data points, arrayed spatially and temporally, rather than with a single averaged community matrix, offers tremendous advan- tages. We identified interspecific drivers of the endangered vernal pool plant Lasthenia conjugens in time and space, as well as how species density and levels of early and late season precipitation mediated these interactions. While the importance of the two exotic species (Lolium multiflorium and Hordeum marinum) was already known [42], our approach offers a more nuanced interpre- tation of when and how these species effected Lasthenia conjugens. For instance, the strengths of these two interspecific interactions were largely dependent on early and late season rainfall, with stronger antagonism at intermediate levels. This drove the early takeover of these vernal pool communities by Lolium multiflorium, which creates a thatch [73] that prevents native species like Lasthenia conjugens from rebounding in drier or wetter years, or using their seedbank to re- establish. Interestingly, many of the other native species interacted with Lasthenia conjugens more strongly in those extreme dry or wet years, some antagonistically but some beneficially.

The high resolution of interactions inferred by both S-maps and our causal extension al- lows community dynamics to be mapped precisely through space and time. For instance, we found evidence of interactions being more beneficial to Lasthenia conjugens in more northern pools, though with exceptions such as Veronica peregrina (Fig. C.2). We were also able to identify transient water-mediated interspecific dynamics, such as with Lolium multiflorium before and af- ter 2006. Across all 14 native and 2 exotic species, we found that interspecific effects on Lasthenia conjugens were the exception (Fig. C.1). Hordeum marinum, the other exotic species, mostly ex- erted a small negative effect on Lasthenia conjugens, but in 2004 caused dramatic declines. This kind of punctuated interspecific interactivity also occurred with Achyrachaena mollis, Downingia 72 concolor, and Lasthenia glaberrima, emphasizing the danger of using a single community matrix to model interspecific interactions. Similarly, an historical reliance on individual community ma- trices has hindered our understanding of density dependence, which has been exacerbated by the ways multicollinearity masks direct effects. Using causally-filtered S-maps we were able to show that, with the exception of Eryngium vaseyi, density dependence was largely unimportant to interspecific effects on Lasthenia conjugens.

Causally-filtered S-maps add to a growing body of causal inference methods in ecology that can be used to infer ecological processes in the face of noisy data often collected at scales larger than those at which the processes occur. While there is evidence that we can and should make conservation decisions with these methods [55, 56, 229], there is still no unified theory of causality, and often species within a single system will meet the assumptions of different causal methods differently [141]. The closest to a unified theory of causality is called information flow, or information transfer, [148] which is the only theory we are aware of that adheres to the prin- ciple of nil causality, that “an event is not causal to another event if the evolution of the latter does not depend on the former” [150]. Unfortunately information flow has only been developed to the stage where it is usable for real data from two-dimensional systems [148]. Additionally, none of the causal theories we are aware of explicitly address the relationship between causal strength and interaction strength. This is especially important when combining causal inference with methods like S-maps that infer community matrices of interaction strengths. As mentioned in the Causally-filtered S-maps section of the Methods, our formulation of a causal correction for

S-maps decouples the strength of causation from the resulting per capita interaction strengths, but often these are assumed to be identical. This was not the case for transfer entropies and per density interaction coefficients in this system (Fig. C.8), where a small or large interspecific transfer of information sometimes manifested a large or small per capita biological effect respec- tively. This relationship is complicated by causal noise and the resulting distribution around nil causality (Fig. C.8). Observation and process noise create causal noise, but we are currently unaware of a null model for causality which accounts for this. Another complication is that some 73 causal inferences measure the certainty of causation, not its strength. Convergent Cross Mapping

(CCM) is one such method [232]. These inferences should be explicitly decoupled in future work that estimates interaction strength using causal inference.

An accompanying measure of uncertainty would strengthen both the original S-map method and our causal extension. While the goal of these methods is to infer dynamic interactions, it may be that in noisy and poorly-sampled systems the exact values they produce are not reliable.

In the system studied here, densities were measured using a single quadrat per pool per year, which incorrectly assumed homogeneity. The reappearance of Lasthenia conjugens in pools fol- lowing years where it was absent may reflect may reflect complicated seed bank dynamics at play in these pools [41] and a storage effect that helps plants survive stressful years [72, 250]. It may also be dispersal from other individuals in a part of the pool that was not sampled. A sequence of binary (unweighted) networks, where interspecific interactions are classified as present or ab- sent at each time point, would better model these unobserved dynamics than a single weighted average matrix. These sequences could be produced by repeatedly randomizing the data to boot- strap a distribution of transfer entropies and S-map weights. Together these would then create a distribution of causally-filtered S-map interaction strengths capable of testing the significance of the true interaction strengths against a null hypothesis that they were equal to zero.

While we have demonstrated that causal inference can help S-maps overcome ecological multicollinearity, combining the two comes at a cost. S-maps parse variance among covarying but non-interacting species because this does a better job of accounting for systemic dynamics.

Causally-filtered S-maps are averages of models with complexity bounded from above by the normal S-map model. In this way S-maps are an upper bound on the goodness-of-fit of causally-

filtered S-maps. If the goal is to predict the density or abundance of a species of interest at a particular point in space and time, causally-filtered S-maps should not be used. However conservation requires explanation as well as prediction. If we wish to intervene in the decline of an endangered species, identifying the causes of its dynamics is more important than predicting the future of its dynamics. Chapter 5

The future of causal community ecology

Ecology is poised for a major shift in how observational data are used to address classic questions. The potential for this shift is rooted in the diagnostic promise of causality as a currency of interaction. While the current methods of causal inference warrant adoption, the field is in its infancy. Rather than end this dissertation with a review of what I describe in earlier chapters,

I use this final chapter to discuss future work needed in the fields I have contributed to in

Chapters 3 and 4. Answering the following questions will help usher in a much needed era of causal community ecology, where conservation policy can be better designed and monitored in real time with observational data.

5.1 How do incomplete or aggregate community descriptions effect our ability to infer causal interactions?

Causal inference ostensibly identifies true causation between two variables regardless of an unknown third variable that is causal to either or both of them. This claim requires more rigorous verification as it is at the heart of causal inference’s proposed utility. Similarly, we do not understand the role aggregation plays in masking causality. Consider a system that was observed at a coarser scale than the causation was occurring, such as when data is collected at higher taxonomic levels (e.g. genus or family), or when the causation occurs only as a subset of a species’ behaviors (e.g. human predation). Thus far causal inference has been developed mostly from the perspective of causal theory, but as it becomes an applied set of empirical methods, 75 understanding the effects of the kinds of data used will be increasingly important for its adoption and intelligent use.

5.2 How does causality propagate through a community?

Just as traditionally-inferred interactions between species generate indirect effects, captured by the negative inverse of a Jacobian community matrix [143], so too do causal interactions. This has not been studied in part because the relationship between causal strength and interaction strength is not fully understood. Modeling causal propagation also requires an understanding of causal autonomy, or what is meant by direct causation. If species A drives species B so completely that they synchronize, can species B be considered the cause of any other species

C? This particularly matters when identifying drivers of a species of interest for the purpose of management intervention. If causal inference only identifies causally autonomous species, we might decide to intervene at the beginning of a complex causal chain of interactions when it would be less damaging to the entire community to intervene with the most proximate, non- autonomous interactor.

5.3 Can we compare causality inferred in different systems?

Part of the allure of causality as a currency of interaction is its universality. Any observa- tional data collected at the same times can be tested for causation, even if the data were never intended to be tested together. One day this may lead to a global causal network that tracks causes and their effects in real time, but not without being able to normalize causal inferences across systems. So far only the information flow formalism of causality [150] has a normalized version [149], and this only encompasses two-dimensional systems that are largely determinis- tic. Causal inference is already being applied in epidemiology [51, 56], climate science [75, 244], neurobiology [213, 255, 256], economics [60], sociology [155], linguistics [112], and even astron- omy [238], but only in isolation. This is in part because we do not know what biases accompany 76 inference between variables of vastly different dimensionality [141] or kinds of dynamics. In order to begin constructing that global causal network we need null models for the strength of inferred causality as a function of the kinds of the variables under consideration and the systems they derive from.

5.4 What is the relationship between causal strength and interaction strength?

Assuming we can normalize causal strengths, these can be used to identify those drivers most responsible for effects in variables of interest, such as endangered species. However, this does not explain the direct effects of those causes. As I have shown [140], the biological effects of a causal interaction are often uncorrelated with the strength of their causal relationship. That is, large amounts of information transfered between two variables need not result in a large absolute change in either. Moreover, methods like Convergent Cross Mapping measure the certainty of causation, but make no claim about its strength let alone the magnitude of the resulting effect.

Understanding the relationships between causal and interaction strengths is important because we are almost always interested in the latter. To the extent that this is true, approaches like I take in Chapter 4, using causal analysis not as an end point but as a filter to separate interactions from correlations, is likely to be its greatest utility.

5.5 To what extent can we substitute spatial replication for temporal replication in inferring causality?

Imagine two cyclists riding single file along a snowy street. Can we determine which track corresponds to the leader (cause) and which to the follower (effect) from an aerial photograph of their tracks? This is the simplest scenario, where the system’s dynamics unfold monotonically through space. While dispersing species often create a wave-like front [249], competition behind the leading edge results in replacement in species composition making it more challenging to infer the relationships that caused that turnover. In the extreme case of competitive exclusion it is impossible to infer causality from a single aerial photograph because the entire system converges 77 to a single species and there is no second set of observations to look for effects in. This scenario also complicates causal inference in that the dynamics occur in more than one dimension. All methods of inferring causality use sets of paired observations where, importantly, the order matters. There is currently no way to convert two-dimensional spatial data to a one-dimensional sequence of observations that can be tested for causality. Solving these problems would allow causality to be inferred faster, making it a more viable foundation for policy decisions. This work is already underway [36].

5.6 Does the principle of nil causality result in a trade-off in sensitivity?

Causal theory has no consensus axioms, but the principle of nil causality is one of the few that has garnered attention [141]. It states that “an event is not causal to another event if the evolution of the latter does not depend on the former” [150]. Eliminating false positives is important to the widespread trust and adoption of causal inference, as is a more rigorous axiomatic causal theory, but we do not fully understand the accompanying trade-offs. This is true in even the simplest case, of classifying a unidirectional relationship between two variables as causal or not, and is especially concerning because in probabilistic statistics eliminating false positives also eliminates true positives. Except for Cross Map Smoothness [157], for which ROC curves have been described, these statistical trade-offs have been ignored.

5.7 What is the asymptotic behavior of causal inference?

How much data is required to infer causality? How is the answer affected by the temporal scale of the causal relationship? And how is this affected by observation and process noise? These questions are part of a broader need to quantify uncertainty in inferred causation by defining null models for causality. Consider the previously discussed principle of nil causality, and that noisy observations can produce distributions of causation with variance about zero [140]. Without a theoretical null distribution it is impossible to know if nonzero causation is part of the tail of the 78 nil causality distribution or a separate instance of real causation. Fortunately, there are already studies of how noise affects Convergent Cross Mapping [39, 168]

5.8 Can causal inference predict the effects of adding or removing a variable in a system?

One of the longstanding issues in community ecology is that constituent variables change and interact in previously unobserved ways when even a single variable is added or removed.

This prevents prediction. Part of the appeal of causal inference is that it identifies mechanisms and so ought to be more resilient to changing community compositions. However I am not aware of evidence that this is true, or how wrong it is. It may be that the answer to this question is

‘no’ because causality is inherently context dependent and not guaranteed to work the same in different contexts. This question is of particular relevance as Earth’s climate continues to change and create novel species assemblages in novel environments [106].

5.9 What will convince people that causality can be inferred from observational data?

Politicians often justify ideological policy by claiming we cannot separate causation from correlation and therefore cannot know the causes of trends we hope to change. For causal in- ference to make a real difference, people of all backgrounds will have to trust these methods and the underlying theory. Proving that causality can be inferred from observational data will require more than recovering parameters in theoretical models. These methods have to be tested in systems that are understood experimentally and have little intersection with important social norms [120]. Most importantly, causal inference is buried under jargon and behind academic paywalls. Free and interactive online tools can help, where people can explore causal relation- ships in their own data with a graphical interface and build an intuition for the tremendous value and potential of causal inference. Bibliography

[1] Ajay Aggarwal, Walter Scott, Eric Rustici, David Bucciero, Andrew Haskins, and Wallace Matthews. Method and apparatus for determining a communications path between two nodes in an internet protocol (ip) network, October 7 1997. US Patent 5,675,741.

[2]R eka´ Albert and Albert-Laszl´ o´ Barabasi.´ Statistical mechanics of complex networks. Reviews of modern physics, 74(1):47, 2002.

[3]M ario´ Almeida-Neto, Paulo Guimaraes,˜ Paulo R Guimaraes,˜ Rafael D Loyola, and Werner Ulrich. A consistent metric for nestedness analysis in ecological systems: reconciling con- cept and measurement. Oikos, 117(8):1227–1239, 2008.

[4]M ario´ Almeida-Neto and Werner Ulrich. A straightforward computational approach for measuring nestedness using quantitative matrices. Environmental Modelling & Software, 26(2):173–178, 2011.

[5] Stephen F Altschul, Warren Gish, Webb Miller, Eugene W Myers, and David J Lipman. Basic local alignment search tool. Journal of molecular biology, 215(3):403–410, 1990.

[6] Luıs A Nunes Amaral, Antonio Scala, Marc Barthelemy, and H Eugene Stanley. Classes of small-world networks. Proceedings of the national academy of sciences, 97(21):11149– 11152, 2000.

[7] Miguel B Araujo´ and Alejandro Rozenfeld. The geographic scaling of biotic interactions. Ecography, 37(5):406–415, 2014.

[8] Wirt Atmar and Bruce D Patterson. The measure of order and disorder in the distribution of species in fragmented habitat. Oecologia, 96(3):373–382, 1993.

[9] Engin Avci. An expert system based on wavelet neural network-adaptive norm entropy for scale invariant texture classification. Expert Systems with Applications, 32(3):919–926, 2007.

[10] Joseph G Azofeifa and Robin D Dowell. A generative model for the behavior of rna poly- merase. Bioinformatics, page btw599, 2016.

[11] Albert-Laszlo Barabasi,ˆ Hawoong Jeong, Zoltan Neda,´ Erzsebet Ravasz, Andras Schubert, and Tamas Vicsek. Evolution of the social network of scientific collaborations. Physica A: Statistical mechanics and its applications, 311(3):590–614, 2002. 80

[12] Albert-Laszlo Barabasi and Zoltan N Oltvai. Network biology: understanding the cell’s functional organization. Nature reviews genetics, 5(2):101–113, 2004.

[13] A K Barner, K Coblentz, S Hacker, and B Menge. Fundamental contradictions among common estimates of non-trophic species interactions. Ecology, 2018, In press.

[14] Jordi Bascompte. Networks in ecology. Basic and Applied Ecology, 8(6):485–490, 2007.

[15] Jordi Bascompte, Pedro Jordano, and Jens M Olesen. Asymmetric coevolutionary networks facilitate biodiversity maintenance. Science, 312(5772):431–433, 2006.

[16] Edward B Baskerville and Sarah Cobey. Does influenza drive absolute humidity? Proceedings of the National Academy of Sciences, 114(12):E2270–E2271, 2017.

[17] Michael Batty. The size, scale, and shape of cities. science, 319(5864):769–771, 2008.

[18] Stephen J Beckett. Improved community detection in weighted bipartite networks. Royal Society open science, 3(1):140536, 2016.

[19] Eric L Berlow, Anje-Margiet Neutel, Joel E Cohen, Peter C De Ruiter, BO Ebenman, Mark Emmerson, Jeremy W Fox, Vincent AA Jansen, J Iwan Jones, Giorgos D Kokkoris, et al. Interaction strengths in food webs: issues and opportunities. Journal of animal ecology, 73(3):585–598, 2004.

[20] Louis-Felix´ Bersier, Carolin Banasek-Richter,ˇ and Marie-France Cattin. Quantitative de- scriptors of food-web matrices. Ecology, 83(9):2394–2407, 2002.

[21] Leonora S Bittleston, Naomi E Pierce, Aaron M Ellison, and Anne Pringle. Convergence in multispecies interactions. Trends in ecology & evolution, 31(4):269–280, 2016.

[22] Nico Bluthgen,¨ Florian Menzel, Thomas Hovestadt, Brigitte Fiala, and Nils Bluthgen.¨ Spe- cialization, constraints, and conflicting interests in mutualistic networks. Current biology, 17(4):341–346, 2007.

[23] Benjamin M Bolker, Mollie E Brooks, Connie J Clark, Shane W Geange, John R Poulsen, M Henry H Stevens, and Jada-Simone S White. Generalized linear mixed models: a prac- tical guide for ecology and evolution. Trends in ecology & evolution, 24(3):127–135, 2009.

[24] Emile´ Borel. Sur quelques points de la theorie´ des fonctions. In Annales scientifiques de l’Ecole Normale Superieure,´ volume 12, pages 9–55, 1895.

[25] Stuart R Borrett and Matthew K Lau. enar: An r package for ecosystem network analysis. Methods in Ecology and Evolution, 5(11):1206–1213, 2014.

[26] Nicolas Bourbaki. General topology, part 1. Hermann, Paris and Addison-Wesley, 1966.

[27] George EP Box. Robustness in the strategy of scientific model building. Robustness in statistics, 1:201–236, 1979.

[28] ENJ Brookshire and Tad Weaver. Long-term decline in grassland productivity driven by increasing dryness. Nature communications, 6, 2015. 81

[29] Atul J Butte and Isaac S Kohane. Mutual information relevance networks: functional ge- nomic clustering using pairwise entropy measurements. In Pac Symp Biocomput, vol- ume 5, pages 418–429, 2000.

[30] Francis Cailliez. The analytical solution of the additive constant problem. Psychometrika, 48(2):305–308, 1983.

[31] EF Canard, N Mouquet, D Mouillot, M Stanko, D Miklisova, and D Gravel. Empirical evaluation of neutral interactions in host-parasite networks. The American Naturalist, 183(4):468–479, 2014.

[32] Elsa Canard, Nicolas Mouquet, Lucile Marescot, Kevin J Gaston, Dominique Gravel, and David Mouillot. Emergence of structural patterns in neutral trophic networks. PLoS One, 7(8):e38295, 2012.

[33] Yong Cao, D Dudley Williams, and Nancy E Williams. How important are rare species in aquatic community ecology and bioassessment? OCEANOGRAPHY, 43(7), 1998.

[34] Ziyue Chen, Jun Cai, Bingbo Gao, Bing Xu, Shuang Dai, Bin He, and Xiaoming Xie. De- tecting the causality influence of individual meteorological factors on local pm2. 5 concen- tration in the jing-jin-ji region. Scientific Reports, 7, 2017.

[35] Adam Clark. multispatialCCM: Multispatial Convergent Cross Mapping, 2014. R package version 1.0.

[36] Adam Thomas Clark, Hao Ye, Forest Isbell, Ethan R Deyle, Jane Cowles, G David Tilman, and George Sugihara. Spatial convergent cross mapping to detect causal relationships from short time series. Ecology, 96(5):1174–1181, 2015.

[37] Connor Clark and Jugal Kalita. A comparison of algorithms for the pairwise alignment of biological networks. Bioinformatics, 30(16):2351–2359, 2014.

[38] Frederic Edward Clements. Research methods in ecology. University Publishing Company, 1905.

[39] Sarah Cobey and Edward B Baskerville. Limits to causal inference with state-space recon- struction for infectious disease. PloS one, 11(12):e0169050, 2016.

[40] JE Cohen. Ecologists co-operative web bank (ecoweb), version 1.0 [machine-readable database]. New York: Rockefeller University, 1989.

[41] Sharon K Collinge and Chris Ray. Transient patterns in the assembly of vernal pool plant communities. Ecology, 90(12):3313–3323, 2009.

[42] Sharon K Collinge, Chris Ray, and Fritz Gerhardt. Long-term dynamics of biotic and abiotic resistance to exotic species invasion in restored vernal pool plant communities. Ecological Applications, 21(6):2105–2118, 2011.

[43] Sharon K Collinge, Chris Ray, and Jaymee T Marty. A long-term comparison of hydrology and plant community composition in constructed versus naturally occurring vernal pools. Restoration Ecology, 21(6):704–712, 2013. 82

[44] ENCODE Project Consortium et al. An integrated encyclopedia of dna elements in the human genome. Nature, 489(7414):57–74, 2012.

[45] Donatello Conte, Pasquale Foggia, Carlo Sansone, and Mario Vento. Thirty years of graph matching in pattern recognition. International journal of pattern recognition and artificial intelligence, 18(03):265–298, 2004.

[46] Stephen A Cook. The complexity of theorem-proving procedures. In Proceedings of the third annual ACM symposium on Theory of computing, pages 151–158. ACM, 1971.

[47] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic progres- sions. Journal of symbolic computation, 9(3):251–280, 1990.

[48] Katie L Cramer, Aaron ODea, Tara R Clark, Jian-xin Zhao, and Richard D Norris. Pre- historical and historical declines in caribbean coral reef accretion rates driven by loss of parrotfish. Nature communications, 8:14160, 2017.

[49] James P Crutchfield and Bruce S McNamara. Equations of motion from a data series. Complex systems, 1(417-452):121, 1987.

[50] Gabor Csardi and Tamas Nepusz. The igraph software package for complex network re- search. InterJournal, Complex Systems:1695, 2006.

[51] Ray F David, Amir E BozorgMagham, David G Schmale, Shane D Ross, and Linsey C Marr. Identification of meteorological predictors of fusarium graminearum ascospore release us- ing correlation and causality analyses. European journal of plant pathology, 145(2):483–492, 2016.

[52] Alexander Munro Davie and Andrew James Stothers. Improved bound for complex- ity of matrix multiplication. Proceedings of the Royal Society of Edinburgh: Section A Mathematics, 143(02):351–369, 2013.

[53] Donald L DeAngelis and Simeon Yurek. Equation-free modeling unravels the behav- ior of complex ecological systems. Proceedings of the National Academy of Sciences, 112(13):3856–3857, 2015.

[54] Donald Lee DeAngelis, WM Post, and George Sugihara. Current Trends in Food Web Theory: Report on a Food Web Workshop, October 25-27, 1982, Fontana Village Inn, North Carolina. Oak Ridge National Laboratory, 1983.

[55] Ethan R Deyle, Michael Fogarty, Chih-hao Hsieh, Les Kaufman, Alec D MacCall, Stephan B Munch, Charles T Perretti, Hao Ye, and George Sugihara. Predicting climate effects on pacific sardine. Proceedings of the National Academy of Sciences, 110(16):6430–6435, 2013.

[56] Ethan R Deyle, M Cyrus Maher, Ryan D Hernandez, Sanjay Basu, and George Sugi- hara. Global environmental drivers of influenza. Proceedings of the National Academy of Sciences, page 201607747, 2016.

[57] Ethan R Deyle, Robert M May, Stephan B Munch, and George Sugihara. Tracking and fore- casting ecosystem interactions in real time. In Proc. R. Soc. B, volume 283, page 20152258. The Royal Society, 2016. 83

[58] C. F. Dormann, J. Frueund, N. Bluethgen, and B. Gruber. Indices, graphs and null models: analyzing bipartite ecological networks. The Open Ecology Journal, 2:7–24, 2009.

[59] C. F. Dormann, B. Gruber, and J. Fruend. Introducing the bipartite package: Analysing ecological networks. R News, 8(2):8–11, 2008.

[60] Florian Dost. A non-linear causal network of marketing channel system structure. Journal of Retailing and Consumer Services, 23:49–57, 2015.

[61] James A Drake. Community-assembly mechanics and the structure of an experimental species ensemble. American Naturalist, pages 1–26, 1991.

[62] Jordi Duch and Alex Arenas. Community detection in complex networks using extremal optimization. Physical review E, 72(2):027104, 2005.

[63] Jennifer A Dunne, Richard J Williams, and Neo D Martinez. Food-web structure and network theory: the role of connectance and size. Proceedings of the National Academy of Sciences, 99(20):12917–12922, 2002.

[64] Richard Durrett. Essentials of stochastic processes. Springer Science & Business Media, 2012.

[65] Jack Edmonds and Richard M Karp. Theoretical improvements in algorithmic efficiency for network flow problems. Journal of the ACM (JACM), 19(2):248–264, 1972.

[66] Jeno Egervary.´ Matrixok kombinatorius tulajdonsagair´ ol´ [hungarian, with german sum- mary]. Matematikai es´ Fizikai Lapok, 38:16–28, 1931.

[67] Mohammed El-Kebir, Jaap Heringa, and Gunnar W Klau. Lagrangian relaxation applied to sparse global network alignment. In IAPR International Conference on Pattern Recognition in Bioinformatics, pages 225–236. Springer, 2011.

[68] Charles S Elton. Animal ecology. University of Chicago Press, 1927.

[69] Paul Erdos¨ and Alfred´ Renyi.´ On random graphs, i. Publicationes Mathematicae (Debrecen), 6:290–297, 1959.

[70] AL Evans, Navinder J Singh, Andrea Friebe, Jon Martin Arnemo, TG Laske, O Frobert,¨ Jon E Swenson, and S Blanc. Drivers of hibernation in the brown bear. Frontiers in zoology, 13(1):7, 2016.

[71] Martin J Evans. The cultural mouse. Nature medicine, 7(10):1081–1083, 2001.

[72] Jose´ M Facelli, Peter Chesson, and Nicola Barnes. Differences in seed biology of annual plants in arid lands: a key ingredient of the storage effect. Ecology, 86(11):2998–3006, 2005.

[73] Akasha M Faist and Stower C Beals. Invasive plant feedbacks promote alternative states in california vernal pools. Restoration Ecology, 26(2):255–263, 2018.

[74] Daniel P Faith, Peter R Minchin, and Lee Belbin. Compositional dissimilarity as a robust measure of ecological distance. Vegetatio, 69(1-3):57–68, 1987. 84

[75] Lili Feng, Zhiqing Jia, and Qingxue Li. The dynamic monitoring of aeolian desertification land distribution and its response to climate change in northern china. Scientific reports, 6:39563, 2016.

[76] Robert Fergus, Pietro Perona, and Andrew Zisserman. Object class recognition by un- supervised scale-invariant learning. In Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on, volume 2, pages II–264. IEEE, 2003.

[77] Ronald A Fisher, A Steven Corbet, and Carrington B Williams. The relation between the number of species and the number of individuals in a random sample of an animal popu- lation. The Journal of Animal Ecology, pages 42–58, 1943.

[78] Miguel A Fortuna, Raul´ Ortega, and Jordi Bascompte. The web of life. arXiv preprint arXiv:1403.2575, 2014.

[79] James H Fowler. Legislative cosponsorship networks in the us house and senate. Social Networks, 28(4):454–465, 2006.

[80] Javier Galeano, Juan M Pastor, and Jose M Iriondo. Weighted-interaction nestedness estima- tor (wine): a new estimator to calculate over frequency matrices. Environmental Modelling & Software, 24(11):1342–1346, 2009.

[81] R. Gentleman. yeastExpData: Yeast Experimental Data, 2016. R package version 0.20.0.

[82] Fritz Gerhardt and Sharon K Collinge. Exotic plant invasions of vernal pools in the central valley of california, usa. Journal of Biogeography, 30(7):1043–1052, 2003.

[83] Todd A Gibson and Debra S Goldberg. Improving evolutionary models of protein interac- tion networks. Bioinformatics, 27(3):376–382, 2011.

[84] Corrado Gini. Variabilita` e mutabilita.` Reprinted in Memorie di metodologica statistica (Ed. Pizetti E, Salvemini, T). Rome: Libreria Eredi Virgilio Veschi, 1, 1912.

[85] Michelle Girvan and Mark EJ Newman. Community structure in social and biological networks. Proceedings of the national academy of sciences, 99(12):7821–7826, 2002.

[86] Henry Allan Gleason. Further views on the succession-concept. Ecology, 8(3):299–326, 1927.

[87] Kwang-Il Goh, Michael E Cusick, David Valle, Barton Childs, Marc Vidal, and Albert- Laszl´ o´ Barabasi.´ The human disease network. Proceedings of the National Academy of Sciences, 104(21):8685–8690, 2007.

[88] Devin W Goodsman, Jeric S Goodsman, Daniel W McKenney, Victor J Lieffers, and Nadir Erbilgin. Too much of a good thing: landscape-scale facilitation eventually turns into com- petition between a lepidopteran defoliator and a bark beetle. Landscape ecology, 30(2):301– 312, 2015.

[89] John C Gower. Some distance properties of latent root and vector methods used in multi- variate analysis. Biometrika, pages 325–338, 1966. 85

[90] Michael H Graham. Confronting multicollinearity in ecological multiple regression. Ecology, 84(11):2809–2815, 2003.

[91] Clive WJ Granger. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, pages 424–438, 1969.

[92] Dominique Gravel, Franc¸ois Massol, Elsa Canard, David Mouillot, and Nicolas Mouquet. Trophic theory of island biogeography. Ecology letters, 14(10):1010–1016, 2011.

[93] Bradford Hall, Advait Limaye, and Ashok B Kulkarni. Overview: generation of gene knockout mice. Current protocols in cell biology, pages 19–12, 2009.

[94] Richard W Hamming. Error detecting and error correcting codes. Bell System technical journal, 29(2):147–160, 1950.

[95] Donna Jeanne Haraway. Crystals, Fabrics, and Fields: Metaphors of Organicism in 20th-Century Developmental Biology. Yale University Press, 1976.

[96] W Harford, SR Sagarese, MA Nuttall, M Karnauskas, H Liu, M Lauretta, M Schirripa, and JF Walter. Can climate explain temporal trends in king mackerel (scomberomorus cavalla) catch-per-unit-effort and landings? Technical report, Tech. Rep., SEDAR, SEDAR38-AW-04. SEDAR, North Charleston, SC, 2014.

[97] Michael P Hassell, John H Lawton, and RM May. Patterns of dynamical behaviour in single-species populations. The Journal of Animal Ecology, pages 471–486, 1976.

[98] Alan Hastings, Carole L Hom, Stephen Ellner, Peter Turchin, and H Charles J Godfray. Chaos in ecology: is mother nature a strange attractor? Annual review of ecology and systematics, 24(1):1–33, 1993.

[99] Alan Hastings, Kevin S McCann, and Peter C de Ruiter. Introduction to the special issue: theory of food webs. Theoretical Ecology, 9(1):1–2, 2016.

[100] Alan Hastings and Thomas Powell. Chaos in a three-species food chain. Ecology, 72(3):896– 903, 1991.

[101] Karl Havens. Scale and structure in natural food webs. Science(Washington), 257(5073):1107–1109, 1992.

[102] DT Haydon, DA Randall, L Matthews, DL Knobel, LA Tallents, MB Gravenor, SD Williams, JP Pollinger, S Cleaveland, MEJ Woolhouse, et al. Low-coverage vaccination strategies for the conservation of endangered species. Nature, 443(7112):692, 2006.

[103] Eduard Heine. Die elemente der fucntionenlehre. J. reine angew. Math., 74:172–188, 1871.

[104] Arthur E Hoerl and Robert W Kennard. Ridge regression: Biased estimation for nonorthog- onal problems. Technometrics, 12(1):55–67, 1970.

[105] Douglas R. Hofstadter. Godel, Escher, Bach: An Eternal Golden Braid. Basic Books, Inc., New York, NY, USA, 1979. 86

[106] Sally J Holbrook, Russell J Schmitt, and John S Stephens. Changes in an assemblage of temperate reef fishes associated with a climate shift. Ecological Applications, 7(4):1299– 1310, 1997.

[107] Paul W Holland. Weighted ridge regression: Combining ridge and robust regression meth- ods, 1973.

[108] Crawford S Holling. Resilience and stability of ecological systems. Annual review of ecology and systematics, 4(1):1–23, 1973.

[109] FY Hu, DY Tao, E Sacks, BY Fu, P Xu, J Li, Y Yang, K McNally, GS Khush, AH Pater- son, et al. Convergent evolution of perenniality in rice and sorghum. Proceedings of the National Academy of Sciences, 100(7):4050–4054, 2003.

[110] Stephen P Hubbell. The unified neutral theory of biodiversity and biogeography (MPB-32), volume 32. Princeton University Press, 2001.

[111] Thomas C Ings, Jose´ M Montoya, Jordi Bascompte, Nico Bluthgen,¨ Lee Brown, Carsten F Dormann, Franc¸ois Edwards, David Figueroa, Ute Jacob, J Iwan Jones, et al. Review: Ecological networks–beyond food webs. Journal of Animal Ecology, 78(1):253–269, 2009.

[112] Jeremy Irvin, Daniel Spokoyny, and Fermın Moscoso del Prado Martın. Dynamical systems modeling of the child–mother dyad: Causality between child-directed language complexity and language development. In Proceedings of the 38th Annual Conference of the Cognitive Science Society, Austin, TX, 2016.

[113] Iaroslav Ispolatov, PL Krapivsky, and A Yuryev. Duplication-divergence model of protein interaction network. Physical review E, 71(6):061911, 2005.

[114] AR Ives, B Dennis, KL Cottingham, and SR Carpenter. Estimating community stability and ecological interactions from time-series data. Ecological monographs, 73(2):301–330, 2003.

[115] Johan Ludwig William Valdemar Jensen. Sur les fonctions convexes et les inegalit´ es´ entre les valeurs moyennes. Acta mathematica, 30(1):175–193, 1906.

[116] Hawoong Jeong, Sean P Mason, A-L Barabasi,´ and Zoltan N Oltvai. Lethality and centrality in protein networks. Nature, 411(6833):41–42, 2001.

[117] Pieter TJ Johnson, Richard B Hartson, Donald J Larson, and Daniel R Sutherland. Diversity and disease: community structure drives parasite transmission and host fitness. Ecology Letters, 11(10):1017–1026, 2008.

[118] Pieter TJ Johnson, Daniel L Preston, Jason T Hoverman, and Katherine LD Richgels. Bio- diversity decreases disease through predictable changes in host community competence. Nature, 494(7436):230, 2013.

[119] Fred´ eric´ Jurie and Cordelia Schmid. Scale-invariant shape features for recognition of object categories. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, volume 2, pages II–90. IEEE, 2004.

[120] Dan M Kahan, Hank Jenkins-Smith, and Donald Braman. Cultural cognition of scientific consensus. Journal of Risk Research, 14(2):147–174, 2011. 87

[121] Christina Kahramanoglou, Aswin SN Seshasayee, Ana I Prieto, David Ibberson, Sabine Schmidt, Jurgen Zimmermann, Vladimir Benes, Gillian M Fraser, and Nicholas M Lus- combe. Direct and indirect effects of h-ns and fis on global gene expression control in escherichia coli. Nucleic acids research, 39(6):2073–2091, 2011.

[122] Karel Joseph Kansky. Structure of transportation networks: relationships between network geometry and regional characteristics. 1963.

[123] Leo Katz. A new status index derived from sociometric analysis. Psychometrika, 18(1):39– 43, 1953.

[124] James J Kay, Lee A Graham, and Robert E Ulanowicz. A detailed guide to network analysis. In Network Analysis in Marine Ecology, pages 15–61. Springer, 1989.

[125] Brian P Kelley, Roded Sharan, Richard M Karp, Taylor Sittler, David E Root, Brent R Stockwell, and Trey Ideker. Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proceedings of the National Academy of Sciences, 100(20):11394–11399, 2003.

[126] A Marm Kilpatrick, Peter Daszak, Matthew J Jones, Peter P Marra, and Laura D Kramer. Host heterogeneity dominates west nile virus transmission. Proceedings of the Royal Society of London B: Biological Sciences, 273(1599):2327–2333, 2006.

[127] Wan Kyu Kim and Edward M Marcotte. Age-dependent evolution of the yeast protein interaction network suggests a limited role of gene duplication and divergence. PLoS Comput Biol, 4(11):e1000232, 2008.

[128] Hiroaki Kitano. Computational systems biology. Nature, 420(6912):206–210, 2002.

[129]D enes´ Konig.¨ Uber¨ graphen und ihre anwendung auf determinantentheorie und mengen- lehre. Mathematische Annalen, 77(4):453–465, 1916.

[130] Gueorgi Kossinets and Duncan J Watts. Empirical analysis of an evolving social network. science, 311(5757):88–90, 2006.

[131] Boris R Krasnov, David Mouillot, Georgy I Shenbrot, Irina S Khokhlova, and Robert Poulin. Abundance patterns and coexistence processes in communities of fleas parasitic on small mammals. Ecography, 28(4):453–464, 2005.

[132] Oleksii Kuchaiev, Tijana Milenkovic,´ Vesna Memiseviˇ c,´ Wayne Hayes, and Natasaˇ Przulj.ˇ Topological network alignment uncovers biological function and phylogeny. Journal of the Royal Society Interface, 7(50):1341–1354, 2010.

[133] Oleksii Kuchaiev and Natasaˇ Przulj.ˇ Integrative network alignment reveals large regions of global network similarity in yeast and human. Bioinformatics, 27(10):1390–1396, 2011.

[134] Harold W Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83–97, 1955.

[135] Harold W Kuhn. A tale of three eras: The discovery and rediscovery of the hungarian method. European Journal of Operational Research, 219(3):641–651, 2012. 88

[136] Solomon Kullback. Information theory and statistics. Courier Corporation, 1997.

[137] Georges Kunstler, Sebastien´ Lavergne, Benoˆıt Courbaud, Wilfried Thuiller, Ghislain Vieille- dent, Niklaus E Zimmermann, Jens Kattge, and David A Coomes. Competitive interactions between forest trees are driven by species’ trait hierarchy, not phylogenetic or functional similarity: implications for forest community assembly. Ecology letters, 15(8):831–840, 2012.

[138] Renaud Lambiotte, J-C Delvenne, and Mauricio Barahona. Laplacian dynamics and multi- scale modular structure in networks. arXiv preprint arXiv:0812.1770, 2008.

[139] Ryan Langendorf. netcom: Dynamic Network Alignment, 2017. R package version 1.0.4.

[140] Ryan E Langendorf, Sharon K Collinge, and Dan F Doak. Interspecific causation of the endangered vernal pool plant Lasthenia conjugens. PLOS Computational Biology, In Prepa- ration.

[141] Ryan E Langendorf and Daniel F Doak. Can community structure causally determine dynamics of constituent species? a test using a host-parasite community. The American Naturalist, 2018, In revision.

[142] Ryan E Langendorf and Debra S Goldberg. Aligning statistical dynamics captures biologi- cal network functioning. Network Science, Under Review.

[143] Mark S Laska and J Timothy Wootton. Theoretical concepts and empirical approaches to measuring interaction strength. Ecology, 79(2):461–476, 1998.

[144] Public Law. Law 93-205 (dec. 28, 1973). Endangered species act of, 87, 1973.

[145] Franc¸ois Le Gall. Powers of tensors and fast matrix multiplication. In Proceedings of the 39th international symposium on symbolic and algebraic computation, pages 296–303. ACM, 2014.

[146] Richard Levins and Richard Lewontin. Dialectics and reductionism in ecology. Synthese, 43(1):47–78, 1980.

[147] Richard C Lewontin. The meaning of stability. In Brookhaven symposia in biology, vol- ume 22, page 13, 1969.

[148] X. S. Liang. Unraveling the cause-effect relation between time series. Physical Review E, 90(5):052150, 2014.

[149] X. S. Liang. Normalizing the causality between time series. Physical Review E, 92(2):022126, 2015.

[150] X. S. Liang. Information flow and causality as rigorous notions ab initio. Physical Review E, 94(5):052201, 2016.

[151] Edward N Lorenz. Deterministic nonperiodic flow. Journal of the atmospheric sciences, 20(2):130–141, 1963.

[152] Alfred J Lotka. Elements of physical biology, william and wilkins, baltimore, 1925. reissued as elements of mathematical biology, 1956. 89

[153] James E Lovelock and Lynn Margulis. Atmospheric homeostasis by and for the biosphere: the gaia hypothesis. Tellus, 26(1-2):2–10, 1974.

[154] Jonathan G Lundgren and Janet K Fergen. Predator community structure and trophic linkage strength to a focal prey. Molecular ecology, 23(15):3790–3798, 2014.

[155] Chuan Luo, Xiaolong Zheng, and Daniel Zeng. Causal inference in social media using convergent cross mapping. In Intelligence and Security Informatics Conference (JISIC), 2014 IEEE Joint, pages 260–263. IEEE, 2014.

[156] Haisu Ma. COSINE: COndition SpecIfic sub-NEtwork, 2014. R package version 2.1.

[157] Huanfei Ma, Kazuyuki Aihara, and Luonan Chen. Detecting causality from nonlinear dynamics with short-term time series. Scientific reports, 4, 2014.

[158] Robert MacArthur. Fluctuations of animal populations and a measure of community sta- bility. ecology, 36(3):533–536, 1955.

[159] Kazuaki Matsui, Shigeki Kono, Asuka Saeki, Nobuyoshi Ishii, Man-Gi Min, and Zen’ichiro Kawabata. Direct and indirect interactions for coexistence in a species-defined microcosm. Hydrobiologia, 435(1):109–116, 2000.

[160] Robert M May. Thresholds and breakpoints in ecosystems with a multiplicity of stable states. Nature, 269(5628):471–477, 1977.

[161] Robert McCredie May. Stability and complexity in model ecosystems, volume 6. Princeton University Press, 1973.

[162] James M McCracken and Robert S Weigel. Convergent cross-mapping and pairwise asym- metric inference. Physical Review E, 90(6):062903, 2014.

[163] Scott McGinnis and Thomas L Madden. Blast: at the core of a powerful and diverse set of sequence analysis tools. Nucleic acids research, 32(suppl 2):W20–W25, 2004.

[164] John A McGowan, Ethan R Deyle, Hao Ye, Melissa L Carter, Charles T Perretti, Kerri D Seger, Alain Verneil, and George Sugihara. Predicting coastal algal blooms in southern california. Ecology, 98(5):1419–1433, 2017.

[165] Tijana Milenkovic and Natasa Przulj. Uncovering biological network function via graphlet degree signatures. arXiv preprint arXiv:0802.0556, 2008.

[166] Dennis J Minchella and Marilyn E Scott. Parasitism: a cryptic determinant of animal community structure. Trends in Ecology & Evolution, 6(8):250–254, 1991.

[167] Peter R Minchin. An evaluation of the relative robustness of techniques for ecological ordination. In Theory and models in vegetation science, pages 89–107. Springer, 1987.

[168] Dan Mønster, Riccardo Fusaroli, Kristian Tylen,´ Andreas Roepstorff, and Jacob F Sher- son. Causal inference from noisy time-series datatesting the convergent cross-mapping algorithm in the presence of noise and external influence. Future Generation Computer Systems, 73:52–62, 2017. 90

[169] Peter J Mucha, Thomas Richardson, Kevin Macon, Mason A Porter, and Jukka-Pekka Onnela. Community structure in time-dependent, multiscale, and multiplex networks. science, 328(5980):876–878, 2010.

[170] CB Muller, ICT Adriaanse, R Belshaw, and HCJ Godfray. The structure of an aphid– parasitoid community. Journal of Animal Ecology, 68(2):346–370, 1999.

[171] James Munkres. Algorithms for the assignment and transportation problems. Journal of the society for industrial and applied mathematics, 5(1):32–38, 1957.

[172] Elena Nabieva, Kam Jim, Amit Agarwal, Bernard Chazelle, and Mona Singh. Whole- proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics, 21(suppl 1):i302–i310, 2005.

[173]L aszl´ o´ Negyessy,´ Tamas´ Nepusz, Laszl´ o´ Kocsis, and Ful¨ op¨ Bazso.´ Prediction of the main cortical areas and connections involved in the tactile function of the visual cortex by net- work analysis. European Journal of Neuroscience, 23(7):1919–1930, 2006.

[174] Mark Newman. Analysis of weighted networks. Physical review E, 70(5):056131, 2004.

[175] Mark Newman. Networks: an introduction. Oxford university press, 2010.

[176] Mark EJ Newman. Clustering and preferential attachment in growing networks. Physical review E, 64(2):025102, 2001.

[177] Mark Novak and J Timothy Wootton. Using experimental indices to quantify the strength of species interactions. Oikos, 119(7):1057–1063, 2010.

[178] Mark Novak, J Timothy Wootton, Daniel F Doak, Mark Emmerson, James A Estes, and M Timothy Tinker. Predicting community responses to perturbations in the face of imper- fect knowledge and network complexity. Ecology, 92(4):836–846, 2011.

[179] Mark Novak, Justin D Yeakel, Andrew E Noble, Daniel F Doak, Mark Emmerson, James A Estes, Ute Jacob, M Timothy Tinker, and J Timothy Wootton. Characterizing species inter- actions to understand press perturbations: what is the community matrix? Annual Review of Ecology, Evolution, and Systematics, 47:409–432, 2016.

[180] Richard S Ostfeld and Felicia Keesing. Biodiversity series: the function of biodiversity in the ecology of vector-borne zoonotic diseases. Canadian Journal of Zoology, 78(12):2061– 2078, 2000.

[181] Otso Ovaskainen, Jenni Hottola, and Juha Siitonen. Modeling species co-occurrence by multivariate logistic regression generates new hypotheses on fungal interactions. Ecology, 91(9):2514–2521, 2010.

[182] Norman H Packard, James P Crutchfield, J Doyne Farmer, and Robert S Shaw. Geometry from a time series. Physical review letters, 45(9):712, 1980.

[183] Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: bringing order to the web. 1999.

[184] Robert T Paine. Food web complexity and species diversity. American Naturalist, pages 65–75, 1966. 91

[185] Robert T Paine. A note on trophic complexity and community stability. The American Naturalist, 103(929):91–93, 1969.

[186] Robert T Paine. Ecological determinism in the competition for space: the robert h. macarthur award lecture. Ecology, 65(5):1339–1348, 1984.

[187] Robert T Paine. Road maps of interactions or grist for theoretical development? Ecology, 69(6):1648–1654, 1988.

[188] RT Paine. Food-web analysis through field measurement of per capita interaction strength. Nature, 355(6355):73, 1992.

[189] Romualdo Pastor-Satorras, Eric Smith, and Ricard V Sole.´ Evolving protein interaction networks through gene duplication. Journal of Theoretical biology, 222(2):199–210, 2003.

[190] Rob Patro and Carl Kingsford. Global network alignment using multiscale spectral signa- tures. Bioinformatics, 28(23):3105–3114, 2012.

[191] Sara H Paull, Sejin Song, Katherine M McClure, Loren C Sackett, A Marm Kilpatrick, and Pieter TJ Johnson. From superspreaders to disease hotspots: linking transmission across hosts and space. Frontiers in Ecology and the Environment, 10(2):75–82, 2012.

[192] Judea Pearl. Graphs, causality, and structural equation models. Sociological Methods & Research, 27(2):226–284, 1998.

[193] Karl Pearson. Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58:240–242, 1895.

[194] Evelyn C Pielou. The measurement of diversity in different types of biological collections. Journal of theoretical biology, 13:131–144, 1966.

[195] Timothee´ Poisot, Elsa Canard, David Mouillot, Nicolas Mouquet, and Dominique Gravel. The dissimilarity of species interaction networks. Ecology letters, 15(12):1353–1361, 2012.

[196] David M Post, M Elizabeth Conners, and Debra S Goldberg. Prey preference by a top predator and the stability of linked food chains. Ecology, 81(1):8–14, 2000.

[197] Steven D Prager and William A Reiners. Historical and emerging practices in ecological topology. ecological complexity, 6(2):160–171, 2009.

[198] Stephen R Proulx, Daniel EL Promislow, and Patrick C Phillips. Network thinking in ecology and evolution. Trends in Ecology & Evolution, 20(6):345–353, 2005.

[199] Jonathan N Pruitt and Maud CO Ferrari. Intraspecific trait variants determine the nature of interspecific interactions in a habitat-forming species. Ecology, 92(10):1902–1908, 2011.

[200] H Ronald Pulliam. Sources, sinks, and population regulation. American naturalist, pages 652–661, 1988.

[201] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2017. 92

[202] Enrico L Rezende, Pedro Jordano, and Jordi Bascompte. Effects of phenotypic complemen- tarity and phylogeny on the nested structure of mutualistic networks. Oikos, 116(11):1919– 1929, 2007.

[203] R. A. Rigby and D. M. Stasinopoulos. Generalized additive models for location, scale and shape,(with discussion). Applied Statistics, 54:507–554, 2005.

[204] WE Ritter. The unity of the organism, 2 vols, 1919.

[205] Miguel A Rodr´ıguez-Girones´ and Luis Santamar´ıa. A new algorithm to calculate the nest- edness temperature of presence–absence matrices. Journal of Biogeography, 33(5):924–935, 2006.

[206] Ryan A. Rossi and Nesreen K. Ahmed. The network data repository with interactive graph analytics and visualization. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015.

[207] Ryan A Rossi and Nesreen K Ahmed. An interactive data repository with visual analytics. ACM SIGKDD Explorations Newsletter, 17(2):37–41, 2016.

[208] Sayed Mohammad Ebrahim Sahraeian and Byung-Jun Yoon. A network synthesis model for generating protein interaction network families. PloS one, 7(8):e41474, 2012.

[209] Erina Sakamoto and Hitoshi Iba. Inferring a system of differential equations for a gene regulatory network by using genetic programming. In Evolutionary Computation, 2001. Proceedings of the 2001 Congress on, volume 1, pages 720–726. IEEE, 2001.

[210] Marcel Salathe´ and James H Jones. Dynamics and control of diseases in networks with community structure. PLoS computational biology, 6(4):e1000736, 2010.

[211] Marten Scheffer, Stephen R Carpenter, Timothy M Lenton, Jordi Bascompte, William Brock, Vasilis Dakos, Johan Van De Koppel, Ingrid A Van De Leemput, Simon A Levin, Egbert H Van Nes, et al. Anticipating critical transitions. science, 338(6105):344–348, 2012.

[212] Marten Scheffer, Steve Carpenter, Jonathan A Foley, Carl Folke, and Brian Walker. Catas- trophic shifts in ecosystems. Nature, 413(6856):591, 2001.

[213] Karin Schiecke, Britta Pester, Diana Piper, Franz Benninger, Martha Feucht, Lutz Leistritz, and Herbert Witte. Nonlinear directed interactions between hrv and eeg activity in children with tle. IEEE Transactions on Biomedical Engineering, 63(12):2497–2504, 2016.

[214] Thomas Schreiber. Measuring information transfer. Physical review letters, 85(2):461, 2000.

[215] Martin Schuster and E Peter Greenberg. A network of networks: quorum-sensing gene regulation in pseudomonas aeruginosa. International journal of medical microbiology, 296(2):73–81, 2006.

[216] Esther Sebastian-Gonz´ alez,´ Jose´ Antonio Sanchez-Zapata,´ Francisco Botella, and Otso Ovaskainen. Testing the heterospecific attraction hypothesis with time-series data on species co-occurrence. Proceedings of the Royal Society of London B: Biological Sciences, 277(1696):2983–2990, 2010. 93

[217] Claude Elwood Shannon. A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review, 5(1):3–55, 2001.

[218] Roded Sharan and Trey Ideker. Modeling cellular machinery through biological network comparison. Nature biotechnology, 24(4):427–433, 2006.

[219] Roded Sharan, Igor Ulitsky, and Ron Shamir. Network-based prediction of protein function. Molecular systems biology, 3(1):88, 2007.

[220] Daniel S Simberloff and Edward O Wilson. Experimental zoogeography of islands: the colonization of empty islands. Ecology, 50(2):278–296, 1969.

[221] Edward H Simpson. Measurement of diversity. Nature, 1949.

[222] Dmitry A Smirnov. Spurious causalities with transfer entropy. Physical Review E, 87(4):042917, 2013.

[223] Michael E Soule,´ James A Estes, Joel Berger, and Carlos Martinez Del Rio. Ecological effectiveness: conservation goals for interactive species. Conservation Biology, 17(5):1238– 1250, 2003.

[224] Matthew Spencer and Edward Susko. Continuous-time markov models for species inter- actions. Ecology, 86(12):3272–3278, 2005.

[225] Brian J Spiesman and Brian D Inouye. The consequences of multiple indirect pathways of interaction for species coexistence. Theoretical Ecology, 8(2):225–232, 2015.

[226] M Stanko and D Miklisova. Data from: Empirical evaluation of neutral interactions in host-parasite networks, 2014.

[227] Michal Stanko, Dana Miklisova,´ Joelle¨ Gouy¨ de Bellocq, and Serge Morand. Mammal density and patterns of ectoparasite species richness and abundance. Oecologia, 131(2):289– 295, 2002.

[228] CA Stepien, CD Taylor, and KA Dabrowska. Genetic variability and phylogeographical patterns of a nonindigenous species invasion: a comparison of exotic vs. native zebra and quagga mussel populations. Journal of Evolutionary Biology, 15(2):314–328, 2002.

[229] Adolf Stips, Diego Macias, Clare Coughlan, Elisa Garcia-Gorriz, and X. S. Liang. On the causal structure between co2 and global temperature. Scientific reports, 6, 2016.

[230] Donald R Strong. Special feature: Food web theory: A ladder for picking strawberries. Ecology, 69(6):1647–1647, 1988.

[231] George Sugihara. Nonlinear forecasting for the classification of natural time series. Phil. Trans. R. Soc. Lond. A, 348(1688):477–495, 1994.

[232] George Sugihara, Robert May, Hao Ye, Chih-hao Hsieh, Ethan Deyle, Michael Fogarty, and Stephan Munch. Detecting causality in complex ecosystems. science, 338(6106):496–500, 2012.

[233] John P Sutherland. Multiple stable points in natural communities. The American Naturalist, 108(964):859–873, 1974. 94

[234] Floris Takens. Detecting strange attractors in turbulence. In Dynamical systems and turbulence, Warwick 1980, pages 366–381. Springer, 1981.

[235] Arthur G Tansley. The use and abuse of vegetational concepts and terms. Ecology, 16(3):284–307, 1935.

[236] Elisa Thebault´ and Colin Fontaine. Stability of ecological communities and the architecture of mutualistic and trophic networks. Science, 329(5993):853–856, 2010.

[237] FPJF Thomas, R Poulin, Jean-Franc¸ois Guegan,´ Y Michalakis, and F Renaud. Are there pros as well as cons to being parasitized? Parasitology today, 16(12):533–536, 2000.

[238] Anastasios A Tsonis, Ethan R Deyle, Robert M May, George Sugihara, Kyle Swanson, Joshua D Verbeten, and Geli Wang. Dynamical evidence for causality between galactic cosmic rays and interannual variation in global temperature. Proceedings of the National Academy of Sciences, 112(11):3253–3256, 2015.

[239] Peter Turchin and Andrew D Taylor. Complex dynamics in ecological time series. Ecology, 73(1):289–305, 1992.

[240] Jessica B Turner and James B McGraw. Can putative indicator species predict habitat quality for american ginseng? Ecological Indicators, 57:110–117, 2015.

[241] Jason M Tylianakis, Etienne Laliberte,´ Anders Nielsen, and Jordi Bascompte. Conservation of species interaction networks. Biological conservation, 143(10):2270–2279, 2010.

[242] Jason M Tylianakis, Teja Tscharntke, and Owen T Lewis. Habitat modification alters the structure of tropical host–parasitoid food webs. Nature, 445(7124):202–205, 2007.

[243] Alfonso Valiente-Banuet and Miguel Verdu.´ Temporal shifts from facilitation to competition occur between closely related taxa. Journal of Ecology, 96(3):489–494, 2008.

[244] Egbert H Van Nes, Marten Scheffer, Victor Brovkin, Timothy M Lenton, Hao Ye, Ethan Deyle, and George Sugihara. Causal feedbacks in climate change. Nature Climate Change, 5(5):445–448, 2015.

[245] Alexei Vazquez,´ Alessandro Flammini, Amos Maritan, and Alessandro Vespignani. Mod- eling of protein interaction networks. Complexus, 1(1):38–44, 2002.

[246] Vito Volterra. Variazioni e fluttuazioni del numero d’individui in specie animali conviventi. C. Ferrari, 1927.

[247] Vito Volterra. Variations and fluctuations of the number of individuals in animal species living together. ICES Journal of Marine Science, 3(1):3–51, 1928.

[248] Caroline S Wagner and Loet Leydesdorff. Network structure, self-organization, and the growth of international collaboration in science. Research policy, 34(10):1608–1618, 2005.

[249] Peter D Walsh, Roman Biek, and Leslie A Real. Wave-like spread of ebola zaire. PLoS biology, 3(11):e371, 2005.

[250] Robert R Warner and Peter L Chesson. Coexistence mediated by recruitment fluctuations: a field guide to the storage effect. The American Naturalist, 125(6):769–787, 1985. 95

[251] Duncan J Watts and Steven H Strogatz. Collective dynamics of small-worldnetworks. nature, 393(6684):440–442, 1998.

[252] John G White, Eileen Southgate, J Nichol Thomson, and Sydney Brenner. The structure of the nervous system of the nematode caenorhabditis elegans. Philos Trans R Soc Lond B Biol Sci, 314(1165):1–340, 1986.

[253] Thorsten Wiegand, Florian Jeltsch, Ilkka Hanski, and Volker Grimm. Using pattern- oriented modeling for revealing hidden information: a key for reconciling ecological theory and application. Oikos, 100(2):209–222, 2003.

[254] Richard J Williams and Neo D Martinez. Simple rules yield complex food webs. Nature, 404(6774):180–183, 2000.

[255] Axel Wismuller,¨ Anas Z Abidin, Adora M DSouza, and Mahesh B Nagarajan. Mutual connectivity analysis (mca) for nonlinear functional connectivity network recovery in the human brain using convergent cross-mapping and non-metric clustering. In Advances in Self-Organizing Maps and Learning Vector Quantization, pages 217–226. Springer, 2016.

[256] Axel Wismuller,¨ Xixi Wang, Adora M DSouza, and Mahesh B Nagarajan. A framework for exploring non-linear functional connectivity and causality in the human brain: Mutual connectivity analysis (mca) of resting-state functional mri with convergent cross-mapping and non-metric clustering. arXiv preprint arXiv:1407.3809, 2014.

[257] J Timothy Wootton. Predicting direct and indirect effects: an integrated approach using experiments and path analysis. Ecology, 75(1):151–165, 1994.

[258] JT Wootton, ME Power, RT Paine, and CA Pfister. Effects of productivity, consumers, competitors, and el nino events on food chain patterns in a rocky intertidal community. Proceedings of the National Academy of Sciences, 93(24):13855–13858, 1996.

[259] Hao Ye, Richard J Beamish, Sarah M Glaser, Sue CH Grant, Chih-hao Hsieh, Laura J Richards, Jon T Schnute, and George Sugihara. Equation-free mechanistic ecosystem forecasting using empirical dynamic modeling. Proceedings of the National Academy of Sciences, 112(13):E1569–E1576, 2015.

[260] Hao Ye, Adam Clark, Ethan Deyle, Oliver Keyes, and George Sugihara. rEDM: Applications of Empirical Dynamic Modeling from Time Series, 2016. R package version 0.5.4.

[261] Hao Ye, Ethan R Deyle, Luis J Gilarranz, and George Sugihara. Distinguishing time-delayed causal interactions using convergent cross mapping. Scientific reports, 5, 2015.

[262] Wayne W Zachary. An information flow model for conflict and fission in small groups. Journal of anthropological research, pages 452–473, 1977. Appendix A

Chapter 1 Appendix

A.1 Supplemental Figures

Table A.1: Model specifications for synthetic networks. 97

Table A.2: Induced Conserved Structure (ICS) scores for the alignments produced by the method developed here when applied to the same 30 pairs of networks recently compared by Clark and Kalita [37]. The three network families were Duplication with Random Mutation (DMR), Duplication-Mutation-Complementation (DMC), and Crystal Growth (CG). Each pair of net- works was comprised of one 3000 node network and one 4000 node network. 98

Figure A.1: Shepard plots for the non-metric multidimensional scaling of network dynamics in Fig. 2.5. (A) Stress = 0.07. (B) Stress = 0.0002. 99

Figure A.2: Shepard plot for the non-metetric multidimensional scaling of pairwise alignments between the 307 empirical networks and 120 reference networks in Fig. 2.6. Stress = 0.11. 100

Figure A.3: Relationship between the difference in network size (nodes) and the quality of their alignment, indicating the alignments were size-invariant. 101

Figure A.4: Shepard plot for the non-metric multidimensional scaling of pairwise alignments between the 100 empirical ecological networks in Fig. 2.8. Stress = 0.16. 102

ID Classification Source Details

1 - 30 Erdos-Renyi R igraph package [50] 10 each with 10, 100, and

1000 nodes from the Static

Version in Table S1

31 - 60 Preferential R igraph package [50] 10 each with 10, 100, and

Attachment 1000 nodes from the Static

Version in Table S1

61 - 90 Duplication and [83, 113, 245] 10 each with 10, 100, and

Divergence 1000 nodes from the Static

Version in Table S1

91 - 120 Small World R igraph package [50] 10 each with 10, 100, and

1000 nodes from the Static

Version in Table S1 103

121 - 218 Gene Regulatory ENCODE Consortium [44] 100 randomly selected

Networks and Dr. Yijun Ruan connected components from

the following Accession

Numbers: ENCFF001THV,

ENCFF001THX,

ENCFF001THY,

ENCFF001THZ,

ENCFF001TIA,

ENCFF001TIB,

ENCFF001TID,

ENCFF001TIE,

ENCFF001TIF,

ENCFF001TIJ,

ENCFF001THU,

ENCFF001THT,

ENCFF001TIG,

ENCFF001THW,

ENCFF001TIC

219 - 276 Trophic R enaR package [25]

Ecosystems

277 - 318 Biogeochemical R enaR package [25]

Ecosystems

319 Macaque Brain R igraphdata package [173] Visuotactile regions 104

320 C. elegans Network Repository [62,206]

Metabolism

321 Human Diseasome Network Repository [87,206]

322 Yeast Network

Protein-Protein Repository [116, 206]

Interactions

323 Protein-Protein R COSINE package [156]

Interactions

324 - 327 Protein-Gene R yeastExpData package [81]

Interactions 328 - 427 Enzymatic Network Repository g2, g12, g30, g35, g39, g54,

Pathways Cheminformatics [206] g57, g59, g62, g64, g70, g72,

g97, g101, g115, g117, g121,

g122, g129, g132, g137, g141,

g142, g144, g149, g158, g162,

g167, g170, g171, g175, g189,

g192, g201, g203, g204, g206,

g208, g217, g230, g245, g246,

g247, g259, g263, g264, g275,

g298, g299, g302, g304, g310,

g314, g321, g332, g337, g345,

g354, g362, g363, g365, g375,

g377, g378, g387, g392, g397,

g401, g410, g412, g415, g421,

g423, g438, g439, g444, g458,

g459, g464, g474, g481, g509,

g516, g520, g532, g533, g535,

g537, g543, g544, g558, g560,

g562, g563, g574, g578, g588,

g592, g596, g598

Table A.3: Sources for all 427 networks in Fig. 2.6. All networks were freely available as of the publication date.

105 Appendix B

Chapter 2 Appendix

B.1 Supplemental Methods

Here we provide a recipe to test the assumptions of Convergent Cross Mapping (CCM) and apply it to time series data of populations as well as community properties. All but the tests of

CCM’s applicability to community properties were adapted from the vignette for the R package rEDM ( [260]) and in our work were run using functions in the multispatialCCM R package ( [35]) which applies CCM to spatially-replicated time series data making causal inference possible for shorter time series ( [36]).

B.1.1 Determining an Embedding Dimension

As described in the Methods section of the main text, CCM infers bidirectional causation between two time series by looking for an improvement in the ability of each time series’ shadow manifold to predict the other as the number of observations increases. The first step in applying

CCM is therefore to construct each time series’ shadow manifold. These are higher-dimensional scatter plots (e.g. Figure B.6) where the first axis is the data at time t, the second axis is a lagged version of the data at time t − τ, the third axis is a another lagged version of the data at time t − 2τ, and so on. The number of these lagged dimensions is called the embedding dimension and is typically denoted with the letter E. This dimension is the number of independent interactors underlying each time series’ dynamics and reflects the complexity of the full system. It is intuitive to conceive of each dimension as a species (or systemic property), but often the dimensions 107 are combinations of covarying variables. We identified E as the dimension that maximized the correlation coefficient (here Pearson’s formulation) between predicted and actual system states using leave-one-out cross-validation. We used the function SSR pred boot ( [35]) and varied the value for E to find this maximum, which is stored as rho in the output. Note that SSR pred boot requires values for E but also predstep and τ. predstep determines how many time steps forward each cross-validation attempts to predict and τ sets the lag size. We used a value of one for both because, as discussed in the main text, the dynamics of the Slovakian rodent-ectoparasite community appeared to occur faster than the data was collected. These parameters can only take integer values, so predstep = τ = 1 made our predictions as close as possible to the community’s actual dynamics.

Identifying an embedding dimension this way assumes predictive power is unimodal with respect to embedding dimension and the peak of this relationship is the number of dimensions within which the system’s dynamics are best represented. When there were multiple similar peaks we tested for causality using each separately, though we saw no difference in line with [232] and therefore only presented the best. If embedding dimension has no effect on cross-validation the reconstructed manifold is too incomplete or flawed to perform inference on. These time series were excluded. See Figure B.1 for examples using the three community properties that met these assumptions.

B.1.2 Testing for Nonlinearity

The primary assumption of CCM is that the system in question is deterministic with cou- pled variables resulting in chaotic nonlinear dynamics. As [232] discuss, causality in stochastic linear systems, which are separable, can be tested using Granger causality ( [91]), or perhaps by measures of information transfer between time series ( [148, 214]). Testing this assumption in- volves looking for evidence of chaotic nonlinearity in each time series’ shadow manifold, which is why an embedding dimension must first be found for each time series. This test uses the same function SSR pred boot, only here E is the identified embedding dimension and predstep is 108 varied (τ remains constant). The correlation between predicted and actual system states declines rapidly in chaotic nonlinear systems because nearby trajectories diverge. We excluded all species and community properties that did not show this decline. Additionally, we considered those with declining predictive ability which did not level off near zero to be drifting stochastically and excluded them as well. [260] suggest using predstep values from one through ten, though we tested as many as the function would allow. This will depend on the length of each time series, the value of τ, and the embedding dimension.

The rEDM vignette ( [260]) includes an additional test to distinguish deterministic systems with couplings and nonlinear dynamics from purely stochastic systems with autocorrelated red noise. Unfortunately, this test involves the s map function in the R package rEDM ( [260]) which does not currently have a counterpart for spatially-replicated data in the multispatialCCM R package ( [35]). [36] point out that CCM should correctly infer no causation regardless so we did not attempt to test for autocorrelated red noise, but this would be a useful addition to the multispatialCCM R package.

It is worth considering how representative the 3 host and 7 parasite species that met the assumptions of CCM were of the full 26 parasite and 61 host species community. They were among the more abundant, ranked 3, 4, 6 and 3, 4, 5, 7, 10, 11, 16 respectively. The rates of parasitism were also higher than average for the hosts, ranked 3, 5, 6, but more representative among the parasites, ranked 1, 3, 6, 8, 10, 31, 42. These biases likely reflects CCM’s need for data as much as it does any sparsity of chaotic nonlinear dynamics in nature, but this hypothesis is not testable.

B.1.3 Applying CCM to Community Properties

We further tested the application of CCM to community properties by considering the assumptions of the delay embedding theorem underpinning CCM ( [234]) which may not apply to properties of communities. He assumed a smooth observation function y : M → R and a smooth vector field X experiencing a diffeomorphic flow ϕt : M → M, on a compact manifold 109

M which here are the state space reconstructions of the community’s dynamics; the shadow manifold of each species. Takens also specified that smooth means twice differentiable. So for

CCM to be applicable to community properties, their state space reconstructions must be compact manifolds with twice differentiable dynamics.

Embedded manifolds in the EDM framework exist in Euclidean space so by the Heine-

Borel theorem these manifolds are compact if they are both closed and bounded ( [24, 103]). The three community properties that met the other assumptions of CCM range between zero and one and so are clearly bounded, but even those that do not are practically bounded by a reasonable limit to the number of community members. This is akin to assuming shadow manifolds con- structed from population abundances are bounded by their respective carrying capacities even if they are practically unknowable. Their time series also form closed manifolds because the interactions were based on parasite loads which can only take integer values. There are therefore a finite number of possible interaction values between zero and one after normalization. The complement of this set is open so these manifolds will always be closed. This will not always be true however, depending on the kind of interaction and the way it was measured. Many inter- action strengths are defined on R and either do not include their limit points (zero and one) or contain discontinuities which result in open manifolds. Takens did point out ( [234]) that these assumptions can be ignored if the observations y : M → R are a proper function, meaning any reconstruction of a compact dataset in R is also compact, but this is an important distinction between community properties and populations to consider in future applications of CCM and methods based on Takens’ delay embedding theorem.

Takens’ smoothness assumption is harder to test. None of the time series here, hosts and parasites included, were twice differentiable because the data was collected slower than the dy- namics of the community were occurring. So instead of directly testing the smoothness of the community properties’ time series, we compared them to those of the rodents and ectoparasites.

To do this we used correlation coefficients between adjacent time steps where a more positive correlation indicates smoother dynamics. We also computed pairwise Kolmogorov-Smirnov tests 110 using the changes between adjacent time steps to help determine if they came from the same un- derlying distribution. See Figure B.2 for results of these two tests which indicate the time series of the community properties were as smooth as those of the host and parasite populations.

Unfortunately there is no way to know if with more frequent observations the population fluctu- ations would become smoother but those of the community properties would not. Populations can only change in size by one individual in the limit of observation frequency, but community properties can have discontinuities from both impossible values and large structural changes to a community following even a small change in an important constituent population.

B.1.4 Real World Shadow Manifolds

To our knowledge, CCM has only been visualized using toy theoretical examples, like the

Lorenz system (see Fig. 2 in [232]). This has been rationalized by arguing that these theoretical examples more clearly and generally illustrate the method, and that shadow manifolds for real communities almost always embed in greater than 3 dimensions. However, not showing some representation of them has hindered non-experts ability to build intuition for what is going on. Additionally, many of the assumptions of CCM are difficult to assess with a test. Visual inspection is tremendously helpful, even of a projection into 3 dimensions.

Figure B.6 uses splines to show one of the community property shadow manifolds, con- structed from sequential time lags of the community’s Interaction Evenness, alongside a more traditional shadow manifold constructed from time lags of the relative abundances of Host 7.

This only depicts three of the four embedding dimensions identified for both Host 7 and In- teraction Evenness, but qualitatively shows the shadow manifolds CCM uses to infer causality.

This particular system is dominated by rotational dynamics, possibly about a semi-stable state, but with evident nonlinear behavior such as saddle points. Nearby observations appear to have similar dynamics and there is little redundancy between the trajectories of the individual time series, supporting our use of the multispatial form of CCM. 111

B.2 Supplemental Figures

Figure B.1: Testing the assumptions of CCM. Of the 9 community properties tested only these 3 passed, where there was an embedding dimension that produced the best predictions within which the quality of predictions decayed to zero with increasing temporal distance indicating nonlinearity. These embedding dimensions are highlighted with blue stars. 112

Figure B.2: Proxies for smoothness assumed by the EDM framework, plotted to test the applica- bility of CCM to community properties. Green are host species, blue are parasite species, and red are community properties. The left panel contains correlation coefficients between adjacent points in each time series. Values closer to one indicate smoother dynamics. The right panel contains pairwise test statistics from the Kolmogorov-Smirnov test on changes between adjacent time steps to compare discrete rates of change. Larger values (bar widths) indicate greater dif- ferences in the distributions of changes between adjacent time steps. Their sum, the total width of each bar, connotes the cumulative difference between a given time series’ rates of change and those of every other time series. 113

Figure B.3: Comparison of the correlations and causal p-values for all 156 pairs of time series using the 3 hosts, 7 parasites, and 3 community properties that met the assumptions of the EDM framework. The lack of structure suggests no relationship between the covariance structure of the data and the underlying causal interactions. Note that the strong positive correlations between community properties appear to be outliers, casting doubt on CCM’s assessment that they were all very likely causally related to each other. 114

Figure B.4: Difference in the Clustering Coefficient (CC) of randomly generated preferential attachment networks with random attachment powers p ∼ U(0, 3) where a single node’s edges were either always present (P) or absent (A). A network generated with p = 0 is an Erdos-Renyi random network. 100 pairs of networks were generated for each integer α diversity between 2 and 100. 115

Figure B.5: Comparison of communities within the host-parasite network and three measures of their interactions: the observed relative interaction frequencies (left panel), pairwise host- parasite correlation coefficients (middle panel), and pairwise host-parasite causal interactions identified by CCM (right panel). Red rectangles indicate communities identified from the relative interaction frequencies (left panel) with the function computeModules in the R package Bipartite ( [58,201]) using its default settings, which uses Beckett’s algorithm ( [18]) on Newman’s measure of modularity for a bipartite network ( [174]). Host and parasite species are arranged according to community membership, and are the same in all three panels. Note the absence of blue squares in the right panel where every causal interaction at least involved the host rodent species driving the dynamics of their parasites. Shading indicates an interaction involving at least one species that did not meet the assumptions of CCM and was therefore not tested. 116

Figure B.6: Reconstructed shadow manifolds of the host-parasite community using time lags of Host 7 (top panel) and the Interaction Evenness of the community (bottom panel). In both cases only the first three of the four embedding dimensions (time lags) are plotted. Here τ = 1. Colors correspond to the sampling locations, each of which was interpolated using splines at 1000 evenly spaced points per location. Shading indicates the direction of time, with darker colors representing more recent observations. Black circles mark the underlying data. Appendix C

Chapter 3 Appendix

C.1 Supplemental Figures 118

Figure C.1: Annual interspecific interactions of 14 native and 2 exotic (bottom right) vernal pool plant species with Lasthenia conjugens, measured as their per density effect on Lasthenia conjugens’ annual growth rate. Boxes span interquartile ranges (IQR), with the inside bar at medians and whiskers extending to the most extreme data within ±1.5 ∗ IQR. Missing bars indicate the species was not observed in any pool that year. 119

Figure C.2: Average interspecific interactions in each vernal pool of 14 native and 2 exotic (bottom right) vernal pool plant species with Lasthenia conjugens, measured as their per density effect on Lasthenia conjugens’ annual growth rate. Missing pools indicate the species was not observed in between 2002 and 2015 in that pool. 120

Figure C.3: Density had no effect on interspecific interactions. Each X-axis is the density of Lasthenia conjugens and each Y-axis is the density of that panel’s species. Colors denote the intensity of the effect of that species on Lasthenia conjugens. Except for Erygnium vaseyi’s beneficial interactions when at low density, there density had no effect on these interactions. 121

Figure C.4: Annual per density growth rates of Lasthenia conjugens as a function of early and late season precipitation. The darkness of the colors in the legend are 20 or more overlapping points. 122

Figure C.5: Distributions of direct effects of early and late rainfall on Lasthenia conjugens. 123

Figure C.6: The relationship between Eryngium vaseyi and Lolium multiflorium and their effects on Lasthenia conjugens. Darker regions are greater density of points, where black spots are at least 10 overlapping points. Left) If one of these species was present the other was more likely absent, seen as darker bands along the axes. The exception was when Lolium multiflorium was at a density of one, where Eryngium vaseyi was likely still present though at low densities. This reflects that Eryngium vaseyi is a perennial species. There was no relationship between the densities of these two species when both were present (interior of the left panel). Right) There was a tradeoff in the way these two species interacted with Lasthenia conjugens. As one had a great effect, the other’s effect tended to weaken, reflecting there density relationship where they tended not to co-occur. 124

Figure C.7: The proportion of S-maps with a maximum weight of a given value. Each S-map was a multivariate regression of one observation on every other observation, and so had 3108 S- map weights (349 observations were excluded because they contained no plants). These weights which ranged between zero and one. No observations had a maximum weight of exactly one because no two observations were identical. 125

Figure C.8: The relationship between causality and interaction strength for all interactions with Lasthenia conjugens (top left) and interactions aggregated across species (top right), years (bottom left) and pools (bottom right). Causalities are bits of transferred entropy and interaction strengths are the causally-filtered S-map coefficients. Note the different scales for each panel. 126

Figure C.9: Species-specific distributions of transfer entropy. Dashed red lines are where there was no transferred entropy. The relative height of these bars is listed as a percent in the upper right corner of each panel to keep the rest of the distribution visible. Note the species-specific axes.