<<

Introduction and Motivation Models of Trait Along Phylogenies

Phylogenetic Comparative Methods

Laura Salter Kubatko Departments of Statistics and Evolution, , and Organismal The Ohio State University [email protected]

May 25, 2010

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Introduction

I Thus far, we have focused on estimation of the as the primary goal of our analysis.

I Often, however, the phylogeny itself is not of interest; rather, it is a nusiance parameter.

I One setting in which this occurs is the case where we wish to study relationships among traits, either discrete or continuous, across taxa.

I We’ll begin by examining some hypothetical data to motivate the method.

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 1: Correlation between discrete traits

I Consider the following traits for 10 taxa:

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 1: Correlation between discrete traits

I Consider the following traits for 10 taxa:

0 6

4 0

Fisher’s exact test gives a p-value of 0.0048

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 1: Correlation between discrete traits

I But ....

0 6

4 0

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 1: Correlation between discrete traits

I Correlation is completely explained by phylogeny: the tree has 18 branches – the probability of two changes on the same 1 branch is then 18 = 0.056.

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 2: Correlation between continuous traits

I Consider the following traits for 14 taxa: Character 2 Character

Character 1

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 2: Correlation between continuous traits

I But ..... Character 2 Character

Character 1

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Brownian Motion

I The problem is that the taxa are not independent outcomes of the evolutionary process; they’re all related by the phylogeny.

I We need to model the evolution of traits along the phylogeny and adjust appropriately for correlation due to shared evolutionary history.

I The simplest model of trait evolutionary along a phylogeny is the Brownian motion (BM) model.

I BM was proposed by Robert Brown (1773-1858) based on observation that pollen grains suspended in solutions “jiggled” continually. Later theoretical work is due to Einstein and Weiner, among others.

I BM was first applied to model trait evolution along phylogenies by Edwards and Cavalli-Sforza in 1964.

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Brownian Motion

I Consider a particle moving in a single dimension (say along the x-axis).

I Measure position of particle at small intervals of time.

I The movement of the particle in each interval is assumed to:

I be independent of movement in other intervals of time I have mean 0 I have constant variance, regardless of position of the particle

I Consider n distinct intervals of time.

I After n steps, the net displacement is the sum of the displacements at each step.

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Brownian Motion

2 I Let s be the variance of the displacement at each interval. The variance in the net displacement is then ns2 (since the displacement in intervals is independent).

2 I Now let s → 0 as n → ∞ such that their product is constant. This is the BM or Weiner process.

I What is important for us is the distribution of the net displacement after t units of time.

2 I Let σ be the variance per unit time.Then after t units of time, the variance of the net displacement is σ2t.

I In addition, the net displacement across an interval of t units of time is N(0, σ2t).

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny

I How do we apply this to trait evolution along a phylogeny?

I Assume that displacements on different branches of a tree are independent; traits evolve over branches of tree according to BM model. x0

v12

v9 x12 v11 x11

x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7

x1 x2 x3 x4 x5 x6 x7

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny: An Example

I Let xi denote the at node i, and let vi denote the branch length.

I As an example, look at value of trait at external node 5, x5.

x0

v12

v9 x12 v11 x11

x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7

x1 x2 x3 x4 x5 x6 x7

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny: An Example

I Assume that the state at the root (x0 ) is fixed.

I Note that x5 = x0 + (x12 − x0) + (x11 − x12) + (x5 − x11)

I The last three terms above are independent draws from normal distributions with mean 0 and variances depending on the vi .

x0

v12

v9 x12 v11 x11

x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7

x1 x2 x3 x4 x5 x6 x7

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny: An Example

I We have

I E(x5) = x0

2 2 2 I Var(x5) = σ v12 + σ v11 + σ v5

x0

v12

v9 x12 v11 x11

x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7

x1 x2 x3 x4 x5 x6 x7

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny: An Example

I Similarly, for x7, we have

I E(x7) = x0

2 2 2 2 I Var(x7) = σ v12 + σ v11 + +σ v10 + σ v7

x0

v12

v9 x12 v11 x11

x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7

x1 x2 x3 x4 x5 x6 x7

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny: An Example

I Note: x5 and x7 are not independent – they share a history up to node x11 2 2 I cov(x5, x7) = σ v12 + σ v11

x0

v12

v9 x12 v11 x11

x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7

x1 x2 x3 x4 x5 x6 x7

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny

I Now do this for all pairs of tips on a tree.

I The characters on the tips are jointly multivariate normal with expectation x0 and covariances based on their shared history.

I For our example, the variance-covariance matrix is:

0v1 + v8 + v9 v8 + v9 v9 0 0 0 0 1 B v + v v + v + v v 0 0 0 0 C B 8 9 2 8 9 9 C B v v v + v 0 0 0 0 C B 9 9 3 9 C B 0 0 0 v + v v v v C B 4 12 12 12 12 C B 0 0 0 v v + v + v v + v v + v C B 12 5 11 12 11 12 11 12 C @ 0 0 0 v12 v11 + v12 v6 + v10 + v11 + v12 v10 + v11 + v12 A 0 0 0 v12 v11 + v12 v10 + v11 + v12 v7 + v10 + v11 + v12

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Inference Under the BM Model

I Suppose that p characters are observed – each follows the multivariate normal distribution

I We can write a as the product over characters (assuming characters evolve independently along the phylogeny)

I Want to estimate parameters – e.g., x0i , i = 1, ··· p and 2 σi , i = 1 ··· p

I However, we run into a problem – likelihood goes to ∞ as branch length goes to 0 (see Ch. 23 in text for a worked example).

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Inference Under the BM Model

I Some possible solutions:

I Assume a

I Recode data as differences between character states at the tips – contrasts

I Look more at the second case – let C be an (n − 1) × n matrix of contrasts.

I Let V be the original variance-covariance matrix (multiplied by σ2).

I Then for the data re-coded as differences in trait values, which we denote by the vector y, we have

y = Cx ∼ N(0, CVCT)

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Inference Under the BM Model

I Next, we’ll discuss computing the likelihood under this model.

I Need a notion of independent contrasts.

Stat 882: Statistical Phylogenetics - Lecture 8 - Part I