<<

Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Modeling Phylogenetic Comparative Methods with Hybridization

Tony Jhwueng

NIMBioS Interdisciplinary Seminar

Jan 25 2011 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Outline:

1 Introduction: Phylognetic Comparative Methods (PCMs). 2 Modeling PCMs with Hybridization. Develop possible comparative methods when there are ancient hybridization events in addition to the usual speciation events. 3 Some ongoing work. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Phylogenetic Comparative Methods Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

A

Human (Akha) Chimpanzee Gorilla

4.7 million years ago

7.2 million years ago (Takahata et al., 1995) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Comparative Data

Examples: Body mass (adult male)(kg) Brain mass (adult male)(gram)

        x1 55.5 y1 1361 Xbody =  x2  =  56.7  , Ybrain =  y2  =  440  . x3 172.4 y3 570

Data from Jerison (1973) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Phylogenetic Comparative Methods (PCMs)

Phylogenetic Comparative Methods (PCMs) are statistical methods that incorporating phylogenetic tree for analyzing comparative data in the ecology and evolution literature. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Phylogenetic Comparative Methods (PCMs)

Phylogenetic comparative methods are commonly applied to such questions as: 1 What is the slope of an allometric scaling relationship ? e.g. how does brain mass vary in relation to body mass ? 2 What was the ancestral state of a trait? (Schluter et al. 1997; Hardy 2006; Ronquist 2004.) e.g. where did endothermy(warm-blooded animals) evolve in the that led to mammals? Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

A PCM developed from Evolutionary Perspective

Rely directly on explicit assumptions regarding the evolutionary process. 1. FIC (Felsenstein 1985): derived directly from population genetic theory and requires an assumption that the traits of interest have evolved via the Brownian motion process. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Brownian Motion process: dyt = σdBt

σ (rate parameter) measures the intensity of the random fluctuations in the evolutionary process.

BM: σ=1 BM: σ=3 Trait Value y(t) Value Trait y(t) Value Trait −400 −200 0 200 400 −400 −200 0 200 400

0 2000 4000 6000 8000 0 2000 4000 6000 8000 Time t Time t Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

The Comparative Data is NOT Statisitcally Independent

Since the are related by shared evolutionary history it may not be reasonable to view comparative data as independent, identically distributed realizations of the same stochastic process. the distribution of comparative data depends on the assumption of stochastic process for trait evolution. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Representing Phylogenetic Tree by Similarity Matrix G

Scale the phylogenetic tree so that the length from the root to each tip is 1 Relationship between the trait of paired species is measured by the shared branch length (time).

y1 y2 y3 Human Chimp. Gorilla

y1 y2 y3   0.4 0.4 y1 10 .60

G3 = y2  0.610  1

y3 0 0 1 0.6 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

FIC (Felsenstein, 1985): trait has evolved under the Brownian motion

The variation of rate of evolution of the trait value is proportional to time.

  y1 y2 y3 Human Chimp. Gorilla  y   µ   10 .60  1   0.4 0.4  2   y2  ∼ MVN  µ  , σ  0.610  y  µ 0 0 1  1 3   0.6 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Other PCMs

Developed from evolutionary perspective 2. PGLS (Martins and Hansen 1997, Butler and King 2004, Hansen 2008): expands the assumptions of BM to allow for other evolutionary scenarios (OU process) (for stabilizing selection). 3. PMM (Lynch 1991; Housworth et al. 2004): derived from quantitative genetics, and partitions phenotypic variation into phylogenetically heritable and nonheritable components. Developed from statistical perspective 4. ARM: spatial autoregressive method (Cheverud et al. 1985, Jittleman and Kot 1990). 5. PVR: phylogenetic eigenvector regression. Diniz-Filho et al. (1998) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Some useful textbooks and softwares for PCMs:

The Comparative Method in Evolutionary Biology by Harvey and Pagel, 1991. Inferring Phylogenies by Joseph Felsenstein, 2004, Ch 26. Analysis of and Evolution with R by Emmanuel Paradis, 2006. Softwares: PHYLIP (Joseph Felsenstein) COMPARE (Em´ıliaMartins) BROWNIE (Brian O’Meara) Various R packages at http://cran.r-project.org/web/views/Phylogenetics.html. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Modeling PCMs with Hybridization Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Hybridization is common in nature: Sunflower

Wild Sunflowers in a Field

(photo by Erin Silversmith) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Hybridization is common in nature: Sunflower

Helianthus annuus. Helianthus petiolaris. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Hybridization is common in nature: Sunflower

L. H. Rieseberg (1991) provided evidence that H. anomalus is a species derived from H. annuus and H. petiolaris. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Hybridization is common in nature: Cichild

Lake Tanganyika (African Great Lake: the second largest freshwater lake in the world).

(picture from Wikipedia) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Hybridization is common in nature: Cichild

A typical shell-nest constructed by large Lamprologus callipterus males. These aggregations attract different species of obligatory and facultative gastropod-shell-breeders, which consequently live and breed in closest vicinity. (photo from Koblm¨ulleret al. 2007) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Hybridization is common in nature: Cichlid

Hybrid cichilds from Lake Tanganyika

Lamprologus meleagris. (photo by Hag- Lamprologus speciosus.(photo by blom, Fredrik ) Slaboch, Roman ) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Hybridization is common in nature: Cichlid

Hybrid cichilds from Lake Tanganyika

Neolamprologus fasciatus. (photo by Neolamprologus multifasciatus. Jensen, Johnny) (photo by Gagliardi, Flavio ) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Purpose of this project

Hybrid species are known for sharing some common phenotypes from their parents. When studying the trait of a group of related species involving multiple hybrids,

Question 1: are hybrids constrained to be between the parents, or hybridization allows them to break free from their constraints ?

Hybrid trait Hybrid trait Hybrid trait

? ? ? Trait values Species 1 Species 2 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Purpose of this project

Some exotic species can evolve rapidly especially after hybridization (Barrett and Richardson 1982).

Question 2: When studying the trait of a group of related species involving multiple hybrids, does hybridization increase the rate of evolution ? Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Modeling PCMs with Hybridization

If evolution involves ancient hybridizations (reticulate evolutionary events), instead of the phylogenetic tree, incorporate the phylogenetic network into comparative analysis. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Linder and Rieseberg 2004.

Assume the nonhybrid taxa are normal diploid organisms, in which each consists of a pair of homologs. In a diploid hybridization event, the hybrid inherits one of two homologs from each chromosome from each of its two parents.

Species 1 Hybrid Species 2 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

A 3 taxa network

A nucleotide inherited from the A parent (species 1 at t1) of hybrid B (hybrid at t1) will be part of the subtree in which species 1 and hybrid are sister taxa

Species 1 hybrid Species 2

t2

A B C

t1 O Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

A 3 taxa network

A nucleotide inherited from the C parent (species 2 at t1) of hybrid B (hybrid at t1) will be part of the subtree in which species 2 and hybrid are sister taxa.

Species 1 hybrid Species 2

t2

A B C

t1 O Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

A 3 taxa network

Species 1 hybrid Species 2

t2

A B C

t1 O Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

The affinity between the hybrid and other species

Denote r as the trait of hybrid.

Denote x1 and x2 as the trait of species 1 and 2, respectively. Define r = µ + β(x1 + x2 − 2µ). (??) where β is called the hybrid parameter. Then

cov(r, z) = cov(β(x1 + x2), z), z = x1, r, x2. (?)

The new PCMs are associate with the hybrid parameter β. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Phylogenetic Tree and Phylogenetic Network in BM model

y1 y2 y3 x1 r x2 Human Chimp. Gorilla H. a. H. an. H. p.

0.6 0.6 0.6 0.6 0.6 1

0.4 0.4 0.4

y1 y2 y3 x1 r x2     y1 1 0.4 0 x1 1 0.4β 0 2 y2  0.4 1 0  r  0.4β 0.6 + 0.8β 0.4β  y3 0 0 1 x2 0 0.4β 1 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Investigating the hybrid effect

Test of assumption: Question 1: Using trait data with evolutionary evidence (network), does the trait of hybrid at origin fall between its parents with statistical significance?

r = µ + β(x1 + x2 − 2µ)

Hypothesis Testing: 1 H : β = 0 2 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

A simple approach for 3 taxa case

3 7 11

0.6 0.6 0.6

0.4 0.4

Input trait data: (3,7,11) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

A simple approach for 3 taxa case

Statistical Model

        3  µ 1 0.4β 0   2 2   7  ∼ MVN  µ  , σ  0.4β 0.6 + 0.8β 0.4β    11  µ 0 0.4β 1  Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

A simple approach for 3 taxa case

Parameters are estimated by maximum likelihood analysis and confidence intervals are obtained from simulation.

µˆ = 6.78 (5.94, 8.32)

σˆ = 4.12 (2.97, 5.88) βˆ = 0.47 (0.31, 0.65)

Answer for Question 1 (H0 : β = 0.5): Since 0.5 falls in the confidence interval (0.31, 0.65), the null hypothesis is not rejected. The trait of the hybrid in this data (3,7,11) to be more extreme than its parents is insignificant. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Investigating the hybrid effect

Question 2: Does hybridization increase the rate of evolution ?

x1 r x2 x1 r x2 H. a. H. an. H. p. H. a. H. an. H. p.

0.6 0.6 0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Investigating the hybrid effect

Test different rate of evolution. (McPeek, 1995; O’Meara et al. 2006)

 1 0.4β 0  2 σ  0.4β 0.8β2 + 0.6 0.4β  0 0.4βt1 1

 1 0.4β 0   0 0 0  2 2 σ  0.4β 0.8β2 0.4β  + η  00 .60  0 0.4βt1 1 0 0 0 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Investigating the hybrid effect

Statistical models

2 Y ∼ MVN(µ1, σ Gβ) (same rate)

v.s.

2 2 Y ∼ MVN(µ1, σ (Gβ − J) + η J) (diff. rate)

Hypothesis Testing:

2 2 H0 : σ = η Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

A simple approach for 3 taxa case

2 2 Answer for Question 2 (H0 : σ = η ):

2 Y ∼ MVN(µ1, σ Gβ)

vs

2 2 Y ∼ MVN(µ1, σ (Gβ − J) + η J) By likelihood ratio test(LRT). Since 2 −2{log Lβˆ/ log Lβ,ˆ ηˆ} = 2.69 < χdf =1 = 3.84 (P value = 0.101), we fail to reject the null hypothesis. Hence the heterogeneous rate of evolution between hybrid and other species is insignificant. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Phylogenetic Network

1 2 X1 3 4 5 X2 6 7 8 9 10 11 12 13 14 15 16 X3 17 18 19

20 21 22

23 24 25

26 28 27 29 30 31 32

33 36 34 35

37 38 39

40 41 42 43 44 45 46 47

48 49

50 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Algorithm for general construction of similarity matrix.

1 Input: phylogenetic network in eNewick format. 2 Construct the similarity matrix through following recursive formula t Gβ,k = Kk−1Gβ,k−1Kk−1 + tk−1Ik (BM)

where K depends on hybridization (Kh) or speciation (Ks ). Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Quick look for algorithm of 4 taxa

1 2 3 4

t3

4 5 6 7

t2 5 8

t1 9 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

[9] → [5, 8]: Speciation

1 2 3 4

t3

5 6 7 4

t2

5 8

t1

9 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

[9] → [5, 8]: Speciation

5 8   5 t1 0 G2 = 8 0 t1 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

[5, 8] → [5, 7, 4]: Speciation

1 2 3 4

t3

5 6 7 4

t2

8

t1

9 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

[5, 8] → [5, 7, 4]: Speciation

5 7 4   5 t1 + t2 0 0 G3 = 7 0 t1 + t2 t1  4 0 t1 t1 + t2 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

[5, 7, 4] → [5, 6, 7, 4]: Hybridization

1 2 3 4

t3

5 6 7 4

t2

8

t1

9 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

[5, 7, 4] → [5, 6, 7, 4]: Hybridization

5 6 7 4   5 t1 + t2 β(t1 + t2) 0 0 2 0 6 β(t1 + t2) 2β (t1 + t2) β(t1 + t2) βt1  Gβ,4 =   7 0 β(t1 + t2) t1 + t2 t1  4 0 βt1 t1 t1 + t2 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

[5, 6, 7, 4] → [1, 2, 3, 4]: Elongation

1 2 3 4

t3

5 6 7

t2

8

t1

9 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

[5, 6, 7, 4] → [1, 2, 3, 4]: Elongation

Gβ,4 =

1 2 3 4   1 t1 + t2 + t3 β(t1 + t2) 0 0 2 2 β(t1 + t2) t3 + 2β (t1 + t2) β(t1 + t2) βt1    3 0 β(t1 + t2) t1 + t2 + t3 t1  4 0 βt1 t1 t1 + t2 + t3 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Statistical Model and Parameters Estimation

Statistical model

2 Y ∼ MVN(µ1, σ Gβ)

Negative log likelihood function `(µ, σ2, β|Y)

n n 1 1 = log 2π + log σ2 + log |G |+ (Y−µ1)t G−1(Y−µ1) 2 2 2 β 2σ2 β MLE estimation through derivative free method: golden section method. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Application: Koblm¨ulleret al. (2007)

Cichild in Lake Tanganyika: 5 hybrids (in bold) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Phylogenetic Network: regraph from Koblm¨ulleret al. (2007)

1 2 X1 3 4 5 X2 6 7 8 9 10 11 12 13 14 15 16 X3 17 18 19

20 21 22

23 24 25

26 28 27 29 30 31 32

33 36 34 35

37 38 39

40 41 42 43 44 45 46 47

48 49

50 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Input

eNewick format ((((((1 : 0.495, ((((2 : 0.215, (5 : 0.215)27 : 0)26 : 0.1, (14 : 0.315)34 : 0)33 : 0.055, (X1 : 0.1, (3 : 0.1)21 : 0)20 : 0.27)37 : 0.06, ((21 : 0, 4 : 0.1)22 : 0.26, (27 : 0, (X 2:0 .17, (6 : 0.17)24 : 0)23 : 0.045)28 : 0.145)36 : 0.07)38 : 0.065)39 : 0.11, 7 : 0.605)41 : 0.045, ((8 : 0.575, 9 : 0.575)40 : 0.06, 10 : 0.635)42 : 0.015)43 : 0.125, ((((24 : 0, 11 : 0.17)25 : 0.075, 12 : 0.245)29 : 0.475, (13 : 0.68, (34 : 0, (15 : 0.275, (16 : 0.275)31 : 0)30 : 0.04)35 : 0.365)44 : 0.04)45 : 0.02, (31 : 0, X 3: 0.275)32 : 0.465)46 : 0.035)47 : 0.11, (17 : 0.835, 18 : 0.835)48 : 0.05)49 : 0.115, 19 : 1)50 : 0;

Trait data: total lengths of cichild (cm). (Froese and Pauly, 2010) www.fishbase.org Y = (y1, y2, ··· , y19) = (13.5, 12.4, 7, 5.8, 5.5, 16, 15, 25, 6.1, 6.5, 5.5, 5, 7.8, 15, 4.3, 4, 8.6, 10.3, 8.5) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Some result

Model does not fit the raw data, log transform the data. MLE estimators βˆ = 0.43, µˆ = 2.06, σˆ = 0.69. Since βˆ = 0.43 < 0.5, the trait values of those hybrid species are not more variable than those of the other species. −5 As 0.5 falls in (βˆ25, βˆ975) = (5.57 · 10 , 1.05). The total length of those hybrids to be more extreme than their parents is insignificant. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Some ongoing work in this project

1 Contribute R packages bmhyd (BM model) and ouhyd (OU model). 2 Modeling with more parameters Ri = βi (Xi + Yi ), i = 1, 2, ··· , d. 3 Improve these methods to allow adaptive radiations. 4 Develop sampling strategy to study performance (bias and power of β) of BM and OU model. 5 Model events such as horizontal transfers that affects traits but are biologically different from hybridization. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Acknowledgement: NIMBioS Elizabeth Housworth (Dept. of Mathematics, Indiana University Bloomington) Brian O’Meara (EEB, Univ. of Tennessee, Knoxville) Em´ıliaMartin (Dept. of Biology, Indiana University Bloomington) Vasileios Maroulas (Dept. of Math, Univ. of Tennessee, Knoxville) Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Thanks ! Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Bias and Power analysis for the hybrid parameter

Access the bias and power of parameter β using Access 5, 9, 17, 33, 65 and 129 taxa phylogenetic network, each contains one ancient hybrid.

The time where the ancient hybridization occurs. Set t1 by t1 = 0.1, 0.5, 0.9, separately. Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

A sample network for simulation

t4

t3

t2

17 taxa

1 hybrid t1 Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Bias analysis of BM model

Bias of β

Red : t1 = 0.1 Black : t1 = 0.5 Blue : t1 = 0.9 −3 −2 −1 0 1 2 3

−3 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 3 β Introduction: PCMs Modeling PCMs with Hybridization Some ongoing work

Power analysis of BM model

Power of β = 0.5

Up : t1 = 0.9 Middle : t1 = 0.5 Down : t1 = 0.1 0.0 0.2 0.4 0.6 0.8 1.0

−5 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 β