<<

SplitsSplits andand PhylogeneticPhylogenetic NetworksNetworks

DanielDaniel H.H. HusonHuson

Paris, June 21, 2005 1 ContentsContents 1.1. PhylogeneticPhylogenetic treestrees 2.2. SplitsSplits networksnetworks 3.3. ConsensusConsensus networksnetworks 4.4. HybridizationHybridization andand reticulatereticulate networksnetworks 5.5. RecombinationRecombination networksnetworks

2 PhylogeneticPhylogenetic NetworksNetworks Bandelt (1991):evolutionary Network relationships displaying

Other types of Splits Phylogenetic Reticulate phylogenetic networks trees networks networks

Median Consensus Hybridization Recombination Augmented networks (super) networks networks trees networks Special case: “Galled trees”

Any graph Ancestor Split decomposition, representing Split decomposition, recombination evolutionary Neighbor-net graphs data 3 PhylogeneticPhylogenetic NetworksNetworks

22 11 Other types of Splits Phylogenetic Reticulate phylogenetic networks trees networks networks

33 44 55 Median Consensus Hybridization Recombination Augmented networks (super) networks networks trees networks from sequences Special case: “Galled trees”

Any graph Ancestor Split decomposition,from trees representing Split decomposition, recombination evolutionary Neighbor-net graphs data 4 from distances PhylogeneticPhylogenetic NetworksNetworksDan Gusfield: “Phylogenetic network”

22 11 “Generalized Other types of Splits Phylogenetic Reticulate phylogeneticphylogenetic networks trees networks network”networks

33 44 \ 55 Median Consensus Hybridization Recombination“More generalizedAugmented networks (super) networks networks trees networks Phylogenetic network” from sequences Special case: “Galled trees”

Any graph Ancestor Split decomposition,from trees representing Split decomposition, recombination evolutionary Neighbor-net graphs data 5 from distances PartPart II 1.1. PhylogeneticPhylogenetic treestrees 2.2. SplitsSplits networksnetworks 3.3. ConsensusConsensus networksnetworks 4.4. HybridizationHybridization andand reticulatereticulate networksnetworks 5.5. RecombinationRecombination networksnetworks

6 PhylogeneticPhylogenetic TreesTrees

Ernst Haeckel, Tree of Life 1866

7 PhylogeneticPhylogenetic TreesTrees

LetLet XX == {x{x1,...,x,...,xn}} denotedenote aa setset ofof taxa.taxa. AA phylogeneticphylogenetic treetree TT (or(or XX--treetree)) isis givengiven byby labelinglabeling thethe leavesleaves ofof aa treetree byby thethe setset X:X:

Cow Fin Whale Blue Whale Habor Seal Rat Mouse Chimp Human Gorilla Taxa + tree ⇒ 8 UnrootedUnrooted vsvs RootedRooted TreesTrees

Unrooted tree Rooted tree, rooted using Chicken as outgroupoutgroup mathematically and algorithmically biologically relevant, defines easier to deal with of related taxa 9 AA SimpleSimple ModelModel ofof EvolutionEvolution

TAA G C CG T ACT C CG A T AC GC C time time AC C C A G C T

Evolutionary tree

AC G C C T •Mutations along branches •Speciation events at nodes A C Sequence of common ancestor 12 TreeTree ReconstructionReconstruction ProblemProblem

TAA G C CG T AC T C CG A T AC GC C

Tree? Evolutionary tree

13 TreeTree ofof LifeLife BasedBased onon 16S16S rRNArRNA

(Doolittle, 2000)

14 PartPart IIII 1.1. PhylogeneticPhylogenetic treestrees 2.2. SplitsSplits networksnetworks 3.3. ConsensusConsensus networksnetworks 4.4. HybridizationHybridization andand reticulatereticulate networksnetworks 5.5. RecombinationRecombination networksnetworks

15 SplitsSplits NetworksNetworks

RepresentsRepresents incompatibleincompatible signalssignals inin data,data, from:from: SequencesSequences,, e.g.:e.g.: Median network (Bandelt etet alal 1994) Spectral analysis (Hendy and Penny 1993) DistancesDistances,, e.g.:e.g.: E.g. Split decomposition (Bandelt and Dress 1992) Neighbor-Net (Bryant and Moulton 2002) TreesTrees,, e.g.:e.g.: Consensus network (Holland and Moulton 2003) Super network (H., Dezulian, Kloepper and Steel 2004) Bootstrap network (H., implemented in SplitsTree4)

16 SequencesSequences toto SplitsSplits NetworkNetwork

If characters have only 2 states and not too conflicting: interpret columns as splits and draw full splits network 17 DistancesDistances toto SplitsSplits NetworkNetwork

Split decomposition or Neighbor-Net produces network from distances 18 TreesTrees toto SplitsSplits NetworkNetwork

A collection of trees can be represented by a consensus network or super network 19 BootstrapBootstrap NetworkNetwork

Draw all splits that have positive bootstrap score

20 SplitSplit DecompositionDecomposition && BootstrapBootstrap NetworkNetwork

CompareCompare thethe resultresult ofof SplitSplit DecompositionDecomposition withwith anan NJNJ treetree andand bootstrapbootstrap network:network:

A.cerana A.mellifer A.cerana A.mellifer A.cerana A.mellifer

A.dorsata A.dorsata A.dorsata

A.andrenof A.andrenof A.florea A.andrenof A.koschev A.florea A.koschev A.florea A.koschev Bio-NJ tree Bootstrap network Splits network obtained via the Split Decomposition

Bill Martin: Splits networks show which signals tree reconstruction methods are fighting over… 21 RootedRooted SplitsSplits NetworksNetworks

Splits network can be “rooted” e.g. using an (Gambette and H, manuscript) 22 BetterBetter layoutlayout ofof splitssplits networksnetworks

Philippe Gambette, currently visiting Tuebingen from ENS Cachan 23 PartPart IIIIII 1.1. PhylogeneticPhylogenetic treestrees 2.2. SplitsSplits networksnetworks 3.3. ConsensusConsensus networksnetworks 4.4. HybridizationHybridization andand reticulatereticulate networksnetworks 5.5. RecombinationRecombination networksnetworks

24 GeneGene TreesTrees CanCan DifferDiffer AlsoAlso allowallow genegene duplicationduplication andand lossloss::

x1 x2 x3 A’ A” B’ x x x

A’ A’’ B’ x1 x2 x3

≠ A B duplication Gene Tree Tree G 25 GeneGene TreesTrees vsvs SpeciesSpecies TreesTrees

Differing gene trees give rise to “mosaic sequences”

Gene A Gene B Gene C Gene D

26 ConsensusConsensus ofof DifferentDifferent GeneGene TreesTrees ForFor aa givengiven setset ofof species,species, differentdifferent genesgenes leadlead toto differentdifferent treestrees HowHow toto formform aa consensusconsensus ofof thethe trees?trees? ConsensusConsensus treestrees ConsensusConsensus networksnetworks ConsensusConsensus supersuper networksnetworks

27 TheThe SplitsSplits ofof aa TreeTree EveryEvery edgeedge ofof aa treetree definesdefines aa splitsplit ofof thethe taxontaxon setset X:X:

x6

x4 x1 x 8 e

x5 x3 x7 x2 xx ,x,x ,x,x ,x,x ,x,x vsvs xx ,x,x ,x,x 1 3 4 6 7 2 5 8 28 TheThe SplitSplit EncodingEncoding ofof aa TreeTree

dd aa Tree T:

cc bb

ee Split encoding Σ(T):

5 trivial splits:

2 non-trivial splits: 29 CompatibilityCompatibility

TwoTwo splitssplits AA1|B|B1 andand AA2|B|B2 ofof XX areare compatiblecompatible,, ifif

∅∈∅∈{A{A1∩∩AA2,,AA1∩∩BB2,B,B1∩∩AA2,B,B1∩∩ AA2}} Two compatible splits:

A1 B1 x x4 A2 B2 x3 x8 x7 8 x2

x9 x5 x 1 x 6 X

30 CompatibilityCompatibility

TwoTwo splitssplits AA1|B|B1 andand AA2|B|B2 ofof XX areare compatiblecompatible,, ifif

∅∈∅∈{A{A1∩∩AA2,A,A1∩∩BB2,B,B1∩∩AA2,B,B1∩∩ AA2}} Two incompatible splits:

x x4 A1 B1 x5

A2 x6 B2

x2

x3 X x1 X 1 x7

31 ConsensusConsensus ofof TreesTrees Six gene trees:

Σ(1/2): majority consensus: Σ(1/6): Σ(0): splits contained in more splits contained in splits contained than 50% of trees more than one tree in at least one tree

34 ExampleExample ofof AA SuperSuper NetworkNetwork (Plants)(Plants)

Partial trees for five plant Super network 35 Joint work with Kim McBreen and Pete Lockhart ZZ--ClosureClosure MethodMethod Idea:Idea: ExtendExtend partialpartial splits.splits.

∩ A A ∪ A ZZ--rule:rule: A1 A2 1 1 2 ,

B1 B2 B1∪ B2 B2

B RepeatedlyRepeatedly applyapply 1toto completion.completion. A2 ReturnReturn allall fullfull splits.splits.

A1 B2 [Huson, Dezulian, Kloepper and Steel, 2004] 36 ExampleExample FiveFive fungalfungal treestrees fromfrom [[Pryor,Pryor, 2000]2000] andand [Pryor,[Pryor, 2003]2003] Trees:Trees: ITSITS (two(two trees)trees) SSUSSU (two(two trees)trees) GpdGpd (one(one tree)tree) NumbersNumbers ofof taxataxa differ:differ: “partial“partial trees”trees”

37 ExampleExample

4beta25: User manual now available from www..org! 38 IndividualIndividual GeneGene TreesTrees

ITS00

46 taxa 39 IndividualIndividual GeneGene TreesTrees

ITS03

40 taxa 40 IndividualIndividual GeneGene TreesTrees

SSU00

29 taxa 41 IndividualIndividual GeneGene TreesTrees

SSU03

40 taxa 42 IndividualIndividual GeneGene TreesTrees

Gpd03

40 taxa 43 GeneGene TreesTrees asas SuperSuper NetworkNetwork

Z-closure: a fast super-network method 44 GeneGene TreesTrees asas SuperSuper NetworkNetwork

ITS00+ ITS03

45 GeneGene TreesTrees asas SuperSuper NetworkNetwork

ITS03+ SSU00

46 GeneGene TreesTrees asas SuperSuper NetworkNetwork

ITS00+ ITS00+ SSU03

47 GeneGene TreesTrees asas SuperSuper NetworkNetwork

ITS00+ ITS03+ SSU03+ Gpd03

48 GeneGene TreesTrees asas SuperSuper NetworkNetwork

ITS00+ ITS03+ SSU00+ SSU03+ Gpd03

49 PartPart IVIV 1.1. PhylogeneticPhylogenetic treestrees 2.2. SplitsSplits networksnetworks 3.3. ConsensusConsensus networksnetworks 4.4. HybridizationHybridization andand reticulatereticulate networksnetworks 5.5. RecombinationRecombination networksnetworks

50 HybridizationHybridization OccursOccurs whenwhen twotwo organismsorganisms fromfrom differentdifferent speciesspecies interbreedinterbreed andand combinecombine theirtheir genomesgenomes

Copyright © 2003 University of Illinois Copyright © 2003 University of Illinois Copyright © 2003 University of Illinois

Water hemp Pigs weed 51 SpeciationSpeciation byby HybridizationHybridization 11 InIn allopolyploidizationallopolyploidization,, twotwo differentdifferent lineageslineages produceproduce aa newnew speciesspecies thatthat hashas thethe completecomplete nuclearnuclear genomesgenomes ofof bothboth parentalparental species:species:

Linder et al. 2004 52 SpeciationSpeciation byby HybridizationHybridization 22 InIn diploiddiploid (or(or homoploid)homoploid) hybridhybrid speciation,speciation, eacheach ofof thethe parentsparents producesproduces normalnormal gametesgametes (haploid)(haploid) toto produceproduce aa normalnormal diploiddiploid hybrid:hybrid:

Linder et al. 2004 53 HorizontalHorizontal GeneGene TransferTransfer ThereThere areare aa numbernumber ofof knownknown mechanismsmechanisms byby whichwhich bacteriabacteria cancan exchangeexchange genesgenes TransformationTransformation ConjugationConjugation transductiontransduction

http://www.pitt.edu/~heh1/research.html

54 AA SimpleSimple ModelModel ofof ReticulateReticulate EvolutionEvolution

b1 a h c b3

P Q

Tree for gene g1

Ancestral g1 55 AA SimpleSimple ModelModel ofof ReticulateReticulate EvolutionEvolution

b1 a h c b3

P Q

g1-tree is “P”-variant

g1 56 AA SimpleSimple ModelModel ofof ReticulateReticulate EvolutionEvolution

b1 a h c b3

g1-tree is “P”-variant

57 AA SimpleSimple ModelModel ofof ReticulateReticulate EvolutionEvolution

b1 a h c b3

P Q

Tree for gene g2

g2 58 AA SimpleSimple ModelModel ofof ReticulateReticulate EvolutionEvolution

b1 a h c b3

P Q g2-tree is “Q”-variant

g2 59 AA SimpleSimple ModelModel ofof ReticulateReticulate EvolutionEvolution

b1 a h c b3

g2-tree is “Q”-variant

60 ReticulateReticulate NetworksNetworks andand TreesTrees TheThe evolutionaryevolutionary historyhistory associatedassociated withwith anyany givengiven genegene isis aa treetree AA networknetwork NN withwith kk reticulationsreticulations givesgives riserise toto 22k differentdifferent genegene treestrees

b1 a h c b3 P Q b1 a h c b3 b1 a h c b3

N P-tree Q-tree 61 RootedRooted ReticulateReticulate NetworkNetwork

DefinitionDefinition LetLet XX bebe aa setset ofof taxa.taxa. AA rootedrooted reticulatereticulate networknetwork NN onon XX isis aa connected,connected, directeddirected acyclicacyclic graphgraph with:with: preciselyprecisely oneone nodenode ofof indegreeindegree 0,0, thethe rootroot,, allall otherother nodesnodes areare treetree ofof indegreeindegree 1,1, oror reticulationreticulation ofof indegreeindegreenodesnodes 2,2, everyevery edgeedge isis aa treetree joiningjoining twotwo treetree nodes nodes,nodes, oror aa reticulationreticulationnodesedgeedge fromfrom aa treetree nodenode toto aa reticulationreticulation node,node, andand edge thethe setset ofof leavesleaves consistsconsistsedge ofof treetree nodesnodes andand isis labeledlabeled byby X.X.

63 RootedRooted ReticulateReticulate NetworkNetwork

c dd ee f aa bb c f gg hh

rr1

rr3

rr2

root

64 MostMost ParsimoniousParsimonious NetworkNetwork Problem:Problem: GivenGiven aa setset ofof treestrees TT,, determinedetermine aa reticulatereticulate networknetwork NN suchsuch thatthat TT ⊆⊆ T(N)T(N) andand NN containscontains aa minimumminimum numbernumber ofof reticulationreticulation nodes.nodes.

InIn fullyfully generality,generality, thisthis isis knownknown toto bebe aa computationallycomputationally hardhard problemproblem [Wang etet alal 2001, Bordewich and Semple 2004]..

66 IndependentIndependent ReticulationsReticulations

ReticulationReticulation nodesnodes rri,, rrj ∈∈ NN areare independentindependent,, ifif theythey areare notnot containedcontained inin aa commoncommon cycle:cycle:

rr1

rr3 rr treetree 2

Independent reticulations also called gallsgalls and a network only containing galls is also called a galledgalled [Gusfield et al. 2003] 67 SPR'sSPR's andand ReticulationsReticulations

ObservationObservation [Maddison 1997]:: IfIf NN containscontains onlyonly oneone reticulationreticulation r,r, thenthen itit correspondscorresponds toto aa “sub“sub--treetree pruneprune andand regraftregraft”” operation:operation:

Reticulate network N: rr

SPR 68 SPRSPR--BasedBased AlgorithmAlgorithm GivenGiven twotwo bifurcatingbifurcating trees,trees, computecompute theirtheir SPRSPR distance:distance: IfIf == 0,0, returnreturn thethe treetree IfIf == 1,1, returnreturn thethe reticulatereticulate networknetwork Else,Else, returnreturn failfail

GeneralizedGeneralized toto networksnetworks withwith multiplemultiple independentindependent reticulationsreticulations [Nakhleh et al 2004] Maximum agreement forest approach (Semple et al 2005)

69 SplitsSplits--BasedBased ApproachApproach

A new splits-based approach [Huson, Kloepper, Lockhart and Steel 2005]:

gene tree1 gene tree2 splits network reticulate

of all splits network 70 MultipleMultiple IndependentIndependent ReticulationsReticulations

Reticulate network Two reticulations ⇒ all splits that induces all four different gene trees input trees 71 MultipleMultiple andand OverlappingOverlapping ReticulationsReticulations

Reticulate network Input trees all splits that induces all input trees 72 DecompositionDecomposition TheoremTheorem EachEach incompatibilityincompatibility componentcomponent cancan bebe consideredconsidered independently:independently:

1. component 2. component

73 [Gusfield & Bansal, 2005]2005] [Huson,[Huson, Kloepper, Lockhart & Steel, 2005]2005] DecompositionDecomposition TheoremTheorem ConsiderConsider aa component:component:

74 AlgorithmAlgorithm FindFind decompositiondecomposition RR ∪∪ BB asas aa setset ofof reticulatereticulate taxataxa andand backbonebackbone taxataxa

75 AlgorithmAlgorithm NecessaryNecessary condition:condition: splitssplits restrictedrestricted toto BB mustmust correspondcorrespond toto aa treetree

76 AlgorithmAlgorithm ConsiderConsider allall choiceschoices forfor RR ofof sizesize 11 [Gusfield etet alal., 2003, 2004]::

77 R={tR={t7}} ……notnot aa tree,tree, RR notnot goodgood AlgorithmAlgorithm ConsiderConsider allall choiceschoices forfor RR ofof sizesize 11 [Gusfield etet alal., 2003, 2004]::

R={c}R={c} ……notnot aa tree,tree, RR notnot goodgood78 AlgorithmAlgorithm ConsiderConsider allall choiceschoices forfor RR ofof sizesize 11 [Gusfield etet alal., 2003, 2004]::

79 R={tR={t5}} ……notnot aa tree,tree, RR notnot goodgood AlgorithmAlgorithm ConsiderConsider allall choiceschoices forfor RR ofof sizesize 22:: [H., Kloepper, Lockhart and Steel, 2005]

80 R={tR={t6,t,t7}} ……notnot aa tree,tree, RR notnot goodgood AlgorithmAlgorithm ConsiderConsider allall choiceschoices forfor RR ofof sizesize 22:: [H., Kloepper, Lockhart and Steel, 2005]

R={R={b,cb,c}} ☺☺……isis aa treetree,, RR isis aa candidatecandidate81 CheckCheck CandidateCandidate ForFor R={R={b,cb,c},}, checkcheck thatthat reticulationreticulation cyclescycles overlapoverlap correctlycorrectly alongalong aa path:path:

82 NetworkNetwork ConstructionConstruction ModifyModify splitssplits networknetwork toto representrepresent reticulations:reticulations:

83 SplitsSplits--BasedBased AlgorithmAlgorithm Input:Input: SetSet ofof treestrees TT,, notnot necessarilynecessarily bifurcating,bifurcating, cancan bebe partialpartial treestrees ParameterParameter kk Output:Output: AllAll reticulatereticulate networksnetworks NN forfor whichwhich everyevery incompatibilityincompatibility componentcomponent cancan bebe explainedexplained byby atat mostmost kk overlappingoverlapping reticulationsreticulations Complexity:Complexity: polynomialpolynomial forfor fixedfixed kk

84 ApplicationApplication toto RealReal DataData

New Zealand RanunculusRanunculus (buttercup) species

Nuclear ITS region Chloroplast J region SA 85 ApplicationApplication toto RealReal DataData

New Zealand RanunculusRanunculus (buttercup) species

CurrentThis splits algorithms network are However, interactive sensitivesuggests thatto false R.nivicola removal of five confusing branchesmay be a hybridin the input splits and one leads treesof the and evolutionary here initially to the detection of an nolineages reticulation on the isleft - and appropriate reticulation. four splits here detected.right-hand sides.

Splits network for both genes Reticulate network 86 PartPart VV 1.1. PhylogeneticPhylogenetic treestrees 2.2. SplitsSplits networksnetworks 3.3. ConsensusConsensus networksnetworks 4.4. HybridizationHybridization andand reticulatereticulate networksnetworks 5.5. RecombinationRecombination networksnetworks

87 RecombinationRecombination

RecombinationRecombination isis studiedstudied inin populationpopulation geneticsgenetics [24,[24, 20,16,20,16, 46,46, 47,47, 48]48] andand therethere ancestorancestor recombinationrecombination graphsgraphs ((ARGsARGs)) areare usedused forfor statisticalstatistical purposes.purposes.

88 ChromosomalChromosomal RecombinationRecombination

WeWe willwill studystudy thethe combinatorialcombinatorial aspectsaspects ofof chromosomalchromosomal (meiotic)(meiotic) recombinationrecombination andand willwill considerconsider recombinationrecombination networksnetworks ratherrather thanthan ARGsARGs..

SimplifyingSimplifying assumptions:assumptions: all sequences have a common ancestor, and any position can mutate at most once.

89 ExampleExample ofof aa RecombinationRecombination NetworkNetwork

r:001101 100000 b:010101 000000 c:000000 110100 a:100110 000000 d:000000 111010 3 10 2 9,11

1,5 000101 100000 000000 110000 8 000101 000000 6 6 000000 100000 7 Alignment A: 000100 000000 4 a:100110 000000 outgroup 000000 000000 b:010101 000000 r:001101 100000 c:000000 110100 000000 000000 d:000000 111010 root o:000000 000000 90 RecombinationRecombination NetworkNetwork

TreeTree--basedbased approachapproach [Gusfield etet al.al. 2003] forfor computingcomputing galledgalled trees:trees: ForFor eacheach component:component: DetermineDetermine whetherwhether removingremoving oneone taxontaxon producesproduces aa perfectperfect phylogenyphylogeny IfIf so,so, arrangearrange taxataxa inin gallgall ReturnReturn descriptiondescription ofof networknetwork

94 RecombinationRecombination NetworkNetwork

SplitsSplits--basedbased approachapproach [Huson & Kloepper 2005] forfor computingcomputing overlappingoverlapping networks:networks: DetermineDetermine aa reticulatereticulate networknetwork asas describeddescribed above.above. ComputeCompute thethe labelinglabeling ofof nodesnodes andand edges.edges.

95 RecombinationRecombination NetworkNetwork

[Lungso, Sun and Hein, to appear in WABI 2005]:: BranchBranch andand boundbound approachapproach toto computingcomputing unrestrictedunrestricted recombinationrecombination networknetwork

96 ExampleExample 1,1, DataData

Input: Presence (0) or absence (1) of a given restriction site in a 3.2kb region of variable chloroplast DNA in Pistacia [Parfitt & Badenes 1997]:

P.lentiscus 01110010100000000111000000010000 P.weinmannifolia 11001110100000010111000000010000 P.chinensis 01011000100000000111001100010000 P.integerrima 01011010100000000111001100010000 P.terebinthus 00011010000000001111101100010000 P.atlantica 01011011000000000111001100010000 P.mexicana 01011110100000010111010000010000 P.texana 01011110100000010111010000010000 P.khinjuk 01011010000000000111001100010000 P.vera 01011010000000000111001100010000 Schinus molle 01011010011111100000000011101111 98 ExampleExample 1,1, RecombinationRecombination NetworkNetwork

LoadLoad thisthis datadata inin toto SplitsTree4SplitsTree4 andand selectselect RecombinationNetworkRecombinationNetwork toto obtain:obtain:

99 ExampleExample 1,1, SingleSingle CrossoverCrossover

CombinatoricallyCombinatorically,, thisthis cancan bebe explainedexplained usingusing onlyonly oneone singlesingle--crossovercrossover recombination:recombination:

However: recombination of chloroplast is unlikely 100 ExampleExample 2,2, DataData

Input: Restriction maps of the rDNA cistron (length ≈ 10kb) of twelve species of mosquitoes using eight 6bp recognition restriction enzymes [Kumar et al, 1998]: Aedes albopictus 11110101010100010101010010 Aedes aegypti 11110101000100010101000010 Aedes seatoi 11110101010100010101010000 Aedes avopictus 11110101010100010101010010 Aedes alcasidi 11110101010100010101010000 Aedes katherinensis 11110101010100010101010000 Aedes polynesiensis 11110101000100010101010010 Aedes triseriatus 10110101000110010101000000 Aedes atropalpus 10110101000100010111000010 Aedes epactius 10110101000100010111000010 Haemagogus equinus 10110101000110010101010000 Armigeres subalbatus 10110101000100010101000000 Culex pipiens 11110111000100011101001011 Tripteroides bambusa 11110111000100010101000010 Sabethes cyaneus 11110101001100010101010000 Anopheles albimanus 11011101100101110101110100 101 ExampleExample 2,2, MedianMedian NetworkNetwork

This data set was analyzed using different tree- reconstruction methods with inconclusive results. The associated splits network (or medianmedian networknetwork [Bandelt et al, 1995] in this context), with edges labeled by the corresponding mutations: Anopheles_albimanus

Aedes_katherinensis Aedes_seatoi Aedes_alcasidi Aedes_flavopictus 10 Aedes_albopictus root 25 Aedes_polynesiensis 22 3,5,9,14-15,21,24 7 Tripteroides_bambusa 11 2 17,23,26 Aedes_aegypti Sabethes_cyaneus Culex_pipiens 13 19 Aedes_epactius Aedes_atropalpus Haemagogus_equinus Armigeres_subalbatus Aedes_triseriatus 102 ExampleExample 2,2, SubsetSubset

Recombination scenarios based on the complete data set look unconvincing. However, trial-and-error removal of two taxa AedesAedes and ArmigeresArmigeres gives rise to a simplertriseriatustriseriatus splits network: subalbatussubalbatus

Anopheles albimanus

Sabethes cyaneus root 3,5,9,14-15,21,24 Haemagogus equinus 11 13 Aedes epactius 2 22 19 Aedes atropalpus Aedes aegypti 10 17,23,26 25 7 Aedes polynesiensis Culex pipiens Tripteroides bambusa Aedes katherinensis Aedes seatoi Aedes albopictus Aedes alcasidi Aedes flavopictus 103 ExampleExample 2,2, RecombinationRecombination NetworkNetwork

A possible recombination scenario is given by:

Anopheles_albimanus

Haemagogus_equinus Sabethes_cyaneus Aedes_epactius 13 19 root 3,5,9,14-15,21,24 22,25 Aedes_atropalpus 11 2 2 Aedes_aegypti 25 22 7 17,23,26 Culex_pipiens 10 25 10 Tripteroides_bambusa Aedes_polynesiensis

Aedes_katherinensis Aedes_albopictus Aedes_seatoi Aedes_flavopictus Aedes_alcasidi

Here, HaemagogusHaemagogus appears to arise by a single- crossover recombination, and a second such recombination leads to A.albopictusA.albopictusequinusequinusand A.avopictusA.avopictus 104 .. ExampleExample 3,3, DataData 1919 restrictionrestriction endonucleasesendonucleases werewere usedused toto analyzeanalyze patternspatterns ofof cleavagecleavage sitesite variationvariation inin thethe mtDNAmtDNA ofof ZonotrichiaZonotrichia.. 77 taxataxa,, 122122 characterscharacters [Zink[Zink etet alal,, 1991]1991]

'Zonotrichia_querula' 11100011111110001111001111111110000111100011101010110001111100111111110001001111100111111011110011011111000 'Z._atricapilla' 11100011111100001100001111111100000111110011100010111001111100111111110001001111110111111011111011011011110 'Z._leucophrys' 11100011111100001100001111111100000111110011110010111001111100111111110001001111110111111011111011011011110 'Z._albicollis' 11100011111100001100001011111101000011101011100010111101111110111111111001001111100111110011110011011011010 'Z._capensis--Bolivia' 01110011100100001000001111111000110110000011101010111101111100110111100101101111100011111001110110111011000 'Z._capensis--Costa_Rica' 11100111100101101000001111111000110110000011100010111101111100110111100101111111100011011001110110111011000 'J._hyemalis' 11101011100100011000110011111000001111000111101111110011100101010101110011001111000001100101110011011011001 However:However: recombinationrecombination ofof mtDNAmtDNA unlikelyunlikely 105 ExampleExample 3,3, MedianMedian NetworkNetwork

TheThe unrootedunrooted splitssplits networknetwork forfor aa datasetdataset ofof restrictionrestriction sitessites inin thethe mtDNAmtDNA ofof ZonotrichiaZonotrichia::

106 ExampleExample 3,3, SignificantSignificant DifferencesDifferences

•• TheThe rootedrooted splitssplits networknetwork forfor thethe samesame datadata set,set, butbut suppressingsuppressing allall splitssplits thatthat areare onlyonly supportedsupported byby oneone sitesite inin thethe data:data:

107 RecombinationRecombination NetworkNetwork

PossiblePossible recombinationrecombination scenarioscenario involvinginvolving twotwo nonnon--independentindependent reticulations:reticulations:

108 SummarySummary

IncompatibleIncompatible signalssignals inin genegene treestrees cancan bebe usefullyusefully displayeddisplayed usingusing splitssplits networksnetworks

AA reticulatereticulate networknetwork maymay bebe extractedextracted byby combinatorialcombinatorial analysisanalysis ofof individualindividual componentscomponents

ImplementationsImplementations ofof manymany treetree andand networknetwork methodsmethods areare availableavailable inin SplitsTree4SplitsTree4 109 110