<<

PHYLOGENETIC TREES IN

Haeckel’s tree of life from The Evolution of Man (1879) WHAT WE WILL COVER

➤ Basic information about phylogenetic trees,

➤ Why we do them,

➤ what is an ,

pipeline,

➤ Example from literature

Copyright ©2018, J. Szyda & B.Kosińska-Selbi WHAT IS A PHYLOGENETIC TREE ?

➤ Diagram that represents evolutionary relationships among organisms. ➤ Phylogenetic trees are hypotheses, not definitive facts. ➤ The pattern of branching in a phylogenetic tree reflects how species/groups evolved from series of common ancestors.

https://www.khanacademy.org/science/biology/her/tree-of-life/a/phylogenetic-trees ROOTED / UNROOTED TREES

https://www.ncbi.nlm.nih.gov/Class/NAWBIS/Modules/Phylogenetics/phylo9.html WHY WE DO USE THEM AND WHY THEY ARE IMPORTANT

Phylogenetics is important because it enriches our understanding of how , genomes, species (and molecular sequences more generally) evolve.

Through phylogenetics, we learn not only how the sequences came to be the way they are today, but also general principles that enable us to predict how they will change in the future.

This is not only of fundamental importance but also extremely useful for numerous applications

EMBL - EBI - https://www.ebi.ac.uk/training/online/course/introduction-phylogenetics/what-phylogenetics Copyright ©2018, J. Szyda & B.Kosińska-Selbi Copyright ©2018, J. Szyda & B.Kosińska-Selbi JIN XIONG

Chapter 10 - Phylogenetics Basics

Chapter 11 Phylogenetic Tree Construction Methods and Programs

Copyright ©2018, J. Szyda & B.Kosińska-Selbi VOCABULARY

Sequence is the biological homology between or DNA sequences, defined in terms of shared ancestry in the evolutionary . Two segments of DNA can have shared ancestry either because of a event (orthologs), or because of a duplication event (paralogs).

Copyright ©2018, J. Szyda & B.Kosińska-Selbi JIN XIONG / SEQUENCE ALIGNMENT

Chapter 3. Pairwise Sequence Alignment

“It is an important first step toward structural and functional analysis of newly determined sequences. As new biological sequences are being generated at exponential rates, sequence comparison is becoming increasingly important to draw functional and evolutionary inference of a new protein with already existing in the database. “

“The most fundamental process in this type of comparison is sequence alignment. This is the process by which sequences are compared by searching for common character patterns and establish- ing residue–residue correspondence among related sequences. Pairwise sequence alignment is the process of aligning two sequences and is the basis of database similarity searching and multiple sequence alignment. “

Copyright ©2018, J. Szyda & B.Kosińska-Selbi Copyright ©2018, J. Szyda & B.Kosińska-Selbi PIPELINE

➤ Choose Your group and markers, ➤ collect data(BLAST from NCBI) / have sequenced data, ➤ perform an alignment to remove or cut data(Seaview), ➤ choose the ( jModelTest), ➤ creating phylogenetic tree(MEGA/ MrBayes)

Copyright ©2018, J. Szyda & B.Kosińska-Selbi Copyright ©2018, J. Szyda & B.Kosińska-Selbi TREE-BUILDING METHODS:

The most popular and frequently used methods of tree building can be classified into two major categories: phenetic methods based on distances and cladistic methods based on characters.

1) Distance-Based Methods: -Neighbor Joining Method (NJ), -Weighted Neighbor-Joining (Weighbor), -Fitch-Margoliash (FM) and Minimum Evolution (ME) Methods 2)Character-Based Methods: -Maximum parsimony (MP), -Maximum Likelihood (ML)

http://guava.physics.uiuc.edu/~nigel/courses/598BIO/498BIOonline-essays/hw2/files/hw2_li.pdf Copyright ©2018, J. Szyda & B.Kosińska-Selbi EMBL’s PIPELINE

Copyright ©2018, J. Szyda & B.Kosińska-Selbi Copyright ©2018, J. Szyda & B.Kosińska-Selbi Copyright ©2018, J. Szyda & B.Kosińska-Selbi SHORT INTRODUCTION

The family Equidae, along with Rhinocerotidae and Tapiridae, is one of the three extant families of odd-toed ungulates. The Equidae are richly represented in the fossil record throughout the past 55 Million years, starting with dog-sized taxa from the North American Eocene. Equid lineages then spread globally and experienced successive episodes of radiations and extinctions during the early Miocene, the late Miocene, and at the end of the Pleistocene, becoming adapted to a variety of environments with remarkable variations in body size. While several dozen extinct equid genera have been described, all extant equid species are classified in the same genus, Equus. It includes the African wild ass, Equus africanus, which is the progenitor of the Domestic donkey, E. asinus, and three living species of zebra, all endemic to Africa.

http://www.theequinest.com/mesohippus/ NO PHYLOGENETIC STUDY HAS BEEN UNDERTAKEN USING COMPLETE MITOCHONDRIAL SEQUENCE OF SEVEN EXTANT SPECIES OF EQUUS !

Copyright ©2018, J. Szyda & B.Kosińska-Selbi AVAILABLE MITOCHONDRIAL GENOMES:

> domestic (142 individuals),

>Przewalski horses (seven individuals),

>Tibetan Kiang,

> domestic donkeys since 1996.

Copyright ©2018, J. Szyda & B.Kosińska-Selbi SECOND-GENERATION SEQUENCING TECHNOLOGIES

> complete mitogenome sequences from 2-4 individuals of other living equid species.

>14 novel and complete mitogenomes comprising all the taxonomic diversity of extant equids

> sequence complete mitogenomes from 3 extinct equids lineages : New World Stilt- Legged horses(NWSLH), the Sussemious (E. ovodovi), the Quagga.

Copyright ©2018, J. Szyda & B.Kosińska-Selbi Previous studies

Short fragments of mitochondrial DNA

Short fragments of genomic DNA

use of a complete mitogenome, a new innovative method

Copyright ©2018, J. Szyda & B.Kosińska-Selbi WHY A LOT OF RESEARCH IS BASED ON MITOCHONDRIAL GENOME?

“ Most of the studies have focused on the mitochondrial D-loop region, the most variable part of mtDNA due to increased substitution rate than in the rest of the mtDNA genome which serves as a better genetic marker to assess the diversity. Mitochondrial DNA (mtDNA) possesses several favorable characteristics, including large quantity in the cell, small , haploid, maternal inheritance with extremely low probability of paternal leakage, higher rate than nuclear DNA, and amenable to change mainly through mutation rather than recombination. All these features make mtDNA a useful and one of the most frequently used markers in molecular systematic and has been widely employed to address questions of genetic diversity, population structure and population evolution of animals including equines.”

Copyright ©2018, J. Szyda & B.Kosińska-Selbi MATERIALS AND METHODS

17 complete mitogenomes representing all extant species within the genus Equus (14 modern samples), including three extinct lineages (3 ancient samples).

For five out of the 14 modern samples, mitogenomes were generated by PCR amplification followed by a combination of Sanger and GS FLX sequencing.

The DNA extracts from the remaining nine modern samples and three of the ancient samples were converted into sequencing libraries and shotgun-sequenced using the Illumina HiSeq2000 platform.

Copyright ©2018, J. Szyda & B.Kosińska-Selbi PIPELINE / MODERN SAMPLES

trimmed Aligned reads remove all reads that mapped ILLUMINA reads publish genomes to multiple positions and with AdapterRemoval generated via PCR mapping quality score <25

BWA

SAM tools final alignment was generated for each mitogenome and visually corrected any potential remove duplicated local misalignments sequence and filter

Copyright ©2018, J. Szyda & B.Kosińska-Selbi MarkDuplicates PIPELINE / ANCIENT SAMPLES extract samples from assemble mitogenomes built ILLUMINA dry tissues following libraries ILLUMINA’s procedure

minimum coverage 2x the final consensus sequence was called using SAMtools

Copyright ©2018, J. Szyda & B.Kosińska-Selbi ANCIENT/MODERN SAMPLES

Tissue samples provided by the Smithsonian Institution (Washington D.C.), the Muse e ́ des Confluences (Lyon), the Russian Academy of Sciences, the Government of Yukon, or zoological gardens following official agreements with the Natural History Museum of Denmark.

Copyright ©2018, J. Szyda & B.Kosińska-Selbi Copyright ©2018, J. Szyda & B.Kosińska-Selbi Copyright ©2018, J. Szyda & B.Kosińska-Selbi Copyright ©2018, J. Szyda & B.Kosińska-Selbi Copyright ©2018, J. Szyda & B.Kosińska-Selbi RESULTS AND DISCUSSION

> They successfully recovered the complete mitogenome of three extinct equid taxa (NWSLH, Sussemione, and Quagga).

> They also characterized partial mitochondrial contigs (11–288 bp in length, for a total of 7,108 bp) for a second NWSLH (JW328).

> The average depth-of-coverage is 387.3 for modern samples and 57.2 for ancient samples .

> Minimal pairwise distances were observed between conspecific individuals, suggesting that the sequences reported here are in agreement with the known sequence diversity among equids. CONCLUDING REMARKS !