Whole-genome phylogenetics, gene regulation and the origin of evolutionary novelty
Feather photos: J. Trimble, MCZ “Beast Legends”: A six part adventure in science and myth
Fijian shark god Griffin Kraken
Wild man
Dragon Terror bird Graphics by Invisible Pictures, Inc. Beast Legends – Griffin episode Evolutionary change: genes or gene regulation? Taste receptors in mammals X X Taste receptors on the tongue Birds inherited only the umami (meat) receptor from their dinosaur ancestors
el receptor del X gusto dulce se perdió incluso en colibríes ! Hummingbirds can taste sugar due to changes in the gene other birds use to taste meat (or insects) Can taste sugar? Taste receptor T1R1 T1R3 hummingbird ✗
swift
✗
chicken Baldwin et al. 2014. Science 345: 929-933 Non-coding Dark matter of the genome: a regulatory network?
Karyotype of an Emu CNEEs: evolutionarily conserved non-coding enhancer regions CNEEs = conserved non-exonic elements
View of a segment of human chromosome 10 using UCSC Genome Browser
Janes et al. (2011) Genome Biol. Evol. 3:102–113 Noncoding enhancers: long-range control of gene expression
Levine et al. 2014 Cell 157: 13-25 Phylogenetic hidden Markov model detects CNEEs using Phastcons*
*Siepel et al. 2005. Genome Res. 15:1034-1050 A role for gene regulation in the origin of feathers
Quanguo et al. (2010) Science Anchiornis Chen et al. (1998) Nature Sinosauroptyerx Archaeopteryx Feather photos: J. Trimble, MCZ Conserved non-exonic elements (CNEEs) act as enhancers for feather genes
• 400 abstracts • 193 feather genes • 13,307 feather gene CNEEs (2.2% of total)
Bird, amniote- and tetrapod-specific CNEEs near SHox
Lowe et al. 2015 Mol. Biol. Evol. 32:23–28 High origination rates of feather CNEEs, but not feather genes, when feathers evolved
feathers evolve 1.5 220 mya genes Relative CNEEs number of 1.0 origins
0.5Origin of most feather genes Feather gene pre-dates enhancers feathers evolved 0.0 at almost the same time as feathers Birds Turtles r MammalsReptiles Amphibians Chickens/ducks Crocodilians Bony vertebrates ancient time recent Lowe et al. 2015 Mol. Biol. Evol. 32:23–28 Bird-specific regulatory evolution: what makes a bird a bird? Bird-specific CNEEs associated with genes for limb and body size evolution
Chromosome position (chicken)
Lowe et al. 2015 Mol. Biol. Evol. 32:23–28 Bird-specific CNEEs associated with genes for limb and body size evolution
Lowe et al. 2015 Mol. Biol. Evol. 32:23–28 Bird-specific CNEEs associated with genes for limb and body size evolution
Lee et al. 2014. Science 345:562 Correlates of GC content in ~1000 mammalian coding regions
Romiguier et al. 2010. Genome Res. 20: 1001-1009 Mechanistic hypotheses for high GC content in a lineage
GC content
Isochores/Chromo Rate of biased somal location and gene conversion Population size and length (BGC) and life history effects recombination
Hughes & Hughes 1995 Nature Lynch & Connery 2003 Science Sessions & Larson 1987 Evolution Consequences of repair of double strand breaks induced by BGC or recombination
Frequency of GC alleles among gametes from a heterozygous individual:
(b=BGC bias; Ne = effective population size) Covariation of GC content and body mass in birds
large small large small populations populations populations populations Nabholz et al./Avian Phylogenomics Consortium Effects of GC variation on phylogenomic analysis
Including only genes with low-GC Including only genes with high-GC variation among lineages variation among lineages
Avian Phylogenomics Consortium Passerines
Parrots
Owls
Cormorants Penguins
Rails Gulls, Auks
Pigeons Grebes Ducks, Geese Ratites Jarvis et al. 2014, Science CNEEs and the convergent evolution of flightlessness in Palaeognathae Skeletal modifications for flightlessness
Little-spotted kiwi sternum
Emu and ostrich keelless sterna
De Bakker et al. 2013. Nature 500, 445–448. Convergent losses of flight allow comparative genomics to identify genomic regions for flightlessness
Extinct
Extant
Mitchell et al 2014 11 new palaeognath genomes
Little bush moa Rowi Kiwi Great-spotted Kiwi Little Spotted Kiwi
Southern Cassowary
Emu Lesser Rhea Greater Rhea
Elegant-crested Tinamou Thicket tinamou Chilean tinamou Image (all CC): David Cook; Quartl; Jim, the Photographer, Tim Sackton 42-species whole genome alignment for birds using ProgressiveCactus
DMRT1 Relative rates of different noncoding markers
Edwards et al. 2017. Syst. Biol. 66: 1028 Phylogenomic markers cover c. 3% of total genome length 12,676 CNEEs
5,016 Introns
3,158 Ultraconserved elements (UCEs)
Cloutier et al. 2019. Syst. Biol. 10.1093/sysbio/syz019 Relationships of rheas unclear
Haddrath & Baker (2012) Mitchell et al. (2014) Smith et al. (2013) 27 nuclear loci mtDNA 60 nuclear loci + mtDNA
Moa Moa Tinamou Moa Tinamou Elephant bird 49 Tinamou 1 Kiwi Kiwi Rhea 63 Emu 1 Emu Kiwi
0.67-1 Cassowary Cassowary Emu Rhea Rhea Cassowary Ostrich Ostrich Ostrich Coalescent* analyses resolve the position of rheas and reveal an ancient rapid radiation
0.061
12,676 CNEEs - 4,794,620 bp 0.073 5,016 introns - 27,890,802 bp 3,158 UCEs - 8,498,759 bp
Total: 20,850 loci; 41,184,181 bp
Branch lengths in coalescent units
*MP-EST: Liu et al. 2010. BMC Evol. Biol. Consistent accumulation of phylogenetic signal using MP-EST
Cloutier et al. 2019. Syst. Biol. 10.1093/sysbio/syz019 Gene tree distribution suggests a near polytomy at base of ratites Cassowary/ Rheas Emu Kiwis Major topology
Cassowary/ Kiwis Emu Rheas Minor topology 1
Cassowary/ Emu Kiwis Rheas Minor topology 2
Cloutier et al. 2019. Syst. Biol. 10.1093/sysbio/syz019 Anomaly zone: most common gene tree does not match the species tree
Cloutier et al. 2019. Syst. Biol. 10.1093/sysbio/syz019 Corroboration of coalescent tree from transposable elements • Chicken Repeat1 (CR1) retroelement insertions • (Virtually) homoplasy free • Binary presence/absence • No model misspecification • No GC/rate variation bias • BUT- subject to incomplete lineage sorting TS TS 5' flank D CR1 D 3' flank A TS TS 5' flank D CR1 D 3' flank B 5' flank 3' flank C Example CR1 insertion Most CR1s support the coalescent species tree topology
Cloutier et al. 2019. Syst. Biol. 10.1093/sysbio/syz019 Evolutionary change: genes or gene regulation? Searching for evidence of genome-wide amino acid convergence in ratites No evidence for genome-wide convergent amino acid substitutions in ratites
R2 = 0.95 ~1-2% of protein-coding genes show evidence of ratite-specific positive selection
271 genes (10% FDR) 104 genes (1% FDR)
Sackton et al. 2019. Science 364: 74-78 Non-coding Dark matter of the genome: a regulatory network?
Karyotype of an Emu CNEEs: evolutionarily conserved non-coding enhancer regions CNEEs = conserved non-exonic elements 284,001 long (* > 50 bp) CNEEs in data set
View of a segment of human chromosome 10 using UCSC Genome Browser
Janes et al. (2011) Genome Biol. Evol. 3:102–113 Convergent loss of function of CNEEs in ratite lineages Ratites neognaths and Tinamous
250 bp Branch-specific Bayesian model of noncoding rate accelerations
for noncoding element i
Z
a = probability of gain of conserved state
b = probability of loss of conserved state
Hu, Z., et al. 2019. Mol. Biol. Evol. 36: 1086 A convergently accelerated CNEE detected with a novel Bayesian method
Anole Turtle2 Turtle1 Gharial Crocodile Alligator2 SNPs Alligator1 Ostrich Moa Tinamou4 Tinamou3 Tinamou2 Tinamou1 Rhea2 Rhea1 accelerated Emu Cassowary Kiwi3 branch Kiwi2 Kiwi1 Mallard Turkey Chicken Mesite Pigeon conserved Cuckoo Swift Hummingbird Killdeer branch Crane Ibis Fulmar Penguin2 Penguin1 Eagle neutral Courol Woodpecker Falcon Budgerigar branch Crow Ground−tit Flycatcher Finch BF1: 72 BF2: 6 r1=0.27, r2=3.08 Branch lengths relative to conserved rate Hu, Z., et al. 2019. Mol. Biol. Evol. 36: 1086 Additional examples of convergently accelerated CNEEs
Hu, Z., et al. 2019. Mol. Biol. Evol. 36: 1086 Rapid regulatory evolution near developmental genes
Sackton et al. 2019. Science 364: 74-78 Homing in on regulators for flightlessness through convergence
4,260 CNEEs 1,270 CNEEs 252 CNEEs 66 CNEEs iw i ed T inamo u ed T inamo u h_ Moa -cres t po tt ed_ K hroa t iw i _Spo tt ed_Kiw i e- t ed T inamo u ed T inamo u iw i hi t rea t ed T inamo u h_ Moa G Rowi_Kiw i Southern_Cassowar y Em u Chilean_ TI na mou Elegen t Thicket_Tinamo u W _Rh ea Lesser Greater_Rhe a Ostr ich Li tt le_ Bus Li tt le_ S ed T inamo u -cres t po tt ed_ K hroa t h_ Moa _Spo tt ed_Kiw i iw i e- t -cres t po tt ed_ K hroa t hi t ed T inamo u _Spo tt ed_Kiw i rea t ed T inamo u e- t G Southern_Cassowar y Em u Chilean_ TI na mou Elegen t Thicket_Tinamo u W Greater_Rhe a Rowi_Kiw i _Rh ea Lesser Ostr ich Li tt le_ Bus Li tt le_ S hi t rea t h_ Moa G Rowi_Kiw i Southern_Cassowar y Em u Chilean_ TI na mou Elegen t Thicket_Tinamo u W _Rh ea Lesser Greater_Rhe a Ostr ich Li tt le_ Bus Li tt le_ S -cres t po tt ed_ K hroa t _Spo tt ed_Kiw i e- t hi t rea t G Chilean_ TI na mou Elegen t Thicket_Tinamo u Rowi_Kiw i Southern_Cassowar y Em u W _Rh ea Lesser Greater_Rhe a Ostr ich Li tt le_ Bus Li tt le_ S 0. 7 0. 7 0. 7 0. 7
CNEE relaxed Same CNEE relaxed Same CNEE relaxed Same CNEE relaxed on 1 lineage in 2 lineages in 3 lineages in 4 lineages Ratite genomes exhibit higher numbers of convergently accelerating CNEEs than do vocal learners or random trios of taxa
140 ratites random pairs/trios 120 vocal learners
Number of 100 convergently accelerated 80 CNEEs 60
40
20
0 2 3 Number of convergent lineages Assay for Transposase-Accessible Chromatin ATAC-Seq identifies DNA with open chromatin, accessible to transcription factors
Stage HH24-25 chickens and rheas
Buenrostro et al. 2015. Curr Protoc.Biol. 2015; 109: 21.29.1–21.29.9. Differences in ATAC-seq peaks between rhea and chicken suggest changes in limb gene regulation Ratite accelerated element 1317692 is contained under chicken ATAC peaks …
5 kb galGal 11,278,500 11,279,000 11,279,500 11,280,000 11,280,500 11,281,000 11,281,500 11,282,000 11,282,500 11,283,000 11,283,500 11,284,000 11,284,500 11,285,000 11,285,500 11,286,000 11,286,500 11,287,000 11,287,500 CNEECNEE Chick HL 90.2857 _
0 _ Chicken 82.4286 _ 0 _ hind limb86.6667 _ 0 _
Chick FL 98.5714 _
0 _ Chicken 93.6667 _ 0 _ 84.8571 _
Forelimb 0 _
Rhea HL 14.8333 _
0 _ 18.5 _
0 _ RheaRhea FL 17 _
0 _ forelimb27.5 _
0 _ ... but rhea is missing this peak Combined information from multiple sources suggests candidate enhancers for flightlessness phenotypes
Rate acceleration ATAC-seq
16 340 33,554 42 CNEEs 71 44,396
62,299
Chip-seq (from Seki et al. 2017)
Sackton et al. 2019. Science 364: 74-78 Volant version of CNEE drives gene expression in the developing forelimb of chicken but flightless version does not Conclusions
• Ratite relationships require methods accommodating gene tree heterogeneity • Noncoding CNEEs show more evidence for convergence than do coding regions • Functional marks (ATAC-seq) suggests noncoding CNEEs are functional and related to species differences Acknowledgements Cliff Tabin Tim Sackton Phil Grayson Scott Edwards Chad Eliason
Statistics: Zhirui Hu Jun Liu
Grant numbers EAR-1355343/DEB-1355292 Michele Clamp Allan Baker Julia Clarke Alison Cloutier