AAAATTTTCGTATCTGTTGGAGTTAGATAAGCCTACGCTTGATGGACCGTTGGGTGGCTTTCTAAGTGAGCTCGTGCCATCACAATTAATATAAGGAATTGTAGATGTTTCTTTCGTTATAGGTATTTCAA AATAATTATAAGAACCTACGCCCTCGTCTTTCTCCATTGGAACAGTTGCCGTTTTCGCAGTTCTTTTTGGTTCAGTCCTCATATCATGTGATTCCCCTGGCTCTCCTGATCTTTTTATACTTACTTTGAAAT CGTCATATGTGTATTTCTTTGATGCAACTCCGATAACGAAGACAATGCTTCCAATAATAACTAAGAATTTGCATACCGTTATTAAACCTACCAAAAGTTTACCTATAAGCTTCTGTAATATTGGCCCCATCA TTGTTGTGAATACGCACCCTACCAAAAATGATGGGAAATCCAGCACAATACTGCCAGGCCCACTACCTATTGTAATTTTCCATCGTAACCAATCCCTTTTCAAATCCATCCGTGACTTCTATGTCTCGTTA CTTTCACAGCGTGTGGAGCTACTAGAAAAGTGGCAAAGCTAAACAGCTGATCGAAGTAAACAGAAAAGAACACTAATTGTAGATCAGGCTGTGTACTAGACCTTATTTTACTGTATTTTTTCGGAAAGAA AAAAGGAGCGCTTTGCAGATCGAAAGTTTCGCTCGTAAATTATTTGTAAGATGCTATTCATAATATGTTAACTGAGAGAAACCAGGTCAAAACAAAACAATTTTGGGCTCTTGCCTCCAAATTTGCCTACC CTAGAACAGGTATCCATTATCTCGCCTGTACCCGATTAAAAAAAAGACCAATTATTTAAAACTTCTCAAGAAGTTTCATATGCAGTGTATAAGTTGAAGGAATATAGGAATATATATCCTTCAGAAAAGCA ACACAATACCTAATTACATAACCGATATTTACCTTTTAGAGTGCCTCATTCTTGCAATCTTTCTGTTCGCCATAACACCACCGCCCATGCTCATGCCATTATTTGTTCCCATCCCCATCTGATTAGGGGCT GACTGCGGCTGCCCAAAAGAAGTTGTCGGCACACCACCTGCCCCCCCAAAAATGGATGATGGATTTGTTACGTTTGAATTGGAACCAGAGGCAGCATTCGCACCAAATATATCACTTGGCCTTAATGC ATTGGTCGCGGTATTAGTAATTCCGCCATTCAATCCGCTAAAATTAATATTAGGAACTGTTGATGGCGTGAATGAGCTGTTTGTATTGAAAGATGGGGTTTGCGATTGATGTGGTTGGTTATTAGAGCCG GCAAACACCGTATTAGCATTAGTGTTGCCGTTCATATTAAATACAGAGCCGCCACCAGGCGTTGAATTATTTCCCGTAAAATTAAAAGCAGAGGGAACATTGACATTTTGTGCATTCGTGGAAGGAGGTT TATTGAATAAACCAGCATTAGCATTAGTTGACGAAGTTGCTGATGTAAAAGGATTGAGACCCCCTGCACCATTGTTTCCAAAATTAAATGAAGATTGATTAGAAGCTGCACCAGTTGCTGCTGTTCCTGA GCTCGAAAAGCCAAATGCCGAGCCCGCTCCATTCGTATTGCCACTAGCGATACTTTGATCCGGTTTTCCTACGTTAAATGTACCCGCTATATTGGTTCCTGAGGTATTGGAAGTAGTAGTTGTGCCGTT ACCAGTAGCAGGGGCGTTAAACGAGAATGATGTGGAGTTTGCTGAGGCATTGGTACCATTGGTGTTAGCAGTACCGAATGAAAATGCAGATTTAGATGTTGTATTACCAGTAGCGTCTGTTGGCTTACC CAAGACAGGAATCGGCGTTGAGGAAGCAGAACCATCGAAGAAAGAAGTTGGAGAGTTTGACTTTTCTTTATTGTGATTGAACTTTGTAAAGGAAAAGCCATTTGATAGCTTCTCCGTATTTGCTGCCGC TGTACTTGCTGTCGACTTCATCGACTCGGGAGCCCCAAAACTAAAAGATGGTTTTGTGCTAGTGGTTGTTGTATTATTTGTTGTGGAACCGCCAAAGGTGAAAGATGGCGGTGTAGGTCTTTTATDéclin et contingence, bases de l'évolution CTGT CTCATTAGCAGGAGGTTTAGTGAAAGAAAATGAGGTGTTAGAGCCTGGTGGTTCTTTAGCTGCATCTGACTTACCAAATGAAAACAATGGCTTTGGTTGTGAGGTTTTTGAAGACGCAAACGTAAAGGA GGGCTTTTTAGGTTCCGAAACAACAGATGAATCTTTTTGAGCAGGTTCAGTAAAAGAAAAAGTTGGCTTGAGAGTCTTATCATCCGTCGGTGCTTGAACATCAACAGGCTTGCCCGGAAACGAAAACGA GGGTTTAGCTGCTTCGTTTGAAATTGGACTACTCTTACGTTCCTCCTCTGACTTAGAGAAAGAGAATGTAGGTTTCGCACTTCCCTCAGAGATCTTATTTTCACTTGTTGACTGCCCAAAAGTAAbiologique AAGTAG GCTTCTTGACTATTGTGGCAGGTGTCTCAGATGGTTTGGTGTGTGTTTCTTTCGCGGTGGCGGCTTTACCAAAGGTAAATTGTGCAGAGGAGTCAATATTGCTTGTTACATCAGCTTTTTTTCCGAATGT AAATAATGGTGTACCTTCAGCTTGCTTATCACTTGCACCAAAGACAAAGCTTGGTTTCCCTGATGCGTCCTTTTCTGACTCTCCCTTTTTGGTCTCCTTTTGATCACCGGTCTTGCCGAAATCGAATAAAG GCTTGGTGTTTGTATCCTCGCTAACAGGTAAACGCCTTTTTCTTTTGGGCTCATTTTCATCATCACCTTCATCACCATTCTCTTCTTGTTTACCAAAAGAAAATATTGGAGCAGTTGATTTTGGAGGCGCG TCTGATTCTGTATGATTTTCACTTTTTTCGGATGTCTTTCCAAATTTAAAAGGTTGACTGGCAGAAGTAACGGTATCTGATTTACCACCAAAATTGAATAAAGTTGTGGAAGGGACAGTATTGTC GACAGC CTTAGTTTTATTAGCCTTTTGGCTAAAATTGAATGATAAGGTAGGCGCCTCGGCAGTTTTCGTTGACTTATCGGTTTTTCCCATTTCTACACTCGATTTAAAGACTGCACCTGCAGAAGAAGTTGCCTTAG GAGAAGTTTTCTTAGATGGAGTCTCATTGTCCTTGATAAAGTCAAAACCTACGGTGGGCAAAACAATACTTTCTTTGTCCTTTTTGGGCTCAATATTTTTCTTTAACGTAGGCGTACCAGAGCGCTCTGAA TTGGGAACAAAGCTCTCCTGAATAGGGGCAGTTTTTGCAACTGTGGAAGCTGGTCCTTTTAAAAGTAGATTTTTTTGAGGATTCGATAACCTATTAGAGTTGATGTCTGCACGTAGGTCTTCAATDecline and contingency, bases of biological TTCGC TTGTCAGGTTAGGGCCTGTAGCCAGATTGCCATTTGAAATACTACTCTTAATATTATTTCTATTCTCGCTTGTCTTCTGATCACCGCCAGCGTTACCTTCCTTATCCTTGTTATCCTTTTTTTGTATAGCGT CATATTCTGACAAATCATATTCAAAATTTGCTGACCACACGGTCCCCTTTGACTGACTATGAAACCTTTTTCTATTGGATCTATTTTTCAATGATTTGAGAATGGGTAGTCCAACATTGGTGTCTTCACCGC TTTTTCCGGCCAACTGCCTAGTGCAAGAACCGTTTTTAATAGGGGAAGGAGTAGATGATGTGCATAGGTACGATCCTCCCTCATCGCTTTTACTTTGAGAGCCCAATATAACCGACGATGTAATAevolution GATG GAAATTCAGTTGATTGAATTAATCCAAGCTCACGCATATTTCTCACCCTCTGCTTCTCCCTTAATAACCTCAGTCTTTGAATGGGCAAAATTGGCAAAAGCGGCGGTCTCTCAGTGTTTTCGGTTCCATAT ATTATAATTGGCGCATTATTGTTGTTCTGAGTTAAGCTGTCGCTATGCTGTGATGTACCGGACACCCTCTTTCTCTTATTAACATGCAGTGTGTCTTCAACATCTGATTCCTCCAAATGATTCGCGTATGA GAGGTTTGAACTGAAAACTTTCTTGCTCGATGGCCGTTTTTTATTGGGGTTTGTGAAGAATGATTTTAAAGTGGAAGAAAACGATCTCTTTTCGACACGTGGAGAAGACATCACAGAAGAAGTGTTTGAA GACATGAATGACTAAAAATTGTCGCTCACTCTCTGTCCCTATAACCCTTTCGAGGCTAATATCCTATCGTATTTGCACCGCTACGTAGTGTCCTTATTGAGTTCCTCATCACTTATTTTCTTTAABernard Dujon GTGTTTC TTGACATTACGAAATTTCGTCAAAGAAAAAAATTAAAATGAAAAAGCATTTCAATGTCACATAATACGAACCATTGATCACGTGCAACGACAAACCCTAAATATAAAAACTAGGGCGTAAAAACCGGGGC TTGAAAATTAGGGCATAAAATAGGCTTTGCATACACGTGACTTATATTTGGTGTCGGCGTTTTCTTTACGCGGTGTAGTGTAAATCTCTTGTCGTACAAGTGGATATACGCACTGTATACCTCCAATGGCTTTGG GTAACA CCAAAAAAAAAACCGTGGTTGTCCCATGTAAACGAGTACCGCACACGTAGGCCAAAGCACTCCAGAGAGACTTCGTGTCAAAGGTCTATAATAGGTGGTGCCTTCTTGCTTCTTTTTTGCAGATTGGTTTTTGAAGAC CTTA GTATAATACGCTAGACTATTGTACTTTCTAATTTTAAGAGATATCTTTTTCCTCACAAAGATTTCGTTAAGCAATCGAAGTAAAGTACTCCATCAGAAGAGTTTTTAAAATTTTCGTATCTGTTGTGGTCTCCTTTTGACGTAAAGGAGGTTT GAGTTAGA TAAGCCTACGCTTGATGGACCGTTGGGTGGCTTTCTAAGTGAGCTCGTGCCATCACAATTAATATAAGGAATTGTAGATGTTTCTTTCGTTATAGGTATTTCAAAATAATTATAAGAACCTACGCCTTGCCGAAATCGAATAAAGAGGTTCCGAAACAA CCTCGT GTGTTTGTATCCTCGCTAACAGGTAAGAGCAGGTTCAG CTTTCTCCATTGGAACAGTTGCCGTTTTCGCAGTTCTTTTTGGTTCAGTCCTCATATCATGTGATTCCCCTGGCTCTCCTGATCTTTTTATACTTACTTTGAAATCGTCATATGTGTATTTCTTTCGCCTTTTTCTTTTGGGCTCATTTTCATCAAAGTTGGCTG GATGCAAC TCCGATAACGAAGACAATGCTTCCAATAATAACTAAGAATTTGCATACCGTTATTAAACCTACCAAAAGTTTACCTATAAGCTTCTGTAATATTGGCCCCATCATTGTTGTGAATACGCACCCTA CACCTTCATCACCATTCTCTTCTTGTTTAGGC CCAAAA ATGATGGGAAATCCAGCACAATACTGCCAGGCCCACTACCTATTGTAATTTTCCATCGTAACCAATCCCTTTTCAAATCCATCCGTGACTTCTATGTCTCGTTACTTTCACAGCGTGTGGAGCTAAAAAGAAAATATTGGAGCAGTTGATTTTGGA CTAGA AAAGTGGCAAAGCTAAACAGCTGATCGAAGTAAACAGAAAAGAACACTAATTGTAGATCAGGCTGTGTACTAGACCTTATTTTACTGTATTTTTTCGGAAAGAAAAAAGGAGCGCTTTGCAGATC AGGGCGCGTCTGATTCTGTATGATTTTCACG GAAAG TTTCGCTCGTAAATTATTTGTAAGATGCTATTCATAATATGTTAACTGAGAGAAACCAGGTCAAAACAAAACAATTTTGGGCTCTTGCCTCCAAATTTGCCTACCCTAGAACAGGTATCCATTAT GTTTTTTCGGATGCTCTTTCCAAATTTAAA CTCGCC GGTTGACTGGCAAGAAGTAACGGTATC TGTACCCGATTAAAAAAAAGACCAATTATTTAAAACTTCTCAAGAAGTTTCATATGCAGTGTATAAGTTGAAGGAATATAGGAATATATATCCTTCAGAAAAGCAACACAATACCTAATTACATATACCACCTTAAAAATTGAATAAAGT ACCGATA TTTACCTTTTAGAGTGCCTCATTCTTGCAATCTTTCTGTTCGCCATAACACCACCGCCCATGCTCATGCCATTATTTGTTCCCATCCCCATCTGATTAGGGGCTGACTGCGGCTGCCCAAAAGAAGGGACAGAATTGTCG GTTGT CGGCACACCACCTGCCCCCCCAAAAATGGATGATGGATTTGTTACGTTTGAATTGGAACCAGAGGCAGCATTCGCACCAAATATATCACTTGGCCTTAATGCATTGGTCGCGGTATTAGTAATTCCGCC ATTCAATCCGCTAAAATTAATATTAGGAACTGTTGATGGCGTGAATGAGCTGTTTGTATTGAAAGATGGGGTTTGCGATTGATGTGGTTGGTTATTAGAGCCGGCAAACACCGTATTAGCATTAGTGTTG CCGTTCATATTAAATACAGAGCCGCCACCAGGCGTTGAATTATTTCCCGTAAAATTAAAAGCAGAGGGAACATTGACATTTTGTGCATTCGTGGAAGGAGGTTTATTGAATAAACCAGCATTAGCATTAG TTGACGAAGTTGCTGATGTAAAAGGATTGAGACCCCCTGCACCATTGTTTCCAAAATTAAATGAAGATTGATTAGAAGCTGCACCAGTTGCTGCTGTTCCTGAGCTCGAAAAGCCAAATGCCGAGCCCG CTCCATTCGTATTGCCACTAGCGATACTTTGATCCGGTTTTCCTACGTTAAATGTACCCGCTATATTGGTTCCTGAGGTATTGGAAGTAGTAGTTGTGCCGTTACCAGTAGCAGGGGCGTTAAACGAGAA TGATGTGGAGTTTGCTGAGGCATTGGTACCATTGGTGTTAGCAGTACCGAATGAAAATGCAGATTTAGATGTTGTATTACCAGTAGCGTCTGTTGGCTTACCCAAGACAGGAATCGGCGTTGAGGAAG CAGAACCATCGAAGAAAGAAGTTGGAGAGTTTGACTTTTCTTTATTGTGATTGAACTTTGTAAAGGAAAAGCCATTTGATAGCTTCTCCGTATTTGCTGCCGCTGTACTTGCTGTCGACTTCATCGACTC GGGAGCCCCAAAACTAAAAGATGGTTTTGTGCTAGTGGTTGTTGTATTATTTGTTGTGGAACCGCCAAAGGTGAAAGATGGCGGTGTAGGTCTTTTATCTGTCTCATTAGCAGGAGGTTTAGTGAColloque Sciences de la vie, sciences de l'information AAGA AAATGAGGTGTTAGAGCCTGGTGGTTCTTTAGCTGCATCTGACTTACCAAATGAAAACAATGGCTTTGGTTGTGAGGTTTTTGAAGACGCAAACGTAAAGGAGGGCTTTTTAGGTTCCGAAACAACAGA TGAATCTTTTTGAGCAGGTTCAGTAAAAGAAAAAGTTGGCTTGAGAGTCTTATCATCCGTCGGTGCTTGAACATCAACAGGCTTGCCCGGAAACGAAAACGAGGGTTTAGCTGCTTCGTTTGAAACentre culturel international de Cerisy-la-Salle, 17-24 Septembre 2016 TTGG ACTACTCTTACGTTCCTCCTCTGACTTAGAGAAAGAGAATGTAGGTTTCGCACTTCCCTCAGAGATCTTATTTTCACTTGTTGACTGCCCAAAAGTAAAAGTAGGCTTCTTGACTATTGTGGCAGGTGTCT CAGATGGTTTGGTGTGTGTTTCTTTCGCGGTGGCGGCTTTACCAAAGGTAAATTGTGCAGAGGAGTCAATATTGCTTGTTACATCAGCTTTTTTTCCGAATGTAAATAATGGTGTACCTTCAGCTTGCTTA Ernst Haeckel, 1834 -1919

1866

Unicellular Unicellular Pluricellular (without nuclei) (with nuclei) 2016 Symbioses Multiple

Reversible Gregor Johann Mendel, 1822 - 1884

1866 Genotype Phenotype

1953 DNA RNA Proteins RNA RNA

Gene and allele Complex RNA Intrinsic 2016 interactions interactions variance (splicing, editing, processing, RNAi, trans-generational …) Regular selection acting on limited variations

Charles Robert Darwin (1809 -1882) Abrupt genetic changes

?

Hugo Marie de Vries (1848 -1935) Species or Species or individual n° 1 individual n° 2

Common core of genes Restricted set of genes (ancestral origin) (origin ?)

Species or individual n° 3 ca. 100 genes mutated and a few genes missing CNV in the human genome

Yoon et al., (2009) Genome Res. 19: 1586-1592

Segmental deletions Segmental duplications Gene gains Gene losses Duplications Fission / fusion Pseudogenes De novo formation Deletions Horizontal acquisitions

Each genome is only a snapshot in time within continual changes, not an optimized structure Major lessons from genomics

1: Genomes are (much) too big

The C-value paradox (Swift, 1950)

2: There are too many genes in genomes Genomes are too big

Fritillaria assyriaca 100 000 Mb Zea mais 55 000 X 3 000 Mb

1000 X Vitis vinifera 487 Mb Saccharomyces cerevisiae 12 Mb

Necturus lewisi Gonyaulax 100 000 Mb grindleyi 98 000 Mb 1000 X

Homo sapiens 2 900 Mb

Paramecium tetraurelia 100 Mb Amoeba dubia 670 000 Mb Genome sizes and content

Coding exons (exome) Genes are complex mosaics (exons and introns) 1.9 % of genome

Many genes encode RNA, not proteins Other Introns Genomes contain many other elements than genes: > 100 000 Mobile elements - pseudogenes and traces of ancient sequences > 1 100 000 - mobile elements and their remnants pseudo -genes - NUMTs, NUPTs, NUPAVs ~25 000 - structural elements of chromosomes (CEN, TEL) - transcription is pervasive Regulations and evolution: 98.1 %

Coding exons UTR and introns Pseudogenes Mobile elements and remants DNA RNA Too many genes in genomes

The genome of Saccharomyces cerevisiae

13.4 Mb (> 70 % coding) ~ 5 800 protein-coding genes

44 % of genes are not unique (paralogs)

Many genes escaped systematic genetic screenings

1996

No genome is minimal

In each genome many genes are dispensible The systematic gene deletion collection

< 18 % of genes are essential for cellular life

> 50 % of the non-essential genes generate no detectable phenotype when deleted Core and pan-genomes Variability + dispensability --> core- and pan-genomes

From 61 E. coli genomes fully sequenced (Lukjancenko et al., 2010 Microb. Ecol. 60: 708-720)

ca. 4 000 genes / individual genome

pan-genome : 15 574 gene families core-genome: 993 gene families

Medini et al., 2005 «The microbial pan-genome» Curr. Op. Genetics & Development 15: 589-594

Closed species: e.g. B. anthracis Open species: e.g. S. agalactiae

> What is a species ?

Burst of novel Gradual reductive lineages evolution in each lineage

Phase of complexification: brief, sporadic Phase of reductive evolution: long, nearly clockwise Adapted from Wolf and Koonin (2013) Bioessays 35: 829-837 Reductive evolution Loss of genes and loss of functions play the key role in evolution

Parasitism, commensalism, symbiosis: , Chlorarachniophytes, Cryptophytes

Role of sex, loss of sex and horizontal acquisitions: Bdelloid rotifers

Free-living eukaryotes: Microsporidia

Corradi and Slamovits, 2010, Brief. Funct. Genomics 10: 115-124.

2001 vol. 414: . 450-453

Genome size: 2.5 Mb Total genes: 1 997 Microsporidia

Intermediate forms Final forms Complex ancestor Small number of genes Small number of genes Large genomes Small genomes Gene loss Genome size Intron loss reduction Horizontal acquistions

Horizontal gene acquisitions in Horizontal acquisition of genes involved in nucleic-acid metabolism Encephalitozoon hellem Alexander et al. 2016 PNAS 113: 4116-4121 Secondary endosymbiosis

Ancestral eukaryote Ancestral chromalveolate Guillardia theta

Primary endosymbiosis Secondary endosymbiosis

Ancestral rhizaria Bigellowiella natans

Gould, 2012, Nature 492: 46-48 Genomics of the eukaryotes

Viridiplantae Excavata

2 313 84

8 748 Bigelowiella natans 5 666 Rhizaria

Guillardia theta

Chromalveolata Unikonts

Keeling et al., 2005 Trends in Ecology and Evolution, 20: 670-676 Secondary endosymbiosis Curtis et al. (2012) Nature 492: 59-65

Nucleomorph: 3 chromosomes: 196, 181, 174 kb

inverted repeats at chromosome ends (rDNA, ubiquitin- conjugating enzyme gene)

Douglas et al., 2001 Nature 410: 1091-1096

Nucleomorph: 3 chromosomes: 141, 134, 98 kb

inverted repeats at chromosome ends (rDNA, DnaK pseudogenes)

Gilson et al., 2006 PNAS 103: 9566-9571

Genome size (Mb) 87.2 94.7 Split CDS (%) 80 86 Mean exons /gene 6.4 8.8 Mean intron size (nuc.) 110 184 Asexual metazoa World-wide expansion alleles of one ohnolog Survive total desiccation > pairs of genome fragmentation and ohnologs regeneration

Asexual reproduction

Nature (2013) 503: 453-457

Total genome size 244 Mb (including 26 Mb homozygous 2x) 49,300 protein-coding genes Numerous homologous blocks (colinear regions) forming two groups: - pairs of ohnologs (mean 74 % identity) corresponding to an ancient genome duplicaton - pairs of alleles (mean 96 % identity) for each member of the pair of Example of a genomic quartet of 4 ohnologs. scaffolds

Their coexistence in the same genome forms quartets that, altogether, cover 40 % of the entire genome (---> a locally tetraploid genome) Asexual metazoa

The genomic structure is incompatible with conventional meiosis

Allelic pairs are found on the Colinear regions do not extend to same chromosomes entire chromosome scale

Multiple traces of gene conversion between gene copies reassort alleles without meiosis Multiple traces of horizontal gene acquisition of non-metazoan origin throughout the genome (8%) What are yeasts ? Unicellular forms of modern fungi (devoid of fruiting bodies) adapted to rapid and unlimited clonal proliferation so long as nutrients are available.

Generation 0: 1 cell

B Generation 1: 2 cells M Exponential mitotic growth Generation 2: 4 cells M B M B

Generation 3: 8 cells M B M B M B M B

etc… etc… etc… etc… etc… etc… etc… etc…

Generation 10: 1 024 cells Generation 30: ~109 cells Generation 20: 1 048 576 cells Generation 50: ~1015 cells

Generation 100: ~1030 cells (approx. one week of growth)

i.e. ~ 10 trillions of tons = a several centimeters-high crust covering all world continents Natural yeasts and experimental models

> 1 500 species Fermentable sugars: , , , maltose, melibiose, raffinose, lactose …

Oxidative utilization of a very large variety of organic compounds, including: Non-fermentable sugars: , arabinose, ribose, rhamnose, fucose … Non-fermentable alcohols: methanol, , propanol, , erythritol, ribitol, arabitol, … Amino-sugars: glucosamine, acetyl-glucosamine, galactosamine… Organic acids: lactic, succinic, citric, malic … Other compounds: acetone, ethyl acetate, glucuronic acid, anthracene …

Need organic Nitrogen (but some species are able to assimilate nitrate or nitrite)

Terrestrial habitats Aquatic habitats Leaves, roots, trunks Fresh or saline water, fruits, plant exudates, planctonic or surfaces insect guts, soils of plants and animals

Saccharomyces, Lachancea, Rhodotorula, Sporobolomyces, Naumovozyma, Kluyveromyces, Debaryomyces, Metchnikowia, Kazachstania, Eremothecium, Leucosporidium … Schizosaccharomyces …

Human, animal or plant pathogens or commensals Antagonist effect against various plant diseases Saccharomyces, Candida, , Cryptococcus, Taphrina, Protomyces, Eremothecium, Galactomyces … The world of yeasts Approximate number of generations from present 1012 1011 1010 109 108 107 106 human-chimpanzee separation Origin of Origin of Yeasts Taphrinomycotina "Budding yeasts" Numerous species e.g. Saccharomyces Debaryomyces, Yarrowia Filamentous fungi Rare species with fruting bodies e.g. Hortea "Fission yeasts" Few species ~ 1000 MYr Recurrent formation e.g. Schizosaccharomyces of other yeasts Fungi with fruting Some species e.g. Rhodotorula «modern» fungi bodies Dikarya Fungi with fruting Few species bodies e.g. Malassezia Fungi with fruting Some species e.g. Cryptococcus ~1100 MYr bodies Chytridiomycota Microsporidia and Glomeromycota «primitive» fungi Neocallimastigomycota Blastocladiomycota

~1500 MYr

flagellated cells loss of flagellum What are yeasts ?

Contrary to common intuitive thinking, yeasts are not a homogeneous group of closely related, evolutionary primitive eukaryotes. They have emerged recurrently from distinct lineages of more complex fungi.

~ Linear ~ Exponential growth growth M B Mycelium of Ascomycota or B M B M Basidiomycota

B M B M B M B M mitosis

etc… etc… etc… etc… etc… etc… etc… etc…

Yeast genomes are highly evolved structures that have lost a number of ancestral features. Saccharomycotina genome signatures Genome Coding Gene Split size capacity numbers genes

ZT Zygosaccharomyces rouxii Mb % %

Torulaspora delbruckii Saccharomyces cerevisiae Saccharomyces paradoxus Saccharomyces mikatae Saccharomyces kudriavzevii Saccharomyces uvarum Nakaseomyces (Candida) glabrata 3 Nakaseomyces bacillisporus 9 4 700 Kazachstania exigua - Kazachstania servazzii - > 70 – Kazachstania africana 5 Naumovozyma castellii 14 6 000 WGD Naumovozyma dairenensis Tetrapisispora phaffi Vanderwaltozyma polyspora Lachancea thermotolerans Lachancea waltii Lachancea kluyveri KLE clade Kluyveromyces lactis Eremothecium gossypii Eremothecium cymbalariae Debaryomyces hansenii Meyerozyma guilliermondii

Millerozyma sorbitophila Scheffersomyces stipitis 12 6 100 6 Spathaspora passalidarum Candida parapsilosis - > 70 – - Candida orthopsilosis Lodderomyces elongisporus 14 6 400 7 «CTG» Candida dubliniensis code Clavispora lusitaniae Dekkera bruxellensis Ogataea polymorpha Ogataea parapolymorpha 12 5 000 Methylo Kuraishia capsulata Komagataella pastoris - > 70 – 15 trophs Lindnera jadinii (Candida utilis) 14 6 000 Nadsonia fulvescens basal Blastobotrys adeninivorans 12 6 100 15 Yarrowia lipolytica - ~ 45 – - lineages candidum 24 6 800 35 Loss of HP1-mediated chromatin modification & Loss of canonical RNAi machinery Debaryomyces Yarrowia hansenii lipolytica 329 1115 1443 2823 Gene losses Gene gains 369 85 Deletions Duplications & sub- or neo-functionalization Pseudogenes Hybridizations Horizontal acquisitions 834 De novo formation Saccharomyces cerevisiae functional losses associated to gene loss Sugar utilization GAL7 (UDP transferase), GAL10 (epimerase), GAL1 (kinase), GAL3 (activator), SUC2 (invertase) Phosphate metabolism PHO3, PHO5, PHO11, PHO12 (acid phosphatases), PHO89 (transporter) Nicotinic acid biosynthesis BNA1, BNA2 (dioxygenases), BNA4 (monooxygenase), BNA5 (kynureninase), BNA6 (phosphoribosyl transferase), BNA7 (kynurenin formamidase) Allantoine metabolism DAL1, DAL2 (allantoinases), DAL3 (ureidoglycolate lyase) Transporters OPT1, OPT2 (oligopeptide-transporters) Etc … functional gains associated to gene loss External nicotinic acid permeases General Synthetic enzymes Nicotinic acid NAD+ metabolism In C. glabrata, the intracellular concentration of NAD+ depends directly on the external concentration of nicotinic acid ---> regulate specific adhesion to epitheliums (Domergue et al., 2005 Science 308, 866-870)

A) 3 3 -

histone H3 H3 histone

3

3 CenP

3

3 3 3 IR heterochromatin

Canonical histone H3 histone Canonical Centromeric ( variant

A

IR

3 A 3 A

3 CDE III CDE

3 A A

remnants

A

A Ty 5 kb5

A

– .13 kb .13 A > 10 kb10 > CDE II CDE 0 3

3 3 A LTR and and LTR

A

3

A

A

CDE I CDE

IR IR 3

3 3

3

3

3 heterochromatin

3 3

centromerescentromeres Eukaryotic-type Point centromeres Regional centromeres centromeres Point Point Point

Saccharomycetaceae

Basal CTG ZT clade Post WGD Post KLE cladeKLE Methylotrophs Taphrinomycotina

of of

, 2009 ,

1082 - elements

: 1067 : classical

Henikoff

of of 138

Recruitment

plasmid modification Cell

heterochromatin

Loss Malik and Malik Specifically Retained Ancestral Genes (SRAG)

Saccharo mycotina

Geotrichum candidum Pezizo 24.8 Mb, 6 804 CDS, 35 % with intron(s). mycotina

Taphrinomycotina Basidiomycota

Pseudo-orphans with discordant phylogenies X 280 genes have no homolog in any sequenced Saccharomcotina, but have homologs in Pezizomycotina and / or Basidomycota X 17 correspond to horizontal transfers from filamentous fungi (low sequence divergence with outgroup)

263 correspond to SRAG (same sequence divergence with outgroup as other genes)

Debaryomyces hansenii 113 SRAGs (2.0 %) Amino-acid, carbon metabolism, transport … Geotrichum candidum 263 SRAGs (3.9 %) Transcriptional regulation, cell cycle, cellusose /pectin hydrolysis … Yarrowia lipolytica 232 SRAGs (3.6 %) Extracellular proteases, oxydo-reduction … Summary

Novel lineage

Complex ancestor etc… Novel lineage

Evolved form 1 Gene loss Evolved form 2 and decay Evolved form 3 Horizontal gene acquistions Introgressions Duplications, sub-functionalization, neo- Etc … Hybridizations, endosymbiosis functionalization, exon-shuffling Genome duplication accompanied by selection Exon-shuffling De novo gene formation ----- adaptations --- major innovations

Genomes are never minimal Evolution is mostly reticulate