Deeply Conserved Synteny Resolves Early Events in Vertebrate Evolution
Total Page:16
File Type:pdf, Size:1020Kb
ARTICLES https://doi.org/10.1038/s41559-020-1156-z Deeply conserved synteny resolves early events in vertebrate evolution Oleg Simakov 1,2,13 ✉ , Ferdinand Marlétaz 1,11,13, Jia-Xing Yue 3,12, Brendan O’Connell 4, Jerry Jenkins 5, Alexander Brandt6, Robert Calef7, Che-Huang Tung8, Tzu-Kai Huang8, Jeremy Schmutz 5, Nori Satoh 9, Jr-Kai Yu 8, Nicholas H. Putnam 7, Richard E. Green4 and Daniel S. Rokhsar 1,6,10 ✉ Although it is widely believed that early vertebrate evolution was shaped by ancient whole-genome duplications, the number, timing and mechanism of these events remain elusive. Here, we infer the history of vertebrates through genomic comparisons with a new chromosome-scale sequence of the invertebrate chordate amphioxus. We show how the karyotypes of amphioxus and diverse vertebrates are derived from 17 ancestral chordate linkage groups (and 19 ancestral bilaterian groups) by fusion, rearrangement and duplication. We resolve two distinct ancient duplications based on patterns of chromosomal conserved syn- teny. All extant vertebrates share the first duplication, which occurred in the mid/late Cambrian by autotetraploidization (that is, direct genome doubling). In contrast, the second duplication is found only in jawed vertebrates and occurred in the mid–late Ordovician by allotetraploidization (that is, genome duplication following interspecific hybridization) from two now-extinct progenitors. This complex genomic history parallels the diversification of vertebrate lineages in the fossil record. n the 1970s, Ohno1 proposed that vertebrates arose through Results and discussion a process involving one or more genome-wide duplications. Amphioxus chromosomes reflect ancestral chordate linkages. IThis hypothesis received early support from the discovery of As an invertebrate chordate whose lineage diverged before the multiple vertebrate Hox clusters compared with one invertebrate emergence of vertebrates, amphioxus species have often served as cluster2 and the finding of numerous vertebrate gene families with a proxy for the ancestral proto-vertebrate condition22, and provide members distributed across multiple chromosomes3,4. Further a critical outgroup for analysing vertebrate-specific gene duplica- evidence came from the discovery of paralogous (that is, dupli- tions2–4,10 and the evolution of vertebrate gene regulation23. To cated) blocks of linked genes on multiple chromosomes within robustly infer the proto-vertebrate karyotype and the genomic the human genome5–8, culminating in the discovery of wide- changes that accompanied the invertebrate-to-vertebrate transition, spread quadruply conserved synteny of the human genome9,10. we produced a chromosome-scale genome assembly of amphioxus These studies support the so-called ‘2R’ scenario of two rounds of (the Florida lancelet Branchiostoma floridae). We combined exist- whole-genome duplication during vertebrate evolution. ing shotgun data10 with new in vitro24 and in vivo25 chromatin con- However, the number, timing and mechanism of these duplica- formation capture sequences that enable megabase-scale scaffolds tion events are still debated3,10–14. Alternatives to the 2R hypothesis to be accurately linked together to reconstruct chromosomes24,25 include the recent proposal of a single whole-genome duplication (Methods, Supplementary Notes 1 and 2 and Extended Data with “additional large paralogy regions being the product of rare Fig. 1a). The resulting chromosome-scale assembly of B. floridae segmental duplications occurring both before and after”, based represents a substantial improvement over the original draft on comparative analyses of the sea lamprey genome13,15. Others genome sequence, which achieved only megabase-scale scaffolds10, have suggested a series of large segmental duplications without and megabase-scale assemblies of other amphioxus species23,26. Our any genome-wide events16,17, although this is a minority view. assembly assigns 94.5% of genes to the 19 B. floridae chromosomes Contributing to this uncertainty are discrepancies in the inferred BFL1–19. We validated the chromosome-scale accuracy of the new chromosomal organization of the proto-vertebrate ancestor. By B. floridae assembly by generating a dense meiotic linkage map analysing gene linkages within and among selected bony vertebrate made from the F1 progeny of two wild parents (Supplementary genomes (Euteleostomi), some authors have suggested the exis- Note 3 and Extended Data Fig. 1b)10,22. tence of 10–13 proto-vertebrate (that is, before any duplications) To examine the conservation of syntenic relationships, we con- chromosomes13,15,18–20, although other studies10,14,21 have inferred 17 structed Oxford dot plots comparing the chromosomal positions ancestral chromosomes. of orthologous genes between genomes of amphioxus and multiple 1Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan. 2Department of Neuroscience and Developmental Biology, University of Vienna, Vienna, Austria. 3Université Côte d’Azur, CNRS, INSERM, IRCAN, Nice, France. 4Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA. 5HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA. 6Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA. 7Dovetail Genomics, Scotts Valley, CA, USA. 8Institute of Cellular and Organismic Biology, Academia Sinica, Taipei, Taiwan. 9Marine Genomics Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan. 10Chan Zuckerberg Biohub, San Francisco, CA, USA. 11Present address: Centre for Life’s Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK. 12Present address: State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China. 13These authors contributed equally: Oleg Simakov, Ferdinand Marlétaz. ✉e-mail: [email protected]; [email protected] NatURE EcoLogy & EVOLUTION | www.nature.com/natecolevol ARTICLES NATURE ECOLOGY & EVOLUTION vertebrates (Fig. 1a and Extended Data Figs. 2 and 3) and inverte- between distinct patterns of conserved synteny across the chicken brates (Fig. 1b and Extended Data Fig. 4). These plots clearly dis- (chromosome code: GGA), gar (chromosome code: LOC), human play dense rectangular blocks of dots that represent units of deeply (chromosome code: HSA), frog (chromosome code: XTR), sea lam- conserved synteny. Here, we use the original meaning of ‘synteny’27 prey and scallop (vertical dashed lines Fig. 1 and Extended Data to represent physical linkage without regard to gene order. The Figs. 2 and 3). BFL2 exhibits an alternating block pattern of CLGJ uniform distribution of orthologues within these blocks implies that while physical linkage is conserved, gene order within syn- tenic units has become largely scrambled since the amphioxus and other lineages diverged from each other more than half a billion a years ago. This gene order scrambling is the result of accumulated BFL1 BFL2 BFL3 BFL4 BFL5 BFL6 BFL7 BFL8 BFL9 BFL10 BFL11 BFL12 BFL13 BFL14 BFL15 BFL16 BFL17 BFL18 BFL19 LOC28 LOC27 LOC26 LOC25 inversions and other intra-chromosomal rearrangements over LOC24 LOC23 time, as observed between Drosophila species across increasing LOC22 6,000 LOC21 28,29 LOC20 evolutionary distances . LOC19 LOC18 LOC17 Comparison of the chromosomes of amphioxus and the scal- LOC16 30 5,000 LOC15 lop Patinopecten yessoensis (chromosome code: PYE) —the most LOC14 LOC13 complete chromosomal assembly of a marine invertebrate until LOC12 amphioxus, albeit with only 80% of genes assigned to linkage 4,000 LOC11 CLGA groups—shows the remarkable stability of bilaterian chromosomes. LOC10 Many amphioxus and scallop are in 1:1 correspondence, and others LOC9 CLGB Spotted gar 3,000 LOC8 are related by the limited rearrangements described below (Fig. 1b). CLGC This observation directly confirms the deep conservation of synteny LOC7 LOC6 CLGD we previously hypothesized based on networks of conserved link- 2,000 LOC5 21,30–33 CLGE ages observed among draft genomes of diverse invertebrates LOC4 CLGF (see Extended Data Fig. 4). LOC3 1,000 CLGG We identified 17 distinct patterns of conserved amphioxus– LOC2 CLGH vertebrate synteny (Supplementary Note 4). Each pattern repre- LOC1 sents a group of genes whose linkage has been preserved since the A J C J C Q O I E D F H K B G N M P L B J BO CLGI GGAZ divergence of the vertebrate and cephalochordate lineages, and is GGA33 CLGJ GGA28 GGA27 GGA26 identified with an ancestral chordate linkage group (CLG). Each GGA25 6,000 GGA24 GGA22 GGA23 CLGK GGA21 CLG is assigned a letter (A–Q, in decreasing order of number of GGA20 GGA19 CLGL genes) and, for ease of representation, colour, consistently used GGA18 GGA17 5,000 GGA15 CLGM throughout. Vertical dashed lines show sharp boundaries between GGA14 GGA13 segments of different CLG ancestry in amphioxus; these are con- GGA12 CLGN GGA11 sistent in all comparisons among both vertebrates and inverte- 4,000 GGA10 CLGO GGA9 brates (Fig. 1 and Extended Data Figs. 2–4). The CLGs defined GGA8 CLGP GGA7 here by amphioxus–vertebrate comparisons also are found as Chicken GGA6 intact units in the scallop (Fig. 1b and Supplementary Note 4), 3,000 CLGQ GGA5 which implies that they represent even more ancient bilaterian or GGA4 metazoan conserved syntenic units. 2,000 The amphioxus karyotype (n = 19) is derived