Sea Anemone Genomes Reveal Ancestral Metazoan Chromosomal Macrosynteny
Total Page:16
File Type:pdf, Size:1020Kb
Sea anemone genomes reveal ancestral metazoan chromosomal macrosynteny Ulrich Technau ( [email protected] ) University of Vienna https://orcid.org/0000-0003-4472-8258 Sophia Robb Stowers Grigory Genikhovich University of Vienna https://orcid.org/0000-0003-4864-7770 Juan Montenegro University of Vienna Witney Fropf Stowers Lukas Weinguny University of Vienna Shuonan He Stowers Institute for Medical Research https://orcid.org/0000-0001-6709-9648 Shiyuan Chen Stowers Institute for Medical Research https://orcid.org/0000-0001-9805-2906 Jessica Lovegrove-Walsh University of Vienna https://orcid.org/0000-0002-4362-4177 Eric Hill Stowers Katerina Ragkousi Stowers Daniela Praher University of Vienna David Fredman University of Vienna Yehu Moran The Hebrew University of Jerusalem https://orcid.org/0000-0001-9928-9294 Matthew Gibson Stowers Bob Zimmermann University of Vienna https://orcid.org/0000-0003-2354-0408 Page 1/32 Biological Sciences - Article Keywords: sea anemones, chromosome-level genome assemblies, Nematostella vectensis, Scolanthus callimorphus Posted Date: August 17th, 2021 DOI: https://doi.org/10.21203/rs.3.rs-796229/v1 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Page 2/32 Abstract Draft genome sequences of non-bilaterian species have provided important insights into the evolution of the metazoan gene repertoire. However, there is little information about the evolution of gene clusters, genome architectures and karyotypes during animal evolution. Here we report chromosome-level genome assemblies of two related anthozoan cnidarians, the sea anemones, Nematostella vectensis and Scolanthus callimorphus. We nd a robust set of 15 chromosomes with a clear one-to-one correspondence of the chromosomes between the two species. We show that, in contrast to Bilateria, Hox and NK clusters of investigated cnidarians are disintegrated, indicating that microsynteny conservation is largely lost. In line with that, we nd no evidence for topologically associated domains, suggesting fundamental difference in long-range gene regulation compared to vertebrates. However, both sea anemone genomes show remarkable chromosomal conservation with other cnidarians, several bilaterians and the sponge Ephydatia muelleri, allowing us to reconstruct the putative cnidarian and metazoan chromosomes, consisting of 19 and 16 ancestral linkage groups, respectively. These data suggest that large parts of the ancestral metazoan genome have been retained in chromosomes of some extant lineages, yet, higher order gene regulation may have evolved only after the cnidarian-bilaterian split. Introduction Draft genomes comprising hundreds to thousands of scaffolds, while helpful to identify gene repertoires, were neither suitable to reconstruct the evolutionary history of chromosomes nor did they always allow researchers to investigate long-range cis-regulation of genes. Recent chromosome-level genome assemblies refueled the opportunity to compare the content and localization of homologous genes between distantly related species. This led to the reconstruction of ancestral linkage groups among Chordata1,2 and Spiralia3,4. However, these analyses remained restricted to bilaterians and did not provide insights into the origin of metazoan chromosomes or their diversication. In this regard, representatives of Cnidaria, the sister clade to Bilateria, are crucial. Cnidaria constitute a large clade of basally branching Metazoa, dating back between 590 and 690 Mya5– 7. Their robust phylogenetic position as sister to Bilateria makes them the key group to study the evolution of bilaterian features, such as axis organization, mesoderm formation and central nervous system development8. The genome of the edwardsiid sea anemone Nematostella vectensis became the rst non-bilaterian animal genome to have a draft scaffold-level sequenced in 2007, and revealed uncanny conservation of gene content to vertebrates as well as rst hints for macrosyntenic conservation9. By now, genomes of the representatives of all ve cnidarian classes have become available10–18 providing valuable insight into various aspects of the cnidarian gene complement and genome organization. However, these genomes originated from fairly distantly related species, and a cornerstone to genomic inquiry has long been the conservation signals of recently diverged taxa19. The Page 3/32 “starlet sea anemone” Nematostella (Figure 1a), which has been developed into an important model organism, belongs to the family of Edwardsiidae within the Actiniaria. Yet, to date no genome sequence of another edwardsiid sea anemone has been reported. An interesting and closely related sea anemone of the edwardsiid family is the “worm sea anemone” Scolanthus callimorphus (Figure 1b), dwelling in European seawater20,21, which according to our molecular clock calculations has separated from Nematostella approximately 174 Mio years ago (EDF 1, see Materials and Methods for details). Main Text High Quality Chromosome-Level Assemblies of Two Edwardsiid Genomes Using short-read sequencing and a k-mer coverage model, we estimated the genome length of Nematostella at 244 Mb (EDF 2), which is substantially shorter than previously suggested at 450 Mb9. This discrepancy could be in part attributed to the previous use of four haplotypes in sequencing. The genome of the sea anemone Exaiptasia pallida is similar in length to Nematostella22, while the estimated 414 Mb of the Scolanthus genome is at present the largest sequenced actiniarian genome, mainly due to expansions of repetitive elements. This indicates that the genome lengths among Actiniaria may be more dynamic than suggested from earlier analyses (Figure 1c). Using PacBio long-read sequencing and high- throughput conformation capture (Hi-C), we then assembled chromosome-level of Nematostella and Scolanthus genomes, which far surpass the quality of the published Nematostella genome in terms of contiguity, correctness, mappability and completeness (see Supplementary Text, EDF 3-5 for details). In order to compare the genomic location of homologous genes between the edwardsiids, we utilized the previously sequenced Scolanthus transcriptome23 and we determined 24,625 gene models (see Materials and Methods for details). For Nematostella, we sequenced several transcriptome libraries from various developmental stages, which were assembled earlier as NVE gene models24. To further improve the gene annotation, in particular with respect to isoforms and untranslated regions, we used a combination of IsoSeq and RNAseq data, which allowed us to identify 24,525 gene models and 36,280 transcripts. BUSCO analysis showed that the transcriptome contains 96.1% of expected metazoan conserved sequences, which represents an increase over the previous NVE gene models (90.6% complete BUSCOs) (EDF 6). Additional comparison to previously cloned complete CDS (EDT 7) showed that 261 of the 277 sequences (94.2%) were present. Of the missing 16 sequences, 15 could be condently aligned to the reference genome and have been manually added to the annotation les. To facilitate the usage of the newly assembled genomes, we established a publicly accessible Genome Browsers. Both new genome assemblies and associated data are available for browsing, downloading, Page 4/32 and BLAST at SIMRbase (https://simrbase.stowers.org). The Nematostella vectensis genome assembly, referred to as Nvec200, has an abundance of aligned track data, including the newly generated gene models and a large collection of published RNAseq and ChIP-seq analysis as well as 145 ultra-conserved non-coding elements (UCNEs) shared between Nematostella and Scolanthus (Supplementary Text; EDT 8). Chromosomal Organization of the NK and extended Hox gene clusters The chromosome-level assembly of the Nematostella genome allowed us to re-address the evolution of specic gene clusters. Prominent examples of clusters of homeodomain transcription factor coding genes ancestral for Bilateria include the SuperHox cluster, the ParaHox cluster, the NK/NK-like cluster as well as NK2 group genes located separately25–27. It has been hypothesized that all of them originated from a single gene cluster, which then disintegrated during evolution (for review see27). Our analysis revealed that Nematostella possesses a separate ParaHox cluster of two genes, (Gsx and Xlox/Cdx) on chromosome 10, and a SuperHox cluster on chromosome 2 containing Hox, Evx, Mnx, and Rough, as well as more distant Mox and Gbx28 (Figures 3 and 4, EDT6). We identied an NK cluster on chromosome 5 containing NK1, NK5, Msx, NK4, NK3, NK7, NK6, and more distant Ladybird, a Tlx-like gene and, intriguingly, Hex, which is also linked to the NK cluster in the hemichordate Saccoglossus kowalevskii29 and in the cephalochordate Branchiostoma oridae. Similar to Bilateria, the NK2 genes were clustered separately and found on chromosome 2 (Figures 3 and 4, EDT6). In contrast, in earlier branching sponges, neither ParaHox nor extended Hox cluster genes exist, and only the NK cluster is present with a single NK2/3/4 gene, two NK5/6/7 genes, an Msx ortholog, as well as possible Hex and Tlx orthologs30, (EDF 7). Taken together, this allows us to propose that the last common ancestor of Cnidaria and Bilateria possessed an NK-cluster on a chromosome different from the one carrying the SuperHox cluster, and a separate NK2 cluster, which might have been on the same chromosome as the SuperHox cluster (Figure 4). The hypothesized SuperHox-NK Megacluster25,