Chromosome-Level Assembly Reveals Extensive Rearrangement In
Total Page:16
File Type:pdf, Size:1020Kb
Aberystwyth University Chromosome-level assembly reveals extensive rearrangement in sakar falcon and budgerigar, but not ostrich, genomes O'Connor, Rebecca E.; Farré, Marta; Joseph, Sunitha; Damas, Joana; Kiazim, Lucas ; Jennings, Rebecca; Bennett, Sophie; Slack, Eden A.; Allanson, Emily; Larkin, Denis M.; Griffin, Darren K. Published in: Genome Biology DOI: 10.1186/s13059-018-1550-x Publication date: 2018 Citation for published version (APA): O'Connor, R. E., Farré, M., Joseph, S., Damas, J., Kiazim, L., Jennings, R., Bennett, S., Slack, E. A., Allanson, E., Larkin, D. M., & Griffin, D. K. (2018). Chromosome-level assembly reveals extensive rearrangement in sakar falcon and budgerigar, but not ostrich, genomes. Genome Biology, 19, [171]. https://doi.org/10.1186/s13059- 018-1550-x Document License CC BY General rights Copyright and moral rights for the publications made accessible in the Aberystwyth Research Portal (the Institutional Repository) are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the Aberystwyth Research Portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the Aberystwyth Research Portal Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. tel: +44 1970 62 2400 email: [email protected] Download date: 04. Oct. 2021 O’Connor et al. Genome Biology (2018) 19:171 https://doi.org/10.1186/s13059-018-1550-x RESEARCH Open Access Chromosome-level assembly reveals extensive rearrangement in saker falcon and budgerigar, but not ostrich, genomes Rebecca E O’Connor1†, Marta Farré2†, Sunitha Joseph1, Joana Damas2, Lucas Kiazim1, Rebecca Jennings1, Sophie Bennett1, Eden A Slack2, Emily Allanson2, Denis M Larkin2† and Darren K Griffin1*† Abstract Background: The number of de novo genome sequence assemblies is increasing exponentially; however, relatively few contain one scaffold/contig per chromosome. Such assemblies are essential for studies of genotype-to-phenotype association, gross genomic evolution, and speciation. Inter-species differences can arise from chromosomal changes fixed during evolution, and we previously hypothesized that a higher fraction of elements under negative selection contributed to avian-specific phenotypes and avian genome organization stability. The objective of this study is to generate chromosome-level assemblies of three avian species (saker falcon, budgerigar, and ostrich) previously reported as karyotypically rearranged compared to most birds. We also test the hypothesis that the density of conserved non-coding elements is associated with the positions of evolutionary breakpoint regions. Results: We used reference-assisted chromosome assembly, PCR, and lab-based molecular approaches, to generate chromosome-level assemblies of the three species. We mapped inter- and intrachromosomal changes from the avian ancestor, finding no interchromosomal rearrangements in the ostrich genome, despite it being previously described as chromosomally rearranged. We found that the average density of conserved non-coding elements in evolutionary breakpoint regions is significantly reduced. Fission evolutionary breakpoint regions have the lowest conserved non-coding element density, and intrachromomosomal evolutionary breakpoint regions have the highest. Conclusions: The tools used here can generate inexpensive, efficient chromosome-level assemblies, with > 80% assigned to chromosomes, which is comparable to genomes assembled using high-density physical or genetic mapping. Moreover, conserved non-coding elements are important factors in defining where rearrangements, especially interchromosomal, are fixed during evolution without deleterious effects. Keywords: Chromosome-level genome assembly, Genome evolution, CNE, EBR Background remains the ultimate aim of a de novo sequencing The number of de novo (new species) genome sequence effort. This is for several reasons, among them the assemblies is increasing exponentially (e.g., [1, 2]). Im- requirement for an established order of DNA markers proved technologies are generating longer reads, greater as a pre-requisite for revealing genotype-to-phenotype read depths, and ultimately assemblies with fewer, longer associations for marker-assisted selection and breeding, contigs per genome [3, 4]; however, the ability to assem- e.g., in species regularly bred for food production, com- ble a genome with the same number of scaffolds or panionship, or conservation purposes [5]. contigs as chromosomes (“chromosome-level” assembly) Chromosome-level assemblies were rapidly established for agricultural animals (chicken, pig, cattle, sheep) [6–9] in part because they were assembled as maps prior to * Correspondence: [email protected] (e.g., Sanger) sequencing. Species used for food consump- † ’ Rebecca E O Connor, Marta Farré, Denis M Larkin and Darren K Griffin tion in developing countries (e.g., goat, camel, yak, buffalo, contributed equally to this work. 1School of Biosciences, University of Kent, Canterbury, UK Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. O’Connor et al. Genome Biology (2018) 19:171 Page 2 of 15 ostrich, quail); animals bred for conservation (e.g., falcons degree of homology with the chicken (like other ratite and parrots), and companion animals (e.g., pet birds) are birds) revealed by cross species chromosome painting still however poorly represented, in part because they were [25–27], it however purportedly has 26 previously un- initially assembled using NGS data alone. New techniques, detected interchromosomal rearrangements when com- e.g., optical mapping [10], BioNano [11], Dovetail [12], pared to the ancestral avian karyotype as revealed by and PacBio long-read sequencing [13], make significant sequence assembly analysis of optical mapping data [28]. steps towards this. Recent progress on the goat genome For these three species, we used our previously described for instance resulted in a chromosome-level assembly approach combining computational algorithms for order- using PacBio long-read sequencing [2]; others however ing scaffolds into predicted chromosome fragments encounter technical issues: BioNano contigs fail to map (PCFs) which we then physically mapped directly to the across multiple DNA nick site regions, centromeres, or chromosomes of interest using a set of avian universal large heterochromatin blocks, and PacBio requires starting bacterial artificial chromosome (BAC) probes [15]. material of hundreds of micrograms of high molecular Chromosome-level assemblies also inform studies of weight DNA, thereby limiting its usage. To achieve a evolution and speciation given that inter-species differ- chromosome-level assembly therefore often requires a ences arise from chromosomal changes fixed during combination of technologies to integrate the sequence evolution [29–35]. In recent studies, we have used data, e.g., Hi-C [14], linkage mapping, pre-existing (near) chromosome-level assemblies to reconstruct ances- chromosome-level reference assemblies, and/or mo- tral karyotypes and trace inter- and intrachromosomal lecular cytogenetics [15, 16]. To this end, we made use changes that have occurred to generate the karyotypes of of bioinformatic approaches, e.g., the Reference-Assisted extant species [28, 36]. Theories explaining the mecha- Chromosome Assembly (RACA) algorithm [17]. RACA nisms of chromosomal change in vertebrates include a however is limited in needing a closely related reference role for repetitive sequences used for non-allelic homolo- species for comparison [17] and further mapping of gous recombination (NAHR) in evolutionary breakpoint superscaffolds physically to chromosomes. We therefore regions (EBRs) [37] and the proximity of DNA regions in recently developed an approach where RACA produces chromatin [38]. During gross genome (karyotype) evolu- sub-chromosome-sized predicted chromosome fragments tion, unstable EBRs delineate stable homologous synteny (PCFs) which are subsequently verified and mapped to blocks (HSBs) and we have established that the largest chromosomes using molecular methods [15]. In so doing, HSBs are maintained non-randomly and highly enriched we previously established a novel, integrated approach for conserved non-coding elements (CNEs) [9–11, 15, 39]. that allows de novo assembled genomes to be mapped dir- We recently proposed the hypothesis that a higher frac- ectly onto the chromosomes of interest and displayed the tion of elements under negative selection involved in gene information