The Whale Shark Genome Reveals How Genomic and Physiological Properties Scale with Body Size
Total Page:16
File Type:pdf, Size:1020Kb
The whale shark genome reveals how genomic and physiological properties scale with body size Jessica A. Webera,1, Seung Gu Parkb,c,1, Victor Luriad,1, Sungwon Jeonb,c, Hak-Min Kimb,c, Yeonsu Jeonb,c, Youngjune Bhakb,c, Je Hun June, Sang Wha Kimf,g, Won Hee Hongh, Semin Leeb,c, Yun Sung Choe, Amir Kargeri, John W. Cainj, Andrea Manicak, Soonok Kiml, Jae-Hoon Kimm, Jeremy S. Edwardsn,2,3, Jong Bhakb,c,e,2,3, and George M. Churcha,2,3 aDepartment of Genetics, Harvard Medical School, Boston, MA 02115; bKorean Genomics Center, Ulsan National Institute of Science and Technology, 44919 Ulsan, Republic of Korea; cDepartment of Biomedical Engineering, School of Life Sciences, Ulsan National Institute of Science and Technology, 44919 Ulsan, Republic of Korea; dDepartment of Systems Biology, Harvard Medical School, Boston, MA 02115; eClinomics Inc., 44919 Ulsan, Republic of Korea; fLaboratory of Aquatic Biomedicine, College of Veterinary Medicine, Seoul National University, 08826 Seoul, Republic of Korea; gResearch Institute for Veterinary Science, College of Veterinary Medicine, Seoul National University, 08826 Seoul, Republic of Korea; hHanwha Marine Biology Research Center, 63642 Jeju, Republic of Korea; iIT–Research Computing, Harvard Medical School, Boston, MA 02115; jDepartment of Mathematics, Harvard University, Cambridge, MA 02138; kDepartment of Zoology, University of Cambridge, CB2 3EJ Cambridge, United Kingdom; lNational Institute of Biological Resources, 37242 Incheon, Republic of Korea; mCollege of Veterinary Medicine and Veterinary Medical Research Institute, Jeju National University, 63243 Jeju, Republic of Korea; and nDepartment of Chemistry and Chemical Biology, UNM Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM 87131 Contributed by George M. Church, June 8, 2020 (sent for review December 24, 2019; reviewed by Manuel Corpas and Xiaohua Huang) The endangered whale shark (Rhincodon typus) is the largest fish BMRs, decreased growth rates, longer generational times, and on Earth and a long-lived member of the ancient Elasmobranchii increased body sizes (3, 4). clade. To characterize the relationship between genome features The whale shark (Rhincodon typus) is the largest extant fish, and biological traits, we sequenced and assembled the genome of reaches lengths of 20 m (5) and 42 tons in mass (6) and has a the whale shark and compared its genomic and physiological fea- maximum lifespan estimated at 80 y (6). Worldwide populations tures to those of 83 animals and yeast. We examined the scaling have been declining, and the whale shark has been classified as relationships between body size, temperature, metabolic rates, an endangered species by the International Union for Conser- and genomic features and found both general correlations across vation of Nature. Whale sharks are one of three species of filter- the animal kingdom and features specific to the whale shark ge- feeding sharks that use modified gill rakers to sieve plankton and nome. Among animals, increased lifespan is positively correlated small nektonic prey from the water column in a method to body size and metabolic rate. Several genomic traits also sig- nificantly correlated with body size, including intron and gene Significance length. Our large-scale comparative genomic analysis uncovered general features of metazoan genome architecture: Guanine and We sequenced and analyzed the genome of the endangered cytosine (GC) content and codon adaptation index are negatively whale shark, the largest fish on Earth, and compared it to the correlated, and neural connectivity genes are longer than average genomes of 84 other species ranging from yeast to humans. genes in most genomes. Focusing on the whale shark genome, we We found strong scaling relationships between genomic and identified multiple features that significantly correlate with life- physiological features. We posit that these scaling relation- span. Among these were very long gene length, due to introns ships, some of which were remarkably general, mold the ge- being highly enriched in repetitive elements such as CR1-like long nome to integrate metabolic constraints pertaining to body interspersed nuclear elements, and considerably longer neural size and ecological variables such as temperature and depth. genes of several types, including connectivity, activity, and neuro- Unexpectedly, we also found that the size of neural genes is degeneration genes. The whale shark genome also has the second strongly correlated with lifespan in most animals. In the whale slowest evolutionary rate observed in vertebrates to date. Our shark, large gene size and large neural gene size strongly comparative genomics approach uncovered multiple genetic fea- correlate with lifespan and body mass, suggesting longer gene tures associated with body size, metabolic rate, and lifespan and lengths are linked to longer lifespans. showed that the whale shark is a promising model for studies of neural architecture and lifespan. Author contributions: J.A.W., V.L., Y.S.C., J.B., and G.M.C. designed research; H.-M.K., S.W.K., W.H.H., Y.S.C., S.K., and J.-H.K. performed research; J.A.W., S.G.P., V.L., S.J., whale shark | lifespan | body size | metabolic rate | neural genes H.-M.K., Y.J., Y.B., J.H.J., S.L., A.K., and J.W.C. analyzed data; and J.A.W., S.G.P., V.L., S.J., A.M., J.S.E., J.B., and G.M.C. wrote the paper. he relationships between body mass, longevity, and basal Reviewers: M.C., Cambridge Precision Medicine Limited; and X.H., University of California San Diego. Tmetabolic rate (BMR) across diverse habitats and taxa have The authors declare no competing interest. been researched extensively over the last century and have led to generalized rules and scaling relationships that explain many Published under the PNAS license. physiological and genetic trends observed across the tree of life. Data deposition: The whale shark whole-genome project data have been deposited at INSDC: International Nucleotide Sequence Database Collaboration (accession no. While the largest extant animals on the planet are aquatic, the QPMN00000000). The version described in this paper is version QPMN01000000. DNA impact of marine habitats on body size and other physiological sequencing reads have been uploaded to the National Center for Biotechnology Infor- and genetic characteristics is only beginning to be discovered (1). mation Sequence Read Archive (SRP155581). The C++ code used for the Markov Cluster (MCL) algorithm was uploaded to the GitHub repository (https://github.com/jsungwon/ In an effort to better understand the selective pressures imposed MCL-clustering). on body size in marine environments, studies of endothermic 1J.A.W., S.G.P., and V.L. contributed equally to this work. aquatic mammals have shown that selection for larger body sizes 2J.S.E., J.B., and G.M.C. contributed equally to this work. has been driven by the minimization of heat loss (2). In ecto- 3To whom correspondence may be addressed. Email: [email protected], thermic vertebrates, however, the relationship between envi- [email protected], or [email protected]. ronmental temperature and body size is more complex. In these This article contains supporting information online at https://www.pnas.org/lookup/suppl/ species, metabolic rate is directly dependent on temperature, doi:10.1073/pnas.1922576117/-/DCSupplemental. and decreased temperatures are correlated with decreased First published August 4, 2020. 20662–20671 | PNAS | August 25, 2020 | vol. 117 | no. 34 www.pnas.org/cgi/doi/10.1073/pnas.1922576117 Downloaded by guest on October 1, 2021 convergent with that of the baleen whales (1). Unlike the two Correlation of Physiological Characteristics with Genome Features smaller filter-feeding shark species (Cetorhinus maximus, Mega- across 85 Taxa. Body mass is intrinsically linked to physiological chasma pelagios) that inhabit colder temperate waters with in- traits such as lifespan and BMR (23). To better understand how creased prey availability, whale sharks have a cosmopolitan genomic traits correlate with physiological and ecological pa- tropical and warm subtropical distribution and have rarely been rameters such as body weight, lifespan, temperature, and meta- sighted in areas with surface temperatures less than 21 °C (7–9). bolic rate, we compared the whale shark to 83 animals and yeast However, recent global positioning system (GPS) tagging studies (SI Appendix, Tables S15 and S16) using physiological and ge- have revealed that they routinely dive to mesopelagic (200 to nomic data (Fig. 1 and SI Appendix, Figs. S3–S6 and Table S16). 1,000 m) and bathypelagic (1,000 to 4,000 m) zones to feed, Across the 85 species examined, we find a strong positive cor- facing water temperatures less than 4 °C (10). Observations of relation with significant P values between the log-transformed increased surface occupation following deeper dives have led to values for body weight and maximum lifespan (Spearman’s cor- the suggestion that thermoregulation is a primary driver for their relation coefficient ρ = 0.787, Fig. 2A and SI Appendix, Table occupation of the warmer surface waters (7, 11). Since larger S17A) and BMR (SI Appendix, Table S17A and Fig. S9A, ρ = body masses retain heat for longer periods of time, the large 0.962, exponent B = 0.688; SI Appendix, Fig. S24; n = 84 species, body mass of whale sharks may slow their cooling upon diving yeast is excluded), consistent with previous reports (23). Com- and maximize their dive times to cold depths, where food is parisons of physiological traits and genome characteristics across abundant, and could thus play a key role in metabolic regulation. these 84 animals and yeast revealed several genetic features that Body size, environmental temperature, metabolic rate, and also scaled with body weight. Among these, total gene length, generation time are all correlated with variations in evolutionary intron length, and genome size all show a moderate statistical rates (12, 13). Since many of these factors are interconnected, correlation with body mass, lifespan, and BMR (ρ = 0.4 to 0.7) modeling studies have shown that observed evolutionary rate (Fig.