bioRxiv preprint doi: https://doi.org/10.1101/829887; this version posted November 4, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Genomics of a complete butterfly continent Jing Zhang2,*, Qian Cong3,*, Jinhui Shen2, Paul A. Opler4 and Nick V. Grishin1,2,# 1Howard Hughes Medical Institute and 2Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA; 3Institute for Protein Design and Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA; 4Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO, 80523, USA. *These authors contributed equally to this work, #Corresponding author:
[email protected] Never before have we had the luxury of choosing a continent, picking a large phylogenetic group of animals, and obtaining genomic data for its every species. Here, we sequence all 845 species of butterflies recorded from North America north of Mexico. Our comprehensive approach reveals the pattern of diversification and adaptation occurring in this phylogenetic lineage as it has spread over the continent, which cannot be seen on a sample of selected species. We observe bursts of diversification that generated taxonomic ranks: subfamily, tribe, subtribe, genus, and species. The older burst around 70 Mya resulted in the butterfly subfamilies, with the major evolutionary inventions being unique phenotypic traits shaped by high positive selection and gene duplications. The recent burst around 5 Mya is caused by explosive radiation in diverse butterfly groups associated with diversification in transcription and mRNA regulation, morphogenesis, and mate selection.