ARTICLE

DOI: 10.1038/s41467-018-03281-1 OPEN The genomic and functional landscapes of developmental plasticity in the American

Sheng Li1, Shiming Zhu1, Qiangqiang Jia1, Dongwei Yuan2,3, Chonghua Ren1, Kang Li1, Suning Liu1, Yingying Cui1, Haigang Zhao2,3, Yanghui Cao2, Gangqi Fang2,3, Daqi Li4, Xiaoming Zhao4, Jianzhen Zhang4, Qiaoyun Yue5, Yongliang Fan6, Xiaoqiang Yu1, Qili Feng1 & Shuai Zhan2 1234567890():,; Many cockroach have adapted to urban environments, and some have been serious pests of public health in the tropics and subtropics. Here, we present the 3.38-Gb genome and a consensus gene set of the , americana. We report insights from both genomic and functional investigations into the underlying basis of its adaptation to urban environments and developmental plasticity. In comparison with other , expansions of gene families in P. americana exist for most core gene families likely associated with environmental adaptation, such as chemoreception and detoxification. Multiple pathways regulating metamorphic development are well conserved, and RNAi experiments inform on key roles of 20-hydroxyecdysone, juvenile hormone, insulin, and decapentaplegic signals in regulating plasticity. Our analyses reveal a high level of sequence identity in genes between the American cockroach and two species, advancing it as a valuable model to study the evolutionary relationships between and .

1 Guangzhou Key Laboratory of Development Regulation and Application Research, Institute of Insect Science and Technology & School of Life Sciences, South China Normal University, Guangzhou 510631, China. 2 CAS Key Laboratory of Insect Developmental and Evolutionary Biology, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai 200032, China. 3 University of Chinese Academy of Sciences, Beijing 100049, China. 4 Research Institute of Applied Biology, Shanxi University, Taiyuan 030006, China. 5 Zhongshan Entry- Exit Inspection and Quarantine Bureau Technology Center, Zhongshan 528403, China. 6 State Key Laboratory of Crop Stress Biology for Arid Areas and Key Laboratory of Integrated Pest Management on the Loess Plateau of Ministry of Agriculture, Northwest A&F University, Yangling 712100, China. These authors contributed equally: Sheng Li, Shiming Zhu, Qiangqiang Jia, Dongwei Yuan, Chonghua Ren. Correspondence and requests for materials should be addressed to S.L. (email: [email protected]) or to S.Z. (email: [email protected])

NATURE COMMUNICATIONS | (2018) 9:1008 | DOI: 10.1038/s41467-018-03281-1 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03281-1

any species of cockroaches (Insecta, ) are full-length P. americana transcriptomes. In the first version of the Mserious cosmopolitan pests of public health. The gene set (OGS1.0), we reported 21,336 protein-coding genes, 95% American cockroach, Periplaneta americana (Blatto- of which were detected to be expressed (Supplementary Table 4). dae), is one of the largest insect species that lives in close Assessment of completeness revealed the gene set is also of high proximity with humans (Supplementary Fig. 1). Despite its quality (Supplementary Table 4). Overall, P. americana encodes a common name, the American cockroach was introduced to the typical blattodean gene repertoire, of which 90% genes match United States of America from Africa in the early 16th century entries of Blattodea homologs, and 84% and 82% have homologs and has since spread throughout the world; it is famous for its in another cockroach species (the German cockroach, Blattella household name “Xiao Qiang” (a small pest with strong living germanica5) and a termite species (the dampwood termite, vitality) in China and “Waterbug” in America. The American nevadensis6), respectively (Supplementary Table 5). cockroach prefers indoor environments with access to food Features of the P. americana genome, such as the exon number sources, but can be found outdoors in moist, shady, and warm per gene or transcript length, are comparable to those of other areas. It consumes a great variety of substrates, with a particular insects, except for intron length (Supplementary Table 4). The preference for fermenting foods1. It also transfers disease-causing median length of introns predicted from the P. americana organisms such as bacteria, protozoa, and viruses; and triggers genome is 3 Kb, which is half of that of L. migratoria and greater allergic reactions and asthma in certain individuals2. The than those of other sequenced insects. This finding supports the American cockroach is thus a serious urban pest in both warm positive correlation between intron size and genome size as and humid regions. Beyond serving as a pest, this cockroach is previously described4. also important in traditional Chinese medicine, well documented in Chinese medical encyclopedia, such as Ben Cao Gang Mu and Genome evolution in Blattodea. We compared the gene reper- Shen Nong Ben Cao Jing. Moreover, its ethanol extract has been toires of 12 representative insect species (Fig. 1a; Supplementary developed as a prescribed drug (Kang Fu Xin Ye) for wound Table 6), including three sequenced blattodean species: (1) the 5 healing and tissue repair (National Drug Standard WS3-B-3674- German cockroach, B. germanica (Ectobiidae) ; (2) the damp- 2000(Z)). wood termite, Z. nevadensis (Termopsidae)6; and (3) the The American cockroach provides a potential model to study growing termite, ()7.We the biology of a hemimetabolous insect with rapid growth, identified 479 Blattodea-specific orthologs, representing metamorphosis after variable molts (6-14), high fecundity, and approximately 1000 genes in each gene set of the two cock- remarkable tissue regeneration capability1. Here, we report the roaches, P. americana and B. germanica. These two cockroach genomic resources of three common Periplaneta species focusing genomes encode the most genes across the species we analyzed. on the genome assembly, consensus gene set, and evolutionary Unlike the pea aphid (not analyzed here)8, which encodes the analyses of P. americana. We also present the functional studies largest number of genes in insects investigated to date, these and analyses of gene families and signaling pathways likely cockroach genomes encode much fewer species-specific genes involved in major aspects of environmental adaptation and (Fig. 1a). In contrast, the increase in gene number in cockroaches developmental plasticity in the American cockroach (Supple- is mainly attributed to their expansion of universal genes in the mentary Fig. 2). 12 species we analyzed (Supplementary Table 7). The gene sets of P. americana and B. germanica contained 13,555 and 10,107 multicopy universal genes, respectively, while the highest number Results and Discussion of such genes in other insects was only 7708 (L. migratoria) Genome assembly and annotation. We sequenced three mem- (Fig. 1a). bers of the Periplaneta and generated >1 Tb of data We also identified approximately 2000 single-copy universal (Supplementary Table 1). For de novo assembly, we sequenced P. genes for each species and selected 538 out of them to reconstruct americana to 295 × fold coverage (Supplementary Table 1). Two the phylogeny (Fig. 1a). The analysis showed the lineage of other sibling species, P. australasiae (the Australian cockroach) Blattodea (including Isoptera)9 is monophyletic. In this lineage, and P. fuliginosa (the smokybrown cockroach), were sequenced to the American cockroach has a closer relationship with the two approximately 40 × fold coverage for genome comparisons termites than the German cockroach. We then focused (Supplementary Table 1). The assembly of P. americana yielded comparative analyses on gene families in six available Blattodea 3.38 Gb of reference genomic sequence, with an N50 length of 333 genomes (see Methods). Compared with P. americana, the two kb (Supplementary Table 2). The assembled size is consistent sibling species, P. fuliginosa and P. australasiae, presented with a previous estimation by flow cytometry3. We assessed the approximately 88% amino acid identity to orthologous proteins. completeness and quality of the assembly using both its tran- Notably, we found that P. americana shares only 75% sequence scripts and conserved genes from a variety of resources (Sup- identity with B. germanica, which is lower than that with the two plementary Table 2 and 3). Since these sequences of all genome-sequenced termites, i.e., 79% to Z. nevadensis and 80% to independent resources could be recovered at a high coverage M. natalensis. To exclude potential biases, we further limited the (98–100%; Supplementary Table 3), the assembly is suggested to comparisons within the common aligned blocks across all be a high-quality representation of the P. americana genome. analyzed species, showing an overall higher level of sequence In comparison with the sequenced genomes of other insect identities between the American cockroach and the two termites species, the American cockroach genome is the second largest (Fig. 1b). This finding suggests that the American cockroach is after that of the locust, Locusta migratoria4. Similar to the more closely related to the termites, at least the two we analyzed, genome of L. migratoria, approximately 60% of the genome of P. than to the German cockroach. We found the German cockroach americana is composed of repetitive elements (Supplementary and the termite (Z. nevadensis) share 9633 and 9573 orthologs Table 2), supporting the hypothesis that repetitive elements drive with the American cockroach, respectively. Of the 7640 common the evolution of genome sizes4. Other genomic features of orthologs, we found approximately two-thirds of the American P. americana, such as the percentages of GC content and repetitive cockroach genes are more closely related to the termite in elements, are typical of Blattodea (Supplementary Table 2). sequence identity, while only one-third are more closely related to An official gene set was generated by integrating ab initio the German cockroach, strengthening the above finding (Fig. 1c). predictions, alignments of insect gene homologs, and evidence of Genes conserved more between the American cockroach and the

2 NATURE COMMUNICATIONS | (2018) 9:1008 | DOI: 10.1038/s41467-018-03281-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03281-1 ARTICLE

a b Locusta migratoria B. germanica Blattella germanica Z. nevadensis

Periplaneta americana Periplaneta M. natalensis

Macrotermes natalensis within Blattodea

Zootermopsis nevadensis 70 80 90 100 % Identity with P. americana Pediculus humanus c Rhodnius prolixus 100 Universal single-copy Bemisia tabaci Universal multiple-copy Present at half of species 80 Apis mellifera Blattodea only Tribolium castaneum Patchy 60 Specific duplication Danaus plexippus Homology w/o clustering Pame and Znev % Identity between Drosophila Species-specific genes 40 melanogaster 0.05 0 10,000 20,000 30,000 # Genes 40 60 80 100 % Identity between Pame and Bger

Fig. 1 Orthology and genome evolution of Blattodea. a Orthology assignment of four blattodean species and eight other representative insect species (Supplementary Table 6). Bars are subdivided to represent different types of orthology clusters as indicated. Universal groups represent common gene families across all species analyzed, but absence in at most one genome is tolerated; of them, “single-copy” tolerates absence or duplication in a single genome, while “multiple-copy” indicates other universal genes. “Blattodea only” indicates unique presence to the blattodean lineage and in at least three genomes. “Specific” groups indicate specific presence or duplication in only one species. “Homology” indicates partial homology detected with E < 10−5 but no orthology grouped. Remaining orthologs were assigned to “Present at half of species” or “Patchy”, depending on whether presence is in at least six genomes or not. The phylogeny was calculated using maximum-likelihood analyses of a concatenated alignment of 538 exactly single-copy proteins. The tree was rooted using the sister clades of Blattodea and Orthoptera species as described previously70. Bootstrap values based on 100 replicates are equal to 100 for each node. b Distribution of sequence identities. The boxplots delineate the interval between the first and the third quartiles of the identity distribution between P. americana and one indicated blattodean species. Notch indicates the median value. Amino acid sequence identity was calculated based on multiple alignment of each universal single-copy ortholog (5911 in total) in four blattodean species. The red dashed line indicates the median value within Periplaneta species (96%). c Comparison of sequence identities on each ortholog cluster. A total of 7640 1:1:1 orthologs were identified among P. americana, B. germanica, and Z. nevadensis. Each dot represents such an ortholog. Value on x-axis indicates the sequence identity between P. americana (Pame) and B. germanica (Bger), while value on y-axis indicates that between P. americana and Z. nevadensis (Znev). Red line indicates the smoother of a locally weighted regression, with the coefficient of determination as 0.99999. Dashed line indicates positions where the sequence identity between P. americana and B. germanica is identical to that between P. americana and Z. nevadensis termites were found significantly over-represented in 29 path- Chemoreception systems provide attractive models to under- ways, including a number of classic functional components in stand how organisms adapt to environments, because they lie insects, such as development, nutrition, and immunity (Supple- between external environmental signals and internal physiological mentary Table 8). The high level of sequence identity between the responses10. Chemosensory stimuli are mainly recognized by American cockroach and the termites provides a solid relation- members of three related insect-specific chemosensory receptor ship between cockroaches and termites and enhances the families: olfactory receptors (ORs), gustatory receptors (GRs), evolutionary significance of P. americana by filling a gap during and ionotropic glutamate receptors (IRs), while the odorant- – the evolution of available blattodean resources. Instead, genes binding proteins (OBPs) bind and transfer odors to ORs11 13.We more conserved between the American cockroach and the manually annotated these gene families, and compared them German cockroach were significantly enriched in only six among cockroaches, termites, and Drosophila (Fig. 2a–c; Supple- pathways. Interestingly, two of them are related to signal mentary Figs. 3 and 4). A total of 154 ORs were found in the transduction (Supplementary Table 9), suggesting a similar role P. americana genome, while other blattodean species were found in processing environmental information between the two with only half as many ORs (Fig. 2a; Supplementary Fig. 3). These common cockroaches. expanded ORs could help P. americana to more easily detect traces (odors) of foods, especially fermenting foods, which the Environmental adaptation. Our analyses then focused on gene American cockroach prefers. Furthermore, we found 522 GRs in families likely associated with the unique biology of the American P. americana, which represents the greatest expansion of GRs in cockroach and its success in adapting to urban environments. The the insect species reported to date. Interestingly, 329 of these GRs American cockroach is an omnivorous scavenger and has adapted formed a specific clade in the phylogeny and were annotated as to human lifestyles and food sources. Adaptation to host and potential bitter receptors (Fig. 2a, b). Being able to identify bitter environment is mainly mediated by chemical communication and tastes is generally considered as a self-protection system to subsequent abilities to tolerate chemical and biological factors, tolerate bitter and toxic foods14, and expansion of bitter receptors such as toxins or pathogens. We therefore began analyses of has been observed in insect herbivores with adaptation to a great signaling pathways that are involved in chemoreception, detox- number of plant secondary metabolites15. The massive expansion ification, and immunity (Figs. 2 and 3), which include expanded of bitter receptors in P. americana may not only explain how this gene families relative to other insect species by automated ana- omnivorous and opportunistic species is able to adapt to diverse lyses (Supplementary Tables 7 and 10). diets in a range of environments, but also enhance the potential

NATURE COMMUNICATIONS | (2018) 9:1008 | DOI: 10.1038/s41467-018-03281-1 | www.nature.com/naturecommunications 3 . 7

3 NMDA PameIR 500

DmGr63a-PB ZnevIR 110 BgerIR507 PaGR116 PameIR542 PaP450-29

ZnevGR07 BgerIR49 DmNmdar2-PD ZnevGR48 DmNmdar2-PF ZnP450 315a1 2 DmNmdar2-PE ZnevGR09 DmNmdar2-PC ZnevGR40 DmNmdar2-PB DmNmdar2-PA Dmsad-P A ZnevGR41 DmNmdar2-PG DmNmdar1-PA ZnevGR08 BgerIR 191 PaP450-210 ZnevGR44 BgerIR 167 BgerIR 166 ZnevGR45 ZnevIR45 ZnP450 49a1PA PameIR 293 ZnevGR06 ZnevIR98 ZnevGR47 PameIR 639 BgerIR 108 PaP450-71 PaGR402 DmGluRIID-PA PaGR237 DmGluRIIE-PB DmCG5621-PC BgP450-99 ZnevGR24 DmCG5621-PA DmCG5621-PD ZnevGR21 PameIR291 ZnP450 CYP44 BgerIR 545 BgerGR109 ZnevIR 6 BgerGR24 BgerIR 544 PameIR 331 PaP450-7 ZnevGR25 PameIR 545 ZnevGR26 BgerIR462 PameIR221 ZnP450 302a1

ZnevGR22 ZnevIR 4 Dmclumsy-PB PaGR380 Dmclumsy-PA PaP450-211

DmGr21a-PB DmCG3822-PA BgerIR546 DmGr21a-PA ZnevIR5 PameIR 135 Dmdib-P A BgerGR110 PameIR 517 ZnevGR20 CO DmGluRIIC-PD DmGluRIIB-PI PaP450-229 PaGR260 DmGluRIIA-PI ZnevGR85 DmGluRIA-PA ZnevIR 40 BgP450-107 PaGR239 BgerIR370 PameIR245 ZnevGR43 DmGluRIB-PA DmGluRIB-PB Dmshd-P C ZnevGR42 DmGluRIB-PC ZnevGR60 DmGluRIB-PD DmCG11155-PB Dmshd-P D PaGR238 DmEk ar-PD BgerGR147 DmEkar-PB DmEkar-PA Dmshd-P B PaGR423 ZnevIR90 PameIR 87 BgerGR146 BgerIR 195 DmCG11155-PD DmCyp12b2-PA ZnevGR63 DmCG11155-PA PaGR452 PameIR 120 ZnevIR85 DmCyp12d1-p-PB DmGr64a-PA BgerIR360 DmGr64b-PA DmIr8a-PA PameIR114 DmCyp12d1-p-PA DmGr64d-PA BgerIR 604 PameIR113 DmGr64d-PC ZnevIR 35 ZnevIR11 DmCyp12d1-d-PA DmGr64c-PB BgerIR 2 DmGr61a-PA PameIR49 DmIr25a-PE DmCyp12e1-P C DmGr61a-PB DmIr25a-PC DmGr5a-PA DmIr25a-PD DmIr25a-PB DmCyp12c1-P B DmGr64f-PA ZnevIR 101 ZnevIR39 iGluRs DmGr64e-PA DmIr93a-PD BgerIR5 DmCyp12c1-P A DmGr64e-PB PameIR 323 PaGR492 ZnevIR 89 ZnevIR 96 DmCyp12a4-PA PaGR493 PameIR295 BgerGR196 DmIr87a-PB DmIr60a-PA DmCyp12a5-PA BgerGR199 DmIr75d-PC DmIr84a-PB BgerGR195 DmIr64a-PA ZnevGR75 PameIR76 PaP450-212 PameIR77 BgerGR198 BgerIR 114 BgerIR50 BgP450-73 ZnevGR92 PameIR549 PaGR276 ZnevIR 93 PameIR 125 BgP450-102 BgerGR36 ZnevIR 97 BgerIR 531 ZnevGR93 BgerIR 241 PaGR253 PameIR 144 BgP450-103 ZnevIR44 ZnevGR82 PameIR257 BgerIR78 PaP450-122 BgerGR238 ZnevIR91 PaGR170 ZnevIR92 PameIR 548 PaP450-121 BgerGR235 PameIR130 PameIR 131 BgerGR234 ZnevIR 94 ZnevGR83 ZnevIR 95 ZnP450 301a1PC BgerIR 113 PaGR290 PameIR 78 PameIR79 BgP450-105 PaGR169 PameIR 551 PameIR15 BgerGR237 -transferase, ABC PameIR550 PaP450-120 ZnevGR74 PameIR 515 ZnevIR 114 PaGR262 BgerIR133

ZnevGR84 BgerIR 134 PaP450-119 BgerIR203 PaGR118 BgerIR 205

BgerIR342 ZnP450 49a1PB ZnevGR50 BgerIR 344 BgerGR41 BgerIR 345 BgerIR357 BgerIR358 DmCyp49a1-P C BgerGR76 S Sugar BgerIR 356 BgerGR99 BgerIR38 BgerGR75 PameIR 626 DmCyp49a1-P D ZnevIR 104 BgerGR10 PameIR627 ZnevIR 124 DmCyp49a1-PE BgerGR72 ZnevIR 105 BgerGR73 PameIR625 BgerIR355 BgerGR101 PameIR 572 DmCyp49a1-PA ZnevIR48 BgerGR98 BgerIR 340 BgerGR100 BgerIR 235 DmCyp301a1- PA ZnevIR 47 BgerGR231 PameIR 569 ZnevIR 49 PaP450-104 BgerGR184 PameIR571 ZnevGR72 BgerIR210 PameIR150 PaGR482 PameIR 387 BgP450-38 BgerGR54 ZnevIR 51 PameIR 570 PaGR76 BgerIR 209 BgP450-98 BgerIR175 PaGR476 PameIR 568 PaGR119 ZnevIR 50 PaP450-116 DmIr31a-PC PaGR474 DmIr31a-PD DmIr75a-PB PaGR254 DmIr75c-PA ZnP450 301a1PA PaGR475 DmIr75b-PA PameIR205 PaGR497 ZnevIR 26 PaP450-225 ZnevIR 2 PaGR54 PameIR 493 PaGR50 ZnevIR 69 BgP450-27 PameIR 1 PaGR500 PameIR 617 PameIR 83 PaGR51 PameIR82 Dmspo-P C PaGR49 PameIR 334 PameIR 9 PaGR92 ZnevIR55 Dmspo-P A PaGR52 PameIR 304 PameIR 380 PaGR91 PameIR 132 ZnP450 307a1 PameIR 133 PaGR53 PameIR 381 BgerGR240 PameIR 378 PameIR 379 PaP450-82 BgerGR229 ZnevIR 57 PameIR 2 BgerGR239 BgerIR 327 BgP450-131 BgerGR78 PameIR 176 PameIR 482 BgerGR228 ZnevIR58 BgP450-62 ZnevIR145 PaGR03 PameIR 399 PaGR501 ZnevIR 56 PameIR483 Dmphm-P B PaGR287 PameIR 484 ZnevIR 54 PaGR289 PameIR 290 Dmphm-P A PaGR288 BgerIR51 PameIR597 BgerGR102 PameIR 45 PaP450-206 PameIR 61 PaGR286 PameIR 62 PaGR498 PameIR 22 PameIR369 BgP450-121 ZnevGR51 PameIR 481 BgerIR481 PaGR285 BgerIR482 ZnP450 306a1 PaGR125 BgerIR 588 BgerIR 589 PaGR126 PameIR 315 PaP450-165 PameIR 541 PaGR123 ZnevIR 134 PaGR124 BgerIR 55 BgerIR54 ZnP450 18a1 PaGR127 BgerIR 319 PaGR284 BgerIR436 BgerIR 437 DmCyp18a1-PA PaGR282 BgerIR 418 PameIR 394 PaGR283 PameIR615 DmCyp18a1-PB BgerIR215 PaGR221 PameIR 136 PaGR220 PameIR 283 ZnevIR72 PaP450-88 PaGR222 PameIR 499 PaGR223 PameIR 485 PameIR127 BgP450-53 PaGR481 PameIR 613 PameIR203 ZnevGR34 PameIR 143 ZnP450 304a1PB PameIR140 BgerGR166 PameIR 138 PaGR536 ZnevIR 80 PameIR 487 PaP450-137 BgerGR165 PameIR 142 PaGR235 ZnevIR 65 PameIR 324 PaP450-16 PaGR488 BgerIR 302 PaGR487 BgerIR303 BgerIR548 PaP450-136 BgerGR37 PameIR40 PameIR 41 BgerGR206 PameIR 42 PameIR233 BgP450-33 PaGR530 ZnevIR 75 ZnevGR18 PameIR 511 BgerIR46 BgP450-35 BgerGR52 BgerIR 485 PameIR 145 PaGR510 ZnevIR 126 BgP450-34 BgerGR08 PameIR84 BgerIR 115 ZnevGR16 PameIR330 BgerIR 585 PaP450-1 BgerGR175 ZnevIR 36 PaGR345 BgerIR 584 BgerIR 568 PaP450-144 PaGR357 BgerIR 149 PaGR179 BgerIR82 PameIR 50 DmCyp304a1-PA BgerGR241 ZnevIR123 PameIR 137 PaGR133 ZnevIR 66 PameIR 352 PaP450-98 PaGR486 Zn evIR 67 ZnevGR62 PameIR 584 BgerIR 487 PaP450-221 ZnevGR61 PameIR 128 PaGR516 PameIR 325 ZnevIR 68 ZnP450 303a1 PaGR181 ZnevIR76 PameIR 620 PaGR499 ZnevIR 73 ZnevIR 74 BgP450-108 PaGR515 PameIR 239 PaGR343 PameIR244 PameIR 292 PaP450-201 PaGR531 PameIR 240 PameIR 219 PaGR485 ZnevIR71

P. americana B. germanica Z. nevadensis D. melanogaster DmCyp303a1-PA BgerGR121 BgerIR 382 PameIR 268 PaGR484 PameIR 299 PameIR351 DmCyp303a1-PB PaGR532 PameIR565 BgerGR141 ZnevIR 115 BgerIR 183 PaP450-90 BgerGR185 BgerIR184 ZnevGR49 ZnevIR64 PameIR 560 ZnP450 305a1PC BgerGR156 PameIR116 PaGR241 ZnevIR 122 PameIR 195 PaP450-89 BgerGR168 BgerIR590 BgerIR 176 PaGR483 PameIR 514 ZnevIR3 DmCyp305a1-PA www.nature.com/naturecommunications PaGR88 BgerIR 34 BgerGR119 BgerIR 532 PameIR85 BgP450-49 PaGR93 ZnevIR 78 PaGR89 PameIR 478 PameIR 423 PaP450-232 BgerGR02 BgerIR 533 ZnevIR106 BgerGR03 PameIR86

PameIR 486 ZnP450 2J6 | BgerGR04 ZnevIR 10 BgerGR05 PameIR335 ZnevIR 140 ZnP450 2J2 PaGR490 ZnevIR 22 PameIR 39 ZnevGR54 BgerIR350 ZnevGR55 PameIR 453 ZnP450 2J3 ZnevIR139 PaGR154 ZnevIR 138 PaGR233 ZnevIR128 PaP450-26 PameIR25 Counts of chemosensory- and ZnevGR56 PameIR259 PameIR260 PaP450-10 ZnevGR57 ZnevIR125 PameIR 322 PaGR252 BgerIR 534 BgerGR142 PameIR374 PaP450-11

PameIR 618 Antennal IRs BgerGR176 BgerIR 118 BgerIR249 PaP450-36 PaGR55 BgerIR529 ZnevGR17 ZnevIR84 BgerIR117 PaP450-8 PaGR527 PameIR 635 ZnevIR 141 BgerGR201 PameIR294 BgerGR202 ZnevIR133 PaP450-9 BgerIR 112 PaGR528 PameIR 353 ZnevIR 146 PaP450-27 PaGR526 PameIR200 a PaGR42 DmIr7f-PA DmIr7d-PA PaP450-28 PaGR43 DmIr7c-PA DmIr7c-PB PaGR80 DmIr7e-PA ZnevGR52 DmIr7g-PA DmCyp311a1-PB DmIr7b-PB PaGR351 DmIr7a-PA DmIr11a-PB DmCyp318a1-PA BgerGR22 DmIr92a-PC BgerGR155 PameIR 288 PameIR411 BgerGR154 ZnevIR 62 DmCyp313b1-PA PaGR82 BgerIR 168 DmIr68a-PA BgerGR153 DmIr68a-PB DmCyp313b1-PB PameIR 64 PaGR81 PameIR65 BgerGR115 ZnevIR 87 DmCyp313a4-PA BgerIR20 ZnevGR02 DmIr21a-PB BgerIR 430 PaGR278 BgerIR431 DmCyp313a1-PA BgerGR173 BgerIR 432 BgerIR 428 BgerGR214 BgerIR 429 DmCyp313a5-PA BgerIR 427 BgerGR50 ZnevIR129 BgerGR211 ZnevIR 130 DmCyp313a3-PA PameIR 109 BgerGR49 BgerIR 433 PameIR 10 BgerGR178 BgerIR 619 DmCyp313a2-PB BgerGR212 PameIR 452 ZnevIR 28 BgerGR213 PameIR146 PaP450-6 PaGR215 BgerIR 18 BgerIR 598 PaGR210 PameIR 628 PaP450-161 ZnevIR 135 PaGR216 PameIR 454 PaGR217 BgerIR 318 DmIr76a-PE PaP450-77 PaGR213 DmIr76a-PA DmIr76a-PB PaGR214 DmIr41a-PC PaP450-192 PaGR212 PameIR 341 PameIR 342 PaGR128 PameIR 3 PaP450-92 PameIR309 PaGR129 BgerIR67 PaGR219 BgerIR66 ZnevIR 132 PaP450-93 BgerGR179 PameIR592 PameIR 344 BgerGR67 PameIR591 BgP450-65 BgerGR88 PameIR208 PameIR 209 BgerGR172 BgerIR 386 PaP450-185 PameIR 207 BgerGR171 PameIR 236 PaGR381 ZnevIR 53 BgerIR182 BgP450-50 ZnevGR73 BgerIR 180 BgerIR181 BgerGR116 PameIR 148 PaP450-25 ZnevGR19 ZnevIR 136 BgerIR 495 BgerGR127 PameIR 246 BgP450-128 PameIR 467 BgerGR128 ZnevIR 12 BgerGR124 BgerIR313 BgerIR566 ZnP450 4C1PP BgerGR51 BgerIR 582 BgerIR 583 BgerGR84 PameIR398 BgP450-39 BgerGR85 BgerIR 157 BgerIR158

BgerGR86 BgerIR159 PaP450-51 . OR olfactory receptor, GR gustatory receptor, ZnevIR 42 BgerGR70 BgerIR156 BgerGR68 BgerIR161 PameIR 516 PaP450-3 BgerGR69 BgerIR160 BgerGR227 PameIR 424 ZnevIR 41 BgP450-58 ZnevGR53 DmIr56a-PA DmIr40a-PF PaGR539 DmIr60e-PD BgP450-57 DmIr67c-PA PaGR547 DmIr67b-PA PaGR543 DmIr48b-PA DmIr48c-PB ZnP450 4c3PA PaGR541 DmIr56c-PA PaGR548 DmIr56d-PA DmIr62a-PA BgP450-56 PaGR546 DmIr51b-PA DmIr56b-PA PaGR545 DmIr94b-PB BgP450-55 PaGR544 DmIr94c-PB DmIr94a-PA BgerGR204 DmIr47a-PA DmIr67a-PB PaP450-57 BgerGR205 DmIr94f-PB BgerGR16 DmIr54a-PA DmIr20a-PA PaP450-32 BgerGR17 DmIr94h-PB BgerGR18 DmIr94g-PB DmIr60d-PB BgP450-91 BgerGR20 DmIr60b-PA DmIr94e-PA BgerGR27 DmIr94d-PA DmIr52d-PB DmCyp316a1-PB BgerGR144 DmIr52c-PA BgerGR145 DmIr52b-PB DmIr52a-PB DmCyp4aa1-PB BgerGR136 DmIr85a-PA BgerGR135 DmIr68b-PA BgerIR240 ZnP450 4aa1 BgerGR134 DmIr100a-PA DmIr10a-PC BgerGR104 PameIR20 ZnevIR 59 PaP450-146 BgerGR83 BgerIR 218 BgerGR44 PameIR 436 BgerIR309 PaP450-145 BgerGR46 BgerIR 475 BgerGR117 PameIR 253 ZnevIR 21 DmCyp4p3-PA BgerGR118 PameIR 94 PameIR605 ZnevGR39 BgerIR 47 PameIR 465 DmCyp4p1-PA PaGR401 ZnevIR 25 PaGR400 BgerIR 569 PameIR 266 DmCyp4p2-PA PaGR78 BgerIR53 PaGR399 PameIR 230 ZnevIR 24 DmCyp4ac3-PA PaGR397 PameIR 495 BgerIR 7 PaGR398 PameIR 104 ZnevIR 46 DmCyp4ac2-PB PaGR395 PameIR 474 PaGR396 PameIR 223 ZnevIR143 DmCyp4ac1-PA BgerGR132 PameIR 222 BgerGR131 PameIR 225 BgerIR371 DmCyp4ad1-PA PaGR383 BgerIR 372 PameIR224 PaGR412 BgerIR 530 PaGR408 ZnevIR 144 DmCyp312a1-PB PameIR182 PaGR411 PameIR184 PameIR 185 DmCyp312a1-PA PaGR415 PameIR188 PaGR409 PameIR 183 PameIR 63 DmCyp4d1-PA PaGR417 PameIR 186 PameIR 187 PaGR416 BgerIR 476 PaGR407 BgerIR 90 DmCyp4d1-PB BgerIR 15 PaGR406 BgerIR 528 BgerIR 244 DmCyp4e2-PB PaGR413 BgerIR130 PaGR414 BgerIR129 BgerIR 140 DmCyp4e2-PA PaGR405 BgerIR242 BgerIR 141 PaGR404 BgerIR274 PaGR410 BgerIR 277 DmCyp4e1-PA BgerIR 276 PaGR403 BgerIR 275 BgerIR 497 DmCyp4e3-PA PaGR418 BgerIR3 PaGR94 BgerIR 602 BgerIR 127 BgerGR157 BgerIR 155 DmCyp4ae1-PA BgerIR4 BgerGR55 BgerIR 486 BgerGR160 BgerIR 45 DmCyp4d8-PA BgerIR 349 BgerGR159 BgerIR496 BgerIR527 DmCyp4d20-PA BgerGR143 BgerIR 576 ZnevGR58 BgerIR 126 BgerIR 577 PaGR529 BgerIR 348 DmCyp4d21-PA ZnevGR32 BgerIR 25 BgerIR148 BgerGR244 BgerIR 471 DmCyp4d14-PB BgerIR 116 BgerGR48 BgerIR63 PaGR95 BgerIR580 DmCyp4d14-PA BgerIR 338 PaGR257 BgerIR 105 BgerIR 101 PaGR185 BgerIR 102 DmCyp4d2-PB PaGR36 BgerIR104 BgerIR 103 PaGR186 BgerIR 100 DmCyp4d2-PA BgerIR 14 PaGR183 BgerIR611 PaGR184 BgerIR 609 DmCyp4s3-PA BgerIR612 PaGR298 BgerIR 610 BgerIR419 PaGR318 BgerIR 262 PaP450-222 PaGR250 BgerIR 400 BgerIR 257 PaGR244 BgerIR 464 PaP450-94 BgerIR311 PaGR07 BgerIR 312 PaGR243 BgerIR87 PaP450-182 BgerIR 314 PaGR06 BgerIR608 BgerIR 607 PaGR05 BgerIR 299 PaP450-84 PaGR04 BgerIR 300

BgerIR 524 were selected for maximum-likelihood phylogenetic analysis, PaGR461 BgerIR 575 PaP450-199 BgerIR 414 PaGR168 BgerIR 552 PaGR465 BgerIR326 PaP450-60 BgerIR 551 PaGR520 BgerIR 578 BgerIR150 PaGR466 BgerIR 56 PaP450-202 PaGR519 BgerIR57 BgerIR 92 PaGR153 BgerIR393 PaP450-142 BgerIR 392 PaGR304 BgerIR 390 PaGR292 BgerIR 389 PaP450-216 BgerIR 294 PaGR293 BgerIR293 PaGR291 BgerIR 373 BgerIR 131 PaP450-224 PaGR303 BgerIR324 BgerIR 119 PaGR47 BgerIR 325 PaP450-223 PaGR301 BgerIR 98 BgerIR 97 PaGR309 BgerIR 385 PaP450-148 BgerIR 65 PaGR308 BgerIR 626 PaGR302 BgerIR625 BgerIR624 ZnP450 4C1PI PaGR307 BgerIR 273 BgerIR 272 PaGR305 BgerIR 271 PaP450-147 PaGR306 BgerIR 623 BgerIR 622 PaGR455 BgerIR 621 PaP450-69 BgerIR 501 PaGR459 BgerIR 172 PaGR456 BgerIR490 BgerIR 601 PaP450-228 PaGR463 BgerIR 213 BgerIR171 PaGR464 BgerIR 170 ZnP450 4C1PB PaGR167 BgerIR600 BgerIR43 PaGR462 BgerIR 173 PaP450-99 BgerIR174 PaGR460 BgerIR 500 PaGR458 BgerIR 570 PaP450-70 BgerIR 380 PaGR473 BgerIR 379 BgerIR381 PaGR472 BgerIR 606 BgP450-28 BgerIR515 PaGR313 DOI: 10.1038/s41467-018-03281-1 BgerIR 514 PaGR471 BgerIR 221 ZnP450 4C1PA BgerIR 220 PaGR297 BgerIR 266 PaGR296 BgerIR 217 BgP450-127 BgerIR 216 PaGR295 BgerIR383 BgerIR 154 PaGR392 BgerIR506 BgP450-29 BgerIR 505

PaGR294 | BgerIR 504 PaGR469 BgerIR 238 PaP450-149 BgerIR 239 PaGR314 BgerIR 237 PaGR525 BgerIR 8 ZnP450 4g15PB BgerIR 635 PaGR131 BgerIR636 PaGR467 BgerIR 446 BgerIR 22 DmCyp4g1-PA PaGR130 BgerIR 599 PameIR 7 PaGR468 PameIR 220 DmCyp4g15-PC PaGR312 PameIR6 PameIR 216 PaGR317 PameIR 218 DmCyp4g15-PA PameIR 383 NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03281-1 PaGR470 PameIR 377 PaGR320 PameIR 375 PameIR376 BgP450-114 PaGR319 PameIR 296 PameIR8 PaGR384 PameIR401 ZnP450 4g15PA PaGR310 PameIR403 PameIR5 PaGR454 PameIR 556 PameIR 402 PaP450-193 PaGR321 PameIR 557 and other blattodean species. PaGR315 PameIR 558 PameIR 4 PaP450-226

PaGR246 PameIR 235 PameIR 297 PaGR247 PameIR 298 BgP450-82 PaGR311 PameIR317 PameIR 274 PaGR388 PameIR319 PameIR 275 BgP450-86 PaGR248 PameIR318 PaGR385 PameIR 506 PameIR 316 PaP450-58 PaGR387 PameIR 505 PaGR245 PameIR 12 PameIR 258 ZnP450 4c3PD PaGR225 ZnevIR 43 BgerIR 413 PaGR394 BgerIR615 BgerIR 613 PaP450-156 PaGR386 PameIR 57 PaGR389 PameIR 58 PameIR 55 ZnP450 4C1PJ PaGR249 PameIR 56 PaGR111 BgerIR196 BgerIR 231 BgP450-130 PaGR112 BgerIR228 BgerIR229 PaGR299 BgerIR230 BgerIR 225 PaP450-152 PaGR300 BgerIR226 PaGR01 BgerIR 227 BgerIR 99 PaP450-95 PaGR02 PameIR 448 PaGR21 BgerIR 187 BgerIR520 PaP450-96 PaGR362 BgerIR 516 PaGR363 BgerIR517 BgerIR 518 PaP450-103 PaGR364 BgerIR 536 BgerIR 519 PaGR201 BgerIR 579 P. americana BgerIR 332 PaP450-198 PaGR366 BgerIR 333 PaGR369 BgerIR 573 BgerIR 574 PaP450-80 PaGR198 BgerIR58 PaGR373 BgerIR59 BgerIR469 PaP450-79 PaGR194 BgerIR 443 BgerIR444 PaGR368 BgerIR445 PameIR 262 PaP450-74 PaGR200 PameIR 263 PaGR367 PameIR 264 PameIR 54 BgP450-81 PaGR371 PameIR 261 PaGR196 PameIR16 PameIR 18 BgP450-93 PaGR197 PameIR19 PameIR265 PaGR365 PameIR573 PameIR277 BgP450-95 PaGR202 PameIR 24 PaGR193 PameIR 449 PameIR 46 BgP450-51 PaGR374 PameIR 37 PaGR370 PameIR 155 PameIR 106 PaGR192 PameIR 107 PaP450-209 PameIR 105 PaGR375 BgerIR412 PaGR182 PameIR 616 ZnP450 4c3PB PameIR619 PaGR236 BgerIR27 PameIR 156 DmCyp4c3-PA PaGR191 BgerIR39 PaGR376 BgerIR388 BgerIR 40 PaGR188 BgerIR41 PaP450-72 BgerIR178 PaGR203 BgerIR 179 PaGR157 BgerIR454 PaP450-18 BgerIR 44 PaGR206 BgerIR329 BgerIR 330 PaP450-183 PaGR347 BgerIR 406 PaGR156 BgerIR 407 BgerIR 331 PaGR190 BgerIR 153 PaP450-227 BgerIR328 PaGR208 BgerIR351 PaGR158 BgerIR395 PaP450-33 BgerIR 135 PaGR209 BgerIR75 BgerIR434 PaP450-87 PaGR204 BgerIR 629 PaGR205 BgerIR336 BgerIR 424 PaGR159 BgerIR259 PaP450-166 BgerIR88 PaGR346 BgerIR 509 PaGR207 BgerIR258 ZnP450 4C1PN BgerIR 510 PaGR187 BgerIR 630 BgerIR 165 PaP450-66 PaGR189 BgerIR 594 PaGR166 BgerIR 335 BgerIR 596 P450 PaGR155 BgerIR164 PaP450-175 BgerIR 337 PaGR165 BgerIR423 PaGR83 BgerIR 595 PaP450-65 BgerIR 513 PaGR378 BgerIR 631 BgerIR416 PaP450-176 PaGR379 BgerIR 511 PaGR504 BgerIR89 BgerIR512 PaGR75 BgerIR188 PaP450-138 PaGR506 BgerIR189 BgerIR206 PaGR256 BgerIR 96 PaP450-139 BgerIR473 PaGR508 BgerIR207 PaGR507 BgerIR 208 ZnP450 4C1PQ PameIR391 PaGR79 PameIR 89 BgerIR84 PaGR505 PameIR 491 BgP450-8 PaGR509 PameIR 420 PameIR 269 PaGR74 PameIR 501 ZnP450 4C1PM PameIR 523 P. americana PaGR338 PameIR 442 PaGR340 BgerIR 463 BgerIR465 ZnP450 4C1PG PaGR339 BgerIR 466 BgerIR 420 PaGR323 ZnevIR38 PaP450-91 PaGR322 ZnevIR33 BgerIR 489 PaGR341 PameIR 640 PaP450-107 ZnevIR137 PaGR115 BgerIR85 PaGR494 PameIR 502 PameIR 429 PaP450-123 PaGR359 BgerIR 417 PameIR 206 PaGR495 PameIR 44 PaP450-157 PaGR358 ZnevIR70 PameIR 226 PaGR361 PameIR 477 PaP450-108 PameIR590 PaGR360 PameIR 134 PaGR326 PameIR 243 PameIR278 PaP450-100 PaGR103 ZnevIR 107 PameIR 13 PaGR12 BgerIR617 DmCyp317a1-PC PaGR334 BgerIR618 BgerIR232 PaGR147 BgerIR 397 DmCyp317a1-PB BgerIR 398 PaGR72 BgerIR399 PaGR45 BgerIR 121 BgerIR 122 DmCyp317a1-PA PaGR144 BgerIR 368 BgerIR 369 PaGR67 PameIR416 DmCyp6u1-PA PaGR96 PameIR 161 PameIR 158 PaGR16 PameIR159 DmCyp6u1-PC PameIR 160 PaGR69 BgerIR 408 PaGR59 BgerIR 64 BgerIR248 DmCyp6u1-PB PaGR17 BgerIR 245 PaGR61 BgerIR 565 BgerIR 564 DmCyp310a1-PA PaGR23 BgerIR562 BgerIR563 PaGR143 BgerIR 247 ZnP450 reductase PaGR60 BgerIR 561 BgerIR560 PaGR68 PameIR 526 PameIR232 PaP450-78 PaGR08 PameIR 555 PaGR137 PameIR 432 ZnevIR60 DmCyp308a1-PA

PaGR25 Bitter GRs BgerIR508 PameIR 368 PaGR87 PameIR367 DmCyp309a1-PC PaGR524 PameIR 289 PameIR 29 PaGR64 PameIR92 PameIR 276 DmCyp309a1-PB PaGR86 PameIR 451 PaGR135 PameIR608 PameIR 247 DmCyp309a2-PA PaGR97 PameIR 362 PaGR19 PameIR14 PameIR 473 DmCyp28c1-PB PaGR141 PameIR 612 PameIR443 PaGR138 PameIR 395 PameIR 26 DmCyp28c1-PA PaGR65 PameIR 248 PaGR523 PameIR 363 PameIR282 DmCyp28a5-PA PaGR136 PameIR 385 PaGR26 PameIR 490 PameIR 384 DmCyp28d1-PA PaGR140 PameIR 404 PameIR 509 PaGR62 PameIR510 PameIR 173 DmCyp28d2-PA PaGR24 BgerIR374 PaGR139 PameIR 340 PameIR397 PaP450-141 PaGR63 PameIR 609 PaGR85 PameIR610 PameIR469 PaP450-110 PaGR18 PameIR 470 PameIR 267 PaGR73 PameIR 409 PameIR 281 PaP450-109 PaGR522 PameIR 408 PaGR255 PameIR280

PameIR 419 BgP450-72 cation in PaGR393 PameIR 406 PaGR173 PameIR 310 PameIR 631 ZnP450 9e2PC PaGR377 PameIR 312 PameIR 313 Gustatory receptors PaGR171 PameIR 632 BgP450-71 PaGR172 PameIR 633 PameIR 311 PaGR28 PameIR194 PameIR 308 PaP450-189

PaGR29 Ionotropic receptors PameIR 307 PaGR11 PameIR 305 PameIR306 BgP450-69 fi PaGR175 PameIR 466 PameIR 388 PaGR22 PameIR 212 PaGR146 PameIR193 BgP450-68 PameIR 389 PaGR57 PameIR390 PameIR 214 BgP450-67 PaGR71 PameIR 215 PaGR31 PameIR 302 PameIR 303 BgP450-70 PaGR66 PameIR 189 PameIR 440 PaGR113 PameIR 300

PaGR09 PameIR 190 ZnP450 9e2PB -aspartate receptors, Mito the mitochondrial clan PameIR439 PaGR114 PameIR 301 PaGR46 PameIR 437 PaP450-48 PameIR438 PaGR13 BgerIR593 BgerIR 550 PaGR30 BgerIR 106 PaP450-49 PaGR335 BgerIR470 BgerIR499 PaGR419 BgerIR 391 DmCyp6d2-PA BgerIR169

PaGR14 BgerIR 222 D PaGR70 BgerIR297 DmCyp6d5-PA BgerIR298 PaGR58 BgerIR 260 BgerIR77 PaGR145 BgerIR 295 DmCyp6d4-PA PaGR15 BgerIR 1 BgerIR296 PaGR521 BgerIR553 DmCyp6a14-PE BgerIR212 (2018) 9:1008 PaGR174 BgerIR 586 PaGR10 BgerIR 587 DmCyp6a14-PC BgerIR201 PaGR48 BgerIR 204 BgerIR 200 PaGR325 BgerIR 37 DmCyp6a13-PA PaGR100 BgerIR364 BgerIR363 PaGR344 BgerIR 202 DmCyp6a16-PC BgerIR343 PaGR342 BgerIR 346 PaGR27 BgerIR 347 DmCyp6a16-PB

BgerIR29 | PaGR110 BgerIR 633 BgerIR 362 PaGR56 BgerIR472 DmCyp6a2-PA PaGR391 BgerIR 502 BgerIR 503 PaGR354 BgerIR 151 DmCyp6a21-PA BgerIR632 PaGR355 BgerIR 404 PaGR353 BgerIR 405 DmCyp6a9-PB BgerIR442 PaGR34 BgerIR 440 BgerIR 265 PaGR272 BgerIR 439 DmCyp6a8-PA PaGR162 BgerIR425 BgerIR 461 PaGR163 BgerIR94 DmCyp6a18-PA BgerIR409 PaGR271 BgerIR 365 PaGR161 BgerIR143 DmCyp6a18-PB BgerIR460 PaGR270 BgerIR 522 BgerIR 95 PaGR160 BgerIR 478 DmCyp6a22-PB PaGR268 BgerIR321 BgerIR 93 PaGR269 BgerIR 147 DmCyp6a22-PA PaGR164 BgerIR521 BgerIR 193 PaGR33 BgerIR 441 DmCyp6a22-PC BgerIR 366 PaGR32 BgerIR 322 PaGR273 BgerIR 194 DmCyp6a20-PB BgerIR 323 PaGR274 BgerIR 144 BgerIR 477 PaGR35 BgerIR426 DmCyp6a19-PA PaGR275 BgerIR 367 BgerIR 378 PaGR263 BgerIR 146 DmCyp6a19-PB BgerIR 411 PaGR435 BgerIR 447 PaGR109 BgerIR 320 DmCyp6a23-PA BgerIR 252 PaGR106 BgerIR83 PameIR 34 PaGR107 PameIR 339 DmCyp6a17-PB PaGR101 PameIR595 BgerIR21 PaGR108 PameIR 28 DmCyp6a17-PA ZnevIR 108 PaGR102 BgerIR76 PaGR337 BgerIR236 ZnP450 6a14 PameIR636 PaGR324 PameIR 358 BgerIR422 PaGR105 PameIR400 PaP450-37 PaGR98 PameIR 177 BgerIR421 PaGR99 PameIR70 BgP450-66 ZnevIR 81 PaGR336 ZnevIR 23 PaGR444 ZnevIR 19 BgP450-101 BgerIR 603 PaGR329 BgerIR 554

BgerIR 132 -methyl- PaGR333 PameIR 622 PaP450-111 PaGR230 PameIR 414 PameIR 623 Divergent IRs PaGR259 PameIR 624 BgP450-79 ZnevIR 147 PaGR226 PameIR97 PaGR330 PameIR 96 BgP450-78 PameIR 167 PaGR331 PameIR67 PameIR 166 PaGR443 PameIR169 BgP450-2 PaGR227 PameIR 168 PameIR 359 PaGR229 PameIR 547 ZnP450 6k1PE PaGR120 PameIR 361 PameIR 360 N PaGR445 PameIR 204 BgP450-32 PameIR 599 PaGR228 PameIR 59 PaGR121 PameIR122 PameIR 123 PaP450-230 PaGR258 PameIR 546 PameIR 198 PaGR232 PameIR 197 ZnP450 6k1PC PaGR231 PameIR 121 PameIR 496 PaGR332 PameIR621 PaP450-125 PameIR 508 PaGR122 BgerIR559 PaGR266 ZnevIR 131 PameIR 507 PaP450-86 PaGR267 ZnevIR 148 PameIR 27 PaGR264 BgerIR 301 ZnP450 6a13PA PaGR265 PameIR 332 PameIR 147 PaGR240 ZnevIR102 BgP450-1 BgerIR334 PaGR440 ZnevIR20 PaGR439 PameIR 426 PameIR 52 PaP450-173 PaGR442 ZnevIR 103 BgerIR339 PaGR438 PameIR 630 PaP450-40 PaGR436 PameIR 567 ZnevIR 15 PaGR429 ZnevIR112 PaP450-41 ZnevIR 30 PaGR425 BgerIR438 PaGR427 ZnevIR37 ZnevIR 117 PaP450-39 PaGR428 PameIR 441 PaGR450 BgerIR317 PameIR 111 BgP450-115 PaGR448 PameIR 199 PameIR 471 PaGR430 ZnevIR 77 PaP450-124 BgerIR 547 PaGR446 BgerIR310 PaGR431 BgerIR359 PameIR 191 PaP450-128 PaGR441 PameIR 227 PaGR437 PameIR 574 PameIR 575 PaP450-126 PaGR434 PameIR 396 PaGR433 ZnevIR88 BgerIR125 PaP450-127 PaGR432 PameIR382 BgerIR190 PaGR426 ZnevIR116 BgerIR 91 BgP450-118 PaGR424 PameIR527 PaGR449 PameIR355 BgerIR 483 BgP450-116 PaGR451 BgerIR492 PaGR447 BgerIR32 PameIR 213 BgP450-117 BgerGR188 BgerIR 74 PameIR638 BgerGR07 ZnevIR 52 PameIR 637 PaP450-134 PaGR480 ZnevIR 100 PaGR148 BgerIR453 PameIR 108 PaP450-133 BgerGR77 PameIR 181 ZnevGR30 PameIR 255 PameIR 479 PaP450-131 PaGR151 PameIR180 ZnevIR 113 PaGR517 PameIR 525 BgerIR 394 PaP450-130 PaGR150 BgerIR 243 PaGR149 BgerIR627 BgerIR 628 BgP450-125 PaGR279 PameIR 364 PaGR534 BgerIR219 BgerIR 270 BgP450-126 PaGR281 PameIR 21 ZnevIR7 PaGR242 BgerIR 60 PaGR261 PameIR 117 BgP450-124 BgerIR 455 PaGR84 ZnevIR99 PameIR 529 BgP450-123 PaGR420 PameIR530 PaGR352 ZnevIR 109 BgerIR 79 BgP450-122 PaGR535 BgerIR558 PameIR 273 PaGR280 BgerIR 80 PaGR422 PameIR489 ZnP450 6k1PF PameIR 413 PaGR421 PameIR 320 PameIR 488 PaP450-13 PaGR327 PameIR 593 PaGR538 BgerIR6 PameIR229 PaGR537 BgerIR 192 PaP450-115 PameIR 600 PaGR348 PameIR601 PaGR349 PameIR603 PaP450-67 PameIR 602 BgerGR94 PameIR 141 ZnevIR 29 PaP450-68 ZnevGR86 ZnevIR27 ZnevGR87 PameIR 494 BgerIR 375 PaGR512 BgerIR 376 PaP450-114 BgerIR177 PaGR513 PameIR 163 ZnevGR03 PameIR 162 PaP450-62 PameIR 251 PaGR176 PameIR165 PameIR250 PaP450-63 PaGR177 PameIR 164 PaGR514 PameIR 249 PameIR47 BgerGR28 BgerIR 197 PaP450-160 BgerIR198 PaGR350 BgerIR 28 ZnevGR94 BgerIR 557 PaP450-194 BgerIR152 BgerGR80 BgerIR 278 BgerIR52 BgP450-138 BgerGR114 BgerIR402 BgerGR113 BgerIR 214 BgerIR279

PaGR180 BgerIR540 BgP450-23 CYP3 CYP4 CYP2 Mito BgerIR539 PaGR477 BgerIR449 PaGR502 BgerIR543 PaP450-207 BgerIR542 BgerGR95 BgerIR616 BgerIR541 ZnP450 6j1PD PaGR132 BgerIR 81 ZnevGR88 PameIR 151 PameIR 157 BgerGR106 PameIR 124 PaP450-19 BgerGR108 BgerIR308 BgerIR597 BgerGR107 BgerIR 592 PaP450-20 BgerIR 403 PaGR489 ZnevIR 32 PaGR234 PameIR 202 PaP450-22 BgerIR 162 BgerGR23 BgerIR267 BgerIR268 PaGR38 BgerIR163 PaP450-21 PaGR39 BgerIR 269 BgerIR 42 PaGR37 BgerIR387 PaP450-31 BgerIR139 PaGR479 BgerIR224 PaGR277 BgerIR567 PaP450-112 BgerIR 352 PaGR478 BgerIR354 BgerIR353 ZnevGR31 BgerIR 33 PaP450-113 ZnevGR33 BgerIR 361 BgerIR 474 BgerGR193 BgerIR307 PaP450-195 BgerIR 199 BgerGR232 BgerIR 494 PaGR251 BgerIR 19 PaP450-196 BgerIR 136 PaGR491 BgerIR137 BgerIR 138 BgerGR150 BgerIR 450 BgP450-106 BgerGR151 BgerIR 62 BgerIR 61 BgerGR148 BgerIR 234 PaP450-208 BgerIR 233 BgerGR149 BgerIR 24 BgerGR129 BgerIR 457 ZnP450 6j1PC BgerIR 456 BgerGR243 BgerIR 523 BgerIR 525 BgerGR120 BgerIR 30 ZnP450 6j1PE BgerGR47 BgerIR 31 BgerIR 480 BgerGR89 BgerIR 23 PaP450-17 BgerIR 479 BgerGR90 BgerIR526 BgerGR162 BgerIR 306 PaP450-44 BgerIR634 BgerGR91 BgerIR305 BgerIR 304 NATURE COMMUNICATIONS BgerGR167 BgerIR 17 PaP450-23 BgerGR111 BgerIR 16 BgerIR70 BgerGR203 BgerIR 69 PaP450-42 BgerIR 73 BgerGR138 BgerIR68 BgerGR139 BgerIR 71 BgerIR 72 PaP450-43 BgerGR137 BgerIR 123 BgerGR161 BgerIR 549 BgerIR 124 PaP450-45 BgerGR92 BgerIR 211 BgerIR 538 BgerGR62 BgerIR 110 PaP450-46 BgerGR65 ZnevIR16 ZnevIR 18 BgerGR96 ZnevIR 63 ZnevIR17 BgP450-60 BgerGR97 PameIR 445 BgerGR163 PameIR 446 PameIR444 ZnP450 6k1PA BgerGR122 PameIR 434 PameIR 435 BgerGR29 PameIR 38 BgP450-59 BgerGR30 PameIR 170 BgerIR537 BgerGR164 BgerIR 109 PameIR405 PaP450-214 BgerGR152 PameIR284 BgerGR61 PameIR 463 PameIR 366 PaP450-85 BgerGR31 PameIR 365 BgerGR158 PameIR 577 PameIR 80 PaP450-47 BgerGR57 PameIR 579 BgerGR63 PameIR 75 PameIR81 ZnP450 6k1PD BgerGR186 ZnevIR 8 ZnevIR 9 BgerGR187 PameIR 462 PameIR 71 PaP450-215 BgerGR79 PameIR 72 BgerGR09 PameIR 472 PameIR 338 PaP450-52 BgerGR93 PameIR 480 BgerGR181 PameIR 337 PameIR 576 DmCyp6w1-PB BgerGR225 PameIR 287 PameIR 95 BgerGR200 PameIR 349 PameIR 286 DmCyp6w1-PA BgerGR180 PameIR 348 BgerGR26 PameIR 336 PameIR 350 DmCyp6g2-PA BgerGR207 PameIR 578 BgerGR105 ZnevIR 121 PameIR 103 DmCyp6g1-PB BgerGR177 PameIR 554 PameIR 279 BgerGR189 PameIR 552 PameIR553 DmCyp6g1-PA BgerGR191 ZnevIR 142 BgerGR33 PameIR 522 ZnevIR 111 DmCyp6t3-PA BgerGR64 BgerIR 637 BgerGR32 ZnevIR 61 PameIR 347 DmCyp6t1-PA BgerGR01 ZnevIR 14 ZnevIR 118 BgerGR34 BgerIR 107 PameIR 172 DmCyp6v1-PC BgerGR56 PameIR 468 BgerGR58 PameIR 393 BgerIR468 DmCyp6v1-PB BgerGR60 PameIR 53 ZnevIR 120 BgerGR59 ZnevIR 83 DmGr33a-PA ZnevIR 119 DmCyp6v1-PA ZnevIR 82 DmGr77a-PA PameIR 346 PameIR 345 BgP450-136 DmGr93c-PA PameIR 93 DmGr93b-PB BgerIR458 PameIR 35 BgP450-54 DmGr93d-PA PameIR 559 BgerIR 459 DmGr92a-PA ZnevIR 34 DmGr93a-PA PameIR 386 ZnP450 6k1PB BgerIR 555 DmGr94a-PA PameIR 285 PameIR 492 PaP450-190 DmGr97a-PA PameIR 425 ZnevIR 1

OR GR IR OBPDmGr97a-PB P450 CCE GST ABC BgerIR435 PaP450-191 DmGr36b-PA PameIR 217 DmGr36c-PA PameIR 234 BgerIR 250 DmGr36a-PA BgerIR493 DmCyp9b2-PA BgerIR 251 DmGr59c-PA BgerIR581 DmGr59b-PA BgerIR384 DmCyp9b1-PA BgerIR 484 DmGr59a-PA PameIR 566 PameIR 594 DmCyp9f2-PA DmGr22b-PA PameIR 271 DmGr22f-PA PameIR 154 PameIR 228 DmGr22c-PA PameIR 192 DmCyp9h1-PA PameIR 412 DmGr22a-PA PameIR 455 DmGr22e-PA PameIR 456 DmCyp9c1-PA PameIR 356 DmGr22d-PB PameIR357 PameIR 370 BgP450-119 PaGR178 PameIR 314 PaGR77 PameIR562 PameIR 629 ZnevGR95 PameIR 119 PaP450-12 PameIR 152 DmGr43a-PA PameIR 153 DmGr43a-PB PameIR 118 BgP450-46 PameIR 498 DmGr43a-PC PameIR 231 BgerIR591 BgP450-134 BgerGR183 BgerIR415 , representing GRs, IRs, and P450, respectively. Phylogenetic relationships of other gene families are shown in Supplementary Figs. BgerGR226 PameIR 90 PameIR 544 BgP450-20 DmGr28b-PE PameIR543 PameIR 237 DmGr28b-PD ZnevIR 86 DmGr28b-PA PameIR 606 BgP450-16 PameIR 450 DmGr28b-PB PameIR 174 PameIR 11 BgP450-18 DmGr28a-PA PameIR210 ZnevGR14 PameIR 634 PameIR372 BgP450-19 ZnevGR12 PameIR 60 d PameIR 115 ZnevGR11 PameIR 91 ZnevGR15 PameIR 596 BgP450-17 BgerIR 620 ZnevGR13 PameIR 464 PameIR 66 BgP450-37 BgerGR210 BgerIR128 PaGR511 PameIR 461 BgerIR452 PaP450-117 0 PaGR453 ZnevIR 13 PameIR 74 BgerGR221 – PameIR 371 BgerGR215 ZnevIR 127 BgP450-42 ZnevIR 79 DmGr39a-PC PameIR 614 DmGr39a-PB PameIR 171 BgP450-41 PameIR 418 DmGr39a-PD PameIR 36 PameIR 457 BgP450-43 DmGr68a-PA PameIR 460 DmGr32a-PA PameIR 68 PameIR 69 DmGr8a-PA PameIR 458 PaP450-14 PameIR 430 b DmGr98c-PA PameIR459 DmGr98d-PA PameIR431 BgP450-14 BgerIR9 DmGr98b-PA BgerIR 10 BgerIR11 BgP450-12 DmGr23a-PC BgerIR 12

DmGr23a-PB PameIR 51 PameIR 417 DmGr2a-PB PameIR 447 PaP450-53 PameIR 241 DmGr2a-PC PameIR 410 BgerGR11 PameIR 512 PaP450-102 PameIR 513 600 400 200 BgerGR66 ZnevIR31 BgerIR 86

BgerGR125 BgerIR35 PaP450-204 BgerGR87 BgerIR36 BgerIR142 BgerGR39 PameIR 33 PaP450-54 PameIR 254 BgerGR40 PameIR 129 BgerGR38 PameIR 321 BgP450-44 PameIR 23 Gene count Gene BgerGR123 PameIR 48 PameIR 518 BgerGR133 PameIR 520 PaP450-218 BgerGR43 PameIR 476 PameIR 475 BgerGR224 PameIR 519 BgP450-15 PameIR 427 PaGR44 PameIR 112 BgerGR192 PameIR 149 ZnP450 9e2PA BgerIR 451 BgerGR242 PameIR 563 PameIR 564 BgerGR112 PameIR 586 PaP450-168 DmGr98a-PA PameIR 326 PameIR 327 DmGr89a-PA PameIR 587 PaP450-135 DmGr58c-PA PameIR 588 PameIR 98 BgerGR209 PameIR 32 PaP450-169 PameIR 31 PaGR224 PameIR 30

a PaGR518 PameIR 497 PameIR 503 PaP450-162 DmGr66a-PB PameIR 583 PameIR 580 DmGr58b-PA PameIR 581 PaP450-164 BgerGR174 PameIR 504 PameIR 73 DmGr47a-PB PameIR 582 PaP450-163 PameIR 561 DmGr9a-PA PameIR 272 PameIR 539

DmGr85a-PA cation-related gene families in the genomes of three blattodean species and PameIR 537 PaP450-170 DmGr58a-PA PameIR 536 PameIR 535 DmGr10b-PA PameIR 538 PaP450-171 DmGr10a-PB PameIR 211 PameIR 528 DmGr57a-PA PameIR 242 PaP450-97 PameIR 415 Gene families involved in chemoreception and detoxi d c b DmGr47b-PA PameIR 531 DmGr59e-PA PameIR252 PameIR 540 PaP450-5 DmGr59f-PA PameIR 534 PameIR 533 DmGr39b-PA fi PameIR 532 PaP450-172 ATP-binding cassette. Of these, three gene families with massive expansions in as shown in IR ionotropic receptor, OBP odorant-binding protein, P450 cytochromes P450 (CYP), CCE carboxyl/choline esterases, GST glutathione detoxi iGluRs ionotropic glutamate receptors, NMDA 4 Fig. 2 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03281-1 ARTICLE

a b Treated with S.aureus

IMD TOLL JAK-STAT Treated with cockroach saline solution a 100 b PGRP-SD a

PGRP-SA GNBP3 80 b GNBP1 c ModSP PGRP-SC2 60 PGRP-SC1a Wound / Stress c Grass PGRP-LB Spirit Easter Upd2 40 % Mortality c c Upd1 PGRP-LE c 8 Upd3 Pro-Spz 20 d PGRP-LC PGN Spz PGRP-LF 0 TOLL DOME CK Toll-1A Toll-1B Myd88 Dorsal dsRNA injected for RNAi 11 Pirk 11 IMD Cytosol 10 9 Pellino Cytosol 12 Pelle c lap2 P Stat92E Ptp61F 1 RNAi Dorsal / CK 1 3 ben 45 0.8 Gprk2 Dorsal P Dnr1 2 66 Stat92E Cactus PIAS 77 P Stat92E * caspar 0.6 * *

Relish ** Dorsal RanBP3 0.4 ** Sickie

Deaf1 Relative expression Nucleus Nucleus 0.2 Ken Cnot4

AFP Stress-response 0 Attacin Termicins genes AMPs Defensins AMPs PaPrp-1 AFP Drosomycin Attacin Paprp-1 Termicin-1Termicin-2 Defensin-1Defensin-2Defensin-3Defensin-4Defensin-5 Drosomycin Expansion Present Drosophila-specific

Fig. 3 Gene repertoire of the innate immune system and functional analyses of the Toll pathway in P. americana. a Representation of three main innate immune signaling pathways (Imd, Toll, and JAK-STAT, proposed in Drosophila)22–24 in P. americana. Genes from expanded gene families in P. americana are highlighted in red. Since all genes absent in P. americana were also found to be absent in other insect species, they were defined as Drosophila-specific components and outlined by gray dashed lines. 1, fas-associated DD protein; 2, death-related Ced-3/Nedd2-like protein; 3, effete; 4, TAK1-binding protein 2; 5, TGF-β-activated kinase 1; 6, kenny; 7, immune response deficient 5; 8, spätzle-processing enzyme; 9, myeloid differentiation primary response 88; 10, tube; 11, hopscotch; 12, socs36E; AMP, antimicrobial peptide; DOME, domeless; Dnr1, defense repressor 1; GNBP, Gram-negative binding protein; Gprk2, G protein-coupled receptor kinase 2; Grass, Gram-positive-specific serine protease; Iap2, inhibitor of apoptosis 2; IMD, immune deficiency; ModSP, modular serine protease; PGN, peptidoglycan; PGRP, peptidoglycan recognition protein; Pirk, poor IMD response upon knock in; Spz, spätzle; Spirit, serine protease immune response integrator; Stat92E, signal transducer and activator of transcription protein at 92E. Detailed information is additionally shown in Supplementary Tables 11–15. b Functional verification of genes in the Toll pathway against a classic Gram-positive bacterium, Staphylococcus aureus. Four major genes in the Toll pathway were knocked down by RNAi, compared to injection of control dsRNA (CK, a 92 bp non-coding sequence from the pSTBlue-1 vector). Corresponding mortality is shown upon S. aureus infection. Injection of cockroach saline solution (CSS, as a control) or S. aureus was performed 24 h after dsRNA injection. Error bars indicate standard deviation of three replicates. Bars labeled with different lowercase letters indicate significant difference between the two samples, with p < 0.05, one-way analysis of variance (ANOVA). c Functional relationship between the Toll pathway and expression of AMPs. Dorsal, a key component in the Toll pathway whose depletion caused the greatest mortality (shown in b), was knocked down by RNAi. Correspondingly, the relative expression of 11 antimicrobial peptide genes were measured and shown in c. Error bars represent s.d. of three replicates. Two-tailed Student's t-test: *p < 0.05; **p < 0.01 value of this cockroach in the context of feeding habitat evolution, other insects (Fig. 2a; Supplementary Fig. 4), suggesting the i.e., from omnivores to herbivores, in Blattodea. The IR gene transport of odorant molecules may be functionally conserved in family also has experienced a substantial expansion in the P. cockroaches. Considering these results, we hypothesize that americana genome, in which we found a total of 640 candidate substantial expansions in chemoreception families may contri- IRs (Fig. 2c), much more than that in the termite genome (148 in bute to the ability of the American cockroach to precisely Z. nevadensis; Fig. 2a). IRs mediate neuronal communication at discriminate environmental signals. synapses throughout vertebrate and invertebrate nervous sys- A detoxification system includes various enzymes and tems16. It was reported that IRs, in Drosophila, are expressed in xenobiotic transporters that are crucial for insects to overcome neurons associated with coeloconic sensilla on the antenna and numerous toxins19. We identified 178 cytochrome P450s, 90 – mediate responses to volatile chemical cues and temperature16 18. carboxyl/choline esterases, 39 glutathione transferases, and 115 We propose that IRs might also greatly contribute to environ- ATP-binding cassette transporters in P. americana (Fig. 2a, d; mental adaptation in cockroaches. By contrast, we found the Supplementary Figs. 5–7). These associated families also show a number of OBPs in blattodean species (with the most in P. pattern of general expansion in P. americana. We focused on americana) is dramatically reduced compared to Drosophila and P450s since their expansion is greatest in the American

NATURE COMMUNICATIONS | (2018) 9:1008 | DOI: 10.1038/s41467-018-03281-1 | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03281-1 cockroach, compared with other blattodean species. Phylogenetic the other AMP genes (Fig. 3c), which is consistent with the analysis of P450s across blattodean species clearly represents four crucial role of the Toll pathway in innate immunity of P. major clans, i.e., the CYP2, the CYP3, the CYP4, and the americana. Taken together, our results provide solid evidence mitochondrial (Mito) clade (Fig. 2d). We found that most P450 that the Toll pathway and AMPs play essential roles in the genes in P. americana are clustered with the lineages of CYP3 American cockroach in fighting invading pathogens. Impor- (79/178 genes) and CYP4 (62/178 genes) (Fig. 2d). The members tantly, dorsal and other key components in the Toll pathway of the CYP3 clan are highly diverse in their ability to metabolize a could be potential molecular targets for controlling this serious variety of naturally occurring compounds, and the expression of pest of public health. several genes in the CYP3 and CYP4 clans can be induced by various xenobiotics20. Thus, the expansions of these two clans may benefit cockroaches in insecticide resistance and survival in Development and regeneration. High developmental plasticity is extreme conditions21. presumed crucial to the success of cockroaches to survive and Cockroaches generally live in moist and unsanitary areas and succeed in many environments. Of all common cockroach spe- are particularly fond of fermenting foods1; thus, they have cies, the American cockroach has the largest body size, up to 53 numerous opportunities to be exposed to microbes and patho- mm in length; molts 6–14 times before metamorphosis; and has gens. Insects exclusively rely on the innate immune system to the longest lifecycle, up to approximately 700 days (Supplemen- combat infecting microbes. The humoral response of innate tary Fig. 1). Insect molting and metamorphosis are coordinately immunity is mediated mainly by three major signaling pathways: regulated by 20-hydroxyecdysone (20E) and juvenile hormone Imd22, Toll23, and Janus kinase-signal transducer and activator of (JH), and JH prevents 20E-induced metamorphosis during the – transcription (JAK-STAT)24. Upon infection by Gram-negative larval and nymph stages27 30. Insulin/insulin-like growth factors bacteria as well as Gram-positive bacteria and fungi, the Imd and and 20E are the two major mechanisms that antagonize each Toll pathways are activated, respectively, resulting in synthesis other to define the final body size by regulating larval and nymph and secretion of antimicrobial peptides (AMPs) into the growth31,32. We found that crucial biosynthesis and signaling hemolymph, where they can kill invading microorganisms25,26. pathways for regulating insect development, such as 20E, JH, We found that all key components in the Imd, Toll, and JAK- insulin, chitin metabolism, AMPK, and TOR (Supplementary STAT pathways, as well as effectors, are well represented in the P. Tables 16–21), are well represented in P. americana. Notably, we americana genome. Compared with other insects, many genes in found two key genes involved in JH biosynthesis and metabolism innate immunity, particularly in the Toll pathway, have been (Jhamt and Jhe) and insulin-like peptide (Ilp) genes are sub- extensively expanded (Fig. 3a; Supplementary Fig. 8; Supplemen- stantially expanded (Supplementary Fig. 11). In addition, the tary Tables 11–15). Gram-negative binding proteins (GNBPs) are cuticle protein family appears to be one of the most expanded pattern recognition proteins responsible for the detection of gene families in P. americana (Supplementary Tables 7, 10 pathogens and the activation of the Toll pathway. We found 12 and 22). GNBP1-like and 2 GNBP3-like genes, more than in any of those To understand how molting, metamorphosis, and growth are insect species examined (maximum of 6 GNBPs in Z. nevadensis). regulated by these upstream signals, we performed RNAi The Drosophila genome encodes 9 Toll proteins, while P. experiments to disrupt the 20E, JH, and insulin signals during americana genome encodes 14. Other components of the Toll the nymph stages of P. americana. We observed visible molting pathway have also experienced gene duplications in P. americana defects and eventually death upon RNAi knockdown of EcR and (Fig. 3a and Supplementary Table 11, such as easter, spaetzle, RXR, the two genes encoding the 20E nuclear receptor complex pellino, pelle, and cactus). We identified 11 AMPs in P. (Fig. 4a). We found precocious metamorphosis occurring when americana genome, including defensins, termicins, attacin, Met and Kr-h1, which encode the JH receptor and the JH drosomycin, Pro-rich peptide (Paprp-1), and Anti-fungus peptide downstream anti-metamorphic factor, respectively, were depleted (AFP) (Fig. 3a; Supplementary Table 15). We performed by RNAi (Fig. 4b). The 11 Jhamt genes and 5 Jhe genes in P. experiments by injecting the American cockroach with microbes americana genome should flexibly regulate JH titers and thus the to test the induction of AMPs by measuring antimicrobial activity number of molts (6–14), providing plasticity in metamorphic of the cockroach crude extracts. We found strong antimicrobial development. For the insulin signaling pathway, we examined activity after injection with Escherichia coli (Gram-negative three key genes (InR, PI3K, and TOR) and found that RNAi bacterium), moderate antimicrobial activity with Staphylococcus knockdown of each gene significantly retarded the growth rate of aureus (Gram-positive bacterium), and weak antimicrobial the nymphs (Fig. 4c). The expansion of the Ilp gene family (seven activity with Candida albicans (fungi). Together, these findings in the American cockroach) might mediate rapid growth of body suggest cockroach AMPs are potentially broad spectrum size in the American cockroach when food resource is rich. These (Supplementary Fig. 9). Furthermore, injection of these microbes phenotypes suggest that 20E, JH, and insulin are mainly was able to upregulate the expression of all 11 AMPs to different responsible for the regulation of molting, metamorphosis, and degrees (Supplementary Fig. 10). Particularly high induction was growth, respectively, although further investigation is required to observed for attacin and Paprp-1 after the injection with E. coli, understand their interactions. defensin-3 and defensin-5 with S. aureus, and AFP and termicin- The American cockroach reproduces periodically within the 2 with C. albicans (Supplementary Fig. 10). We further long adult stage, up to 600 days. In female adult insects, ovary investigated how the expanded Toll signaling pathway induces maturation can be regulated by insulin, JH, or 20E by regulating AMP expression and protects the American cockroach from vitellogenesis, DNA replication, cell proliferation, and generation – microbial invasion. RNA interference (RNAi) knockdown of four of germline stem cells33 35. In some insect species, insulin important genes (Toll-1A, Toll-1B, Myd88, and dorsal) in the Toll signaling affects JH biosynthesis or JH signaling to regulate ovary pathway significantly increased the mortality of this cockroach maturation36,37. Note that facultative parthenogenesis is a upon S. aureus injection, with nearly complete mortality found reproductive strategy common in the American cockroach and after dorsal RNAi (Fig. 3b). We also examined the effect of dorsal termites, but absent in the German cockroach38, again supporting RNAi on the expression of AMP genes. Importantly, dorsal that the American cockroach is more closely related to the RNAi not only significantly decreased the expression of defensin- termites than to the German cockroach. A periodic reproductive 3 and defensin-5, but also decreased the expression of most of cycle, including periodic ovary growth, oocyte maturation, and

6 NATURE COMMUNICATIONS | (2018) 9:1008 | DOI: 10.1038/s41467-018-03281-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03281-1 ARTICLE

ab

5 mm 5 mm 1 mm

1 mm Survival Death Death Treatment n Treatment n Larva Adult (molted) (partially molted) (non-molted) CK dsRNA 60 60 (100%) – CK dsRNA 25 25 (100%) – – Met dsRNA 71 40 (56.34%)* 31(43.66%)* EcR dsRNA 38 15 (39.47%)* 3 (7.89%)* 20 (52.63%)* Kr-h1 dsRNA 79 60 (75.95%)* 19 (24.05%)* RXR dsRNA 31 20 (64.52%)* 2 (6.45%)* 11 (35.48%)* c d

CK dsRNA 40 5 mm Tor dsRNA 35 InR dsRNA PI3K dsRNA 30 1 mm dsRNA CK Vg Dsx InR PI3K TORMet Kr-h1 EcR RXR Chitinase 25 *** *** 8 5 20 *** *** *** 6 4 15 *** *** 3 *** 10 *** 4 *** (mm) 2 *** 5 2 ****** *** *** Injected time point *** 1 *** *** *** ****** ***

% Gonadosomatic index *** 0 Primary oocytelength % Cumulative growth rate of body weight 0 7 14 21 0 0 dsRNA dsRNA Vg Vg CK Dsx InR Met EcR CK Dsx InR Met Days after molting PI3KTOR Kr-h1 RXR PI3KTOR Kr-h1EcRRXR Chitinase Chitinase

Fig. 4 Functional studies of pathways regulating development and reproduction in the American cockroach. a Regulation of molting by ecdysone receptor (EcR) and retinoid X receptor (RXR) genes in the 20E signal pathway. Mortality rates were checked after RNAi treatment of indicated numbers of individuals (n). CK, control dsRNA corresponds to that in Fig. 3. b Regulation of metamorphosis by methoprene-tolerant (Met) and kruppel homolog 1 (Kr- h1) in the juvenile hormone (JH) pathway. Adult proportion index was checked after the JH singling was disrupted. c Regulation of growth by the insulin signaling pathway genes. Insulin-like receptor (InR), phosphoinositide 3-kinase (PI3K), and target of rapamycin (TOR) genes were repeatedly depleted by dsRNA injection during 3 weeks. Three injections were given and cumulative growth rate of body weight (%) was calculated. Student’s t-test: ***p < 0.001. d Morphology change of ovary maturation during the first reproductive cycle in virgin females. Vitellogenin (Vg), double-sex (Dsx), and nine genes involved in the pathways of insulin, JH, and 20E were knocked down by RNAi. Gonadosomatic index and primary oocyte length were used to evaluate the ovary maturation degree. All the data were calculated as the mean value of three replicates. Error bars represent standard deviation. Two-tailed Student’s t-test: *p<0.05; ***p<0.001. Details of all involved pathways are shown in Supplementary Tables 16–18 egg laying, was observed in newly emerged virgin females The American cockroach has a strong capability of limb (Supplementary Fig. 12). We focused on how the gonadosomatic regeneration during the nymph stages39, which is the main reason index and primary oocyte length in the first reproductive cycle are to call it “Xiao Qiang” in China. We re-examined its regeneration regulated by different upstream signals in virgin females. RNAi of missing leg segments after one molting by systematic knockdown of vitellogenin (Vg) completely abolished ovary amputation of the metathoracic limb, including all five podites maturation, while RNAi knockdown of double-sex (Dsx,a (Fig. 5a). The ability of P. americana to regenerate the missing transcriptional factor regulating Vg expression in many insects) limb and the degree of recovery depend on the trauma severity had a less significant inhibition (Fig. 4d; Supplementary Fig. 12). indices (Fig. 5a). Importantly, trochanter and coxa are the two Importantly, RNAi knockdown of key genes in the insulin most important podites in leg regeneration, providing the pathway (InR, PI3K, and TOR) and the JH pathway (Met and Kr- possibility of studying the morphological and molecular details h1) abolished ovary maturation (Fig. 4d), similar to the inhibitory of cell proliferation and differentiation in this biological process effect of Vg RNAi (Fig. 4d; Supplementary Fig. 13), but RNAi (Fig. 5a). A number of important signaling pathways have been knockdown of EcR, RXR,orChitinase had no inhibitory effects suggested to be involved in wound healing and tissue repair in (Fig. 4d). RNAi experiments in female adults demonstrated that Drosophila and vertebrates, including Decapentaplegic (Dpp), Jun both insulin and JH, but not 20E, are crucial for the regulation of N-terminal kinase (JNK), Grainy head (GRH), Wingless (Wg), – ovary maturation, at least at the first reproductive cycle. Notch, Hippo, and Hedgehog (Hh)40 43. Based on our manual Apparently, it is of interest to understand the molecular annotation, we found most key components of these seven mechanisms underlying the interplay between insulin and JH pathways exist in the P. americana genome (Supplementary during the first reproductive cycle as well as the hormone Tables 23–29), indicating that these pathways are evolutionarily network connecting reproductive cycles. We suppose whether the conserved in insects. We also found gene expansions in the GRH, American cockroach can grow fast or slow, molt more or less, and Wg, and Notch pathways (Supplementary Fig. 10; Supplementary reproduce abundantly or not depends on its living conditions, Tables 25–27)inP. americana and speculated that these which is consistent with its strong hormone-controlled adapta- pathways may contribute to the impressive ability of this tion ability. cockroach to regenerate lost appendages. Dpp and Mad

NATURE COMMUNICATIONS | (2018) 9:1008 | DOI: 10.1038/s41467-018-03281-1 | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03281-1

a b Amputation CK Ta Ti Fe Tr 1/2 Tr 2/3 Co Femur

Body 5 mm Coxa Trochanter dsRNA Femur 1 mm CK Dpp Mad

Tibia

Tarsus 5 mm 5 mm

Diagram of a typical leg 1 mm 1 mm Regeneration 4+ 3+ 3+ 3+ 3+ ++ + – 3+ – –

Fig. 5 Leg regeneration ability and its regulation by the decapentaplegic (Dpp) pathway in P. americana. a Limb regeneration confirmation under different trauma severity indices. The diagram in the box represents a typical leg with indicated parts in which we performed the amputation experiment. Regeneration ability is shown in the bottom, according to the wild-type size. 4+, wild-type size; from 3+ to +, incomplete sizes; −, null. b Function of Dpp signaling pathway in leg regeneration. No leg regenerated (−) when Dpp and mothers-against-dpp (Mad) genes were knocked down by RNAi. CK, control dsRNA corresponds to that in Fig. 3

(mothers-against-dpp) are the ligand and downstream transcrip- were picked from the colony, respectively, and placed in separate containers and tion factor, respectively, in the Dpp pathway. We next used RNAi provided with rat food and water. Operations were performed after anaesthetizing to reduce the expression of Dpp and Mad to determine their roles with CO2. For treatment, three biological replicates were performed with 30 ani- mals involved in each replicate. in leg regeneration. When the tarsus–tibia–femur segments were removed and these two genes were depleted by RNAi, the Genome and transcriptome sequencing. DNA for de novo sequencing was all regeneration of missing limbs was completely prohibited after one extracted from a single female adult. Libraries of stepwise increased inserts were molting (Fig. 5b). These results show that the Dpp pathway is constructed following the standard protocol. All libraries were sequenced on necessary for wound healing and tissue repair during cockroach Illumina sequencing platforms by Berry Genomics Co. Ltd. In summary, a total of leg regeneration, which is considerably helpful for cockroaches to 1.06 Tb data were generated for three species of Periplaneta (Supplementary recover from the body damage or hurt. As mentioned above, Table 1). To examine the transcriptome, we also sequenced full-length transcripts using the PacBio sequencing system (Pacific Biosciences). A total of 17.5 Gb Kang Fu Xin Ye, an ethanol extract of the American cockroach, transcriptome sequencing data were generated from three independent libraries, has been developed as a prescribed drug for wound healing and with inserts of 1–2 Kb, 2–3 Kb, and 3–6 Kb, for the messenger RNA pool of all tissue repair. We are currently investigating whether there is a stages of cockroach development (Supplementary Table 1). “growth factor” (Supplementary Table 30) connecting leg regeneration in the American cockroach to its ethanol extract Genome assembly. Initial assemblies were generated by DiscovarDeNovo that is used for wound healing and tissue repair in humans. (v52488; https://software.broadinstitute.org/software/discovar/blog/) with default settings. SSPACE (v3)44 was then used to join contigs into scaffolds based on In summary, our genomic and functional analyses in the linking evidence of mate pair libraries, step by step ranging from 600 bp to 13 kb American cockroach provided insights into its success in the insert size, under default parameters. Gaps within these scaffolds were iteratively adaptation to urban environments and the biology of develop- filled with paired-end reads of short-size inserts using GapCloser available in mental plasticity in cockroaches. High efficiency of RNAi in the SOAPdenovo45. Main properties of the resulting assembly are comparable to other American cockroach unlocks its potential as a genetic model blattodean species and related species of large genome size. Notably, our assembly presents the longest N50 size in contigs compared with other related species system for investigating cockroach biology. The harm of (Supplementary Table 2). We evaluated the completeness of our assembly using American cockroaches is becoming more serious with the threat sequence of different independent resources, which include both nucleotide of global warming. Our study may shed light on both controlling sequences of P. americana itself and protein sequence of proposed conservation and making use of this insect. A paucity of genomic resources in across species, as follows: long transcriptomic reads, with a mean length of 1.5 Kb, that were generated by PacBio; de novo assembled transcripts, based on Illumina Blattodea has precluded the discovery of evolutionary signatures sequencing reads (SRA#DRP001319) using Trinity r20140717 with default settings; of euociality in this hemimetabolous clade. The American expressed sequence tags downloaded from NCBI (National Center for Bio- cockroach system should be informative for addressing eusoci- technology Information); uploaded nucleotide sequences of P. americana in ality and feeding habitats in Blattodea, given that it has a closer GenBank; 458 CEGMA46 proteins (copies of Drosophila) that were recognized as potential conserved core proteins across species. These sequences were aligned relationship with termites than the German cockroach. Genomic back to the assembled genome using genblastA v1.0.1 with “-e 1e-10 −r1”47 and studies of additional species in Blattodea should be forthcoming, their recovered proportions were subsequently calculated by an in-house PERL especially the subsocial and wood-feeding species in Cryptocercus script. BUSCO v348 was also utilized in quantitative measuring for the assessment (Cryptocercidae), to illuminate the transition from cockroaches to of genome assemblies, using insecta_odb9 as lineage input. The evaluation prop- erties of the American cockroach genome assembly were presented in Supple- termites. mentary Tables 2–4.

Methods Genome annotation. Repeats, including repetitive sequences and transposable . The line of P. americana (the American cockroach) was provided by Dr. elements, were identified using RepeatMasker v 4.0.549 (http://www.repeatmasker. Huiling Hao, whose line has been maintained with inbreeding for 30 years. Samples org) against both de novo repeat library, which was built by RepeatModeler v1.0.4 of P. australasiae (the Australian cockroach) and P. fuliginosa (the smokybrown (http://www.repeatmasker.org), and the set of Repbase v2014013150. cockroach) were provided by Professor Jianchu Mo (Zhejiang University, China). The official gene set (OGS1; Supplementary Table 4) was based on a GLEAN The colony was kept at 29 °C and a relative humidity of 70–80% in plastic cages consensus model51, which combined transcriptome (Periplaneta), homology and animals were fed commercial rat food and water at libitum. To obtain pools of (Blattella5, Zootermopsis6, Locusta4, Drosophila, and UniProt), and ab initio sets synchronized animals, newly molted male and female adults and mid-age larvae (AUGUSTUS52, SNAP53, and GENSCAN54) as described55. Model training for

8 NATURE COMMUNICATIONS | (2018) 9:1008 | DOI: 10.1038/s41467-018-03281-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03281-1 ARTICLE each ab initio program was initially based on assembled transcripts and followed by drying in vacuum drying oven, then resuspended in 0.1 vol. (the volume of with an iteratively trained approach. Automatic annotations were performed using supernatant without drying treatment) double distilled water to obtain AMP crude − BLASTP (E < 10 5) against a number of public databases as listed in Supple- extracts. mentary Table 5. Local run of InterProScan v5.13-52.056 with all implemented methods was utilized to assign domains, Gene Ontology terms, as well as KO terms Assay of antimicrobial activity. The radial diffusion assay was performed to test to gene models. the antimicrobial activities of AMP crude extracts, as described previously with slight modifications66. Sterile Petri dishes (9 cm diameter) received 7.5 ml of LB Orthology and phylogenomics. A total of 12 representative insect species, solid medium, containing ~2 × 105 logarithmic-phase cells of a given bacterial including P. americana, were selected for orthology analysis. Related protein sets of strain. Wells (5 mm diameter) were cut into the freshly poured plates after the involved species are listed in Supplementary Table 6. Automatic orthology was solidification of the agar. Each well received a 20 μl aliquot of the fraction expected determined using the OrthoMCL pipeline57. In brief, we first filtered out redundant to contain antibacterial molecules. The plates were incubated overnight at 37 ℃, splice variants to keep the longest isoform for each protein set; then, all-against-all and the diameters of the clear zones were recorded, after subtraction of the well − protein comparisons were performed using BLASTP with E < 10 5; finally, High- diameter. scoring Segment Pair (HSPs) were processed by MCL v10-201 to define orthologs, inparalogs, and co-orthologs. A total of 538 exactly single-copy universal genes RNAi in vivo. For RNAi, 300–500 bp fragments of the target genes were PCR were used to reconstruct the phylogeny. Multiple alignments of protein sequences amplified from the complementary DNA. Then, the T7 promoter sequence 58 for each ortholog group were performed using Muscle v 3.8.31 . Of these align- attached primers were used for PCR amplification of the double-stranded RNA 59 ments, the conserved blocks were extracted using Gblocks v 0.91b and subse- (dsRNA) templates. The dsRNA was synthesized using T7 RiboMAX Express quently concatenated to 12 super genes with 75,839 amino acids. The species tree RNAi kit (Promega) according to the manufacturer’s instructions67. A 92 bp non- 60 was calculated using PhyML v3.1 with the WAG model with 100 replicates of coding sequence from the pSTBlue-1 vector (dsCK) was used as control dsRNA68. bootstrap analysis. dsRNA was quantified using a Nanodrop spectrophotometer (Thermo Scientific) and was injected into the abdomen of cockroaches. All dsRNA synthesis primers Genome-wide sequence identity. We focused on pairwise sequence identity in used in this study are listed in Supplementary Table 31. In the experiments on amino acids between species of Blattodea, including B. germanica5 (v0.5.3), Z. innate immunity, for each gene, newly emerged adult males within 10 days were nevadensis6 (v2.2), and three species of Periplaneta in this study. For P. australasiae injected with a dose of 3 μg dsRNA per . In the experiments on molting, and P. fuliginosa,wefirst de novo assembled preliminary contigs using idba metamorphosis, and growth, newly emerged mid-age larvae were injected with a v1.1.161 following default parameters. All protein-coding genes were then aligned dose of 3 μg dsRNA per animal at 0, 7, and 14 days after molting. In the experi- to each set of contigs using genblastA v1.0.1 with “-e 1e-5 -r 1”47. GeneWise ments on ovary development, newly emerged adult females were injected with 3 μg v2.2.062 was finally utilized to predict the gene models in these two genomes, dsRNA per animal at 2, 4, and 6 days after eclosion. In the experiments on limb respectively, for each input seed of P. americana. Orthology across these blattodean regeneration, the section below the femur was amputated, and dsRNAs were species was analyzed as described above. In order to exclude potential biases, we injected one day before amputation, then injection was operated once every other limited the comparison of sequence identity within the commonly aligned blocks day for four times. Controls were injected with the same volume and dose of dsCK. across all analyzed species, which were generated by MUSCLE58 and Gblocks59 as Three biological replicates were performed. The efficiencies of RNAi are listed in described above. Sequence identity was directly calculated by an in-house PERL Supplementary Fig. 2. script. Nonparametric regression was calculated using the lowess function imple- mented in R. Total RNA extraction and quantitative real-time PCR. After 24 h upon dsRNA injection or microorganism infection, abdominal fat body tissues of cockroaches Gene families. We performed manual curation for each gene of the families and were collected, and total RNA was extracted using EASYspin Plus RNA extraction pathways that we addressed in this study; approximately 2000 genes of the special kit (Aidlab, China) according to the manufacturer’s instructions. Reverse tran- interest in the biology of the American cockroach were manually annotated. Most scription was performed by using Reverse Transcriptase M-MLV (Takara) as gene families were based on known models in Drosophila. For informative com- previously described69. Quantitative real-time PCR (qRT-PCR) was performed ® parison, we included the corresponding information of the B. germanica gene set5. using SYBR Select Master Mix (Applied Biosystems) and the Applied Biosystems™ To avoid discordant information between these two studies5, we used the currently QuantStudio™ 6 Flex Real-Time PCR System. Actin was chosen as a reference gene latest version of the gene set and associated annotation, i.e., v0.6.2, of B. germanica. for qRT-PCR analysis68. All qRT-PCR primers used in this study are listed in Multiple alignments were performed by ClustalX v2.2.063 and manually curated for Supplementary Table 31. well-aligned blocks. Phylogenetic analysis was conducted using the maximum- 64 likelihood algorithm and JTT model implemented in MEGA7 for 100 replicates Amputation and microscopy observation. The tarsus, tibia, femur, trochanter, of bootstraps. coax of 20 mid-age larvae were amputated in 2 days after molting, respectively. Because identification of chemosensory receptor genes is commonly fi Sections and regeneration legs were photographed using a Nikon DS-Ri2 camera. problematic by automated predictions, we identi ed this class of genes in the Cockroaches were dissected in CSS, under an OLYMPUS SZ61 microscope. For genomes using TBLASTN searches with known receptors of Drosophila and each female animal, the ovary was dissected on day 7; body weight, ovarian weight, Zootermopsis as queries, followed by iteration. Gene structures within the genomic fi −5 62 and primary oocyte lengths were recorded at the same time, respectively. Images of loci with signi cant hits (E < 10 ) were predicted using GeneWise v2.2.0 with ovary and ovariole were captured with a Nikon DS-Ri2 camera and Nikon SMZ25 pseudogene masked. We carefully checked the multiple alignments and removed microscope. The length of ovarioles was measured using NIS-Elements BR poor aligned regions before phylogeny analysis. Given the large data set of involved 64 4.50.00 software. For larvae, fat body and coxa were collected at 48 h post injection sequence, we used neighbor-joining algorithm implemented in MEGA7 for 100 of dsRNA and flash-frozen in liquid nitrogen to prevent RNA degradation and replicates. stored at −80 ℃ until further processing. For all experiments described in this For comparisons in large gene families, different approaches and/or cutoffs in paper, at least three biological repeats were performed, and 30 animals were used in independent studies may cause systematic bias. Thus, we utilized a consensus each biological repeat. approach to re-identify large gene families in published genomes to generate a comparable result in this study, such as chemoreceptors and detoxification-related gene families. We note that these annotations do not represent the official genome Data availability. All sequence data from the American cockroach genome project annotations of the corresponding species. Gene function enrichment analysis was have been deposited at DDBJ/ENA/GenBank under the accession PGRX00000000. conducted using an online resource (http://www.omicshare.com/), under the All relevant data are available from the authors. default instructions. Received: 16 August 2017 Accepted: 2 February 2018 AMP induction. Bacteria of E. coli (PTA-5952), S. aureus (ATCC29213), and C. albicans (ATCC29213) were cultured in LB liquid medium. Overnight bacteria cultures were centrifuged and resuspended in the cockroach saline solution (CSS)65 = = containing E. coli (OD600 0.25), S. aureus (OD600 1.0), and C. albicans (OD600 = 1.0). For AMP induction, adult male American cockroaches emerged within 10 days were injected using sharp micro syringe in posterolateral abdomens with 3 References bacterial suspensions of E. coli, S. aureus, C. albicans, or CSS. 1. Bell, W. J. & Adiyodi, K. The American Cockroach (Springer Science & Business Media, New York, 1982). Crude purification. For crude purification of AMPs, the whole animals were 2. Govindaraj, D. et al. Immunogenic peptides: B & T cell epitopes of per a 10 – ground to a fine powder in the constant presence of liquid nitrogen to prevent allergen of periplaneta americana. Mol. Immunol. 80,24 32 (2016). protease degradation. The powder was then transferred in 10 volumes (mass/vol.) 3. Hanrahan, S. J. & Johnston, J. S. New genome size estimates of 134 species of – 0.1 M acetic acid and extracted for 24 h under gentle shaking66. After centrifuga- . Chromosome Res. 19, 809 823 (2011). tion at 13,000 × g for 15 min again, the supernatant was boiled for 15 min, and then 4. Wang, X. et al. The locust genome provides insight into swarm formation and centrifuged at 13,000 × g for 15 min again, the supernatant was collected followed long-distance flight. Nat. Commun. 5, 2957 (2014).

NATURE COMMUNICATIONS | (2018) 9:1008 | DOI: 10.1038/s41467-018-03281-1 | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03281-1

5. Harrison, M. C. et al. Hemimetabolous genomes reveal molecular basis of 37. Xu, J., Sheng, Z. & Palli, S. R. Juvenile hormone and insulin regulate trehalose termite . Nat. Ecol. Evol. 2, 557–566 (2017). homeostasis in the red flour beetle, Tribolium castaneum. PLoS Genet. 9, 6. Terrapon, N. et al. Molecular traces of alternative social organization in a e1003535 (2013). termite genome. Nat. Commun. 5, 3636 (2014). 38. Katoh, K. et al. Group-housed females promote production of asexual ootheca 7. Poulsen, M. et al. Complementary symbiont contributions to plant in American cockroaches. Zool. Lett. 3, 3 (2017). decomposition in a fungus-farming termite. Proc. Natl. Acad. Sci. USA 111, 39. Truby, P. R. Blastema formation and cell division during cockroach limb 14500–14505 (2014). regeneration. Development 75, 151–164 (1983). 8. Consortium, I. A. G. Genome sequence of the pea aphid Acyrthosiphon 40. Belacortu, Y. & Paricio, N. Drosophila as a model of wound healing and tissue pisum. PLoS Biol. 8, e1000313 (2010). regeneration in vertebrates. Dev. Dyn. 240, 2379–2404 (2011). 9. Djernæs, M., Klass, K.-D. & Eggleton, P. Identifying possible sister groups of 41. Das, S. Morphological, molecular, and hormonal basis of limb regeneration Cryptocercidae+Isoptera: a combined molecular and morphological across pancrustacea. Integr. Comp. Biol. 55, 869–877 (2015). phylogeny of Dictyoptera. Mol. Phylogenet. Evol. 84, 284–303 (2015). 42. Hayashi, S., Yokoyama, H. & Tamura, K. Roles of Hippo signaling pathway in 10. Arguello, J. R. et al. Extensive local adaptation within the chemosensory size control of organ regeneration. Dev. Growth Differ. 57, 341–351 (2015). system following Drosophila melanogaster’s global expansion. Nat. Commun. 43. Mito, T. et al. Involvement of hedgehog, wingless, and dpp in the initiation of 7, 11855 (2016). proximodistal axis formation during the regeneration of insect legs, a 11. Dahanukar, A., Hallem, E. A. & Carlson, J. R. Insect chemoreception. Curr. verification of the modified boundary model. Mech. Dev. 114,27–35 (2002). Opin. Neurobiol. 15, 423–430 (2005). 44. Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D. & Pirovano, W. Scaffolding 12. Bargmann, C. I. Comparative chemosensation from receptors to ecology. pre-assembled contigs using SSPACE. Bioinformatics 27, 578–579 (2010). Nature 444, 295 (2006). 45. Li, R. et al. De novo assembly of human genomes with massively parallel short 13. Benton, R. Multigene family evolution: perspectives from insect read sequencing. Genome Res. 20, 265–272 (2010). chemoreceptors. Trends Ecol. Evol. 30, 590–600 (2015). 46. Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate 14. Herness, M. S. & Gilbertson, T. A. Cellular mechanisms of taste transduction. core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007). Annu. Rev. Physiol. 61, 873–900 (1999). 47. She, R., Chu, J. S., Wang, K., Pei, J. & Chen, N. GenBlastA: enabling BLAST to 15. Pearce, S. et al. Genomic innovations, transcriptional plasticity and gene loss identify homologous gene sequences. Genome Res. 19, 143–149 (2009). underlying the evolution and divergence of two highly polyphagous and 48. Simao, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, invasive Helicoverpa pest species. BMC Biol. 15, 63 (2017). E. M. BUSCO: assessing genome assembly and annotation completeness with 16. Benton, R., Vannice, K. S., Gomez-Diaz, C. & Vosshall, L. B. Variant single-copy orthologs. Bioinformatics 31, 3210–3212 (2015). ionotropic glutamate receptors as chemosensory receptors in Drosophila. Cell 49. Saha, S., Bridges, S., Magbanua, Z. V. & Peterson, D. G. Empirical comparison 136, 149–162 (2009). of ab initio repeat finding programs. Nucleic Acids Res. 36, 2284–2294 (2008). 17. Silbering, A. F. et al. Complementary function and integrated wiring of the 50. Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive evolutionarily distinct Drosophila olfactory subsystems. J. Neurosci. 31, elements in eukaryotic genomes. Mob. DNA 6, 11 (2015). 13357–13375 (2011). 51. Elsik, C. G. et al. Creating a honey consensus gene set. Genome Biol. 8, 18. Chen, C. et al. Drosophila ionotropic receptor 25a mediates circadian clock R13 (2007). resetting by temperature. Nature 527, 516 (2015). 52. Stanke, M., Tzvetkova, A. & Morgenstern, B. AUGUSTUS at EGASP: using 19. Simon, J. Y. Encyclopedia of Entomology 687–699 (Springer, Dordrecht, 2004) EST, protein and genomic alignments for improved gene prediction in the 20. David, J. P. et al. Transcriptome response to pollutants and insecticides in the human genome. Genome Biol. 7, S11 (2006). dengue vector Aedes aegypti using next-generation sequencing technology. 53. Korf, I. Gene finding in novel genomes. BMC Bioinforma. 5, 59 (2004). BMC Genom. 11, 216 (2010). 54. Burge, C. & Karlin, S. Prediction of complete gene structures in human 21. Ffrench-Constant, R. H. The molecular genetics of insecticide resistance. genomic DNA. J. Mol. Biol. 268,78–94 (1997). Genetics 194, 807–815 (2013). 55. Zhan, S., Merlin, C., Boore, J. L. & Reppert, S. M. The monarch butterfly 22. Myllymaki, H., Valanne, S. & Ramet, M. The Drosophila imd signaling genome yields insights into long-distance migration. Cell 147, 1171–1185 pathway. J. Immunol. 192, 3455–3462 (2014). (2011). 23. Valanne, S., Wang, J. H. & Ramet, M. The Drosophila Toll signaling pathway. 56. Jones, P. et al. InterProScan 5: genome-scale protein function classification. J. Immunol. 186, 649–656 (2011). Bioinformatics 30, 1236–1240 (2014). 24. Myllymaki, H. & Ramet, M. JAK/STAT pathway in Drosophila immunity. 57. Li, L., Stoeckert, C. J. & Roos, D. S. OrthoMCL: identification of ortholog Scand. J. Immunol. 79, 377–385 (2014). groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003). 25. Igboin, C. O., Griffen, A. L. & Leys, E. J. The Drosophila melanogaster host 58. Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and model. J. Oral Microbiol. 4, 10368 (2012). high throughput. Nucleic Acids Res. 32, 1792–1797 (2004). 26. Kim, I. W. et al. De novo transcriptome analysis and detection of 59. Talavera, G. & Castresana, J. Improvement of phylogenies after removing antimicrobial peptides of the American cockroach Periplaneta americana divergent and ambiguously aligned blocks from protein sequence alignments. (Linnaeus). PLoS One 11, e0155304 (2016). Syst. Biol. 56, 564–577 (2007). 27. Jindra, M., Palli, S. R. & Riddiford, L. M. The juvenile hormone signaling 60. Guindon, S. et al. New algorithms and methods to estimate maximum- pathway in insect development. Annu. Rev. Entomol. 58, 181–204 (2013). likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 28. Yamanaka, N., Rewitz, K. F. & O’Connor, M. B. Ecdysone control of 307–321 (2010). developmental transitions: lessons from Drosophila research. Annu. Rev. 61. Peng, Y., Leung, H. C., Yiu, S. M. & Chin, F. Y. IDBA-UD: a de novo Entomol. 58, 497–516 (2013). assembler for single-cell and metagenomic sequencing data with highly 29. Belles, X. & Santos, C. G. The MEKRE93 (Methoprene tolerant-Krüppel uneven depth. Bioinformatics 28, 1420–1428 (2012). homolog 1-E93) pathway in the regulation of insect metamorphosis, and the 62. Birney, E., Clamp, M. & Durbin, R. Genewise and genomewise. Genome Res. homology of the pupal stage. Insect Biochem. Mol. Biol. 52,60–68 (2014). 14, 988–995 (2004). 30. Liu, S. et al. Antagonistic actions of juvenile hormone and 20- 63. Larkin, M. A. et al. Clustal W and Clustal X version 2.0. Bioinformatics 23, hydroxyecdysone within the ring gland determine developmental transitions 2947–2948 (2007). in Drosophila. Proc. Natl. Acad. Sci. USA 115, 139–144 (2018). 64. Kumar, S., Stecher, G. & Tamura, K. MEGA7: Molecular Evolutionary 31. Colombani, J. et al. Antagonistic actions of ecdysone and insulins determine Genetics Analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 33, final size in Drosophila. Science 310, 667–670 (2005). 1870–1874 (2016). 32. Nijhout, H. F. & Callier, V. Developmental mechanisms of body size and 65. Titlow, J. S., Majeed, Z. R., Hartman, H. B., Burns, E. & Cooper, R. l. Neural wing-body scaling in insects. Annu. Rev. Entomol. 60, 141–156 (2015). circuit recording from an intact cockroach nervous system. J. Vis. Exp. 81, 33. Ables, E. T. & Drummond-Barbosa, D. The steroid hormone ecdysone e50584 (2013). functions with intrinsic chromatin remodeling factors to control female 66. Bulet, P. et al. Insect immunity. Isolation from a coleopteran insect of a novel germline stem cells in Drosophila. Cell Stem Cell 7, 581–592 (2010). inducible antibacterial peptide and of new members of the insect defensin 34. Guo, W. et al. Juvenile hormone-receptor complex acts on Mcm4 and Mcm7 family. J. Biol. Chem. 266, 24520–24525 (1991). to promote polyploidy and vitellogenesis in the migratory locust. PLoS Genet. 67. Tian, L. et al. 20-hydroxyecdysone upregulates Atg genes to induce autophagy 10, e1004702 (2014). in the Bombyx fat body. Autophagy 9, 1172–1187 (2013). 35. Wang, D. et al. LIN-28 balances longevity and germline stem cell number in 68. Gomez-Orte, E. & Belles, X. MicroRNA-dependent metamorphosis in Caenorhabditis elegans through let-7/AKT/DAF-16 axis. Aging Cell 16, hemimetabolan insects. Proc. Natl. Acad. Sci. USA 106, 21678–21682 (2009). 113–124 (2017). 69. Zhou, S. et al. Two Tor genes in the silkworm Bombyx mori. Insect Mol. Biol. 36. Tatar, M. et al. A mutant Drosophila insulin receptor homolog that extends 19, 727–735 (2010). life-span and impairs neuroendocrine function. Science 292, 107–110 70. Misof, B. et al. Phylogenomics resolves the timing and pattern of insect (2001). evolution. Science 346, 763–767 (2014).

10 NATURE COMMUNICATIONS | (2018) 9:1008 | DOI: 10.1038/s41467-018-03281-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03281-1 ARTICLE

Acknowledgements Reprints and permission information is available online at http://npg.nature.com/ We thank Drs. Huiling Hao and Jianchu Mo for providing cockroach samples. We also reprintsandpermissions/ appreciate Dr. Erich Bornberg-Bauer for the early access to the genome data of Blattella germanica. This study was supported by the National Science Foundation of China Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in fi (Grant No. 31620103917, 31522053, 31330072, 31572325, and 31672370), the National published maps and institutional af liations. Key Research and Development Program of China (Grant No. 2016YFD0101900 and 2015CB755700), and the Chinese Academy of Sciences (Grant No. 173176001000162007, QYZDB-SSW-SMC029). Open Access This article is licensed under a Creative Commons Author contributions Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give Sh.L. conceived the project. Sh.L. and Shu.Z. designed and led the project. D.Y. and Ya.C. appropriate credit to the original author(s) and the source, provide a link to the Creative prepared DNA and RNA for sequencing. Shu.Z. assembled the genome and generated Commons license, and indicate if changes were made. The images or other third party the gene set. Sh.L., Shi.Z., Q.J., D.Y., K.L., Su.L., C.R., H.Z., G.F., D.L., X.Z., and Shu.Z. ’ annotated the gene families. Shi.Z., Q.J., K.L., Su.L., and C.R. performed functional material in this article are included in the article s Creative Commons license, unless experiments. Sh.L., Yi.C., J.Z., Q.Y., Y.F., X.Y., Q.F., and Shu.Z. interpreted the data. Sh. indicated otherwise in a credit line to the material. If material is not included in the ’ L., Shi.Z., Q.J., D.Y., C.R., and Shu.Z. wrote the manuscript. Yi.C., X.Y., and Q.F. article s Creative Commons license and your intended use is not permitted by statutory improved the manuscript. regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/ Additional information licenses/by/4.0/. Supplementary Information accompanies this paper at https://doi.org/10.1038/s41467- 018-03281-1. © The Author(s) 2018

Competing interests: The authors declare no competing interests.

NATURE COMMUNICATIONS | (2018) 9:1008 | DOI: 10.1038/s41467-018-03281-1 | www.nature.com/naturecommunications 11