Electronic supplementary material

Evolutionary rates are correlated between symbiont and mitochondrial genomes

Daej A. Arab1, Thomas Bourguignon1,2,3, Zongqing Wang4, Simon Y. W. Ho1, & Nathan Lo1

1School of Life and Environmental Sciences, University of Sydney, Sydney, Australia 2Okinawa Institute of Science and Technology Graduate University, Tancha, Onna-son, Okinawa, Japan 3Faculty of Forestry and Wood Sciences, Czech University of Life Sciences, Prague, Czech Republic 4College of Plant Protection, Southwest University, Chongqing, China

Sampling, DNA extraction, and sequencing

A list of samples and collection data for each cockroach and outgroup examined is provided in table S1. For the majority of taxa examined in this study, partial Blattabacterium genomic data were available from a previous study of cockroach mitochondrial genomes [1]. In some cases, new genomic data were obtained from fat bodies of individual , as follows. DNA was extracted using a DNeasy Blood and Tissue kit (Qiagen), according to the manufacturer’s protocol. Individual DNA samples were tagged with unique barcode combinations, mixed in equimolar concentration, and 150 bp paired-end-reads-sequenced with an Illumina HiSeq4000, following the methods described in ref. [1]. All specimens examined in this study are stored at the Okinawa Institute of Science and Technology, Japan. The genome sequences of Blattabacterium outgroups were obtained from GenBank and included three strains of Sulcia muelleri (accession numbers CP002163, AP013293, and CP010828), one Flavobacterium gilvum (CP017479), one Lutibacter sp. (CP017478), one Tenacibaculum dicentrachi (CP013671), and one Polaribacter sp. (LT629752). In total, we obtained 104 orthologous genes consistently present across the taxa that we examined.

Phylogenetic analysis

The 104 genes selected for analysis were found in 95% of all taxa; genes that were missing more than two taxa were excluded. All taxa had over 90% of the selected genes, except for Aeluropoda insignis which only had ~80% of all genes. Missing genes were presumed to be because of uneven sequencing coverage of samples and the relatively low sequencing coverage used, rather than the actual absence of these genes from their genomes. All genes were concatenated into a 107,187 bp data set. We aligned the protein-coding Blattabacterium genes at the amino acid level using MAFFT v7.305b [2], then back- translated them into nucleotide sequences using PAL2NAL [3]. The same alignment methods were used for the cockroach mtDNA genome analyses. Outgroups for the cockroach mtDNA genome analyses were as described in ref. [1]. The same phylogenetic methods were used for both the cockroach mtDNA and Blattabacterium data sets. Tests in DAMBE indicated that rd 3 codon positions in the mitochondrial dataset were saturated (NumOTU = 32, ISS = 0.804,

ISS.CAsym =0.809). Although the Blattabacterium sequences were not significantly saturated at 3rd codon sites (NumOTU = 32, ISS = 0.649, ISS.CAsym =0.819), we did not use it for our phylogenetic analyses because the test statistic was close to the critical value. Therefore, we performed phylogenetic analyses using alignments of 1st and 2nd codon sites only. First, we inferred the phylogeny using maximum likelihood in RAxML v8.2 [4], using 1000 bootstrap replicates to estimate node support. We used the GTR+G substitution model for Blattabacterium combined 1st and 2nd codon sites. The cockroach mtDNA data set was partitioned into 1st+2nd codon sites and 12S+16S+tRNAs, with each subset assigned a separate GTR+G substitution model.

Tests for congruence between host and symbiont datasets

We used ParaFit in R 3.5.1 [5] to quantify congruence between host and symbiont topologies. We first created matrices of patristic distances calculated from maximum-likelihood host and symbiont phylogenies and a host-symbiont association matrix, then implemented a global test with 999 permutations, using the ParafitGlobal value and a p-value threshold of 0.05 to determine significance.

Molecular dating

The molecular clock was calibrated using minimum age constraints based on the fossil record, implemented as uniform priors on node ages. Soft maximum bounds were determined using phylogenetic bracketing [6]. Altogether we used four fossils to calibrate four nodes (table S2). We selected fossils following suggested criteria for justifying fossil calibrations [7]. A birth-death process was used for the tree prior. A single Markov chain Monte Carlo (MCMC) analysis was performed, with the tree and parameter values sampled every 104 MCMC steps over a total of 108 steps. The initial 10% of trees were discarded as burn-in. A maximum-clade-credibility tree was obtained using TreeAnnotator in the BEAST software package [8]. Acceptable sample sizes and convergence to the stationary distribution were checked using Tracer 1.7 [8].

Simulations

We simulated the evolution of sequences from both host and symbiont along the Blattabacterium tree topology inferred using RAxML, with evolutionary parameters obtained from our separate RAxML analyses of the original data. Tree lengths for host and symbiont were rescaled according to their relative rates, but the relative branch rates were maintained between the two trees. We performed comparisons of evolutionary rates inferred from these synthetic data, using the methods described above. These analyses were carried out a total of three times.

Supplementary Table S1. A list of samples and collection data for each cockroach examined. Species Family Sample ID Accession Collecting locality Collector Date number Aeluropoda insignis B002 TBA Breeding colony of Kyle Kandilian N/A N/A Allacta australiensis AUS Allacta James Cook University, Rainforest site, Queensland, David Rentz 22-Jun-15 TBA Australia Amazonina sp. Ectobiidae Z256E TBA , Bosque Protector del Alto Nangaritza Frantisek Juna Apr-2016 Anallacta methanoides Ectobiidae B057 TBA Breeding colony of Kyle Kandilian N/A N/A Anaplecta calosoma Anaplectidae Cockroach contig Kuranda, Queensland, Australia David Rentz 17-Nov- TBA 1688 15 Anaplecta omei Anaplectidae Anaplecta omei TBA Mt. Emeishan, Sichuan Province, China Zongqing Wang 01-Jul-13 Balta sp. Ectobiidae Balta_sp. Cairns, Australia David Rentz 18-Dec- TBA 15 Beybienkoa kurandanensis Ectobiidae Cairns, Australia David Rentz 18-Dec- Beybienkoa_karandanensis TBA 15 giganteus Blaberidae BGIGA CP003535- GenBank N/A N/A CP003536 Blaberidae B056 TBA Breeding colony of Kyle Kandilian N/A N/A orientalis BOR CP003605- GenBank N/A N/A CP003606 germanica Ectobiidae BGE CP001487 GenBank N/A N/A Carbrunneria paramaxi Ectobiidae Carbru TBA Cairns, Australia David Rentz 05-Oct-15 Chorisoserrata sp. Ectobiidae CHORI TBA Yunnan Province, China Zongqing Wang Cosmozosteria sp. Blattidae B117 TBA Cape Upstart, Australia James Walker 13-Oct-15 hirtus Cryptocercidae HIR TBA Mt.Taibaishan, Shaanxi Province,China N/A 06-Oct-13 Cryptocercus punctulatus Cryptocercidae CPU CP003015- GenBank N/A N/A CP003016 Deropeltis paulinoi Blattidae B069 TBA Breeding colony of Kyle Kandilian N/A N/A sp. Ectobiidae Z254C TBA Slovenia Frantisek Juna Apr-2016 Ectoneura hanitschi Ectobiidae Ectoneura_hanitschi James Cook University, Rainforest site, Queensland, David Rentz 18-Dec- TBA Australia 15 maya Blaberidae B095 TBA Arcadia, Florida, USA Kyle Kandilian 07-Jul-09 distanti Blaberidae B025 TBA Breeding colony of Kyle Kandilian N/A Euphyllodromia sp. Ectobiidae Z257 TBA Ecuador, Podocarpus National Park Frantisek Juna Apr-2016 B081 TBA Breeding colony of Kyle Kandilian N/A N/A Eurycotis decipiens Blattidae B071 TBA Breeding colony of Kyle Kandilian N/A N/A Galiblatta cribrosa Blaberidae Z98 TBA Nouragues, French Guiana N/A 14-Jun-15 Blaberidae B030 Breeding colony of Kyle Kandilian N/A N/A TBA grandidieri Gyna capucina Blaberidae Z139GY TBA Ebogo, Cameroon Frantisek Juna 08-Sep-15 Ectobiidae B083 Torreya State Park, Bristol, Florida, USA 07-Jul-09 deropeltiformis Lamproblatta sp. Lamproblattidae LA male TBA Petit saut, French Guiana Frantisek Juna 08-Jul-09 Laxta sp. Blaberidae AUS2 Olney State Forest, New South, Wales, Australia Nathan Lo and 25-Aug- TBA Thomas 15 Bourguignon Macropanesthia Blaberidae B092 Breeding colony of Kyle Kandilian N/A N/A TBA rhinoceros Isoptera MADAR CP003000, GenBank N/A N/A CP003095 sp. Ectobiidae ECMD1 TBA Podocarpus National Park, Ecuador Frantisek Juna Apr-2016 Melanozosteria sp. Blattidae Melanozosteria_sp. Cairns, Australia David Rentz 18-Dec- TBA 15 Methana sp. Blattidae AUS1 North Manly, New South Wales, Australia Nathan Lo 01-Aug- TBA 15 Nauphoeta cinerea Blaberidae BNCIN CP005488- GenBank - Kambhampati et al. 2013 N/A N/A CP005489 Neolaxta mackerrasae Blaberidae B107 TBA Paluma Range, Australia David Rentz 15-Oct-15 Opisthoplatia orientalis Blaberidae Z15100 TBA Breeding colony of J. Hromádka N/A N/A nivea Blaberidae B044 TBA Breeding colony of Kyle Kandilian N/A N/A Panesthia angustipennis Blaberidae Z138 Breeding colony in Czech Republic., orig Vietnam N/A 01-Aug- TBA 15 Panesthia sp. Blaberidae Panesthia_sp TBA Bubeng, province Yunnan, China N/A 08-Jul-09 Paranauphoeta Blaberidae PARA N/A N/A N/A TBA circumdata Paratemnopteryx Ectobiidae B061 TBA Breeding colony of Kyle Kandilian N/A N/A virginica Ectobiidae B102 TBA Breeding colony of Kyle Kandilian N/A N/A americana Blattidae BPLAN CP001429- GenBank N/A N/A CP001430 Phyllodromica sp. Ectobiidae Phil Czech Republic Thomas 01-Aug- TBA Bourguignon 15 Platyzosteria sp. Blattidae AUS3 Olney State Forest, New South, Wales, Australia Nathan Lo and 25-Aug- TBA Thomas 15 Bourguignon Protagonista lugubris Blattidae Cockroach contig Mt.Diaoluoshan,Hainan, China Zongqing Wang 25-May- TBA 4907 15 femapterus Blaberidae B048 TBA Breeding colony of Kyle Kandilian N/A N/A Rhabdoblatta sp. Blaberidae RHA TBA Kuranda, Queensland, Australia David Rentz 16-Sep-15 Shelfordella lateralis Blattidae B080 TBA Breeding colony of Kyle Kandilian N/A N/A regularis Corydiidae B091 Palm plantation between Puducherry and Auroville, Kyle Kandilian N/A TBA India Tryonicus parvus Tryonicus parvus Olney State Forest, New South, Wales, Australia Nathan Lo and 10-Mar- TBA Thomas 16 Bourguignon Sulcia muelleri Flavobacteriaceae CARI CP002163 GenBank N/A N/A Sulcia muelleri Flavobacteriaceae PSPU AP013293 GenBank N/A N/A Sulcia muelleri Flavobacteriaceae CARI CP002165 GenBank N/A N/A Flavobacterium gilvum Flavobacteriaceae EM1308 CP017479 GenBank N/A N/A Lutibacter sp. Flavobacteriaceae LPB0138 CP017478 GenBank N/A N/A Tenacibaculum Flavobacteriaceae AY7486TD CP013671 GenBank N/A N/A dicentrachi Polaribacter sp. Flavobacteriaceae LT629752 KT25b GenBank N/A N/A Supplementary Table S2. Fossils used to calibrate the estimates of divergence times of major cockroach clades.

Species Age (Ma) / min. Calibration group Soft max. bound Ref. Comments on soft max. bound age constraint (97.5% for group probability) Baissatermes lapideus 137 Cryptocercus + Isoptera 235 [10] Triassoblatta argentina, earliest fossil of Mesoblattinidae [11] Balatronis libanensis 125 Blattidae + Tryonicidae 235 [12] Triassoblatta argentina, earliest fossil of Mesoblattinidae [11] Ischnoptera gedanensis 33.9 Ischnoptera+sister 145 [13] First modern cockroach: Zhujiblatta [14] Epilampra 41.3 Epilampra+Galiblatta 145 [15] First modern cockroach: Zhujiblatta [14] Supplementary figures

Supplementary Figure S1. Time-calibrated phylogenetic tree of Blattabacterium. The tree was inferred using Bayesian analysis in BEAST using 12 protein-coding genes with the 3rd codon sites excluded. Node bars indicate 95% credibility intervals of node ages. Colours of branches and species names represent different cockroach families: light green = Ectobiidae, teal = Blaberidae, blue = Corydiidae, olive green = Blattidae, maroon = Tryonicidae, pink = Cryptocercidae, red = , dark green = Anaplectidae, and orange = Lamproblattidae.

Supplementary Figure S2. Time-calibrated phylogenetic tree of cockroaches. The tree was inferred using Bayesian analysis in BEAST using whole cockroach mitochondrial genomes with the 3rd codon sites of protein-coding genes excluded. Node bars indicate 95% credibility intervals of node ages. Colours of branches and species names represent different cockroach families: light green = Ectobiidae, teal = Blaberidae, blue = Corydiidae, olive green = Blattidae, maroon = Tryonicidae, pink = Cryptocercidae, red = termites, dark green = Anaplectidae, and orange = Lamproblattidae.

Supplementary Figure S3. Time-calibrated phylogenetic tree based on a combined Blattabacterium and cockroach mitochondrial data set. The tree was inferred using Bayesian analysis in BEAST using 12 protein-coding genes from Blattabacterium and whole mitochondrial genomes from cockroaches, with the 3rd codon sites of protein-coding genes excluded. Node bars indicate 95% credibility intervals of node ages. Colours of branches and species names represent different cockroach families: light green = Ectobiidae, teal = Blaberidae, blue = Corydiidae, olive green = Blattidae, maroon = Tryonicidae, pink = Cryptocercidae, red = termites, dark green = Anaplectidae, and orange = Lamproblattidae.

References

1. Bourguignon T, Tang Q, Ho SYW, Juna F, Wang Z, Arab DA, Cameron SL, Walker J, Rentz D, Evans TA. 2018 Transoceanic dispersal and plate tectonics shaped global cockroach distributions: evidence from mitochondrial phylogenomics. Mol. Biol. Evol. 35, 970–983. 2. Katoh K, Standley DM. 2013 MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780. 3. Suyama M, Torrents D, Bork P. 2006 PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 34, W609–W612. 4. Stamatakis A. 2014 RAxML version 8: a tool for phylogenetic analysis and post- analysis of large phylogenies. Bioinformatics 30, 1312–1313. 5. R Core Team. 2014 R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. 6. Ho SYW, Phillips MJ. 2009 Accounting for calibration uncertainty in phylogenetic estimation of evolutionary divergence times. Syst. Biol. 58, 367–380. 7. Parham JF, Donoghue PC, Bell CJ, Calway TD, Head JJ, Holroyd PA, Inoue JG, Irmis RB, Joyce WG, Ksepka DT. 2011 Best practices for justifying fossil calibrations. Syst. Biol. 61, 346–359. 8. Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA. 2018 Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst. Biol. 67, 901–904. 9. Drummond AJ, Suchard MA, Xie D, Rambaut A. 2012 Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973. 10. Engel MS, Grimaldi DA, Krishna K. 2007 Primitive termites from the early of Asia (Isoptera). Stuttg. Beitr. Naturkd B 371, 1–32. 11. Martins-Neto RG, Mancuso A, Gallego OF. 2005 The fauna from Argentina. Blattoptera from the (Bermejo Basin), La Rioja Province. Ameghiniana 42, 705–723. 12. Sendi H, Azar D. 2017 New aposematic and presumably repellent bark cockroach from Lebanese amber. Cretaceous Res. 72, 13–17. 13. Shelford R. 1910 On a collection of Blattidæ preserved in Amber, from Prussia. Zool. J. Linn. Soc. 30, 336–355. 14. Lin Q. 1980 Fossil from the Mesozoic of Zhejiang and Anhui. In Nanjing Institute of Geology and Palaeontology. Division and Correlation of Stratigraphy of Mesozoic Volcanic Sediments from Zhejiang and Anhui, pp. 223–224. Beijing: Science Press. 15. Beccaloni G. 2007 species file online. Version 1.0/4.1. URL http://blattodea.speciesfile.org