Page 1 of 48 Systematic Biology

1 Robert S. de Moya

2 Department of Entomology

3 University of Illinois Urbana-Champaign

4 505 S. Goodwin Ave.

5 Urbana, IL 61801 6 For Peer Review OnlyEmail: [email protected] 7

8 Phylogenomics of parasitic and non-parasitic lice (Insecta: ): Combining sequence data

9 and Exploring compositional bias solutions in Next Generation Datasets

10 Robert S. de Moya1,2, Kazunori Yoshizawa3, Kimberly K.O. Walden1, Andrew D. Sweet4,

11 Christopher H. Dietrich2, Kevin P. Johnson2

12 1Department of Entomology, University of Illinois Urbana-Champaign, 505 S. Goodwin Ave.,

13 Urbana, IL 61801, USA

14 2Illinois Natural History Survey, Prairie Research Institute, University of Illinois, Champaign, IL

15 61820, USA

16 3Systematic Entomology, School of Agriculture, Hokkaido University, Sapporo 060-8589, Japan

17 4Department of Entomology, Purdue University, 901 W. State St., West Lafayette, IN 47907,

18 USA

19

http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 2 of 48

de Moya et al.

20 Abstract

21

22 The order Psocodea is a diverse lineage comprising both parasitic (Phthiraptera)

23 and non-parasitic members (). The extreme age and ecological diversity of the group

24 may be associated with major genomic changes, such as base compositional biases expected to 25 affect phylogeneticFor inference. Peer Divergent morphologyReview between Only parasitic and non-parasitic 26 members has also obscured the origins of within the order. We conducted a

27 phylogenomic analysis on the order Psocodea utilizing both transcriptome and

28 sequencing to obtain a data set of 2,370 orthologous genes. All phylogenomic analyses,

29 including both concatenated and coalescent methods suggest a single origin of parasitism within

30 the order Psocodea, resolving conflicting results from previous studies. This phylogeny allows us

31 to propose a stable ordinal level classification scheme that retains significant taxonomic names

32 present in historical scientific literature and reflects the evolution of the group as a whole. A

33 dating analysis, with internal nodes calibrated by fossil evidence, suggests an origin of parasitism

34 that predates the K-Pg boundary. Nucleotide compositional biases are detected in third and first

35 codon positions and result in the anomalous placement of the as sister to

36 when all nucleotide sites are analyzed. Likelihood-mapping and quartet sampling

37 methods demonstrate that base compositional biases can also have an effect on quartet-based

38 methods.

39

40 Keywords:

41 Quartet sampling, Psocoptera, Phthiraptera, Illumina, Recoding methods

42

2 http://mc.manuscriptcentral.com/systbiol Page 3 of 48 Systematic Biology

Phylogenomics of Psocodea

43 Introduction

44

45 The era of phylogenomic analysis has provided access to new data types originating from

46 different genomic regions (e.g. ultra-conserved elements (UCE) and single-copy protein-coding

47 orthologs) (Jarvis et al. 2014; Prum et al. 2015) or from post transcriptional processes (i.e. 48 transcriptomes) (MisofFor et al. Peer2014; Johnson Review et al. 2018a). Availability Only of certain data types may 49 be contingent on the quality or age of a specimen. For example, RNA degrades quickly, and

50 transcriptome-based analyses are not typically feasible for old or fixed specimens (Houseley and

51 Tollervey 2009; Bossert et al. 2019). Transcriptomes are obtainable for fresh specimens

52 preserved in an appropriate buffer (e.g. RNA-laterTM) that inhibits the activity of RNases which

53 degrade RNA (Houseley and Tollervey 2009). Transcriptomes correspond to coding gene

54 sequences and are free of non-coding introns, and thus align well. Alternatively, most next

55 generation methods that produce whole genome sequences include a fragmentation step prior to

56 library construction (Alkan et al. 2011) that may mimic degradation processes that occur in

57 specimens that are dry or old (Bossert et al. 2019). DNA-based genome sequencing is not limited

58 by the amount of RNA present in a cell and can produce many reads across the genome (Johnson

59 2019). However, raw genomic DNA sequence data contain intron and non-coding data (Jarvis et

60 al. 2014), but these can be excised prior or masked following alignment with transcriptome data.

61 High volume sequencing technologies have often been described and implemented in

62 phylogenomic studies; however, approaches to combine sequences derived from different next

63 generation sequencing technologies have been less developed. A recent phylogenomic analysis

64 of Apidae (bees) successfully combined transcriptome, genome, and UCE data to produce a

3 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 4 of 48

de Moya et al.

65 robust topology (Bossert et al. 2019). However, this is one of few studies to combine

66 transcriptome and partial genome data in a single phylogenomic analysis.

67 Phylogenomic analyses have helped resolve many contentious relationships, but have

68 also accentuated the need to test for compositional (and other) biases in molecular data sets that

69 may be amplified by the inclusion of millions of base pairs of nucleotides and can lead to strong 70 support for misleadingFor hypotheses Peer (Romiguier Review et al. 2016; Bossert Only et al. 2017, 2019; Simion et 71 al. 2017; Laumer et al. 2018; Simon et al. 2018; Vasilikopoulos et al. 2019). Before the

72 widespread use of phylogenomic analysis, Sanger sequencing-based phylogenetics sought to

73 optimize the topology by testing hierarchical models of evolution (Posada and Crandall 1998).

74 Weaknesses of these models can be exposed (Duchêne et al. 2017), due to large scale

75 compositional biases that exist in codon positions found across thousands of loci, which violate

76 model assumptions of stationarity (Simion et al. 2017; Laumer et al. 2018). These biases can

77 create phylogenetic artifacts that appear well supported given traditional clade support values

78 (i.e. bootstrap support) but is actually misleading because the large amount of data simply

79 converges upon a stable topology due to underlying weaknesses in model assumptions.

80 Base compositional biases (%GC) have long been known to influence the results of

81 phylogenetic analyses (Galtier and Gouy 1995; Jermiin et al. 2004; Bossert et al. 2017) and

82 several methods for reducing the influence of such biases on phylogenetic inference have been

83 proposed (Jermiin et al. 2004; Sheffield et al. 2009; Regier et al. 2010; Ishikawa et al. 2012;

84 Zwick et al. 2012; Simmons 2017). However, most methods that incorporate time-heterogeneous

85 approaches (Philippe et al. 2011; Roure and Philippe 2011) are extremely computationally

86 intensive and would be difficult to apply to large phylogenomic data sets, although they have

87 been applied in mitochondrial phylogenomics with success (Sheffield et al. 2009). Alternative

4 http://mc.manuscriptcentral.com/systbiol Page 5 of 48 Systematic Biology

Phylogenomics of Psocodea

88 methods include recoding techniques (Simmons 2017) which use IUPAC ambiguity codes to

89 mask variable codon positions that code for a silent mutation, such as RY recoding (Ishikawa et

90 al. 2012) and degeneracy methods (Regier et al. 2010; Zwick et al. 2012). Another solution is to

91 discard possible saturated data, for example removal of the third codon positions from an

92 alignment (Breinholt and Kawahara 2013) or even the first and third codon postitions (Misof et 93 al. 2014). These twoFor methods Peer are effective Review for concatenated datasets;Only however, coalescent 94 analyses may also be influenced by compositional biases in individual genes (Romiguier et al.

95 2016; Bossert et al. 2017, 2019). A further solution is to analyze amino acid sequences, although

96 it is possible that underlying base compositional biases can result in amino acid biases as well

97 (Foster et al. 1997). In addition, molecular models for the evolution of amino acids are much

98 more computationally intensive and may not be feasible for analysis of large genomic data sets,

99 because a twenty-one amino acid model (two coding strategies for serine) (Zwick et al. 2012) is

100 much more complex relative to nucleotide models based on four bases (Posada and Crandall

101 1998). Here we explore some of these issues using a combined genome and transcriptome data

102 set for a group of (Psocodea) known to have strong variation in base compositional biases

103 across taxa (Johnson et al. 2003; Yoshizawa and Johnson 2013).

104 The insect order Psocodea encompasses the two historically recognized groups

105 Psocoptera (free-living bark lice) and Phthiraptera (parasitic lice) that were once considered

106 separate orders. Members of Psocodea have an extensive fossil record that extends into the

107 Lower Cretaceous (Mockford et al. 2013) and molecular divergence time estimates place their

108 origin in the Paleozoic (~404 Ma) (Misof et al. 2014; Johnson et al. 2018a; Yoshizawa et al.

109 2019). The order also encompasses species with a range of feeding preferences, from detritus,

110 plant material (i.e. pollen, decaying leaves), and microflora (i.e. cyanobacteria films, fungal, and

5 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 6 of 48

de Moya et al.

111 lichen) in non-parasitic members (Broadhead and Wapshere 1966; New 1970, 1987; Broadhead

112 and Richards 1982); to obligate ectoparasitism on birds and mammals (i.e. skin debris, feathers,

113 blood/skin secretions) (Price et al. 2003; Clayton et al. 2015). The ecological diversity and age of

114 the group have likely contributed to large-scale compositional biases that have previously been

115 detected between parasitic and non-parasitic members (Johnson et al. 2003; Yoshizawa and 116 Johnson 2013; JohnsonFor et al. Peer 2018a). These Review known compositional Only biases provide an opportunity 117 to examine the effects such biases may have on phylogenomic analyses. Other groups of

118 organisms are also known to show such biases (Cox et al. 2014; Romiguier et al. 2016; Bossert

119 et al. 2017; Skinner et al. 2020), thus understanding the potential effects and methods to account

120 for such biases will have relevance to many phylogenomic studies.

121 The order Psocodea also represents an ideal taxon for examining the effect of combining

122 whole genome and transcriptome derived sequence data. Parasitic lice are known to have

123 reduced genome sizes (Pittendrigh et al. 2006; Johnston et al. 2007; Kirkness et al. 2010) and are

124 minute insects which typically produce small amounts of RNA. humanus has one of

125 the smallest insect recorded, at 108 Mbp (Kirkness at al. 2010), and coverage estimates

126 from Illumina genome sequencing indicate small genome sizes may be a general feature of

127 Psocodea (100 – 400 Mbp, unpublished). While it has been possible in the past to sequence

128 transcriptome-based data for parasitic Phthiraptera based on pooling many individuals (Johnson

129 et al. 2018b), data is more readily obtained by whole genome sequencing (Allen et al. 2017;

130 Boyd et al. 2017; Sweet et al. 2018). The reduced genome size of parasitic lice makes it possible

131 to produce high quality assemblies from multiplexed samples on a single sequencing lane with

132 whole genome-based sequencing methods (Allen et al. 2017; Boyd et al. 2017; Sweet et al. 2018;

133 Johnson 2019). In contrast, non-parasitic Psocodea (bark-lice) are typically larger in body

6 http://mc.manuscriptcentral.com/systbiol Page 7 of 48 Systematic Biology

Phylogenomics of Psocodea

134 volume and produce higher copy transcript sequences (Johnson et al. 2018a). Less total sequence

135 data is needed for transcriptome sequencing, thus is more economical than whole genome

136 sequencing. (Johnson 2019). Therefore, there is a cost advantage to combining transcriptome and

137 whole genome data in phylogenomic analyses that combine parasitic and non-parasitic Psocodea.

138 Our study is the first to test the utility of combining these different data types in a study of 139 Psocodea phylogeny,For and this Peer general approach Review should be applicable Only to many groups of 140 organisms.

141 Although the monophyly of the lineage comprising both Phthiraptera and Psocoptera is

142 well established based on morphological criteria (Lyal 1985; Yoshizawa and Lienhard 2010),

143 inconsistent taxonomic treatment of the two groups continues (Emeljanov et al. 2001; Scholtz

144 2016; Durden 2019; Wang et al. 2019). Psocoptera traditionally consists of three recognized

145 suborders (, Psocomorpha, and ) (Lienhard and Smithers 2002) and

146 Phthiraptera has four previously recognized suborders (, ,

147 Rhynchophthirina, and Anoplura) (Price et al. 2003). Based on molecular and morphological

148 evidence, Phthiraptera is derived from within the Troctomorpha (Lyal 1985; Johnson et al. 2004;

149 Yoshizawa and Lienhard 2010; Johnson et al. 2018a), but inconsistent use of the subordinal

150 ranks that divide the traditional orders Psocoptera and Phthiraptera can be found in modern

151 literature (Emeljanov et al. 2001; Scholtz 2016). Adding to the confusion, the origin of

152 parasitism remains in question (Yoshizawa and Johnson 2010) because phylogenetic analyses of

153 a ribosomal gene suggested that Phthiraptera could be polyphyletic (Johnson et al. 2004).

154 To explore the phylogenetic relationships within Psocodea, we assembled a large

155 phylogenomic data set (2,370 orthologous genes) derived from whole genome and transcriptome

156 sequencing using a customized pipeline. Using this dataset, comprising more than two million

7 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 8 of 48

de Moya et al.

157 base pairs of nucleotide data, we examined the effects of large base compositional biases on

158 phylogenetic inference. We used the results from our assessment of the influence of base

159 composition on tree topology to conduct a dating analysis accounting for these biases to explore

160 the origins of parasitism in this group.

161 162 Materials & MethodsFor Peer Review Only 163

164 Taxonomic Sampling

165

166 Sampling was aimed at resolving deep level relationships between historically recognized

167 orders or suborders that comprise the insect order Psocodea. One focus of the sampling was

168 resolving whether or not the parasitic lice (Phthiraptera) form a monophyletic assemblage

169 (Johnson et al. 2004; Yoshizawa and Johnson 2010). Sampling included a broad array of

170 parasitic species and the closest non-parasitic members known as the (Mockford

171 1993). In total 112 individuals were sampled, encompassing all currently recognized suborders

172 and infraorders (Table 1). Prior studies established that Trogiomorpha is monophyletic and is the

173 sister taxon of the remainder of Psocodea (Johnson et al. 2004; Yoshizawa et al. 2006), so we

174 used this suborder as the root. This was done because the sister taxon of Psocodea is currently

175 unclear (Misof et al. 2014; Johnson et al. 2018a), thus we avoided outgroups that are highly

176 divergent from the ingroup to circumvent alignment difficulties and potential long branch

177 attraction artifacts.

178

179 Next Generation Sequencing and Orthology Inference

8 http://mc.manuscriptcentral.com/systbiol Page 9 of 48 Systematic Biology

Phylogenomics of Psocodea

180

181 Given the inherent difficulties of obtaining large quantities of freshly preserved tiny

182 insects, such as lice, we developed a pipeline to use whole genome sequencing to obtain

183 orthologs belonging to a set included in a previous dataset derived from transcriptome data

184 (Johnson et al. 2018a). In some cases, both genome and transcriptome sequences were available 185 for the same species.For This allowed Peer us to verifyReview that these two Onlydata types placed respective species 186 in the same phylogenetic position.

187 Whole genome data were obtained following genomic DNA extraction procedures and

188 using Illumina sequencing technologies. Specimens were stored in 95% ethanol at -80 ºC. From

189 these, genomic DNA was extracted using a Qiagen DNAeasy extraction kit. The protocol was

190 slightly modified with an extended 48-hour incubation step and use of 52 µl of elution buffer.

191 DNA was quantified using a Qubit 3.0 fluorometer. The extractions were then sonicated with a

192 Covaris M220 to an average size of 300-400 nt. A Kapa Library Preparation Kit (Kapa

193 Biosystems) was used to produce pair-end libraries. The libraries were then pooled into

194 equimolar concentrations, quantified by qPCR. Each sample was sequenced for 151-161 cycles

195 on a Hiseq2500 (Illumina) with a TruSeq or HiSeq SBS sequencing rapid kit to produce 160 nt

196 reads. Fastq files were produced with Casava 1.8.2 or bcl2fastq v2.17.1.14. Adaptors and low-

197 quality bases were removed using the FASTX Toolkit 0.0.14 (Gordon and Hannon 2010). All

198 sequencing took place at W.M. Keck Center at the University of Illinois, Urbana-Champaign. A

199 gene set of 2,395 protein-coding orthologs previously used for phylogenomic analyses of

200 hemipteroid insects was identified in the annotated genome of the human body , P.

201 humanus (Johnson et al. 2018a). This ortholog set was used as a reference in aTRAM 1.0 (Allen

202 et al. 2015) for local assembly of individual orthologs. This software uses tblastn searches to

9 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 10 of 48

de Moya et al.

203 identify reads matching the gene of interest and assembles them locally. Parameters for aTRAM

204 loci assembly were set to three iterations, fraction one, and the ABySS de novo assembler

205 (Simpson et al. 2009). Exon sequences assembled by aTRAM were then annotated and stitched

206 together if needed using an Exonerate-based (Slater and Birney 2005) pipeline (Allen et al.

207 2017). Transcriptome assemblies and inferred ortholog transcripts were previously published 208 (Table 1, Johnson etFor al. 2018a). Peer Review Only 209

210 Phylogenomic Analyses

211

212 Nucleotide sequences inferred as being orthologous from whole genome sequence data

213 were translated with Geneious 11.1.15 (Kearse et al. 2012). Translated whole genome and

214 transcriptome sequences were aligned with PASTA 1.8.0 by amino acid with memory usage

215 increased (2048 MB) and otherwise default parameters (Mirarab et al. 2014a). Nucleotide

216 sequences were retrieved using a custom python script to produce a final nucleotide alignment

217 based upon the amino acid alignments (Allen et al. 2017). Exonerate inserts ambiguous N’s

218 between combined exon data, therefore excess N’s were recoded to gaps before subsequent

219 masking. Multiple sequence gene alignments (MSAs) were masked on a nucleotide level with

220 trimAl (Capella-Gutiérrez et al. 2009) using a 40% gap threshold. Subsequently, MSAs that

221 included less than 50% of individuals were eliminated from subsequent analyses. Final

222 concatenated supermatrices of 2,370 gene sequences were produced with SequenceMatrix 1.8

223 (Vaidya et al. 2011).

224 Several phylogenetic analyses were performed using the final nucleotide supermatrix.

225 First, all nucleotide sites were analyzed in both partitioned and unpartitioned maximum

10 http://mc.manuscriptcentral.com/systbiol Page 11 of 48 Systematic Biology

Phylogenomics of Psocodea

226 likelihood (ML) frameworks. Second, nucleotide sites were recoded using degeneracy coding

227 (Regier et al. 2010; Zwick et al. 2012) for phylogenetic analyses. Finally, nucleotide

228 supermatrices of 1) first and second codon positions and 2) second codon positions only were

229 produced from Geneious (Kearse et al. 2012). Partitioned analyses were performed on the all

230 nucleotide site and degeneracy recoded supermatrices. The optimal partitioning scheme was 231 determined with PartitionFinderFor Peer 2.1.1 (Lanfear Review et al. 2017) and Only the implemented version of 232 RAxML 8.2.11 (Stamatakis 2014) with the following parameters: branch lengths linked, GTR +

233 G model, BIC model selection, rcluster search and max set to 100.

234 A series of ML phylogenetic analyses were performed on the resulting supermatrices

235 using ExaML 3.0.21 (updated: 6/4/2018) (Kozlov et al. 2015) and RAxML 8.2.11 (Stamatakis

236 2014). To save computation time, the ML hill-climbing algorithm was performed in ExaML with

237 a gamma model and 100 rapid bootstrap replicates were completed with RAxML using the GTR

238 + G model with four GAMMA categories. For each bootstrap search, we tested for bootstrap

239 convergence using RAxML (Pattengale et al. 2009). In all cases, convergence was reached by 50

240 replicates, so the 100 bootstrap replicates are sufficient to provide a reliable estimator of

241 bootstrap proportions. To ensure that the most likely topology was obtained, eight separate tree

242 searches were performed with ExaML, each with different starting input trees derived from

243 RAxML (four parsimony based and four random start topologies). Bootstrap support (BS) for the

244 most likely topology obtained was then mapped using SumTrees 4.1.0 (Sukumaran and Holder

245 2015). These methods were used to analyze all supermatrices produced (all nucleotide sites,

246 degeneracy recoding, second codon positions only, and third codon positions removed).

247 To account for possible incongruence among genes due to incomplete lineage sorting or

248 other biases masked by concatenation, we also performed coalescent gene/species-tree analyses

11 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 12 of 48

de Moya et al.

249 using Astral 5.5.9 (Mirarab et al. 2014b; Mirarab and Warnow 2015). To infer individual gene

250 trees as basis, each of the 2,370 MSAs were analyzed with RAxML using GTR + G and 100

251 rapid bootstrap replicates. Estimation of individual gene trees was performed using both 1) all

252 sites and 2) the degeneracy recoded data for each gene. Resulting bipartition files produced from

253 RAxML were used as in input for Astral analyses using default parameters with branch support 254 calculated based onFor local posterior Peer probability Review (LPP) (Sayyari Only and Mirarab 2016). 255 To evaluate support for conflicting topologies surrounding the phylogenetic position of

256 Amphientometae (see Results) due to potential biases in the concatenated dataset, quartet

257 sampling (Pease et al. 2018) and four cluster likelihood-mapping (quartet mapping, Strimmer

258 and Haeseler 1997) methods were employed. Four cluster likelihood-mapping was performed in

259 IQ-TREE 1.6.5 (Nguyen et al. 2015) testing all possible quartets, tree search skipped, GTR + G

260 model, and the following quartets defined: Trogiomorpha, Psocomorpha, Amphientometae, and

261 Nanopsocetae. Likelihood-mapping in IQ-TREE was performed on the nucleotide supermatrix

262 with 1) all sites, 2) degeneracy recoding, 3) first and second positions only, and 4) second codon

263 positions only. Four cluster likelihood-mapping is not computationally feasible across all

264 branches in large datasets. However, Pease et al. (2018) developed a quartet sampling method

265 which performs four cluster likelihood mapping across each node of the tree but using a random

266 subsample of all possible quartet combinations for that node. We used quartet sampling (Pease et

267 al. 2018) to evaluate support for conflicting topologies across all phylogenetic branches. Quartet

268 sampling was performed using a log likelihood cutoff value of 2 and 200 replicates per branch on

269 the supermatrices for 1) all sites, 2) degeneracy recoding, 3) first and second positions only, and

270 4) second codon positions only. For ease of comparison, we provide a summary of all

271 phylogenetic analyses performed (Table 2).

12 http://mc.manuscriptcentral.com/systbiol Page 13 of 48 Systematic Biology

Phylogenomics of Psocodea

272 Guanine and cytosine content (GC%) were calculated per gene and codon position from

273 the masked MSAs with a custom python script (Allen et al. 2015, 2017). Following GC%

274 calculation, biases were visualized with box and whisker plots produced from RStudio 1.1.453

275 (RS Team 2015). Distribution of the GC% obtained for each individual gene were arranged in

276 ascending order per individual sampled based on the median GC% score obtained. This process 277 was repeated for first,For second, Peer and third codonReview positions. Only 278 Phylogenetic dating analyses using relaxed clock methods were performed with

279 MCMCTree in the PAML package under a correlated rates model (Yang 2007) on a topology

280 resulting from the ML searches of the partitioned degeneracy-coded dataset. A total of nine

281 internal calibration points with soft bounds were based on fossil evidence or previous dating

282 analyses (Wappler et al. 2004; Mockford et al. 2013; Johnson et al. 2018a,b). The internal

283 minimum age calibrations based on fossil evidence include the following: split of Atropetae (120

284 Ma), Psocomorpha (84 Ma), (33.9 Ma), (33.9), Amphientometae (145

285 Ma), + Phthiraptera (99 Ma), (44 Ma), + ( +

286 Pediculus) (20-25 Ma), and P. schaeffi + P. humanus (5-7 Ma) (Wappler et al. 2004; Mockford

287 et al. 2013; Johnson et al. 2018b). A maximum root age calibration for the split between

288 Trogiomorpha and the remainder of Psocodea was set to 328 Ma based on a previous dating

289 analysis (Johnson et al. 2018a) and was used to estimate the rate of substitution across the

290 topology. These described calibrations are not completely independent of calibrations previously

291 employed (Johnson et al. 2018a) but do include additional calibration points relevant to our

292 current taxon sampling. A reversible (GTR) model was implemented for the analysis. The

293 stationarity of two separate MCMC runs was visualized with Tracer 1.7.1 (Rambaut et al. 2018).

294

13 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 14 of 48

de Moya et al.

295 Results

296

297 In total 2,370 genes were successfully aligned yielding a supermatrix of 2,945,181 bp

298 including all three codon positions. Transcriptome and whole genome sequences aligned well,

299 facilitating subsequent phylogenetic analyses. On average, each individual sampled had data 300 present for 95% of Forgenes sampled Peer (Table Review1). Only 301 Topologies from maximum likelihood (ML) phylogenetic analyses of the concatenated

302 sequence dataset varied depending on the methods used for coding or removing nucleotides.

303 Much of this variation centered around the placement of the Amphientometae, an infraorder of

304 Troctomorpha comprising free-living taxa. Amphientometae was recovered as sister to the

305 suborder Psocomorpha with maximum support (100% BS), which contains only non-parasitic

306 taxa, when 1) all nucleotide sites or 2) first and second codon positions were analyzed. However,

307 under degeneracy recoding or analysis of second codon positions only, Amphientometae was

308 recovered as sister to the remainder of the Troctomorpha with maximum support (100% BS), the

309 suborder into which it has traditionally been placed (Fig. 1).

310 Other than the placement of Amphientometae (i.e. monophyly of Troctomorpha),

311 relationships between the other major lineages within Psocodea were generally stable across

312 analyses. Psocomorpha was always recovered as monophyletic (100% BS). Within

313 Troctomorpha, Phthiraptera (parasitic lice) was always recovered as monophyletic regardless of

314 coding method (100% BS). The family Liposcelididae was also always recovered as

315 monophyletic and as the sister taxon of all parasitic lice (100% BS) as predicted from

316 morphology (Lyal 1985). Nanopsocetae (, , and Liposcelididae

317 plus Phthiraptera) was also always recovered as monophyletic (100% BS).

14 http://mc.manuscriptcentral.com/systbiol Page 15 of 48 Systematic Biology

Phylogenomics of Psocodea

318 Among Phthiraptera, there was variation across analyses in the position of some mammal

319 lice (Anoplura, , and Rhynchophthirina). In particular, Rhynchophthirina

320 (elephant lice) was sister to the chewing louse family Trichodectidae when all nucleotide sites

321 were analyzed (100% BS). However, under 1) degeneracy coding, 2) first and second codon

322 positions only, and 3) second codon positions only, Rhynchophthirina was recovered as sister to 323 Anoplura (sucking Forlice) (100% Peer BS). Either Review of these placements Only resulted in paraphyly of what is 324 traditionally considered to be Ischnocera (one of the suborders of chewing lice): Trichodectidae

325 (parasitizing mammals) and (parasitizing mainly birds). Thus, our results also

326 support the existence of a larger mammal infesting clade comprising the Trichodectidae,

327 Rhynchophthirina, and Anoplura, which corroborates recent analyses (Johnson et al. 2018b; de

328 Moya et al. 2019; Song et al. 2019). The other traditional chewing louse suborder, Amblycera,

329 was recovered as monophyletic across all analyses and sister to the remainder of Phthiraptera.

330 Within the Psocomorpha (bark lice, all free-living), relationships between some

331 infraorders showed variation across analyses. In particular, the infraorder was

332 not supported as monophyletic across all ML analyses. Two families of Homilopsocidea

333 ( and ) were most unstable in their placement and each of them

334 sometimes grouped with the Caeciliusetae depending on the method of analysis. However,

335 despite the poor support for the monophyly of the Homilopsocidea, the infraorder was

336 consistently recovered as sister to other members of the Caeciliusetae across ML analyses (100%

337 BS). The and were recovered as sister taxa across ML analyses (89-100%

338 BS) and together sister to Philotarsetae (100% BS). In general, support values are higher within

339 the Psocomorpha when all nucleotide sites are analyzed with a ML approach. The single sampled

340 member of the Archipsocetae () was always recovered as sister to the remainder of

15 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 16 of 48

de Moya et al.

341 the Psocomorpha (100% BS), as in prior morphological (Yoshizawa 2002) and molecular studies

342 (Yoshizawa and Johnson 2014).

343 Within the bark louse suborder Trogiomorpha, there was little variation in the results

344 among analyses. The infraorder Prionoglaridetae was recovered as paraphyletic across all

345 analyses, but this paraphyletic relationship was poorly supported in the degeneracy (6% BS) and 346 second codon positionFor only (33%Peer BS) analyses. Review However, a paraphyleticOnly Prionoglaridetae was 347 supported with maximum bootstrap support (100% BS) when all nucleotide sites or first and

348 second codon positions were analyzed. The remainder of Trogiomorpha was embedded within

349 this paraphyletic assemblage of Prionoglaridetae. The infraorders Atropetae and Psyllipsocetae

350 were each recovered as monophyletic and sister lineages across all ML analyses (100% BS).

351 Coalescent gene/species tree analyses (Astral) of individual gene trees across the 2,370

352 orthologous gene data set yielded similar branching patterns and measures of clade support

353 relative to concatenated ML analyses of the same data type. Most nodes received maximum

354 support (1.0 LPP). As in the ML analyses of all sites for the concatenated data set, coalescent

355 analyses of gene trees from all nucleotide sites display maximum (1.0 LPP) support for a sister

356 relationship between Psocomorpha and Amphientometae (Supplemental Fig. 2). However, when

357 degeneracy recoded gene trees are analyzed (Supplemental Fig. 3), there is maximum support

358 (1.0 LPP) for a monophyletic Trocotomorpha including the Amphientometae. Relationships

359 among members of the Psocomorpha displayed some instability when degeneracy recoded data

360 was analyzed in a coalescent context. However, a similar topology relative to the ML analyses is

361 obtained for members of Psocomorpha when all sites are analyzed in a coalescent context.

362 Results of four-cluster likelihood-mapping show the distribution of discordant topologies

363 between different methods of analyses for the placement of the Amphientometae (Fig. 2). When

16 http://mc.manuscriptcentral.com/systbiol Page 17 of 48 Systematic Biology

Phylogenomics of Psocodea

364 all nucleotide sites are analyzed 65.2% of quartets sampled favor a sister relationship between

365 the Amphientometae and Psocomorpha. Similarly, when first and second positions are analyzed

366 the support declines, but 50.2% of quartets still support a sister relationship between the

367 Amphientometae and Psocomorpha. In contrast, when the degeneracy recoded or second codon

368 positions only data are analyzed, 43.8% and 42.5% of quartets sampled respectively, support a 369 sister relationship betweenFor thePeer Amphientometae Review and Nanopsocetae, Only while only 30.1% and 30.8% 370 support Amphientometae with Psocomorpha.

371 Quartet sampling analyses are able to assess support from quartets (four cluster

372 likelihood-mapping) across all nodes in the tree. Using slightly different metrics, these analyses

373 estimate the frequency of discordant topologies across the resultant ML topology tested. When

374 all nucleotide sites are analyzed, a weak majority of quartets support a sister Psocomorpha +

375 Amphientometae (0.23 QC) (Fig. 3). In contrast, when using the degeneracy recoded dataset, a

376 slight majority of quartets sampled support a monophyletic Troctomorpha, including the

377 Amphientometae (0.01 QC) (Fig. 4). Quartet sampling also provides an estimate of which nodes

378 are most stable given the data type and topology analyzed. For example, monophyly of the

379 parasitic louse clade that includes Philopteridae, Trichodectidae, Anoplura, and

380 Rhynchophthirina is supported by all quartets sampled across all nucleotide sites and degeneracy

381 recoded analyses (1.0 QC) and no discordant topologies are detected (NA QD).

382 Visualization of the distribution of GC content for first, second, and third codon positions

383 revealed substantial compositional biases at all positions between suborders or infraorders of

384 Psocodea. Third codon positions showed the most variation in compositional biases (Fig. 5).

385 Members of the Amphientometae possess some of the highest levels of GC content for third

386 codon positions, similar to the pattern observed in Psocomorpha. In contrast members of the

17 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 18 of 48

de Moya et al.

387 Nanopsocetae tend to be more AT rich at third codon positions, similar to patterns observed in

388 third codon positions of the Trogiomorpha. First and second codon positions showed similar

389 patterns of compositional biases, but with much lower levels of variance around the medians

390 relative to third positions (Figs. 6 and 7). First codon positions showed more variation in the

391 medians relative to second codon positions. However, members of the Psocomorpha and 392 Amphientometae wereFor suggested Peer to have Reviewthe highest levels of Only GC content in first and second 393 positions and the Nanopsocetae and Trogiomorpha tended to be more AT rich in first and second

394 codon positions (Fig. 6 and 7).

395 Divergence time analysis indicates the main diversification of extant lineages of parasitic

396 lice occurred approximately 60 Ma (74-123: 95% Myr) following the K-PG boundary mass

397 extinction event. The origin of parasitism within Psocodea could have occurred a maximum of

398 115 Ma (95-148: 95% Myr) based on estimated divergence of the parasitic lineage from non-

399 parasitic members of Psocodea. Divergences between suborders of Psocodea are estimated to

400 have occurred in the lower Jurassic with the deepest split between extant suborders occurring

401 192 Ma (154-255: 95% Myr) (Fig. 1).

402

403 Discussion

404

405 Combining Data and Compositional Biases

406

407 Few phylogenomic studies have explored the results of combining whole genome and

408 transcriptome-based data (Bossert et al. 2019). Higher level phylogenomic studies of insects

409 have used transcriptome sequencing to produce large data matrices of thousands of genes (Misof

18 http://mc.manuscriptcentral.com/systbiol Page 19 of 48 Systematic Biology

Phylogenomics of Psocodea

410 et al. 2014; Peters et al. 2017; Johnson et al. 2018a; Simon et al. 2019; Wipfler et al. 2019).

411 Transcriptome sequencing relies on freshly preserved material, from which RNA can be

412 extracted. Contigs can be annotated for single copy ortholog genes, and methods exist to account

413 for splice variants to resolve these to a single gene sequence (Petersen et al. 2017). New methods

414 have also been developed to individually assemble and annotate single copy genes from shotgun 415 Illumina genome sequencesFor (AllenPeer et al. 2015,Review 2017). Whole Onlygenome data can include non- 416 coding sequences, but poorly aligned regions can be masked when aligned against transcriptome

417 sequences. Thus, it is possible to combine whole genome and transcriptome data to develop large

418 phylogenomic datasets. Here we used a customized bioinformatic pipeline to combine

419 transcriptome and genome data to produce a shared set of nuclear orthologs.

420 Large-scale compositional biases appeared to have some effect on the phylogenetic

421 results for certain taxonomic groups within Psocodea. Most of the instability in our results

422 centered around the placement of Amphientometae, a group of free-living bark lice traditionally

423 placed in the suborder Troctomorpha based upon morphological synapomorphies (Mockford

424 1993; Lienhard and Smithers 2002; Yoshizawa and Lienhard 2010). However, in our

425 phylogenomic analyses, large base compositional biases resulted in alternative placements of

426 Amphientometae under different nucleotide recoding methods. This is most evident when all

427 nucleotide sites are analyzed in which the Amphientometae are placed with 100% bootstrap

428 support with Psocomorpha, resulting in paraphyly of Troctomorpha. Examination of GC content

429 of third codon positions across the alignment (Fig. 5) shows that Amphientometae are similar in

430 GC composition to Psocomorpha. This same pattern of GC biases is also seen in first and second

431 codon positions (Figs. 6 and 7), although when second codon positions alone are analyzed,

432 Amphientometae is recovered as sister to the remainder of the Troctomorpha, similar to the

19 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 20 of 48

de Moya et al.

433 result using degeneracy recoding. However, when third codon positions are removed,

434 Amphientometae is still placed as the sister of Psocomorpha, suggesting modest base

435 composition biases at first codon positions also affect the results.

436 Four cluster likelihood-mapping (quartet mapping) and quartet sampling analyses of the

437 concatenated data demonstrate that quartet-based analyses may also be affected by substantial 438 compositional biases.For Likelihood-mapping Peer Review analyses produce resultsOnly similar to the ML tree 439 topology itself for a given data type. For example, regarding the position of Amphientometae,

440 degeneracy recoded, and second codon positions produce similar scores and in agreement with

441 the phylogenetic placement of this group with Nanopsocetae (Fig. 2). Degeneracy recodes all

442 nucleotides present in the alignment using IUPAC ambiguity codes so all codons that code for a

443 synonymous mutation are nearly identical (Zwick et al. 2012). This recoding method helps

444 account for the large amount of saturation that can take place at most notably third (Fig. 5) but

445 also first (Fig. 6) codon positions, as well as accounting for some variation in base composition.

446 However, given the similar phylogenetic results when degeneracy recoded and second codon

447 positions only are analyzed, it appears that much of the signal supporting monophyletic

448 Troctomorpha is present in second codon positions. Compositional biases and saturation of third

449 and first codon positions can also skew quartet-based analyses. It appears that the relatively high

450 percent GC base composition in both Amphientometae and Psocomorpha (blue and green in

451 Figs. 5, 6), particularly at first and third codon positions, drives the majority of quartets to

452 support a relationship between these two taxa (Fig. 2), similar to the full phylogeny derived from

453 these data types. Although second base positions also show a similar pattern in the ranking of

454 base composition frequencies (blue and green in Fig. 7), this variation spans only a few percent

455 difference and is apparently not enough to influence the resulting tree.

20 http://mc.manuscriptcentral.com/systbiol Page 21 of 48 Systematic Biology

Phylogenomics of Psocodea

456 Conflicting topologies in regard to the phylogenetic position of Amphientometae between

457 recoded data types demonstrates the limitations of using a simplified model of evolution when

458 analyzing millions of base pairs of data. Third positions are known to saturate at high rates due to

459 degeneracy of the genetic code that allows for the emergence of silent mutations. This point is

460 obvious when considering that a stable topology is obtained when degeneracy recoded data is 461 analyzed (Fig. 1). NearlyFor identical Peer results Review are obtained when secondOnly codon positions only are 462 analyzed (Supplemental Fig. 1). Thus, when phylogenomic analyses are limited by computing

463 power, it may be best to consider alternate methods that reduce data set sizes and recoding

464 strategies to reduce rate heterogeneity and compositional bias that may exist at first and third

465 codon positions between distantly related taxa.

466

467 Dating Analysis and the Origin of Parasitism

468

469 After accounting for heterogeneity by using the degeneracy recoded matrix, the dating

470 analysis provides insight into the evolution of parasitic and non-parasitic members within

471 Psocodea. We obtained an estimate for some of the earliest divergences within parasitic lice (i.e.

472 shortly after the K-Pg boundary) similar to that found by a recent phylogenomic study (Johnson

473 et al. 2018b) that used mostly calibration points from cospeciation events rather than from fossil

474 free-living bark lice. Our estimate for the earliest divergence within parasitic lice (~100 Ma) was

475 similar to this previous estimate (between 90 and 100 Ma) (Johnson et al. 2018b). The dating

476 estimates completed in this study suggest generally younger origins for non-parasitic members

477 than have been previously reported (Misof et al. 2014; Johnson et al. 2018b; Yoshizawa et al.

478 2019). The calibrations for analyses produced in this present study are fossil-based minimum age

21 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 22 of 48

de Moya et al.

479 constraints (Mockford et al. 2013) with a maximum age constraint at the root based on previous

480 Bayesian estimates (Johnson et al. 2018a). Differences in calibration methods may account for

481 some of older divergence estimates for non-parasitic members reported in other studies (Misof et

482 al. 2014; Johnson et al. 2018a; Yoshizawa et al. 2019). The dating analysis suggests the origin of

483 parasitism may have occurred a maximum of 115 Ma, predating the K-Pg boundary (66 Ma) 484 (Fig. 1). Our 95% confidenceFor Peer interval for Reviewthe origin of parasitism Only also predates the K-Pg 485 boundary (95-148 Ma). Therefore, it remains a possibility that the ancient host of the first

486 parasitic louse may have been an endothermic dinosaur, although fossil evidence would be

487 needed to confirm this. However, the common ancestor of the clade that eventually evolved

488 parasitism may have been non-parasitic. Therefore, parasitism may have originated anytime

489 between 115 Ma and the initial diversification of parasitic lice (100 Ma).

490

491 Implications for the Taxonomic Classification of Psocodea

492

493 Our phylogenomic analyses, plus existing morphological evidence (Lyal 1985; Mockford

494 1993; Yoshizawa and Johnson 2010) help establish a stable subordinal level classification

495 scheme for the order Psocodea (Table 3). Given that previous analyses have suggested, both

496 based on morphological and molecular evidence, that free living Psocoptera and parasitic

497 Phthiraptera together form a monophyletic lineage (Lyal 1985; Johnson et al. 2004, 2018a), we

498 recognize Psocodea as a single order encompassing both traditional Psocoptera and Phthiraptera.

499 Below the level of order, the goal of this classification scheme is to reflect the higher-level

500 phylogeny, but also retain as many widely used historical names as possible. In particular, given

22 http://mc.manuscriptcentral.com/systbiol Page 23 of 48 Systematic Biology

Phylogenomics of Psocodea

501 the widespread usage of Phthiraptera, we seek to retain this name, which necessitates changes in

502 taxonomic rank for certain groups within Troctomorpha.

503 A monophyletic Phthiraptera (parasitic lice) is derived from within the Troctomorpha and

504 sister to the family Liposcelididae across all analyses. All analyses suggest that the origin of

505 parasitism occurred once within the Troctomorpha. Thus, it is necessary to recognize 506 Phthiraptera at a lowerFor taxonomic Peer rank than Review the suborder Troctomorpha Only from which these 507 parasites are derived. Given their widespread usage and acceptance (Lienhard and Smithers

508 2002), we prefer to retain the three historical suborders of bark lice (Trogiomorpha,

509 Psocomorpha, and Troctomorpha) within Psocodea. We prefer to retain the infraordinal

510 taxonomic ranks within Psocomorpha and Trogiomorpha given that these groups are generally

511 supported by our analyses. Given that parasitic lice are also embedded within the traditional

512 Infraorder Nanopsocetae, this would further reduce the rank of Phthiraptera. As a solution to this

513 issue, we elect to divide Nanopsocetae into three infraorders (Table 3). Under this scheme, we

514 also preserve many of the traditional subordinal names within Phthiraptera, and they are now

515 placed at the rank of Parvorder. This scheme also allows us to retain all other existing subordinal

516 parasitic louse names, including Amblycera, Rhynchophthirina, and Anoplura.

517 One final concern with regards to classification is the status of Ischnocera. Ischnocera

518 (chewing lice) as currently defined (to include both Philopteridae and Trichodectidae) (Price et

519 al. 2003) was paraphyletic across all analyses. This paraphyly has also been detected in previous

520 studies (Johnson et al. 2018b; Song et al. 2019). Therefore, we suggest that Ischnocera be

521 retained to recognize the bulk of diversity (i.e. Philopteridae, ~3,000 species) (Price et al. 2003)

522 and that the Parvorder Trichodectera (Song et al. 2019) be recognized for the less diverse

523 mammal infesting clade Trichodectidae (~400 species) (Price et al. 2003) (Table 3).

23 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 24 of 48

de Moya et al.

524 Supplemental Material

525 Data files and other materials, can found at the Dryad data repository at http://datadryad.org,

526 https://doi.org/10.5061/dryad.c59zw3r50

527

528 Acknowledgments 529 We would like to thankFor Rodrigo Peer L. Ferreira Review for field support andOnly supplying samples. We would 530 like to thank Charles Lienhard for information and discussion. We thank the following

531 individuals for assistance in obtaining samples: R. Wilson, D.H. Clayton, T.D. Galloway, E.

532 Osnas, T. Spradling, L. Mugisha, A. K. Saxena, T. Chesser, E. DiBlasi, F. Daunt, V. Smith, R.

533 Furness, E.L. Mockford, J. Jankowski, J.M. Allen, R. Faucett, B. O’Shea, K.G. McCracken, R.E.

534 Junge, J.D. Weckstein, A. Lawrence, K.C. Bell, M.S. Leonardi, and M. Bowser. Photo credit to

535 J. Gausas/Terrestrial Parasite Tracker Project.

536

537 Funding

538 This work was supported by JSPS Grants no. 15H04409 and 19H03278 to K.Y.; US NSF DEB-

539 0612938, DEB-1342604, XSEDE DEB-16002, DEB-1855812, DEB-1925487 to K.P.J.; and US

540 NSF DEB-1239788 to K.P.J. and C.H.D.

541

24 http://mc.manuscriptcentral.com/systbiol Page 25 of 48 Systematic Biology

Phylogenomics of Psocodea

542 References

543 Alkan C., Sajjadian S., Eichler E.E. 2011. Limitations of next-generation genome sequence 544 assembly. Nat Methods. 8:61–65.

545 Allen J.M., Boyd B., Nguyen N.-P., Vachaspati P., Warnow T., Huang D.I., Grady P.G.S., Bell 546 K.C., Cronk Q.C.B., Mugisha L., Pittendrigh B.R., Leonardi M.S., Reed D.L., Johnson 547 K.P. 2017. Phylogenomics from Whole Genome Sequences Using aTRAM. Syst. Biol. 548 66:786–798. 549 Allen J.M., Huang ForD.I., Cronk Peer Q.C., Johnson Review K.P. 2015. aTRAM Only - automated target restricted 550 assembly method: a fast method for assembling loci across divergent taxa from next- 551 generation sequencing data. BMC Bioinformatics. 16.

552 Bossert S., Murray E.A., Almeida E.A.B., Brady S.G., Blaimer B.B., Danforth B.N. 2019. 553 Combining transcriptomes and ultraconserved elements to illuminate the phylogeny of 554 Apidae. Mol Phylogenet Evol. 130:121–131.

555 Bossert S., Murray E.A., Blaimer B.B., Danforth B.N. 2017. The impact of GC bias on 556 phylogenetic accuracy using targeted enrichment phylogenomic data. Mol. Phylogenet 557 Evol. 111:149–157.

558 Breinholt J.W., Kawahara A.Y. 2013. Phylotranscriptomics: Saturated Third Codon Positions 559 Radically Influence the Estimation of Trees Based on Next-Gen Data. Genome Biol Evol. 560 5:2082–2092.

561 Broadhead E., Richards A.M. 1982. The Psocoptera of East Africa—a taxonomic and ecological 562 survey. Biol J Linn Soc. 17:137–216.

563 Broadhead E., Wapshere A.J. 1966. Mesopsocus Populations on in England--The 564 Distribution and Dynamics of Two Closely-Related Coexisting Species of Psocoptera 565 Sharing the Same Food Resource. Ecol Monogr. 36:327–388.

566 Capella-Gutiérrez S., Silla-Martínez J.M., Gabaldón T. 2009. trimAl: a tool for automated 567 alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 25:1972–1973.

568 Clayton D.H., Bush S.E., Johnson K.P. 2015. Coevolution of Life on Hosts: Integrating Ecology 569 and History. Chicago; London: University of Chicago Press.

570 Cox C.J., Li B., Foster P.G., Embley T.M., Civáň P. 2014. Conflicting Phylogenies for Early 571 Land Plants are Caused by Composition Biases among Synonymous Substitutions. Syst 572 Biol. 63:272–279.

573 Duchêne D.A., Duchêne S., Ho S.Y.W. 2017. New Statistical Criteria Detect Phylogenetic Bias 574 Caused by Compositional Heterogeneity. Mol Biol Evol. 34:1529–1534.

25 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 26 of 48

de Moya et al.

575 Durden L.A. 2019. Chapter 7 - Lice (Phthiraptera). In: Mullen G.R., Durden L.A., editors. 576 Medical and Veterinary Entomology (Third Edition). Academic Press. p. 79–106.

577 Emeljanov A.F., Golub N.V., Kuznetsova V.G. 2001. (Evolutionary Transformation of Testes 578 and Ovaries in Booklice, Birdlice, and Sucking Lice (Psocoptera, Phthiraptera: 579 , Anoplura). 81:20.

580 Foster P.G., Jermiin L.S., Hickey D.A. 1997. Nucleotide composition bias affects amino acid 581 content in proteins coded by mitochondria. J Mol Evol 44:282-288.

582 Galtier N., Gouy M. 1995. Inferring phylogenies from DNA sequences of unequal base 583 compositions.For PNAS. Peer 92:11317–11321. Review Only 584 Gordon A., Hannon G.J. 2010. Fastx-toolkit. .

585 Houseley J., Tollervey D. 2009. The Many Pathways of RNA Degradation. Cell. 136:763–776.

586 Ishikawa S.A., Inagaki Y., Hashimoto T. 2012. RY-Coding and Non-Homogeneous Models Can 587 Ameliorate the Maximum-Likelihood Inferences from Nucleotide Sequence Data with 588 Parallel Compositional Heterogeneity. Evol Bioinform Online. 8:EBO.S9017.

589 Jarvis E.D., Mirarab S., Aberer A.J., Li B., Houde P., Li C., Ho S.Y.W., Faircloth B.C., Nabholz 590 B., Howard J.T., Suh A., Weber C.C., Fonseca R.R. da, Li J., Zhang F., Li H., Zhou L., 591 Narula N., Liu L., Ganapathy G., Boussau B., Bayzid M.S., Zavidovych V., Subramanian 592 S., Gabaldón T., Capella-Gutiérrez S., Huerta-Cepas J., Rekepalli B., Munch K., 593 Schierup M., Lindow B., Warren W.C., Ray D., Green R.E., Bruford M.W., Zhan X., 594 Dixon A., Li S., Li N., Huang Y., Derryberry E.P., Bertelsen M.F., Sheldon F.H., 595 Brumfield R.T., Mello C.V., Lovell P.V., Wirthlin M., Schneider M.P.C., Prosdocimi F., 596 Samaniego J.A., Velazquez A.M.V., Alfaro-Núñez A., Campos P.F., Petersen B., 597 Sicheritz-Ponten T., Pas A., Bailey T., Scofield P., Bunce M., Lambert D.M., Zhou Q., 598 Perelman P., Driskell A.C., Shapiro B., Xiong Z., Zeng Y., Liu S., Li Z., Liu B., Wu K., 599 Xiao J., Yinqi X., Zheng Q., Zhang Y., Yang H., Wang J., Smeds L., Rheindt F.E., Braun 600 M., Fjeldsa J., Orlando L., Barker F.K., Jønsson K.A., Johnson W., Koepfli K.-P., 601 O’Brien S., Haussler D., Ryder O.A., Rahbek C., Willerslev E., Graves G.R., Glenn T.C., 602 McCormack J., Burt D., Ellegren H., Alström P., Edwards S.V., Stamatakis A., Mindell 603 D.P., Cracraft J., Braun E.L., Warnow T., Jun W., Gilbert M.T.P., Zhang G. 2014. 604 Whole-genome analyses resolve early branches in the tree of life of modern birds. 605 Science. 346:1320–1331.

606 Jermiin L.S., Ho S.Y.W., Ababneh F., Robinson J., Larkum A.W.D. 2004. The Biasing Effect of 607 Compositional Heterogeneity on Phylogenetic Estimates May be Underestimated. Syst 608 Biol. 53:638–643.

609 Johnson K.P. 2019. Putting the genome in insect phylogenomics. Curr Opin Insect Science. 610 36:111–117.

26 http://mc.manuscriptcentral.com/systbiol Page 27 of 48 Systematic Biology

Phylogenomics of Psocodea

611 Johnson K.P., Cruickshank R.H., Adams R.J., Smith V.S., Page R.D.M., Clayton D.H. 2003. 612 Dramatically elevated rate of mitochondrial substitution in lice (Insecta: Phthiraptera). 613 Mol Phylogenet Evol. 26:231–242.

614 Johnson K.P., Dietrich C.H., Friedrich F., Beutel R.G., Wipfler B., Peters R.S., Allen J.M., 615 Petersen M., Donath A., Walden K.K.O., Kozlov A.M., Podsiadlowski L., Mayer C., 616 Meusemann K., Vasilikopoulos A., Waterhouse R.M., Cameron S.L., Weirauch C., 617 Swanson D.R., Percy D.M., Hardy N.B., Terry I., Liu S., Zhou X., Misof B., Robertson 618 H.M., Yoshizawa K. 2018a. Phylogenomics and the evolution of hemipteroid insects. 619 PNAS. 115:12775–12780. 620 Johnson K.P., NguyenFor N., Sweet Peer A.D., Boyd Review B.M., Warnow T.,Only Allen J.M. 2018b. Simultaneous 621 radiation of bird and mammal lice following the K-Pg boundary. Biol Lett. 14:20180141.

622 Johnson K.P., Yoshizawa K., Smith V.S. 2004. Multiple origins of parasitism in lice. P R Soc B- 623 Biol Sci. 271:1771–1776.

624 Kearse M., Moir R., Wilson A., Stones-Havas S., Cheung M., Sturrock S., Buxton S., Cooper A., 625 Markowitz S., Duran C., Thierer T., Ashton B., Meintjes P., Drummond A. 2012. 626 Geneious Basic: An integrated and extendable desktop software platform for the 627 organization and analysis of sequence data. Bioinformatics. 28:1647–1649.

628 Kozlov A.M., Aberer A.J., Stamatakis A. 2015. ExaML version 3: a tool for phylogenomic 629 analyses on supercomputers. Bioinformatics. 31:2577–2579.

630 Lanfear R., Frandsen P.B., Wright A.M., Senfeld T., Calcott B. 2017. PartitionFinder 2: New 631 Methods for Selecting Partitioned Models of Evolution for Molecular and Morphological 632 Phylogenetic Analyses. Mol Biol Evol. 34:772–773.

633 Laumer C.E., Gruber-Vodicka H., Hadfield M.G., Pearse V.B., Riesgo A., Marioni J.C., Giribet 634 G. 2018. Support for a clade of Placozoa and Cnidaria in genes with minimal 635 compositional bias. eLife. 7:e36278.

636 Lienhard C., Smithers C.N. 2002. Psocoptera (Insecta): world catalogue and bibliography. 637 Psocoptera (Insecta): world catalogue and bibliography.

638 Lyal C.H.C. 1985. Phylogeny and classification of the Psocodea, with particular reference to the 639 lice (Psocodea: Phthiraptera). Syst Entomol. 10:145–165.

640 Mirarab S., Nguyen N., Guo S., Wang L.-S., Kim J., Warnow T. 2014a. PASTA: Ultra-Large 641 Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences. J Comput 642 Biol. 22:377–386.

643 Mirarab S., Reaz R., Bayzid M.S., Zimmermann T., Swenson M.S., Warnow T. 2014b. 644 ASTRAL: genome-scale coalescent-based species tree estimation. Bioinformatics. 645 30:i541–i548.

27 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 28 of 48

de Moya et al.

646 Mirarab S., Warnow T. 2015. ASTRAL-II: coalescent-based species tree estimation with many 647 hundreds of taxa and thousands of genes. Bioinformatics. 31:i44–i52.

648 Misof B., Liu S., Meusemann K., Peters R.S., Donath A., Mayer C., Frandsen P.B., Ware J., 649 Flouri T., Beutel R.G., Niehuis O., Petersen M., Izquierdo-Carrasco F., Wappler T., Rust 650 J., Aberer A.J., Aspöck U., Aspöck H., Bartel D., Blanke A., Berger S., Böhm A., 651 Buckley T.R., Calcott B., Chen J., Friedrich F., Fukui M., Fujita M., Greve C., Grobe P., 652 Gu S., Huang Y., Jermiin L.S., Kawahara A.Y., Krogmann L., Kubiak M., Lanfear R., 653 Letsch H., Li Y., Li Z., Li J., Lu H., Machida R., Mashimo Y., Kapli P., McKenna D.D., 654 Meng G., Nakagaki Y., Navarrete-Heredia J.L., Ott M., Ou Y., Pass G., Podsiadlowski 655 L., Pohl H., von Reumont B.M., Schütte K., Sekiya K., Shimizu S., Slipinski A., 656 Stamatakis ForA., Song PeerW., Su X., Szucsich Review N.U., Tan M., Only Tan X., Tang M., Tang J., 657 Timelthaler G., Tomizuka S., Trautwein M., Tong X., Uchifune T., Walzl M.G., 658 Wiegmann B.M., Wilbrandt J., Wipfler B., Wong T.K.F., Wu Q., Wu G., Xie Y., Yang 659 S., Yang Q., Yeates D.K., Yoshizawa K., Zhang Q., Zhang R., Zhang W., Zhang Y., 660 Zhao J., Zhou C., Zhou L., Ziesmann T., Zou S., Li Y., Xu X., Zhang Y., Yang H., Wang 661 J., Wang J., Kjer K.M., Zhou X. 2014. Phylogenomics resolves the timing and pattern of 662 insect evolution. Science. 346:763.

663 Mockford E.L. 1993. North American Psocoptera (Insecta). North American Psocoptera 664 (Insecta).

665 Mockford E.L., Lienhard C., Yoshizawa K., Mockford E.L., Lienhard C., Yoshizawa K. 2013. 666 Revised classification of “Psocoptera” from Cretaceous amber, a reassessment of 667 published information. Insecta matsumurana. New series: journal of the Faculty of 668 Agriculture Hokkaido University, series entomology. 69:1–26.

669 de Moya R.S., Allen J.M., Sweet A.D., Walden K.K.O., Palma R.L., Smith V.S., Cameron S.L., 670 Valim M.P., Galloway T.D., Weckstein J.D., Johnson K.P. 2019. Extensive host- 671 switching of avian feather lice following the Cretaceous-Paleogene mass extinction 672 event. Comm Biol. 2:1–6.

673 de Moya R.S. 2020. Psocodea Phylogenomic dataset, v2, Dryad, Dataset, 674 https://doi.org/10.5061/dryad.c59zw3r50 675 676 New T.R. 1970. The Relative Abundance of Some British Psocoptera on Different Species of 677 Trees. J Anim Ecol. 39:521–540.

678 New T.R. 1987. Biology of the Psocoptera. Orient Insects. 21:1–109.

679 Nguyen L.-T., Schmidt H.A., von Haeseler A., Minh B.Q. 2015. IQ-TREE: A Fast and Effective 680 Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Mol Biol Evol. 681 32:268–274.

682 Pattengale N.D., Alipour M., Bininda-Emonds O.R.P., Moret B.M.E., Stamatakis A. 2009. How 683 Many Bootstrap Replicates Are Necessary? Lect N Bioinformat.:184–200.

28 http://mc.manuscriptcentral.com/systbiol Page 29 of 48 Systematic Biology

Phylogenomics of Psocodea

684 Pease J.B., Brown J.W., Walker J.F., Hinchliff C.E., Smith S.A. 2018. Quartet Sampling 685 distinguishes lack of support from conflicting support in the green plant tree of life. Am J 686 Bot. 105:385–403.

687 Peters R.S., Krogmann L., Mayer C., Donath A., Gunkel S., Meusemann K., Kozlov A., 688 Podsiadlowski L., Petersen M., Lanfear R., Diez P.A., Heraty J., Kjer K.M., Klopfstein 689 S., Meier R., Polidori C., Schmitt T., Liu S., Zhou X., Wappler T., Rust J., Misof B., 690 Niehuis O. 2017. Evolutionary History of the Hymenoptera. Curr Biol. 27:1013–1018.

691 Petersen M., Meusemann K., Donath A., Dowling D., Liu S., Peters R.S., Podsiadlowski L., 692 Vasilikopoulos A., Zhou X., Misof B., Niehuis O. 2017. Orthograph: a versatile tool for 693 mapping codingFor nucleotide Peer sequences Review to clusters of orthologous Only genes. BMC 694 Bioinformatics. 18.

695 Philippe H., Brinkmann H., Lavrov D.V., Littlewood D.T.J., Manuel M., Wörheide G., Baurain 696 D. 2011. Resolving Difficult Phylogenetic Questions: Why More Sequences Are Not 697 Enough. PLOS Biology. 9:e1000602.

698 Posada D., Crandall K.A. 1998. MODELTEST: testing the model of DNA substitution. 699 Bioinformatics. 14:817–818.

700 Price R.D., Hellenthal R.A., Palma R.L., Johnson K.P., Clayton D.H. 2003. Chewing Lice: 701 World Checklist and Biological Overview. Champaign, Ill.: Illinois Natural History 702 Survey.

703 Prum R.O., Berv J.S., Dornburg A., Field D.J., Townsend J.P., Lemmon E.M., Lemmon A.R. 704 2015. A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA 705 sequencing. Nature. 526:569–573.

706 Rambaut A., Drummond A.J., Xie D., Baele G., Suchard M.A. 2018. Posterior Summarization in 707 Bayesian Phylogenetics Using Tracer 1.7. Syst Biol. 67:901–904.

708 Regier J.C., Shultz J.W., Zwick A., Hussey A., Ball B., Wetzer R., Martin J.W., Cunningham 709 C.W. 2010. relationships revealed by phylogenomic analysis of nuclear 710 protein-coding sequences. Nature. 463:1079–1083.

711 Romiguier J., Cameron S.A., Woodard S.H., Fischman B.J., Keller L., Praz C.J. 2016. 712 Phylogenomics Controlling for Base Compositional Bias Reveals a Single Origin of 713 Eusociality in Corbiculate Bees. Mol Biol Evol. 33:670–678.

714 Roure B., Philippe H. 2011. Site-specific time heterogeneity of the substitution process and its 715 impact on phylogenetic inference. BMC Evol Biol. 11:17.

716 RS Team. 2015. RStudio: integrated development for R. .

717 Sayyari E., Mirarab S. 2016. Fast Coalescent-Based Computation of Local Branch Support from 718 Quartet Frequencies. Mol Biol Evol. 33:1654–1668.

29 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 30 of 48

de Moya et al.

719 Scholtz C.H. 2016. The higher classification of southern African insects. Afr Entomol. 24:545– 720 555.

721 Sheffield N.C., Song H., Cameron S.L., Whiting M.F. 2009. Nonstationary Evolution and 722 Compositional Heterogeneity in Beetle Mitochondrial Phylogenomics. Syst Biol. 58:381– 723 394.

724 Simion P., Philippe H., Baurain D., Jager M., Richter D.J., Di Franco A., Roure B., Satoh N., 725 Quéinnec É., Ereskovsky A., Lapébie P., Corre E., Delsuc F., King N., Wörheide G., 726 Manuel M. 2017. A Large and Consistent Phylogenomic Dataset Supports Sponges as the 727 Sister GroupFor to All OtherPeer . Review Curr Biol. 27:958–967. Only 728 Simmons M.P. 2017. Relative benefits of amino-acid, codon, degeneracy, DNA, and purine- 729 pyrimidine character coding for phylogenetic analyses of exons - Simmons - 2017 - 730 Journal of Systematics and Evolution - Wiley Online Library. Available from 731 http://onlinelibrary.wiley.com/wol1/doi/10.1111/jse.12233/full.

732 Simon S., Blanke A., Meusemann K. 2018. Reanalyzing the Palaeoptera problem - The origin of 733 insect flight remains obscure. Arthropod Struct Dev. 47:328–338.

734 Simon S., Letsch H., Bank S., Buckley T., Donath A., Liu S., Machida R., Meusemann K., Misof 735 B., Podsiadlowski L., Zhuo X., Wipfler B., Bradler S. 2019. Old World and New World 736 Phasmatodea: Phylogenomics Resolve the Evolutionary History of Stick and Leaf 737 Insects. Front Ecol Evol. 7.

738 Simpson J.T., Wong K., Jackman S.D., Schein J.E., Jones S.J.M., Birol İ. 2009. ABySS: A 739 parallel assembler for short read sequence data. Genome Res. 19:1117–1123.

740 Skinner R.K., Dietrich C.H., Walden K.K.O., Gordon E., Sweet A.D., Podsiadlowski L., 741 Petersen M., Simon C., Takiya D.M., Johnson K.P. 2020. Phylogenomics of 742 Auchenorrhyncha (Insecta: ) using transcriptomes: examining controversial 743 relationships via degeneracy coding and interrogation of gene conflict. Syst Entomol. 744 45:85–113.

745 Slater G.S.C., Birney E. 2005. Automated generation of heuristics for biological sequence 746 comparison. BMC Bioinformatics. 6:31.

747 Song F., Li H., Liu G.-H., Wang W., James P., Colwell D.D., Tran A., Gong S., Cai W., Shao R. 748 2019. Mitochondrial Genome Fragmentation Unites the Parasitic Lice of Eutherian 749 Mammals. Syst Biol. 68:430–440.

750 Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of 751 large phylogenies. Bioinformatics. 30:1312–1313.

752 Strimmer K., Haeseler A. von. 1997. Likelihood-mapping: A simple method to visualize 753 phylogenetic content of a sequence alignment. PNAS. 94:6815–6819.

30 http://mc.manuscriptcentral.com/systbiol Page 31 of 48 Systematic Biology

Phylogenomics of Psocodea

754 Sukumaran J., Holder M.T. 2015. SumTrees: Phylogenetic Tree Summarization. 4.0.0. Available 755 from https://github.com/jeetsukumaran/DendroPy.

756 Vaidya G., Lohman D.J., Meier R. 2011. SequenceMatrix: concatenation software for the fast 757 assembly of multi-gene datasets with character set and codon information. Cladistics. 758 27:171–180.

759 Vasilikopoulos A., Balke M., Beutel R.G., Donath A., Podsiadlowski L., Pflug J.M., Waterhouse 760 R.M., Meusemann K., Peters R.S., Escalona H.E., Mayer C., Liu S., Hendrich L., Alarie 761 Y., Bilton D.T., Jia F., Zhou X., Maddison D.R., Niehuis O., Misof B. 2019. 762 Phylogenomics of the superfamily Dytiscoidea (Coleoptera: Adephaga) with an 763 evaluation ofFor phylogenetic Peer conflict Review and systematic error. Only Mol Phylogenet Evol. 135:270– 764 285.

765 Wang R., Yao Y., Ren D., Shih C. 2019. Psocoptera – Barklice and Booklice. Rhythms of Insect 766 Evolution. John Wiley & Sons, Ltd. p. 185–188.

767 Wappler T., Smith V.S., Dalgleish R.C. 2004. Scratching an ancient : an Eocene 768 fossil. P R Soc B-Biol Sci. 271:S255–S258.

769 Wipfler B., Letsch H., Frandsen P.B., Kapli P., Mayer C., Bartel D., Buckley T.R., Donath A., 770 Edgerly-Rooks J.S., Fujita M., Liu S., Machida R., Mashimo Y., Misof B., Niehuis O., 771 Peters R.S., Petersen M., Podsiadlowski L., Schütte K., Shimizu S., Uchifune T., 772 Wilbrandt J., Yan E., Zhou X., Simon S. 2019. Evolutionary history of Polyneoptera and 773 its implications for our understanding of early winged insects. Proc Natl Acad Sci USA. 774 116:3024–3029.

775 Yang Z. 2007. PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol Biol Evol. 776 24:1586–1591.

777 Yoshizawa K. 2002. Phylogeny and higher classification of suborder Psocomorpha (Insecta: 778 Psocodea: ‘Psocoptera’). Zool J Linn Soc. 136:371–400.

779 Yoshizawa K., Johnson K.P. 2010. How stable is the “Polyphyly of Lice” hypothesis (Insecta: 780 Psocodea)?: A comparison of phylogenetic signal in multiple genes. Mol Phylogenet 781 Evol. 55:939–951.

782 Yoshizawa K., Johnson K.P. 2013. Changes in base composition bias of nuclear and 783 mitochondrial genes in lice (Insecta: Psocodea). Genetica. 141:491–499.

784 Yoshizawa K., Johnson K.P. 2014. Phylogeny of the suborder Psocomorpha: congruence and 785 incongruence between morphology and molecular data (Insecta: Psocodea: ‘Psocoptera’). 786 Zool J Linn Soc. 171:716–731.

787 Yoshizawa K., Lienhard C. 2010. In search of the sister group of the true lice: A systematic 788 review of booklice and their relatives, with an updated checklist of Liposcelididae 789 (Insecta: Psocodea). Arthropod Syst Phylo. 68:181–195.

31 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 32 of 48

de Moya et al.

790 Yoshizawa K., Lienhard C., Johnson K.P. 2006. Molecular systematics of the suborder 791 Trogiomorpha (Insecta: Psocodea: ‘Psocoptera’). Zool J Linn Soc. 146:287–299.

792 Yoshizawa K., Lienhard C., Yao I., Ferreira R.L. 2019. Cave insects with sex-reversed genitalia 793 had their most recent common ancestor in West Gondwana (Psocodea: : 794 Speleketorinae). Entomol Sci. 22:334–338.

795 Zwick A., Regier J.C., Zwickl D.J. 2012. Resolving Discrepancy between Nucleotides and 796 Amino Acids in Deep-Level Arthropod Phylogenomics: Differentiating Serine Codons in 797 21-Amino-Acid Models. PLOS ONE. 7:e47450. 798 For Peer Review Only 799

32 http://mc.manuscriptcentral.com/systbiol Page 33 of 48 Systematic Biology

Phylogenomics of Psocodea

800 Figure Captions

801

802 Fig. 1: The result of phylogenomic analyses using degeneracy recoded nucleotide data. Clade

803 support is depicted as bootstrap support. The timescale provides an estimate of divergences

804 suggested by MCMCtree dating analyses using correlated rates. Taxonomic names marked with 805 an asterisk representFor samples Peer that are derived Review from transcriptomes. Only Names which lack the 806 asterisk represent samples derived from shotgun whole genome sequencing. The H. following

807 names indicate the taxon is classified within the Homilopsocidea.

808

809 Fig. 2: The result of likelihood-mapping derived from IQtree. Results show the percentage of

810 quartets derived from analyses supporting relationships among the Trogiomorpha, Psocomorpha,

811 Nanopsocetae, and Amphientometae.

812

813 Fig. 3: A cladogram of the result of quartet sampling based on the analysis of all nucleotide sites.

814 Clade support is depicted as: Quartet Concordance (QC)/Quartet Differential (QD)/Quartet

815 Informativeness (QI). Taxonomic names marked with an asterisk represent samples that are

816 derived from transcriptomes. Names which lack the asterisk represent samples derived from

817 shotgun whole genome sequencing. The H. following names indicate the taxon is classified

818 within the Homilopsocidea.

819

820 Fig. 4: A cladogram of the result of quartet sampling based on the analysis of recoded

821 nucleotides using degeneracy methods. Clade support is depicted as: Quartet Concordance

822 (QC)/Quartet Differential (QD)/Quartet Informativeness (QI). Taxonomic names marked with an

33 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 34 of 48

de Moya et al.

823 asterisk represent samples that are derived from transcriptomes. Names which lack the asterisk

824 represent samples derived from shotgun whole genome sequencing. The H. following names

825 indicate the taxon is classified within the Homilopsocidea.

826

827 Fig. 5: Box and whisker plot showing the distribution of GC content in third codon positions 828 from the alignmentsFor analyzed. Peer Review Only 829

830 Fig. 6: Box and whisker plot showing the distribution of GC content in first codon positions from

831 the alignments analyzed.

832

833 Fig. 7: Box and whisker plot showing the distribution of GC content in second codon positions

834 from the alignments analyzed.

835

34 http://mc.manuscriptcentral.com/systbiol Page 35 of 48 Systematic Biology

Phylogenomics of Psocodea

836 Table Captions

837 Table 1: A summary of all species of Psocodea sampled and data analyzed. Transcriptome

838 sequences are available from a previous study (Johnson et al. 2018a).

839

840 Table 2: A summary of the phylogenetic analyses completed, and respective data type analyzed.

841 For Peer Review Only

842 Table 3: A comparison between historical ordinal taxonomic schemes for Psocoptera and

843 Phthiraptera to the newly proposed classification scheme for a single order Psocodea.

844

845 Supplemental Figure Captions

846

847 Supplemental Fig. 1: Maximum likelihood results of the concatenated maximum likelihood

848 analysis of second codon positions only. Taxa marked with stars represent transcriptome-based

849 samples.

850

851 Supplemental Fig. 2: The ASTRAL coalescent result of 2,370 gene alignments of all nucleotide

852 sites. Clade support is based upon local posterior probabilities. Taxa marked with stars represent

853 transcriptome-based samples.

854

855 Supplemental Fig. 3: The ASTRAL coalescent result of 2,370 degeneracy recoded gene

856 alignments. Clade support is based upon local posterior probabilities. Taxa marked with stars

857 represent transcriptome-based samples.

858

35 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 36 of 48 100 Osculotes curta 100 Pessoaiella absita 100 Megaginus tataupensis 100 Strongylocotes lipogonus 100 Campanulotes compar 100 Campanulotes compar * 100 Goniodes ortygis 36 100 Saemundssonia lari Quadraceps punctatus 100 Penenirmus auritus 100 Alcedoecus sp. 100 100 Craspedorrhynchus sp. * 100 Brueelia antiqua Degeeriella rufa Philopteridae 100 Chelopistes texanus 100 Oxylipeurus chiniri 98 Anatoecus icterodes 90 Fulicoffula longipila 99 Halipeurus diversus 100 Docophoroides brevis 100 100 Ibidoecus bisignatus Pectinopygus varius 90 Falcolipeurus marginalis 99 40 Bothriometopus macrocnemis Trichophilopterus babakotophilus Craspedonirmus immer 100 100 Columbicola columbae Troctomorpha Columbicola columbae * 100 100 Proechinopthirus fluctus 100 Antarctophthirus microchir Echinophthirius horridus * 100 100 eurysternus 100 Hoplopleura arboricola spicatus 100 Neohaematopinus pacificus Anoplura 100 100 100 Pediculus humanus Ref. 100 Pediculus schaeffi 100 100 100 Pthirus gorillae Pthirus pubis Pedicinus badius 100 100 100 elephantis Rhynchophthirina Haematomyzus elephantis * 100 Geomydoecus ewingi * 100 Geomydoecus aurei Trichodectidae Stachiella larseni 98 Laemobothrion tinnunculi 52 Heterodoxus spiniger 100 100 For Peer Review OnlyRicinus sp. 100 Macrogyropus costalimai Amblycera 100 Cummingsia maculata 100 Myrsidea sp. Osborniella crotophagae 100 Menopon gallinae * 100 100 Embidopsocus sp. 2 100 Embidopsocus sp. 2 * 100 Embidopsocus sp. 1 * Liposcelididae 100 Liposcelis brunnea 100 Liposcelis pearmani Liposcelis bostrychophila * 100 Tapinella sp. 100 100 Pachytroctes maculosus Pachytroctidae 100 Peritroctes sp. 100 Badonnelia titei Badonnelia titei * Sphaeropsocidae 100 Musapsocus sp. 100 Compsocus elegans 100 100 Stimulopalpus japonicus Amphientometae Stimulopalpus japonicus * Epitroctes sp. 54 Kilauella sp. 100 Propsocus pulchripennis 100 Idatenopsocus orientalis 54 amabilis 100 Nepiomorpha sp. Homilopsocidea 77 Mesopsocus unipunctatus * 100 Elipsocus kuriliensis * 100 100 97 Lachesilla contraforcepeta * Lachesilla abiesicola * briggsi * 100 100 Matsumuraiella radiopicta * 100 cruciatus * 100 Amphipsocus japonicus * 100 Valenzuela badiostigma * Caeciliusetae Xanthocaecilius sommermanae Psocomorpha 97 100 Asiopsocus sonorensis phaeopterus * H. 100 100 Longivalvus nubilus * 100 100 Ptycta johnsoni * Neoblaste papillosus * Psocetae Hemipsocus chroloticus * 100 Psilopsocus sp. 100 100 100 Cladiopsocus ocotensis 100 Neurostigma sp. 100 Loneura mombachensis Epipsocetae Bertkauia sp. * 100 100 Bryopsocus townsendi 100 clarus 100 100 Calopsocus reticulatus Philotarsetae Heterocaecilius solocipennis * Aaroniella sp. * nomas Archipsocetae 100 100 Cerobasis guestfalica * 100 * Atropetae Echmepteryx hageni * 100 Rhyopsocus sp. 6 100 Dorypteryx domestica Psyllipsocetae Psyllipsocus ramburii * 100 Trogiomorpha 100 Neotrogla sp. Neotrogla aurora * Prionoglaridetae Speleketor irwini Prionoglaris stygia

175 1 50 125 100 75 50 25 0 MYA

Mesozoic Cenozoic

Jursassic Cretaceous Paleogene Neogene http://mc.manuscriptcentral.com/systbiol Quartenary Page 37 of 48 (Nanopsocetae, Amphientometae)-(Psocomorpha,Systematic Biology Trogiomorpha)

(Nanopsocetae, Trogiomorpha)-(Amphientometae,For Psocomorpha)Peer Review(Nanopsocetae, Only Psocomorpha)-(Amphientometae, Trogiomorpha)

23.7% 32.4%

0.0% 0.0%

0.0% 0.0% 0.0% 0.0%

65.2% 0.0% 11.0% 50.2% 0.0% 17.4%

All Nucleotide Sites 3rd Codon Positions Removed

43.8% 42.5%

0.0% 0.0%

0.1% 0.1% 0.0% 0.0%

30.1% 0.0% 26.0% 30.8% 0.1% 26.6%

http://mc.manuscriptcentral.com/systbiol Degen Recoded Nucleotides 2nd Codon Positions Only Systematic Biology 1/NA/1 Osculotes curta Page 38 of 48 .58/.72/1 Pessoaiella absita .05/0.65/1 Strongylocotes lipogonus .74/.29/1 Megaginus tataupensis Campanulotes compar * .56/0/1 Campanulotes compar .19/.82/1 .38/0/1 Goniodes ortygis -.25/.73/1 Quadraceps punctatus . 64/1/1 .92/0/1 Saemundssonia lari Craspedorrhynchus sp. * .24/.69/.99 .30/.13/1 Alcedoecus sp. .80/.40/1 Penenirmus auritus .19/.91/1 Brueelia antiqua Degeeriella rufa Philopteridae .76/0/1 Chelopistes texanus -.12/.38/1 Oxylipeurus chiniri -.08/.55/1 Anatoecus icterodes .13/.94/.99 Fulicoffula longipila .32/0/1 .33/.52/1 Docophoroides brevis -.18/.30/.99 Halipeurus diversus .13/0/1 Ibidoecus bisignatus Pectinopygus varius .09/.87/1 -.41/.07/.99 Falcolipeurus marginalis .49/0/1 Bothriometopus macrocnemis Craspedonirmus immer 1/NA/1 .06/.92/1 Columbicola columbae Columbicola columbae * Trichophilopterus babakotophilus 1/NA/1 1/NA/1 Proechinopthirus fluctus .55/.96/1 Antarctophthirus microchir 1/NA/1 Echinophthirius horridus * .05/.69/1 Haematopinus eurysternus .56/.53/.99 Hoplopleura arboricola .62/.15/1 Linognathus spicatus Anoplura .42/.26/1 Neohaematopinus pacificus 1/NA/1 Pediculus humanus Ref. 1/NA/1 Pediculus humanus 1/NA/1 Pediculus schaeffi 1/NA/1 .38/.91/1 Pthirus gorillae . 95/0/1 Pthirus pubis .13/.56/1 For Peer Review Only Pedicinus badius 1/NA/1 Geomydoecus aurei 1/NA/1 Geomydoecus ewingi * Trichodectidae .01/.90/.99 Stachiella larseni 1/NA/1 Haematomyzus elephantis * Rhynchophthirina Haematomyzus elephantis .83/0/1 .27/.23/1 Macrogyropus costalimai .02/0/1 Cummingsia maculata .05/0.67/1 .23/.60/1 Laemobothrion tinnunculi sp. .34/.04/1 Amblycera 1/NA/1 Osborniella crotophagae Menopon gallinae * .36/.43/1 Myrsidea sp. Heterodoxus spiniger 1/NA/1 -.15/.31/1 Liposcelis pearmani .71/0/1 Liposcelis brunnea .82/0.75/1 Liposcelis bostrychophila * 1/NA/1 Embidopsocus sp. 2 * Liposcelididae . 47/0/1 Embidopsocus sp. 2 .83/0/1 Embidopsocus sp. 1 * .32/.09/1 Pachytroctes maculosus Pachytroctidae 1/NA/1 Peritroctes sp. Tapinella sp. 1/NA/1 Badonnelia titei * Sphaeropsocidae Badonnelia titei -.82/.04/1 Kilauella sp. .25/.21/1 Anomopsocus amabilis -.09/.52/1 Propsocus pulchripennis .37/.06/1 .88/.40/1 Idatenopsocus orientalis Nepiomorpha sp. Homilopsocidea -.01/.95/1 .45/.92/1 Elipsocus kuriliensis * Mesopsocus unipunctatus * 1/NA/1 Lachesilla abiesicola * Lachesilla contraforcepeta * 1/NA/1 .78/.17/1 -.03/0/.99 Matsumuraiella radiopicta * .69/.47/1 * .58/.54/1 1/NA/1 Amphipsocus japonicus * Valenzuela badiostigma * Caeciliusetae .07/.60/1 -.24/.40/1 Asiopsocus sonorensis Xanthocaecilius sommermanae -.26/0.13/1 Ectopsocus briggsi * H. Peripsocus phaeopterus * H. .88/.80/1 .92/0/1 .95/0/1 Ptycta johnsoni * .15/.43/1 Longivalvus nubilus * .62/.07/1 Neoblaste papillosus * Psocetae Hemipsocus chroloticus * -.02/.78/1 Psilopsocus sp. -.01/.85/1 .64/.17/1 Cladiopsocus ocotensis . 89/0/1 .79/.80/1 Neurostigma sp. Epipsocetae Loneura mombachensis Bertkauia sp. * .60/0/1 .03/.98/1 .37/.17/.99 Calopsocus reticulatus Heterocaecilius solocipennis * .32/.51/1 Philotarsetae .23/.66/1 .12/.40/1 Bryopsocus townsendi Aaroniella sp. * Archipsocus nomas Archipsocetae .81/0/1 Compsocus elegans -.23/0/1 Musapsocus sp. .58/.88/1 1/NA/1 Stimulopalpus japonicus * Amphientometae Stimulopalpus japonicus Epitroctes sp. .27/0/1 .47/.91/1 Cerobasis guestfalica * .03/.98/1 Lepinotus patruelis * Atropetae .15/.73/1 Echmepteryx hageni * Rhyopsocus sp. 1/NA/1 -.04/.64/.99 Psyllipsocus ramburii * Psyllipsocetae Dorypteryx domestica 1/NA/1 .62/.64/1 Neotrogla aurora * http://mc.manuscriptcentral.com/systbiol.88/.80/1 Neotrogla sp. Speleketor irwini Prionoglaridetae Prionoglaris stygia 1/NA/1 Page 39 of 48 Systematic Biology.54/.47/1 Osculotes curta .34/.08/1 Pessoaiella absita -.02/.76/.99 Megaginus tataupensis -.03/0/1 Strongylocotes lipogonus Campanulotes compar Campanulotes compar * .32/.59/1 .28/0/1 Goniodes ortygis 1/NA/1 Saemundssonia lari .31/.72/.99 .64/.17/1 Quadraceps punctatus Penenirmus auritus .33/.92/1 .79/.60/.99 Alcedoecus sp. -.08/.49/.99 Craspedorrhynchus sp. * .15/.76/1 Brueelia antiqua Degeeriella rufa .95/0/1 Chelopistes texanus Philopteridae .24/.24/.99 Oxylipeurus chiniri -.03/.85/.99 Anatoecus icterodes .26/.66/1 Fulicoffula longipila .57/.43/1 Halipeurus diversus .05/.75/.98 .39/.03/1 Docophoroides brevis Ibidoecus bisignatus .29/0/1 Pectinopygus varius -.23/.33/1 Falcolipeurus marginalis .02/.86/.99 -.66/.02/1 .39/.06/1 Bothriometopus macrocnemis Trichophilopterus babakotophilus Craspedonirmus immer 1/NA/1 Columbicola columbae 1/NA/1 Columbicola columbae * 1/NA/1 Proechinopthirus fluctus .22/.61/.99 Antarctophthirus microchir Echinophthirius horridus * .11/.66/.99 1/NA/1 .21/0/1 Haematopinus eurysternus .82/.22/1 Hoplopleura arboricola Linognathus spicatus Anoplura .33/.96/1 .17/.48/1 Neohaematopinus pacificus 1/NA/1 Pediculus humanus 1/NA/1 Pediculu humanus Ref. .06/.87/.99 Pediculus schaeffi 1/NA/1 Pthirus gorillae .76/0/.99 Pthirus pubis .52/.87/.99 .35/.92/1 Pedicinus badius For Peer Review Only1/NA/1 Haematomyzus elephantis * Haematomyzus elephantis Rhynchophthirina .12/0/1 Geomydoecus ewingi * 1/NA/1 Geomydoecus aurei Trichodectidae Stachiella larseni -.39/.05/1 -.01/.99/1 Laemobothrion tinnunculi .20/.65/1 Heterodoxus spiniger -.03/.70/1 Ricinus sp. 1/NA/1 . 95/0/1 Macrogyropus costalimai Amblycera Cummingsia maculata .17/.55/1 Myrsidea sp. .08/0/1 Osborniella crotophagae .69/0/.99 Menopon gallinae * .60/.58/1 Embidopsocus sp. 2 .49/.39/.99 Embidopsocus sp. 2 * .33/.53/1 Embidopsocus sp. 1 * 1/NA/1 Liposcelis brunnea Liposcelididae Liposcelis pearmani .65/.38/1 Liposcelis bostrychophila * .19/0/1 Tapinella sp. 1/NA/1 Pachytroctidae .01/.98/1 .18/.73/.99 Pachytroctes maculosus Peritroctes sp. 1/NA/1 Badonnelia titei Badonnelia titei * Sphaeropsocidae .35/0/1 Musapsocus sp. 0/0/1 1/NA/1 Compsocus elegans .49/.85/1 Stimulopalpus japonicus Amphientometae Stimulopalpus japonicus * Epitroctes sp. .21/.42/.98 Kilauella sp. .48/0/.99 Propsocus pulchripennis .22/.25/1 Idatenopsocus orientalis . 01/0/1 -.81/.07/.99 Anomopsocus amabilis Homilopsocidea .27/.62/1 Nepiomorpha sp. -.03/.71/1 Mesopsocus unipunctatus * 1/NA/1 Elipsocus kuriliensis * .67/.42/1 Lachesilla contraforcepeta * .09/.67/.99 Lachesilla abiesicola * .55/.71/1 1/NA/1 Ectopsocus briggsi * 1/NA/1 Matsumuraiella radiopicta * .44/1/.99 Graphopsocus cruciatus * Amphipsocus japonicus * Caeciliusetae . 95/0/1 Valenzuela badiostigma * .03/.93/.99 Xanthocaecilius sommermanae .08/.93/.98 Asiopsocus sonorensis Peripsocus phaeopterus * H. 1/NA/1 . 78/0/1 1/NA/1 Longivalvus nubilus * .32/.30/1 Ptycta johnsoni * .49/.20/1 Neoblaste papillosus * Psocetae Hemipsocus chroloticus * -.04/.65/1 .75/.57/1 Psilopsocus sp. .64/.57/1 Cladiopsocus ocotensis Neurostigma sp. Epipsocetae 1/NA/1 .62/.82/1 Loneura mombachensis .24/.16/1 Bertkauia sp. * .05/.96/1 Bryopsocus townsendi .07/0/1 Trichopsocus clarus .28/.96/1 .10/0/1 Calopsocus reticulatus Philotarsetae Heterocaecilius solocipennis * Aaroniella sp. * 1/NA/1 Archipsocus nomas Archipsocetae .46/.47/.99 Cerobasis guestfalica * Lepinotus patruelis * .03/.84/1 Atropetae .19/.65/.99 Echmepteryx hageni * .75/.14/1 Rhyopsocus sp. -.15/.39/.99 Dorypteryx domestica Psyllipsocetae .18/0/1 Psyllipsocus ramburii * .64/.90/1 Neotrogla sp. .65/.18http://mc.manuscriptcentral.com/systbiol/1 Neotrogla aurora * Speleketor irwini Prionoglaridetae Prionoglaris stygia Systematic Biology Page 40 of 48 1.00

0.75

Nanopsocetae

For Peer Review Only Trogiomorpha

0.50 Psocomorpha

Amphientometae GC Content of third codon positions 0.25

0.00 sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. 1 sp. 2 sp. 2 Tapinella Ricinus Peritroctes Kilauella Myrsidea Pthirus pubis Bertkauia Neotrogla Epitroctes Aaroniella Alcedoecus Trichopsocus clarus Rhyopsocus Psilopsocus Ptycta johnsoni Ptycta Osculotes curta Musapsocus Pthirus gorillae Badonnelia titei Badonnelia titei Neurostigma Degeeriella rufa Brueelia antiqua Pedicinus badius Speleketor irwini Goniodes ortygis Nepiomorpha Stachiella larseni Neotrogla aurora Pediculus schaeffi Menopon gallinae Pessoaiella absita Liposcelis brunnea Ectopsocus briggsi Pediculus humanus Pediculus Prionoglaris stygia Lepinotus patruelis Pediculus humanus Oxylipeurus chiniri Archipsocus nomas Penenirmus auritus Compsocus elegans Saemundssonia lari Embidopsocus Embidopsocus Embidopsocus Geomydoecus aurei Chelopistes texanus Longivalvus nubilus Longivalvus Anatoecus icterodes Halipeurus diversus Pectinopygus varius Liposcelis pearmani Echmepteryx hageni Ibidoecus bisignatus Fulicoffula longipila Elipsocus kuriliensis Neoblaste papillosus Neoblaste Geomydoecus ewingi Geomydoecus Lachesilla abiesicola Linognathus spicatus Heterodoxus spiniger Cerobasis guestfalica Psyllipsocus ramburii Pachytroctes maculosus Dorypteryx domestica Docophoroides brevis Bryopsocus townsendi Quadraceps punctatus Asiopsocus sonorensis Cummingsia maculata Falcolipeurus marginalis Campanulotes compar Campanulotes Campanulotes compar Calopsocus reticulatus Megaginus tataupensis Anomopsocus amabilis Craspedorrhynchus Columbicola columbae Columbicola columbae Cladiopsocus ocotensis Amphipsocus japonicus Craspedonirmus immer Loneura mombachensis Valenzuela badiostigma Valenzuela Peripsocus phaeopterus Peripsocus Hemipsocus chroloticus Hoplopleura arboricola Graphopsocus cruciatusGraphopsocus Stimulopalpus japonicus Stimulopalpus Idatenopsocus orientalis Stimulopalpus japonicus Proechinopthirus fluctus Propsocus pulchripennis Osborniella crotophagae Strongylocotes lipogonus Echinophthirius horridus Macrogyropus costalimai Mesopsocus unipunctatus Liposcelis bostrychophila Haematomyzus elephantis Haematomyzus Haematomyzus elephantis Laemobothrion tinnunculi Haematopinus eurysternus Lachesilla contraforcepeta Matsumuraiella radiopicta Matsumuraiella Antarctophthirus microchir Neohaematopinus pacificus Heterocaecilius solocipennis Bothriometopus macrocnemis Xanthocaecilius sommermanae Trichophilopterus babakotophilus

http://mc.manuscriptcentral.com/systbiol

Species Page 41 of 48 Systematic Biology 1.00

0.75

For Peer Review Only Nanopsocetae Trogiomorpha 0.50 Psocomorpha

Amphientometae GC content of 1st codon positions

0.25

0.00 sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. 2 sp. 2 sp. 1 Tapinella Ricinus Peritroctes Kilauella Myrsidea Pthirus pubis Bertkauia Neotrogla Epitroctes Aaroniella Alcedoecus Trichopsocus clarus Rhyopsocus Psilopsocus Ptycta johnsoni Osculotes curta Musapsocus Pthirus gorillae Badonnelia titei Badonnelia titei Neurostigma Degeeriella rufa Brueelia antiqua Pedicinus badius Goniodes ortygis irwini Speleketor Nepiomorpha Stachiella larseni Neotrogla aurora Pediculus schaeffi Pessoaiella absita Menopon gallinae Liposcelis brunnea Ectopsocus briggsi Pediculus humanus Pediculus Prionoglaris stygia Lepinotus patruelis Pediculus humanus Oxylipeurus chiniri Archipsocus nomas Penenirmus auritus Saemundssonia lari Compsocus elegans Embidopsocus Embidopsocus Geomydoecus aurei Chelopistes texanus Longivalvus nubilus Longivalvus Embidopsocus Anatoecus icterodes Halipeurus diversus Pectinopygus varius Neoblaste papillosus Neoblaste Echmepteryx hageni Liposcelis pearmani Ibidoecus bisignatus Fulicoffula longipila Elipsocus kuriliensis Geomydoecus ewingi Geomydoecus Linognathus spicatus Lachesilla abiesicola Heterodoxus spiniger Cerobasis guestfalica Psyllipsocus ramburii Pachytroctes maculosus Dorypteryx domestica Docophoroides brevis Bryopsocus townsendi Quadraceps punctatus Cummingsia maculata Asiopsocus sonorensis Falcolipeurus marginalis Campanulotes compar Campanulotes Campanulotes compar Calopsocus reticulatus Megaginus tataupensis Anomopsocus amabilis Craspedorrhynchus Craspedorrhynchus Columbicola columbae Columbicola columbae Cladiopsocus ocotensis Amphipsocus japonicus Craspedonirmus immer Loneura mombachensis Valenzuela badiostigma Valenzuela Peripsocus phaeopterus Peripsocus Hoplopleura arboricola Hemipsocus chroloticus Graphopsocus cruciatusGraphopsocus Stimulopalpus japonicus Stimulopalpus Stimulopalpus japonicus Idatenopsocus orientalis Proechinopthirus fluctus Propsocus pulchripennis Osborniella crotophagae Strongylocotes lipogonus Echinophthirius horridus Macrogyropus costalimai Mesopsocus unipunctatus Liposcelis bostrychophila Haematomyzus elephantis Haematomyzus Haematomyzus elephantis Laemobothrion tinnunculi Haematopinus eurysternus Lachesilla contraforcepeta Matsumuraiella radiopicta Matsumuraiella Antarctophthirus microchir Neohaematopinus pacificus Heterocaecilius solocipennis Bothriometopus macrocnemis Xanthocaecilius sommermanae Trichophilopterus babakotophilus

http://mc.manuscriptcentral.com/systbiol

Species Systematic Biology Page 42 of 48 1.00

0.75 For Peer Review Only

0.50 Nanopsocetae

Trogiomorpha

Psocomorpha GC content of 2nd codon positions

0.25 Amphientometae

0.00 sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. 1 sp. 2 sp. 2 Tapinella sp. Ricinus Peritroctes Kilauella Myrsidea Pthirus pubis Bertkauia Neotrogla Epitroctes Aaroniella Alcedoecus Trichopsocus clarus Rhyopsocus Psilopsocus Ptycta johnsoni Osculotes curta Musapsocus Pthirus gorillae Badonnelia titei Badonnelia titei Neurostigma Degeeriella rufa Brueelia antiqua Speleketor irwini Pedicinus badius Goniodes ortygis Nepiomorpha Stachiella larseni Neotrogla aurora Pediculus schaeffi Menopon gallinae Pessoaiella absita Liposcelis brunnea Ectopsocus briggsi Pediculus humanus Pediculus Prionoglaris stygia Lepinotus patruelis Embidopsocus Embidopsocus Embidopsocus Oxylipeurus chiniri Archipsocus nomas Pediculus humanus Penenirmus auritus Saemundssonia lari Compsocus elegans Geomydoecus aurei Chelopistes texanus Longivalvus nubilus Longivalvus Anatoecus icterodes Halipeurus diversus Pectinopygus varius Liposcelis pearmani Echmepteryx hageni Ibidoecus bisignatus Fulicoffula longipila Elipsocus kuriliensis Neoblaste papillosus Neoblaste Geomydoecus ewingi Geomydoecus Lachesilla abiesicola Linognathus spicatus Heterodoxus spiniger Cerobasis guestfalica Psyllipsocus ramburii Pachytroctes maculosus Dorypteryx domestica Docophoroides brevis Bryopsocus townsendi Quadraceps punctatus Cummingsia maculata Asiopsocus sonorensis Falcolipeurus marginalis Campanulotes compar Campanulotes Campanulotes compar Calopsocus reticulatus Megaginus tataupensis Anomopsocus amabilis Craspedorrhynchus Columbicola columbae Columbicola columbae Cladiopsocus ocotensis Amphipsocus japonicus Craspedonirmus immer Loneura mombachensis Valenzuela badiostigma Valenzuela Peripsocus phaeopterus Peripsocus Hemipsocus chroloticus Hoplopleura arboricola Graphopsocus cruciatusGraphopsocus Stimulopalpus japonicus Stimulopalpus Stimulopalpus japonicus Idatenopsocus orientalis Proechinopthirus fluctus Propsocus pulchripennis Osborniella crotophagae Strongylocotes lipogonus Strongylocotes Echinophthirius horridus Macrogyropus costalimai Mesopsocus unipunctatus Liposcelis bostrychophila Haematomyzus elephantis Haematomyzus Haematomyzus elephantis Laemobothrion tinnunculi Haematopinus eurysternus Lachesilla contraforcepeta Matsumuraiella radiopicta Matsumuraiella Antarctophthirus microchir Neohaematopinus pacificus Heterocaecilius solocipennis Bothriometopus macrocnemis Xanthocaecilius sommermanae Trichophilopterus babakotophilus

http://mc.manuscriptcentral.com/systbiol

Species Page 43 of 48 Systematic Biology

Psocodea Taxnomic Sampling Scheme

Suborder or Order Family Taxon Data Type Total BP # Genes SRA Accession Anoplura Haematopinidae Haematopinus eurysternus WGS 2549976 2357 SRR5308123 Anoplura Echinophthiriidae Proechinopthirus fluctus WGS 2253525 2335 SRR5308138 Anoplura Echinophthiriidae Antarctophthirus microchir WGS 2440791 2349 SRR5088465 Anoplura Echinophthiriidae Echinopthirus horridus RNA seq 785123 1599 SRR2051484 Anoplura Linognathidae Linognathus spicatus WGS 2567331 2332 SRR5308129 Anoplura Polyplacidae Neohaematopinus pacificus WGS 2787132 2364 SRR5088469 Anoplura Hoplopleuridae Hoplopleura arboricola WGS 2336922 2359 SRR5088468 Anoplura Pedicinidae Pedicinus badius WGS 2344773 2356 SRR5308136 Anoplura Pthiridae Pthirus gorillae WGS 2789559 2368 SRR5088474 Anoplura Pthiridae Pthirus pubis WGS 2754231 2365 SRR5088475 Anoplura Pediculidae Pediculus schaeffi WGS 2562033 2364 SRR1182279 Anoplura Pediculidae Pediculus humanus WGS 2277984 2354 SRR5088472 Anoplura Pediculidae Pediculus humanus Reference 2945068 2370 PRJNA19807 Rhynchophthirina Haematomyzidae Haematomyzus elephantis WGS 2439042 2358 SRR5308122 Rhynchophthirina Haematomyzidae Haematomyzus elephantis RNA seq 1194114 1867 SRR2051491 Ischnocera Trichodectidae Stachiella larseni WGS 2524617 2353 SRR5308143 Ischnocera Trichodectidae Geomydoecus aurei WGS 2507592 2355 SRR5308121 Ischnocera Trichodectidae Geomydoecus ewingi RNA seq 2596821 2310 SRR1821919 Ischnocera Philopteridae Trichophilopterus babakotophilus WGS 2497134 2359 SRR5308144 Ischnocera Philopteridae Bothriometopus macrocnemis WGS 1835208 2259 SRR5088466 Ischnocera Philopteridae Craspedonirmus immer WGS 2193294 2348 SRR5308116 Ischnocera Philopteridae Columbicola columbae WGS 2225532 2356 SRR5308115 Ischnocera Philopteridae Columbicola columbae RNA seq 1968492 2218 SRR1821984 Ischnocera Philopteridae DoForcophoroides brevis PeerWGS 2313018 235 5 ReviewSRR5308117 Only Ischnocera Philopteridae Halipeurus diversus WGS 2281860 2354 SRR5308124 Ischnocera Philopteridae Fulicoffula longipila WGS 2129253 2345 SRR5308119 Ischnocera Philopteridae Anatoecus icterodes WGS 2208450 2338 SRR5308111 Ischnocera Philopteridae Falcolipeurus marginalis WGS 2280087 2350 SRR5308118 Ischnocera Philopteridae Ibidoecus bisignatus WGS 2341512 2353 SRR5308126 Ischnocera Philopteridae Pectinopygus varius WGS 2034798 2315 SRR5308135 Ischnocera Philopteridae Chelopistes texanus WGS 2589447 2363 SRR5308114 Ischnocera Philopteridae Oxylipeurus chiniri WGS 2668434 2368 SRR5308134 Ischnocera Philopteridae Degeeriella rufa WGS 2485305 2353 SRR5088467 Ischnocera Philopteridae Brueelia antiqua WGS 2700864 2357 SRR5308112 Ischnocera Philopteridae Penenirmus auritus WGS 2497635 2361 SRR5308137 Ischnocera Philopteridae Alcedoecus sp. WGS 2417226 2359 SRR5308110 Ischnocera Philopteridae Quadraceps punctatus WGS 2566107 2362 SRR5308139 Ischnocera Philopteridae Saemundssonia lari WGS 2501502 2353 SRR5308141 Ischnocera Philopteridae Goniodes ortygis WGS 2498694 2356 SRR5308120 Ischnocera Philopteridae Campanulotes compar WGS 2336796 2355 SRR5308113 Ischnocera Philopteridae Campanulotes compar RNA seq 2134521 2251 SRR1821983 Ischnocera Philopteridae Strongylocotes lipogonus WGS 2739207 2363 SRR5308142 Ischnocera Philopteridae Megaginus tataupensis WGS 2582559 2364 SRR5308131 Ischnocera Philopteridae Pessoaiella absita WGS 2403510 2358 SRR5308145 Ischnocera Philopteridae Osculotes curta WGS 2480868 2363 SRR5308133 Ischnocera Philopteridae Craspedorrhynchus sp. RNA seq 2560936 2301 SRR1821912 Amblycera Ricinus sp. WGS 2187996 2310 SRR5308140 Amblycera Menoponidae Osborniella crotophagae WGS 2399703 2342 SRR5088470 Amblycera Menoponidae Myrsidea sp. WGS 1997502 2284 SRR5308132 Amblycera Menoponidae Menopon gallinae RNA seq 2405619 2263 SRR921619 Amblycera Boopiidae Heterodoxus spiniger WGS 2451921 2340 SRR5308125 Amblycera Cummingsia maculata WGS 2314959 2335 SRR5308146 Amblycera Macrogyropus costalimai WGS 2432211 2347 SRR5308130 Amblycera Laemobothrion tinnunculi WGS 2425458 2344 SRR5308127 Troctomorpha Liposcelididae Liposcelis brunnea WGS 2112042 2329 SRR5308128 Troctomorpha Liposcelididae Liposcelis pearmani WGS 2199234 2346 SRR5308268 Troctomorpha Liposcelididae Liposcelis bostrychophila RNA seq 2125494 2201 SRR921613 Troctomorpha Liposcelididae Embidopsocus sp. 2 WGS 2330187 2353 SRR5308269 Troctomorpha Liposcelididae Embidopsocus sp. 2 RNA seq 2557295 2318 SRR5134727 Troctomorpha Liposcelididae Embidopsocus sp. RNA seq 963150 1709 SRR2051486 Troctomorpha Pachytroctidae Pachytroctes maculosus WGS 2317935 2313 SRR5308279 Troctomorpha Pachytroctidae Tapinella sp. WGS 2370138 2336 SRR5308286 Troctomorpha Pachytroctidae Peritroctes sp. WGS 2442492 2340 SRR5308280 Troctomorpha Sphaeropsocidae Badonnelia titei WGS 2394783 2343 SRR5308262 Troctomorpha Sphaeropsocidae Badonnelia titei RNA seq 2183178 2186 SRR2051472 Troctomorpha Stimulopalpus japonicus WGS 2270973 2300 SRR5088476 Troctomorpha Amphientomidae Stimulopalpus japonicus RNA seq 1828280 2061 SRR2051511 Troctomorpha Musapsocidae Musapsocus sp. WGS 2363169 2348 SRR5308275 Troctomorpha Compsocus elegans WGS 2367267 2322 SRR5308266 Troctomorpha Electrentomidae Epitroctes sp. WGS 2400144 2330 SRR5308270 Psocomorpha Kilauella sp. WGS 2329824 2313 SRR5308272 Psocomorpha Elipsocidae Nepiomorpha sp. WGS 2570349 2309 SRR5308276 Psocomorpha Elipsocidae Propsocus pulchripennis WGS 2325882 2309 SRR5308281 Psocomorpha Elipsocidae Elipsocus kuriliensis RNA seq 1843671 2158 SRR2051485 Psocomorpha Calopsocus reticulatus WGS 2301513 2288 SRR5308264 Psocomorpha Pseudocaeciliidae Bryopsocus townsendi WGS 2188992 2268 SRR5308263 Psocomorpha Pseudocaeciliidae Heterocaecilius solocipennis RNA seq 1906914 2126 SRR2051493 Psocomorpha Trichopsocus clarus WGS 2332443 2298 SRR5308287 Psocomorpha Cladiopsocus ocotensis WGS 2312040 2290 SRR5308265 Psocomorpha Loneura mombachensis WGS 2059170 2240 SRR5308274 Psocomorpha Archipsocidae Archipsocus nomas WGS 2373123 2304 SRR5308260 Psocomorpha Neurostigma sp. WGS 2158530 2267 SRR5308277 Psocomorpha Epipsocidae Bertkauia sp. RNA seq 1686357 1939 SRR2051473 Psocomorpha Asiopsocus sonorensis WGS 2368782 2298 SRR5308261 Psocomorpha Caeciliusidae Xanthocaecilius sommermanae WGS 2256093 2284 SRR5308288 Psocomorpha Caeciliusidae Valenzuela badiostigma RNA seq 1980198 2179 SRR2051514 Psocomorpha Anomopsocus amabilis WGS 2308974 2294 SRR5308259 Psocomorpha Lachesillidae Lachesilla contraforcepeta RNA seq 2164547 2279 SRR1821927 Psocomorpha Lachesillidae Lachesilla abiesicola RNA seq 1658454 2106 SRR2051497 Psocomorpha Idatenopsocus orientalis WGS 2362071 2302 SRR5308271 Psocomorpha Mesopsocidae Mesopsocus unipunctatus RNA seq 1546809 1987 SRR2051502 Psocomorpha Matsumuraiella radiopicta RNA seq 2026917 2212 SRR2051500 Psocomorpha Psilopsocus sp. WGS 2111205 2239 SRR5308283 Psocomorpha Graphopsocus cruciatus RNA seq 1820073 2095 SRR2051490 Psocomorpha Amphipsocus japonicus RNA seq 2010662 2177 SRR2051466 Psocomorpha Peripsocidae Peripsocus phaeopterus RNA seq 1665303 2038 SRR2051507 Psocomorpha Ectopsocidae Ectopsocus briggsi RNA seq 1870215 2125 SRR645929 Psocomorpha Psocidae Longivalvus nubilus RNA seq 1299474 1716 SRR2051498 Psocomorpha Psocidae Ptycta johnsoni RNA seq 1602536 2006 SRR1821962 Psocomorpha Psocidae Neoblaste papillosus RNA seq 1716579 2052 SRR2051505 Psocomorpha Hemipsocus chloroticus RNA seq 1638022 2025 SRR2051492 Psocomorpha Aaroniella sp. RNA seq 1578200 1891 SRR2051465 Trogiomorpha Prionoglarididae Speleketor irwini WGS 1898916 2200 SRR5308285 Trogiomorpha Prionoglarididae Prionoglaris stygia WGS 2282502 2290 SRR5308282 Trogiomorpha Prionoglarididae Neotrogla sp. WGS 1977267 2244 SRR5308278 Trogiomorpha Prionoglarididae Neotrogla aurora RNA seq 1725501 2005 SRR5134732 Trogiomorpha Dorypteryx domestica WGS 2330406 2308 SRR5308267 Trogiomorpha Psyllipsocidae Psyllipsocus ramburii RNA seq 2394803 2277 SRR5134716 Trogiomorpha Rhyopsocus sp. WGS 2012127 2213 SRR5308284 Trogiomorpha Lepinotus patruelis RNA seq 2391270 2286 SRR5134710 Trogiomorpha Trogiidae Cerobasis guestfalica RNA seq 1947314 2186 SRR2051476 Trogiomorpha Echmepteryx hageni RNA seq 1546566 1859 SRR1821982 http://mc.manuscriptcentral.com/systbiol Systematic Biology Page 44 of 48

Summary of Analyses Completed Data Type Maximum Likelihood Astral Quartet Sampling Likelihood Mapping All Sites Partitioned and Concatenated Yes Yes Yes Third Positions Removed Concatenated Only No Yes Yes Second Positions Only Concatenated Only No Yes Yes Degeneracy Recoded Partitioned and Concatenated Yes Yes Yes

For Peer Review Only

http://mc.manuscriptcentral.com/systbiol Page 45 of 48 Systematic Biology

Traditional Classficiation: Newly Proposed Classification: Order Suborder Infraorder Order Suborder Infraorder Parvorder Psocoptera: Psocodea: Trogiomorpha: Trogiomorpha: Prionoglaridetae Prionoglaridetae Psyllipsocetae Psyllipsocetae Atropetae Atropetae Psocomorpa: Psocomorpa: Archipsocetae Archipsocetae Philotarsetae Philotarsetae Epipsocetae Epipsocetae Psocetae Psocetae For PeerCaeciliuse taReviewe OnlyCaeciliusetae Homilopsocidea Homilopsocidea Troctomorpha: Troctomorpha: Amphientometae Amphientometae Nanopsocetae Sphaeropsocetae Phthiraptera: Pachytroctetae Amblycera Liposcelidetae Anoplura Phthiraptera: Rhynchophthirina Amblycera Ischnocera Anoplura Rhynchophthirina Trichodectera Ischnocera

http://mc.manuscriptcentral.com/systbiol 100 SystematicStrongylocotes Biology lipogonus Page 46 of 48 100 Megaginus tataupensis 100 Pessoaiella absita 97 Osculotes curta 100 Campanulotes compar * 100 Campanulotes compar 100 Goniodes ortygis 92 100 Saemundssonia lari Quadraceps punctatus 100 100 Penenirmus auritus 100 96 Alcedoecus sp. Craspedorrhynchus sp. * 100 Brueelia antiqua Degeeriella rufa 100 Chelopistes texanus 100 Oxylipeurus chiniri 98 Halipeurus diversus 75 Docophoroides brevis 92 Fulicoffula longipila 100 Anatoecus icterodes 100 100 Pectinopygus varius Ibidoecus bisignatus 74 Falcolipeurus marginalis 84 70 Bothriometopus macrocnemis Craspedonirmus immer 100 Trichophilopterus babakotophilus 100 Columbicola columbae * 100 Columbicola columbae 100 Proechinopthirus fluctus 100 Antarctophthirus microchir Echinophthirius horridus * 100 Haematopinus eurysternus 100 100 Hoplopleura arboricola 100 Linognathus spicatus 100 Neohaematopinus pacificus 100 Pediculus humanus 100 Pediculus humanus 100 100 100 100 Pthirus gorillae Pthirus pubis For Peer ReviewPedicinus badius Only 100 100 100 Haematomyzus elephantis * Haematomyzus elephantis 100 Geomydoecus aurei 100 Geomydoecus ewingi * 99 Stachiella larseni 73 Laemobothrion tinnunculi 100 Heterodoxus spiniger 100 Ricinus sp. 100 Macrogyropus costalimai 100 Cummingsia maculata 100 Osborniella crotophagae 100 Myrsidea sp. 100 Menopon gallinae * 100 Liposcelis brunnea 100 Liposcelis pearmani 100 Liposcelis bostrychophila * 100 Embidopsocus sp. 2 100 Embidopsocus sp. 2 * 100 Embidopsocus sp. * 100 Tapinella sp. 100 Pachytroctes maculosus Peritroctes sp. 100 100 Badonnelia titei * Badonnelia titei 100 Compsocus elegans 100 Musapsocus sp. 100 100 Stimulopalpus japonicus * Stimulopalpus japonicus Epitroctes sp. 100 100 Matsumuraiella radiopicta * 100 Graphopsocus cruciatus * Amphipsocus japonicus * 100 Valenzuela badiostigma * 78 100 Asiopsocus sonorensis 92 Xanthocaecilius sommermanae 100 Peripsocus phaeopterus * 100 Lachesilla abiesicola * 78 Lachesilla contraforcepeta * 100 81 Ectopsocus briggsi * 4 Idatenopsocus orientalis 100 Kilauella sp. 100 Anomopsocus amabilis Propsocus pulchripennis Nepiomorpha sp. 100 100 Elipsocus kuriliensis * 100 Mesopsocus unipunctatus * 100 100 Longivalvus nubilus * 100 Ptycta johnsoni * 100 Neoblaste papillosus * Hemipsocus chroloticus * 89 100 Psilopsocus sp. 100 Cladiopsocus ocotensis 100 Neurostigma sp. 100 Loneura mombachensis Bertkauia sp. * 100 100 Heterocaecilius solocipennis * 100 Calopsocus reticulatus 100 100 Trichopsocus clarus Bryopsocus townsendi Aaroniella sp. * 100 Archipsocus nomas 100 Cerobasis guestfalica * 100 Lepinotus patruelis * 100 Echmepteryx hageni * 100 Rhyopsocus sp. 33 Psyllipsocus ramburii * 100 Dorypteryx domestica 100 100 Neotrogla sp. Neotrogla aurora * http://mc.manuscriptcentral.com/systbiol Speleketor irwini Prionoglaris stygia Page 47 of 48 1 SystematicStrongylocotes Biology lipogonus 1 Megaginus tataupensis 1 Pessoaiella absita 1 Osculotes curta 1 Campanulotes compar 1 Campanulotes compar * 1 Goniodes ortygis .81 Alcedoecus sp. 1 Craspedorrhynchus sp. * 1 1 Penenirmus auritus 1 Quadraceps punctatus 1 Saemundssonia lari Brueelia antiqua 1 Chelopistes texanus 1 Oxylipeurus chiniri 1 1 Docophoroides brevis .61 Halipeurus diversus .98 1 Falcolipeurus marginalis 1 Pectinopygus varius .67 1 Ibidoecus bisignatus .86 Anatoecus icterodes 1 Columbicola columbae * 1 1 Columbicola columbae Craspedonirmus immer Bothriometopus macrocnemis Trichophilopterus babakotophilus 1 Antarctophthirus microchir 1 1 Echinophthirius horridus * 1 Haematopinus eurysternus 1 Hoplopleura arboricola 1 Linognathus spicatus 1 1 1 Pediculus humanus 1 Pediculus humanus 1 1 Pthirus pubis 1 1 Pthirus gorillae For PeerPedicinus Review badius Only 1 Haematomyzus elephantis * 1 1 Haematomyzus elephantis 1 Geomydoecus ewingi * 1 Geomydoecus aurei Stachiella larseni 1 Menopon gallinae * 1 Osborniella crotophagae 1 Myrsidea sp. 1 1 Cummingsia maculata 1 Macrogyropus costalimai 1 Laemobothrion tinnunculi 1 Ricinus sp. Heterodoxus spiniger 1 Liposcelis brunnea 1 1 Liposcelis pearmani 1 Liposcelis bostrychophila 1 Embidopsocus sp. 2 1 Embidopsocus sp. 2 * 1 Embidopsocus sp. * Pachytroctes maculosus 1 Peritroctes sp. 1 Tapinella sp. Badonnelia titei * 1 Badonnelia titei Nepiomorpha sp. 1 Elipsocus kuriliensis * 1 Kilauella sp. 1 Mesopsocus unipunctatus * 1 Idatenopsocus orientalis .82 1 Propsocus pulchripennis Anomopsocus amabilis 1 Lachesilla abiesicola * 1 a * 1 Graphopsocus cruciatus * 1 1 Matsumuraiella radiopicta * 1 Amphipsocus japonicus * 1 .99 Valenzuela badiostigma * 1 Xanthocaecilius sommermanae .71 Asiopsocus sonorensis Ectopsocus briggsi * .81 Peripsocus phaeopterus * 1 Ptycta johnsoni * 1 1 Longivalvus nubilus * .98 Neoblaste papillosus * 1 Hemipsocus chroloticus * .97 Psilopsocus sp. 1 Loneura mombachensis 1 Cladiopsocus ocotensis 1 1 Neurostigma sp. 1 Bertkauia sp. * 1 Heterocaecilius solocipennis * 1 Calopsocus reticulatus 1 Bryopsocus townsendi Trichopsocus clarus 1 1 Aaroniella sp. * Archipsocus nomas 1 Stimulopalpus japonicus 1 Stimulopalpus japonicus * 1 Compsocus elegans 1 Musapsocus sp. Epitroctes sp. 1 Lepinotus patruelis * 1 a * 1 Echmepteryx hageni * 1 Rhyopsocus sp. 1 Dorypteryx domestica 1 Psyllipsocus ramburii * 1 Neotrogla aurora * 1 Neotrogla sp. http://mc.manuscriptcentral.com/systbiol 1 Speleketor irwini Prionoglaris stygia Systematic1 BiologyStrongylocotes lipogonus Page 48 of 48 1 Megaginus tataupensis 1 Osculotes curta 1 Pessoaiella absita 1 Campanulotes compar * 1 Campanulotes compar 1 Goniodes ortygis .71 Penenirmus auritus .92 Alcedoecus sp. 1 1 Quadraceps punctatus 1 Saemundssonia lari 1 Craspedorrhynchus sp. * Brueelia antiqua 1 Degeeriella rufa Chelopistes texanus 1 Oxylipeurus chiniri 1 1 Halipeurus diversus .84 Docophoroides brevis .99 Fulicoffula longipila 1 Pectinopygus varius .71 1 .57 Ibidoecus bisignatus Falcolipeurus marginalis 1 Anatoecus icterodes Craspedonirmus immer .65 Columbicola columbae * 1 Columbicola columbae .94 Bothriometopus macrocnemis Trichophilopterus babakotophilus 1 Proechinopthirus fluctus 1 Antarctophthirus microchir 1 Echinophthirius horridus * 1 Haematopinus eurysternus 1 Linognathus spicatus 1 Hoplopleura arboricola 1 Neohaematopinus pacificus 1 1 Pediculus humanus 1 Pediculus humanus 1 1 Pthirus pubis 1 1 Pthirus gorillae For Peer ReviewPedicinus badius Only 1 1 Haematomyzus elephantis 1 Haematomyzus elephantis * 1 Geomydoecus ewingi * 1 Geomydoecus aurei Stachiella larseni 1 Menopon gallinae * 1 Osborniella crotophagae 1 Myrsidea sp. 1 1 Cummingsia maculata .97 Macrogyropus costalimai .84 Ricinus sp. 1 Laemobothrion tinnunculi Heterodoxus spiniger 1 Liposcelis brunnea 1 Liposcelis pearmani 1 1 Liposcelis bostrychophila * 1 Embidopsocus sp. 2 * 1 Embidopsocus sp. 2 Embidopsocus sp. * 1 Pachytroctes maculosus 1 Peritroctes sp. 1 1 Tapinella sp. Badonnelia titei * 1 Badonnelia titei 1 Stimulopalpus japonicus * .41 Stimulopalpus japonicus 1 Musapsocus sp. 1 Compsocus elegans Epitroctes sp. 1 Mesopsocus unipunctatus * 1 Idatenopsocus orientalis Propsocus pulchripennis 1 Nepiomorpha sp. .74 Elipsocus kuriliensis * 1 1 Kilauella sp. .91 Anomopsocus amabilis 1 Lachesilla contraforcepeta * 1 .96 Lachesilla abiesicola * Ectopsocus briggsi * 1 1 Matsumuraiella radiopicta * 1 Graphopsocus cruciatus * 1 Amphipsocus japonicus * 1 Valenzuela badiostigma * 1 Xanthocaecilius sommermanae .98 Asiopsocus sonorensis Peripsocus phaeopterus * 1 Longivalvus nubilus * 1 1 Ptycta johnsoni * 1 Neoblaste papillosus * Hemipsocus chroloticus * 1 Psilopsocus sp. 1 Neurostigma sp. .35 Bertkauia sp. * 1 Cladiopsocus ocotensis 1 1 Loneura mombachensis 1 1 Heterocaecilius solocipennis * 1 Calopsocus reticulatus Bryopsocus townsendi 1 Trichopsocus clarus 1 Aaroniella sp. * Archipsocus nomas 1 Cerobasis guestfalica * 1 Lepinotus patruelis * 1 Echmepteryx hageni * 1 Rhyopsocus sp. 1 Dorypteryx domestica 1 Psyllipsocus ramburii * 1 Neotrogla sp. 1 Neotrogla aurora * http://mc.manuscriptcentral.com/systbiol 1 Speleketor irwini Prionoglaris stygia