1 2 3 4 1 Integrated metatranscriptomics and for the 5 6 2 characterization of bacterial in unfed Ixodes ricinus 7 8 3 9 10 4 Angélica Hernández-Jarguín1&, Sandra Díaz-Sánchez1&, Margarita Villar1, José de la 11 12 1,2* 13 5 Fuente 14 15 6 16 17 7 1SaBio. Instituto de Investigación en Recursos Cinegéticos IREC-CSIC-UCLM-JCCM, 18 19 8 Ronda de Toledo s/n, 13005 Ciudad Real, Spain. 20 21 9 2Department of Veterinary Pathobiology, Center for Veterinary Health Sciences, Oklahoma 22 23 10 State University, Stillwater, OK 74078, USA. 24 25 11 26 27 12 &The first two authors contributed equally to the work reported in this paper. 28 29 30 13 31 32 14 *Correspondence: José de La Fuente. SaBio. Instituto de Investigación en Recursos 33 34 15 Cinegéticos, IREC-CSIC-UCLM-JCCM, 13005 Ciudad Real, Spain; e-mail: 35 36 16 [email protected] 37 38 17 39 40 18 Running title: Metaomics of tick bacterial microbiota 41 42 43 44 45 46 47 48 49 50 51 52 53 54 1 55 56 57 58 59 60 20 Abstract 61 62 21 An innovative metaomics approach integrating metatranscriptomics and metaproteomics 63 64 22 was used to characterize bacterial communities in the microbiota of the Lyme borreliosis 65 66 23 spirochete vector, Ixodes ricinus (Acari: Ixodidae). Whole internal tissues and salivary 67 68 24 glands from unfed larvae and female ticks, respectively were used. Reused I. ricinus RNA- 69 70 25 sequencing data for metranscriptomics analysis together with metaproteomics provided a 71 72 26 better characterization of tick bacterial microbiota by increasing bacteria identification and 73 74 75 27 support for identified bacteria with putative functional implications. The results showed the 76 77 28 presence of symbiotic, commensal, soil, environmental, and pathogenic bacteria in the I. 78 79 29 ricinus microbiota, including previously unrecognized commensal and soil 80 81 30 . The results of the metaomics approach may have implications in the 82 83 31 characterization of putative mechanisms by which pathogen infection manipulates tick 84 85 32 microbiota to facilitate infection. Metaomics approaches integrating different 86 87 33 datasets would provide a better description of tick microbiota compositions, and insights 88 89 34 into tick interactions with microbiota, pathogens and hosts. 90 91 92 35 93 94 36 Keywords: metatranscriptomics; metaproteomics; metaomics; tick; microbiota; biofilm 95 96 37 97 98 99 100 101 102 103 104 105 106 107 108 109 110 2 111 112 113 114 115 116 39 1. Introduction 117 118 40 The microbiota plays an important role in several processes affecting human and animal 119 120 41 health, agriculture, environment, and host-pathogen interactions (Kau et al., 2011; Schwabe 121 122 42 and Jobin, 2013; Philippot et al., 2013; Bouchez et al., 2016). Next-generation sequencing 123 124 43 or omics technologies can be used for microbiota characterization under different 125 126 44 experimental and natural conditions. have been used to characterize the 127 128 45 microbiota in different hosts including both model and nonmodel organisms such as 129 130 131 46 humans and tick vectors (Clay et al., 2008; Andreotti et al., 2011; Carpi et al., 2011; 132 133 47 Vayssier-Taussat et al., 2015; Qiu et al., 2014; Williams-Newkirk et al., 2014; van Treuren 134 135 48 et al., 2015; Narasimhan and Fikrig, 2015; Yoon et al., 2015; Abraham et al., 2017; Heintz- 136 137 49 Buschart and Wilmes, 2017; Greay et al., 2017; Varela-Stokes et al., 2017; Xiang et al., 138 139 50 2017). Different metatranscriptomics approaches have been also applied to the study of 140 141 51 microbial communities in arthropod vectors and vertebrate hosts (Mäder et al., 2011; 142 143 52 Johannson et al., 2013; Vayssier-Taussat et al., 2013; Razzauti et al., 2015; Luo et al., 144 145 53 2017). Recently, metaproteomics and metabolomics have emerged as powerful tools for the 146 147 148 54 characterization of dynamic host- interactions, particularly in combination with 149 150 55 metagenomics and metatranscriptomics approaches (Tanca et al, 2013; 2014; Franzosa et 151 152 56 al., 2015; Aguiar-Pulido et al., 2016; Cheng et al., 2017). Furthermore, metaomics or the 153 154 57 integration of different omics approaches allows network-based analyses to describe the 155 156 58 complexity and function of different biological processes involved in host/tick-pathogen 157 158 59 and host/tick-microbiome interactions (Franzosa et al., 2015; Villar et al, 2015; Narasimhan 159 160 60 and Fikrig, 2015), and the discovery of new targets for prevention and control of tick-borne 161 162 61 diseases (Abraham et al., 2017; Narasimhan et al., 2017; Xiang et al., 2017). 163 164 165 166 3 167 168 169 170 171 62 Ixodes ricinus (Linnaeus 1758) (Acari: Ixodidae) are obligate hematophagous ectoparasites 172 173 63 and vectors of multiple pathogens such as Borrelia spp. (Lyme borreliosis and hard tick- 174 175 176 64 borne relapsing fever), Anaplasma phagocytophilum (human granulocytic anaplasmosis), 177 178 65 tick-borne encephalitis virus (TBE), and Babesia spp. (babesiosis) (de la Fuente et al., 179 180 66 2008; 2017). Additionally, I. ricinus have a diverse community of commensal and 181 182 67 symbiotic microorganisms which exert multiple effects on tick fitness, nutrition, 183 184 68 development, reproduction, defense against environmental stress, immunity and 185 186 69 transmission of tick-borne pathogens (Bonnet et al., 2017; de la Fuente et al., 2017). The I. 187 188 70 ricinus microbiome was first characterized using a metagenomics approach (Carpi et al., 189 190 71 2011; Nakao et al., 2013; Bonnet et al., 2014). Vayssier-Taussat et al. (2013) characterized 191 192 193 72 the bacterial community of I. ricinus using a whole transcriptomics approach, resulting in a 194 195 73 better identification of previously unknown bacteria and accurate identification of potential 196 197 74 pathogens. This method also provides a better understanding of the tick-microbiome 198 199 75 interactions when compared to metagenomics. Additionally, reusing RNA sequencing 200 201 76 (RNA-seq) data has been also used as an efficient strategy for the screening of pathogens in 202 203 77 ticks (Zhuang et al., 2014a). 204 205 78 In this study, we used the integration of metatranscriptomics and metaproteomics for the 206 207 79 characterization of the tick bacterial microbiota in unfed I. ricinus. Reused I. ricinus RNA- 208 209 210 80 seq data for metranscriptomics analysis together with metaproteomics provided a better 211 212 81 characterization of tick microbiome by increasing bacterial identification and support for 213 214 82 identified bacteria with putative functional implications. 215 216 83 2. Materials and methods 217 218 84 2.1. Tick samples and processing. 219 220 221 222 4 223 224 225 226 227 85 Tick samples were obtained and processed as previously described (Genomic Resources 228 229 86 Development Consortium et al., 2014). Briefly, I. ricinus unfed larvae and adult females 230 231 232 87 were obtained from the reference laboratory colony maintained at the tick rearing facility of 233 234 88 the Institute of Parasitology of the Biology Centre of the Academy of Sciences of the Czech 235 236 89 Republic. Whole internal tissues and salivary glands from 300 larvae and 30 female ticks, 237 238 90 respectively were combined and used for RNA-seq. All ticks were washed with a series of 239 240 91 solutions composed of tap water, 3% hydrogen peroxide, two washes of distilled water, 241 242 92 70% ethanol and two more washes with distilled water prior to dissection for DNA, RNA 243 244 93 and extraction. Total DNA, RNA and were extracted using Tri Reagent 245 246 94 (Sigma-Aldrich, St. Louis, MO, USA) according to manufacturer instructions. RNA was 247 248 249 95 further purified with the RNeasy MinElute Cleanup Kit (Qiagen, Valencia, CA, USA) and 250 251 96 characterized using the Agilent 2100 Bioanalyzer (Santa Clara, CA, USA) in order to 252 253 97 evaluate the quality and integrity of RNA preparations. DNA and RNA concentrations were 254 255 98 determined using the Nanodrop ND-1000 (NanoDrop Technologies Wilmington, Delaware 256 257 99 USA). Proteins were resuspended in 20 mM Tris-HCl pH 7.5 with 4% SDS and protein 258 259 100 concentration was determined using the BCA Protein Assay kit (Thermo Scientific, 260 261 101 Rockford, IL, USA) with bovine serum albumin (BSA) as standard. 262 263 102 2.2. Integrated metaomics experimental design. 264 265 266 103 An integrated metatranscriptomics and metaproteomics approach was developed for the 267 268 104 characterization of I. ricinus bacterial microbiota (Fig. 1). Reused I. ricinus RNA-seq data 269 270 105 (Genomic Resources Development Consortium et al., 2014) derived from combined female 271 272 106 salivary glands and larvae were the basis for metatranscriptomics analysis that resulted in 273 274 107 the database of identified bacterial genera. This database was then used to generate the 275 276 277 278 5 279 280 281 282 283 108 Uniprot protein database for application as a variant of the proteomics informed by 284 285 109 transcriptomics (PIT) approach (Evans et al., 2012) in the metaproteomics analysis. 286 287 288 110 2.3. Metatranscriptomics for the identification of bacterial species in the tick 289 290 111 microbiome. 291 292 112 A metatranscriptomics pipeline was developed based on the reused I. ricinus RNA-seq data 293 294 113 (Fig. 1). Tick RNA-seq analysis was conducted as previously described (Genomic 295 296 114 Resources Development Consortium et al., 2014). The I. ricinus transcriptome, raw reads 297 298 115 and assembly results can be accessed at dryad entries doi: 10.5061/dryad.9js92/1 - doi: 299 300 116 10.5061/dryad.9js92/8. The metatranscriptomics database was then generated from the 301 302 117 19,831,942 I. ricinus unaligned reads that did not match to the I. scapularis reference 303 304 305 118 genome (assembly JCVI_ISG_i3_1.0; 306 307 119 http://www.ncbi.nlm.nih.gov/nuccore/NZ_ABJB000000000) (Genomic Resources 308 309 120 Development Consortium et al., 2014). The unaligned reads were extracted from the BAM 310 311 121 files (the binary version of the SAM file, a tab-delimited text file that contains sequence 312 313 122 alignment data) that resulted after the assembly of I. ricinus transcriptome using the 314 315 123 SAMtools (Li et al., 2009; Li, 2011; http://samtools.sourceforge.net). Then, the unaligned 316 317 124 reads were searched against a bacterial sequence database constructed with genome and/or 318 319 125 species-specific ribosomal RNA (rRNA) sequences downloaded from the NCBI 320 321 322 126 (https://www.ncbi.nlm.nih.gov) (Supplementary file 1 – Table 1). The bioinformatics 323 324 127 approach to identify bacterial sequences was done in two steps. First, the LAST genome- 325 326 128 scale sequence comparison tool (http://last.cbrc.jp) was used to search against the bacterial 327 328 129 database previously constructed. The reads containing poly-A tails were discarded. As cut- 329 330 130 off criteria for genome-scale sequence comparison we applied minimum alignment length 331 332 131 of 100 nucleotides, e-value 0.001, with a word size of 11, and a minimum of 70% sequence 333 334 6 335 336 337 338 339 132 identity. Then, the putative bacterial reads detected with LAST were further confirmed by 340 341 133 BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch) 342 343 344 134 (Frith et al., 2010 a,b; Kiełbasa et al., 2011). BLAST assignments were done by using the 345 346 135 10 best BLAST hits (BBH) for each putative bacteria previously assigned by LAST. The 347 348 136 sequences with hits matching to bacteria were confirmed as identified bacterial sequences, 349 350 137 discarding the rest. Manual filtering was applied to remove those sequences with similarity 351 352 138 to functional domains. The metatranscriptomics database was constructed taking the 353 354 139 number of count reads or identifications (IDs) assigned to each identified bacterial 355 356 140 sequences, and normalized against the total number of IDs (Supplementary file 2 – Dataset 357 358 141 1). 359 360 361 142 2.4. Metaproteomics for bacterial protein identification. 362 363 143 The metaproteomics pipeline included de novo identification using a protein database 364 365 144 constructed based on the bacterial genera identified by metatrascriptomics, a variant of the 366 367 145 PIT approach (Evans et al., 2012) (Fig. 1). Protein extracts (150 μg per sample) were on-gel 368 369 146 concentrated by SDS-PAGE as previously described (Villar et al., 2015). The unseparated 370 371 147 protein band was visualized by staining with GelCode Blue Stain Reagent (Thermo 372 373 148 Scientific), excised, cut into 2x2 mm cubes and digested overnight at 37 °C with 60 ng/µl 374 375 149 sequencing grade trypsin (Promega, Madison, WI, USA) at 5:1 protein:trypsin (w/w) ratio 376 377 378 150 in 50 mM ammonium bicarbonate, pH 8.8 containing 10% (v/v) acetonitrile (Shevchenko et 379 380 151 al., 2006). The resulting tryptic peptides from the gel band were extracted by 30 min- 381 382 152 incubation in 12 mM ammonium bicarbonate, pH 8.8. Trifluoroacetic acid was added to a 383 384 153 final concentration of 1% and the peptides were finally desalted onto OMIX Pipette tips 385 386 154 C18 (Agilent Technologies, Santa Clara, CA, USA), dried-down and stored at -20 °C until 387 388 155 mass spectrometry analysis. The desalted protein digests were resuspended in 0.1% formic 389 390 7 391 392 393 394 395 156 acid and analyzed by reverse phase (RP)-liquid chromatography (LC)-mass spectrometry 396 397 157 (MS)/MS (RP-LC-MS/MS) using an Easy-nLC II system coupled to an ion trap LCQ Fleet 398 399 400 158 mass spectrometer (Thermo Scientific). The peptides were concentrated (on-line) by 401 402 159 reverse phase chromatography using a 0.1x20 mm C18 RP pre-column (Thermo Scientific), 403 404 160 and then separated using a 0.075 x 100 mm C18 RP column (Thermo Scientific) operating 405 406 161 at 0.3 ml/min. Peptides were eluted using a 180-min gradient from 5 to 35% solvent B in 407 408 162 solvent A (solvent A: 0.1% formic acid in water, solvent B: 0.1% formic acid in 409 410 163 acetonitrile). Electrospray ionization (ESI) was done using a Fused-silica PicoTip Emitter 411 412 164 ID 10 mm (New Objective, Woburn, MA, USA) interface. Peptides were detected in survey 413 414 165 scans from 400 to 1600 amu (1 mscan), followed by three data dependent MS/MS scans 415 416 417 166 (Top 3), using an isolation width of 2 mass-to-charge ratio units, normalized collision 418 419 167 energy of 35%, and dynamic exclusion applied during 30 sec periods. The MS/MS raw files 420 421 168 were searched against a compiled database containing the Uniprot Ixodidae taxonomy 422 423 169 (134,957 entries in February 2017) together with a database created from the bacterial 424 425 170 genera identified by metatranscriptomics (4,185,346 Uniprot entries in February 2017) 426 427 171 using the SEQUEST algorithm (Proteome Discoverer 1.4, Thermo Scientific). The 428 429 172 following constraints were used for the searches: tryptic cleavage after Arg and Lys, up to 430 431 173 two missed cleavage sites, and tolerances of 1 Da for precursor ions and 0.8 Da for MS/MS 432 433 434 174 fragment ions and the searches were performed allowing optional Met oxidation and Cys 435 436 175 carbamidomethylation. A false discovery rate (FDR) < 0.01 was considered as condition 437 438 176 for successful peptide assignments and at least two peptides per protein were the necessary 439 440 177 condition for protein identification. After discarding Ixodidae assignations, peptides 441 442 178 corresponding to bacterial genera were grouped and the total number of peptide spectrum 443 444 179 matches (PSMs) for each bacterial genera were normalized against the total number of 445 446 8 447 448 449 450 451 180 PSMs (Supplementary file 3 – Dataset 2). The ontology (GO) annotations for 452 453 181 biological process (BP) were done in proteins identified in tick-borne pathogens (TBPs; 454 455 456 182 Anaplasma, Borrelia, Ehrlichia and Rickettsia genera) according to Uniprot 457 458 183 (http://www.uniprot.org) (Supplementary file 3 – Dataset 2). 459 460 184 2.5. Phylogenetic and taxonomic abundance analyses. 461 462 185 The metatrasncriptomics and metaproteomics bacteria relative abundance and taxonomic 463 464 186 analyses were done using a Heatmap Tool associated with a pruned phylogenetic tree 465 466 187 generated with the platform phyloT (http://phylot.biobyte.de) based on NCBI taxonomy, 467 468 188 and visualized using the Interactive Tree of Life software v3.4.3 (http://itol.embl.de) 469 470 189 (Letunic and Bork, 2011). A correlation analysis was conducted in Microsoft Excel 471 472 473 190 (version 12.0) between bacterial genera identification by metatranscriptomics (IDs) and 474 475 191 metaproteomics (PSMs). The Pearson's correlation coefficient was calculated using the 476 477 192 Pearson's Correlation Coefficient Calculator 478 479 193 (http://www.socscistatistics.com/tests/pearson/Default2.aspx) (r = 0.5). 480 481 194 2.6. Validation of tick bacterial microbiota identifications by real-time PCR. 482 483 195 DNA from the same unfed larvae and female salivary glands samples were used for 484 485 196 validation of tick bacterial microbiota identification by real-time PCR. Specific 486 487 197 oliginucleotide primers for the bacteria Rickettsia spp., Ehrlichia spp., Spotted Fever Group 488 489 490 198 (SFG) Rickettsia, Anaplasma spp., Anaplasma phagocytophilum, Borrelia spp., Wolbachia 491 492 199 spp., Candidatus Midichloria mitochondrii and Pseudomonas putida were used (Table 1). 493 494 200 The iScript One-Step was used to perform the real-time PCR with SYBR Green and the 495 496 201 iQ5 thermal cycler (Bio-Rad, Hercules, CA, USA) following manufacturer's 497 498 202 recommendations. A dissociation curve was run at the end of the reaction to ensure that 499 500 203 only one amplicon was formed and that the amplicons denatured consistently in the same 501 502 9 503 504 505 506 507 204 temperature range for every sample (Ririe et al., 1997). DNA levels were normalized 508 509 205 against tick 16S rRNA and Ixodes rps4 following the conditions previously reported 510 511 512 206 by Zivkovic et al. (2009) and Koči et al. (2013). Normalization was performed using the 513 514 207 genNorm method (ddCT method as implemented by Bio-Rad iQ5 Standard Edition, 515 516 208 Version 2.0) (Livak and Schmittgen, 2001). 517 518 209 3. Results and discussion 519 520 210 3.1. Metatranscriptomics bacteria identification in I. ricinus microbiota. 521 522 523 211 The metatranscriptomics analysis identified a total of 450 reads that matched with specific 524 525 212 bacterial genomes distributed among 8 phyla and 38 genera, including uncultured bacteria 526 527 213 (Table 2, Fig. 2 and Supplementary file 2 – Dataset 1). The most represented phyla 528 529 214 identified were Proteobacteria with 21 genera, followed by Actinobacteria and Firmicutes 530 531 215 represented by 7 and 4 genera, respectively (Table 2). Other phyla such as Tenericutes, 532 533 216 Spirochaetes, Fusobacteria and Bacteroidetes were also identified but with lower diversity 534 535 217 (Table 2 and Figure 2). As expected, most of the bacteria identified by metatranscriptomics 536 537 218 have been previously described as apart of the microbiota in different tick species (Table 538 539 540 219 3). However, other genera such as Actinomyces, Amycolatopsis, Bidifidobacterium, 541 542 220 Giliamella, Kurthia, Mesorhizobium and Variovorax have not been previously reported in 543 544 221 ticks (Table 3). Identified bacteria included tick endosymbionts such as Candidatus 545 546 222 Midichloria, Wolbachia, Francisella, Spiroplasma and Rickettsiella, commensals such as 547 548 223 Escherichia, Staphylococcus and Streptococcus, soil and environmental microorganisms 549 550 224 such as Acinetobacter, Arthrobacter, Bradyrhizobium, Sphingomonas and Pseudomonas, 551 552 225 human pathogens such as Brucella and Enterococcus, and TBPs such as Anaplasma, 553 554 226 Ehrlichia, Rickettsia, Neorickettsia and Borrelia (Table 2, Fig. 2 and Table 3). Among the 555 556 557 558 10 559 560 561 562 563 227 newly identified bacteria in tick microbiota, commensal (Actinomyces, Bidifidobacterium, 564 565 228 Giliamella) and soil (Amycolatopsis, Kurthia, Mesorhizobium and Variovorax) 566 567 568 229 microorganisms were present (Table 2, Fig. 2 and Table 3). 569 570 230 3.2. Integration of metatranscriptomics and metaproteomics approaches. 571 572 231 The metatranscriptomics and metaproteomics results were integrated to provide a 573 574 232 metaomics approach to bacteria identifications in the I. ricinus microbiota. A total of 575 576 233 10,845 PSMs were assigned to different bacterial genera present in the metatranscriptomics 577 578 234 database (Supplementary file 3 – Dataset 2). Metaproteomics not only provided support to 579 580 235 metatranscriptomics results by identifying 87% of the identified bacterial genera (Table 2), 581 582 236 but also increased bacteria identification for different genera (Fig. 3A, Supplementary file 2 583 584 585 237 – Dataset 1 and Supplementary file 3 – Dataset 2). However, metaproteomics may results in 586 587 238 some peptide assignments that could match to several related species, which requires 588 589 239 further analyses with sequences of peptides used for protein identity for better 590 591 240 definition at the species level (Tanca et al., 2013; 2014; Fernández de Mera et al., 2017). 592 593 241 Therefore, the integration of metaproteomics with metatranscriptomics provides a better 594 595 242 resolution at the species level. For example, although several Rickettsiella spp. were 596 597 243 identified at metatranscriptomics and metaproteomics levels, only a Rickettsiella 598 599 244 endosymbiont of Ixodes spp. was identified by both analyses (Supplementary file 2 – 600 601 602 245 Dataset 1 and Supplementary file 3 – Dataset 2). These bacteria are closely related to 603 604 246 pathogenic Rickettsiella spp. (Cordaux et al., 2007), and the metaomics approach provided 605 606 247 a better support for the presence of Rickettsiella endosymbionts in the I. ricinus microbiota. 607 608 248 Differences in bacterial identification by metranscriptomics and metaproteomics may be 609 610 249 also due to differences in RNA and protein levels (Fig. 2 and Table 2), which could be 611 612 250 affected by post-transcriptional and post-translational modifications (Fan et al., 2013; 613 614 11 615 616 617 618 619 251 Ayllón et al., 2015; Villar et al., 2015). This suggestion was supported by the multiple 620 621 252 bacterial proteins that were identified by metaproteomics when compared to results of 622 623 624 253 metatranscriptomics analysis (Supplementary file 2 – Dataset 1 and Supplementary file 3 – 625 626 254 Dataset 2). Additional support to integrated metatranscriptomics and metaproteomics 627 628 255 results was provided at the DNA level by PCR (Fig. 3B). 629 630 256 3.3. Putative functional implications of integrated metaomics results. 631 632 257 This study is a “proof-of-concept” for the metaomics approach to tick microbiome 633 634 258 characterization. Integrated metatranscriptomics and metaproteomics results were 635 636 259 functionally more relevant than those obtained by metranscriptomics alone, suggesting that 637 638 260 identified bacteria might form part of the active microbial community in unfed I. ricinus. 639 640 641 261 Differences in bacterial microbiota composition have been attributed to variations between 642 643 262 tick species, collection sites, sex and developmental stages, feeding status, and pathogen 644 645 263 infection (Williams-Newkirk et al., 2014; van Treuren et al., 2015; Bonnet et al., 2017; 646 647 264 Abraham et al., 2017; Xiang et al., 2017). A correlation analysis between 648 649 265 metatranscriptomics and metaproteomics results revealed the absence of correlation for the 650 651 266 entire bacterial microbiome, and for certain phyla such as Proteobacteria and 652 653 267 Actinobacteria (Fig. 3C). However, for Firmicutes and TBPs a positive correlation was 654 655 268 obtained between normalized RNA IDs and protein PSMs (Fig. 3C). Most of the 656 657 658 269 metatranscriptomics data corresponded to rRNA, which is the predominant material in the 659 660 270 ribosome and essential for protein synthesis (Cole et al., 2003). Therefore, a positive 661 662 271 correlation between rRNA and protein levels may reflect that these bacteria were 663 664 272 metabolically active in unfed I. ricinus. These commensal and environmental bacteria 665 666 273 (Firmicutes) and TBPs may interact to affect multiple processes in the tick such as tick 667 668 274 fitness, nutrition, development, reproduction, defense against environmental stress, 669 670 12 671 672 673 674 675 275 immunity and vector competence (Narasimhan and Fikrig, 2015; Bonnet et al., 2017; de la 676 677 276 Fuente et al., 2017; Abraham et al., 2017). 678 679 680 277 The GO analysis for BP was conducted on proteins identified in TBPs (Anaplasma, 681 682 278 Borrelia, Ehrlichia and Rickettsia genera) (Supplementary file 3 – Dataset 2) that showed a 683 684 279 positive correlation between metatranscriptomics and metaproteomics results (Fig. 3C). 685 686 280 The results showed that excluding unknown proteins, energy metabolism was the most 687 688 281 abundant BP in all bacterial genera (Fig. 4). Another BP represented in all bacteria was 689 690 282 protein synthesis (Fig. 4). Finally, BPs represented in some but not all bacterial genera 691 692 283 included regulation of tick host gene expression, bacteria-tick interactions, DNA replication 693 694 284 and transcription, DNA repair, nucleoside metabolism, cell wall biosynthetic process, cell 695 696 697 285 division, and motility (Fig. 4). These results further supported that some of the identified 698 699 286 bacteria may be metabolically active, and involved in tick-bacteria interactions. 700 701 287 The I. ricinus ticks used in this study were obtained from an uninfected reference 702 703 288 laboratory colony. Then, why TBPs such as A. phagocytophilum and Borrelia spp. were 704 705 289 identified by integrated metatranscriptomics and metaproteomics analysis? Two possible 706 707 290 responses to this question are (a) that the colony may be infected with transovarially 708 709 291 transmitted TBPs or (b) that although these bacteria were identified as TBPs, they may 710 711 292 represent non-pathogenic genetic variants of these pathogens. In support to the last 712 713 714 293 suggestion, it has been shown the presence of non-pathogenic species and/or variants in 715 716 294 both A. phagocytophilum and Borrelia spp. (Anderson et al., 1990; Massung et al., 2002; 717 718 295 2003; Portillo et al., 2005; Al-Khedery and Barbet, 2014; Stokes et al., 2016). Nevertheless, 719 720 296 their role as commensals or pathogenic potential is unknown. 721 722 297 Recently, Abraham et al. (2017) demonstrated that A. phagocytophilum manipulates tick 723 724 298 microbiota through induction of I. scapularis antifreeze glycoprotein (IAFGP) that results 725 726 13 727 728 729 730 731 299 in alteration of bacterial biofilm formation to facilitate infection. Additionally, they showed 732 733 300 that A. phagocytophilum alters the composition of the tick microbiota after midgut infection 734 735 736 301 (Abraham et al., 2017). Based on our results and using commensal (Streptococcus) and 737 738 302 Anaplasma bacteria that were identified in the I. ricinus microbiota as putatively 739 740 303 metabolically active (Firmicutes and TBPs with a positive correlation between normalized 741 742 304 RNA IDs and protein PSMs; Fig. 3C), we hypothesized that Anaplasma infection may be 743 744 305 also facilitated by interfering with biofilm formation through reduction of the levels of 745 746 306 biofilm matrix binding proteins (MBPs) and/or the presence of bacteria producing MBPs. 747 748 307 Biofilm MBPs play structural roles in the biofilm formation and are produced by several 749 750 308 commensal bacteria (Fong and Yildis, 2015), but in this study were identified only in 751 752 753 309 Streptococcus (A0A0M4K0V8, A0A0M4JKY5, and E1M3U5; normalized PSMs = 0.06, 754 755 310 Supplementary file 3 – Dataset 2). However, this hypothesis needs to be addressed by 756 757 311 studies in ticks uninfected and experimentally infected with pathogens such as A. 758 759 312 phagocytophilum. 760 761 313 4. Conclusions 762 763 764 314 The innovative metaomics approach used in this study resulted in the characterization of 765 766 315 the bacterial microbiota in unfed I. ricinus. The approach proposed in this study for the 767 768 316 characterization of tick microbiota was based on reused RNA-seq data. Although this is a 769 770 317 limitation when compared to metagenomics approaches, it provides the opportunity to reuse 771 772 318 already existing data for microboiota characterization. The reuse of RNA-seq data not 773 774 319 targeted at bacterial sequences probably affects the identification of some bacterial taxa. 775 776 320 However, the integration of metatrancriptomics and metaproteomics approaches provided a 777 778 321 better characterization of tick bacterial microbiome when compared to RNA-seq alone by 779 780 781 782 14 783 784 785 786 787 322 increasing bacteria identification and support for identified bacteria with putative functional 788 789 323 implications. The results corroborated previous reports on the presence of symbiotic, 790 791 792 324 commensal, soil, environmental, and pathogenic bacteria in the I. ricinus microbiota. 793 794 325 Additionally, previously unrecognized commensal and soil microorganisms were identified 795 796 326 in unfed I. ricinus. The results of the metaomics approach suggested new mechanisms by 797 798 327 which pathogen infection affects biofilm formation to manipulate tick microbiota and 799 800 328 facilitate infection. Metaomics approaches integrating different omics datasets from 801 802 329 metagenomics, metatranscriptomics and metaproteomics studies would provide a better 803 804 330 description of tick microbiota composition, and insights into functional implication of tick 805 806 331 interactions with microbiota, pathogens and hosts. 807 808 809 332 810 811 333 Conflict of interest statement 812 813 334 The authors declare that there are no conflicts of interest. 814 815 335 816 817 818 336 Acknowledgments 819 820 337 We thank Raquel Tobes and Marina Manrique (Oh no sequences! Research group, Era7 821 822 338 Bioinformatics, Granada, Spain) for technical assistance with the metatranscriptomics 823 824 825 339 analysis. This work was financially supported by the H2020 COllaborative Management 826 827 340 Platform for detection and Analyses of (Re-) emerging and foodborne outbreaks in Europe 828 829 341 (COMPARE) Grant 643476. MV was supported by the Research Plan of the University of 830 831 342 Castilla- La Mancha (UCLM), Spain. 832 833 343 834 835 836 344 Supplementary information 837 838 15 839 840 841 842 843 345 Supplementary file 1 – Table 1. Bacterial sequence database constructed with genome 844 845 346 and/or species-specific rRNA sequences. 846 847 848 347 Supplementary file 2 - Dataset 1. Metatranscriptomics database of tick bacterial 849 850 348 microbiota. 851 852 349 Supplementary file 3 - Dataset 2. Metaproteomics database of tick bacterial microbiota. 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 16 895 896 897 898 899 900 351 References 901 902 352 Abraham, N.M., Liu, L., Jutras, B.L., Yadav, A.K., Narasimhan, S., Gopalakrishnan, V., 903 904 353 Ansari, J.M., Jefferson, K.K., Cava, F., Jacobs-Wagner, C., Fikrig, E., 2017. Pathogen- 905 906 354 mediated manipulation of arthropod microbiota to promote infection. Proc. Natl. Acad. Sci. 907 908 355 U. S. A. 114, E781-E790. 909 910 911 356 Aguiar-Pulido, V., Huang, W., Suarez-Ulloa, V., Cickovski, T., Mathee, K., Narasimhan, 912 913 357 G., 2016. Metagenomics, metatranscriptomics, and metabolomics approaches for 914 915 358 microbiome analysis. Evol. Bioinform. Online. 12(Suppl 1), 5-16. 916 917 918 359 Al-Khedery, B., Barbet, A.F., 2014. Comparative genomics identifies a potential marker of 919 920 360 human-virulent Anaplasma phagocytophilum. Pathogens. 3, 25-35. 921 922 361 Anderson, J.F., Barthold, S.W., Magnarelli, L.A., 1990. Infectious but nonpathogenic 923 924 362 isolate of . J. Clin. Microbiol. 28, 2693-2699. 925 926 927 363 Andreotti, R., Perez de Leon, A.A., Dowd, S.E., Guerrero, F.D., Bendele, K.G., Scoles, 928 929 364 G.A., 2011. Assessment of bacterial diversity in the cattle tick Rhipicephalus (Boophilus) 930 931 365 microplus through tag-encoded pyrosequencing. BMC Microbiol. 11, 6. 932 933 934 366 Ayllón, N., Villar, V., Galindo, R.C., Kocan, K.M., Šíma, R., López, J.A., Vázquez, J., 935 936 367 Alberdi, P., Cabezas-Cruz, A., Kopáček, P., de la Fuente, J., 2015. Systems biology of 937 938 368 tissue-specific response to Anaplasma phagocytophilum reveals differentiated apoptosis in 939 940 369 the tick vector Ixodes scapularis. PLoS Genet. 11, e1005120. 941 942 370 Benson, M.J., Gawronski, J.D., Eveleigh, D.E., Benson, D.R., 2004. Intracellular symbionts 943 944 371 and other bacteria associated with deer ticks (Ixodes scapularis) from Nantucket and 945 946 947 372 Wellfleet, Cape Cod, Massachusetts. Appl. Environ. Microbiol. 70, 616-620. 948 949 950 17 951 952 953 954 955 373 Bonnet, S., Michelet, L., Moutailler, S., Cheval, J., Hébert, C., Vayssier-Taussat, M., Eloit, 956 957 374 M., 2014. Identification of parasitic communities within European ticks using next- 958 959 960 375 generation sequencing. PLoS Negl. Trop. Dis. 8, e2753. 961 962 376 Bonnet, S., Binetruy, F., Hernández-Jarguín, A., Duron, O., 2017. The tick microbiome: 963 964 377 Why non-pathogenic microorganisms matter in tick biology and pathogen transmission. 965 966 378 Front. Cell. Infect. Microbiol. 7, 236. 967 968 969 379 Bouchez, T., Blieux, A.L., Dequiedt, S., Domaizon, I., Dufresne, A., Ferreira, S., Godon, 970 971 380 J.J., Hellal, J., Joulian, C., Quaiser, A., Martin-Laurent, F., Mauffret, A., Monier, J. M., 972 973 381 Peyret, P., Schmitt-Koplin, P., Sibourg, O., D’oiron, E., Bispo, A., Deportes, I., Grand, C., 974 975 382 Cuny, P., Maron, P.A., Ranjard, L., 2016. Molecular microbiology methods for 976 977 383 environmental diagnosis. Environ. Chem. Lett. 14, 423-441. 978 979 384 Budachetri, K., Browning, R. E., Adamson, S. W., Dowd, S. E., Chao, C.C., Ching, W.M., 980 981 385 Karim, S., 2014. An insight into the microbiome of the Amblyomma maculatum (Acari: 982 983 386 Ixodidae). J. Med. Entomol. 51, 119-129. 984 985 986 387 Cafiso, A., Bazzocchi, C., De Marco, L., Opara, M.N., Sassera, D., Plantard, O., 2016. 987 988 388 Molecular screening for Midichloria in hard and soft ticks reveals variable prevalence 989 990 389 levels and bacterial loads in different tick species. Ticks Tick Borne Dis. 6, 1186-1192. 991 992 993 390 Carpi, G., Cagnacci, F., Wittekindt, N. E., Zhao, F., Qi, J., Tomsho, L. P., Drautz, D.I., 994 995 391 Rizzoli, A., Schuster, S.C., 2011. Metagenomic profile of the bacterial communities 996 997 392 associated with Ixodes ricinus ticks. PLoS ONE. 6, e25604. 998 999 1000 393 Cheng, K., Ning, Z., Zhang, X., Li, L., Liao, B., Mayne, J., Stintzi, A., Figeys, D., 2017. 1001 1002 394 MetaLab: an automated pipeline for metaproteomic data analysis. Microbiome. 5, 157. 1003 1004 1005 1006 18 1007 1008 1009 1010 1011 395 Clay, K., Klyachko, O., Grindle, N., Civitello, D., Oleske, D., Fuqua, C., 2008. Microbial 1012 1013 396 communities and interactions in the lone star tick, Amblyomma americanum. Mol. Ecol. 17, 1014 1015 1016 397 4371-4381. 1017 1018 398 Cole, J.R., Chai, B., Marsh, T.L., Farris, R.J., Wang, Q., Kulam, S.A., Chandra, S., 1019 1020 399 McGarrell, D.M., Schmidt, T.M., Garrity, G.M., Tiedje, J.M., Ribosomal Database Project, 1021 1022 400 2003. The Ribosomal Database Project (RDP-II): previewing a new autoaligner that allows 1023 1024 1025 401 regular updates and the new prokaryotic taxonomy. Nucleic Acids Res. 31, 442-443. 1026 1027 402 Cordaux, R., Paces-Fessy, M., Raimond, M., Michel-Salzat, A., Zimmer, M., Bouchon, D., 1028 1029 403 2007. Molecular characterization and evolution of arthropod- 1030 1031 404 pathogenic Rickettsiella bacteria. Appl. Environ. Microbiol. 73, 5045-5047. 1032 1033 1034 405 Courtney, J.W., Kostelnik, L.M., Zeidner, N.S., Massung, R.F., 2004. Multiplex real-time 1035 1036 406 PCR for detection of Anaplasma phagocytophilum and Borrelia burgdorferi. J. Clin. 1037 1038 407 Microbiol. 42, 3164-3168. 1039 1040 1041 408 de la Fuente, J., Estrada-Peña, A., Venzal, J.M., Kocan, K.M., and Sonenshine, D.E., 2008. 1042 1043 409 Overview: Ticks as vectors of pathogens that cause disease in humans and animals. Front. 1044 1045 410 Biosci. 13, 6938-6946. 1046 1047 1048 411 de la Fuente, J., Antunes, S., Bonnet, S., Cabezas-Cruz, A., Domingos, A.G., Estrada-Peña, 1049 1050 412 A., Johnson, N., Kocan, K.M., Mansfield, K.L., Nijhof, A.M., Papa, A., Rudenko, N., 1051 1052 413 Villar, M., Alberdi, P., Torina, A., Ayllón, N., Vancova, M., Golovchenko, M., Grubhoffer, 1053 1054 414 L., Caracappa, S., Fooks, A.R., Gortazar, C., Rego, R.O.M., 2017. Tick-pathogen 1055 1056 415 interactions and vector competence: identification of molecular drivers for tick-borne 1057 1058 416 diseases. Front. Cell. Infect. Microbiol. 7, 114. 1059 1060 1061 1062 19 1063 1064 1065 1066 1067 417 Evans, V.C., Barker, G., Heesom, K.J., Fan, J., Bessant, C., Matthews, D.A., 2012. De 1068 1069 418 novo derivation of proteomes from transcriptomes for transcript and protein identification. 1070 1071 1072 419 Nat. Methods. 9, 1207-1211. 1073 1074 420 Fan, Y., Thompson, J. W., Dubois, L. G., Moseley, M. A., Wernegreen, J.J., 2013. 1075 1076 421 Proteomic analysis of an unculturable bacterial endosymbiont (Blochmannia) reveals high 1077 1078 422 abundance of chaperonins and biosynthetic enzymes. J. Proteome Res. 12, 704-718. 1079 1080 1081 423 Fernández de Mera, I.G., Ruiz-Fons, F., Mangold, K.J., Gortázar, C., de la Fuente, J., 1082 1083 424 2013. Spotted Fever Group rickettsiae in questing ticks, Central Spain. Emerg. Infect. Dis. 1084 1085 425 19, 1163-1165 1086 1087 1088 426 Fernández de Mera, I.G., Chaligiannis, I., Hernández-Jarguín, A., Villar, M., Mateos- 1089 1090 427 Hernández, L., Papa, A., Sotiraki, S., Ruiz-Fons, F., Cabezas-Cruz, A., Gortázar, C., de la 1091 1092 428 Fuente, J., 2017. Combination of RT-PCR and proteomics for the identification of 1093 1094 429 Crimean-Congo hemorrhagic fever virus in ticks. Heliyon. 3, e00353. 1095 1096 1097 430 Franzosa, E.A., Hsu, T., Sirota-Madi, A., Shafquat, A., Abu-Ali, G., Morgan, X.C., 1098 1099 431 Huttenhower, C., 2015. Sequencing and beyond: integrating molecular 'omics' for microbial 1100 1101 432 community profiling. Nat. Rev. Microbiol. 13, 360-372. 1102 1103 433 Frith, M. C., Hamada, M., Horton, P., 2010a. Parameters for accurate genome alignment. 1104 1105 434 BMC Bioinformatics. 11, 80. 1106 1107 1108 435 Frith, M. C., Wan, R., Horton, P., 2010b. Incorporating sequence quality data into 1109 1110 436 alignment improves DNA read mapping. Nucleic Acids Res. 38, e100. 1111 1112 1113 1114 1115 1116 1117 1118 20 1119 1120 1121 1122 1123 437 Gall, C. A., Reif, K. E., Scoles, G. A., Mason, K. L., Mousel, M., Noh, S. M., Brayton, K. 1124 1125 438 A., 2016. The bacterial microbiome of Dermacentor andersoni ticks influences pathogen 1126 1127 1128 439 susceptibility. ISME J. 10, 1846-1855. 1129 1130 440 Genomic Resources Development Consortium, Contreras, M., de la Fuente, J., Estrada- 1131 1132 441 Peña, A., Grubhoffer, L., Tobes, R., 2014. Transcriptome sequence divergence between 1133 1134 442 Lyme disease tick vectors, Ixodes scapularis and Ixodes ricinus. Genomic Resources Notes 1135 1136 1137 443 accepted 1 April 2014 – 31 May 2014. Mol. Ecol. Resour. 14, 1095. 1138 1139 444 Greay, T.L., Gofton, A.W., Paparini, A., Ryan, U.M., Oskam, C.L., Irwin, P.J., 2018. 1140 1141 445 Recent insights into the tick microbiome gained through next-generation sequencing. 1142 1143 446 Parasit. Vectors. 11, 12. 1144 1145 1146 447 Heintz-Buschart, A., Wilmes, P., 2017. Human gut microbiome: function matters. Trends 1147 1148 448 Microbiol. In press. 1149 1150 1151 449 Johansson, H., Dhaygude, K., Lindström, S., Helanterä, H., Sundström, L., Trontti, K., 1152 1153 450 2013. A Metatranscriptomic approach to the identification of microbiota associated with the 1154 1155 451 ant (Formica exsecta). PLoS ONE. 8: e79777. 1156 1157 1158 452 Kau, A.L., Ahern, P.P., Griffin, N.W., Goodman, A.L., Gordon, J.I., 2011. Human 1159 1160 453 nutrition, thegut microbiome and the immune system. Nature. 474, 327-336. 1161 1162 454 Kiełbasa, S. M., Wan, R., Sato, K., Horton, P., Frith, M. C., 2011. Adaptive seeds tame 1163 1164 455 genomic sequence comparison. Genome Res. 21, 487-493. 1165 1166 456 Koči, J., Šimo, L., and Park, Y., 2013. Validation of internal reference genes for real-time 1167 1168 1169 457 quantitative polymerase chain reaction studies in the tick, Ixodes scapularis (Acari: 1170 1171 458 Ixodidae). J. Med. Entomol. 50, 79-84. 1172 1173 1174 21 1175 1176 1177 1178 1179 459 Kondo, N., Ijichi, N., Shimada, M., Fukatsu, T., 2002. Prevailing triple infection with 1180 1181 460 Wolbachia in Callosobruchus chinensis (Coleoptera: Bruchidae). Mol. Ecol. 11, 167-180. 1182 1183 1184 461 Kurilshikov, A., Livanova, N. N., Fomenko, N. V., Tupikin, A. E., Rar, V. A., Kabilov, M. 1185 1186 462 R., Tikunova, N. V., 2015. Comparative metagenomic profiling of symbiotic bacterial 1187 1188 463 communities associated with Ixodes persulcatus, Ixodes pavlovskyi and Dermacentor 1189 1190 464 reticulatus ticks. PloS ONE. 10, e0131413. 1191 1192 1193 465 Letunic, I., Bork, P., 2011. Interactive Tree Of Life v2: online annotation and display of 1194 1195 466 phylogenetic trees made easy. Nucleic Acids Res. 39, 475-478. 1196 1197 1198 467 Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, 1199 1200 468 G., Durbin, R., 1000 Genome Project Data Processing Subgroup, 2009. The sequence 1201 1202 469 alignment/map format and SAMtools. Bioinformatics. 25, 2078-2079. 1203 1204 1205 470 Li, H., 2011. A statistical framework for SNP calling, mutation discovery, association 1206 1207 471 mapping and population genetical parameter estimation from sequencing data. 1208 1209 472 Bioinformatics. 27, 2987-2993. 1210 1211 473 Liu, L., Li, L., Liu, J., Yu, Z., Yang, X., Liu, J., 2016. Population dynamics of multiple 1212 1213 1214 474 symbionts in the hard tick, Dermacentor silvarum Olenev (Acari: Ixodidae). Ticks Tick 1215 1216 475 Borne Dis. 7, 188-192. 1217 1218 476 Livak, K.J., Schmittgen, T.D., 2001. Analysis of relative gene expression data using real 1219 1220 477 time quantitative PCR and the 2-ΔΔ CT method. Methods. 25, 402-408. 1221 1222 1223 478 Luo, J., Liu, M.X., Ren, Q.Y., Chen, Z., Tian, Z.C., Hao, J.W., Wu, F., Liu, X.C., Luo, 1224 1225 479 J.X., Yin, H., Wang, H., Liu, G.Y., 2017. Micropathogen community analysis 1226 1227 1228 1229 1230 22 1231 1232 1233 1234 1235 480 in Hyalomma rufipes via high-throughput sequencing of small RNAs. Front. Cell. Infect. 1236 1237 481 Microbiol. 7, 374. 1238 1239 1240 482 Mäder, U., Nicolas, P., Richard, H., Berriéres, P., Aymerich, S., 2011. Comprehensive 1241 1242 483 identification and quantification of microbial transcriptomes by genome-wide unbiased 1243 1244 484 methods. Curr. Opin. Biotechnol. 22, 32-41. 1245 1246 485 Martin, A.R., Brown, G.K., Dunstan, R.H., Roberts, T.K., 2005. Anaplasma platys: an 1247 1248 1249 486 improved PCR for its detection in dogs. Exp. Parasitol. 109, 176-180. 1250 1251 487 Massung, R.F., Mauel, M.J., Owens, J.H., Allan, N., Courtney, J.W., Stafford, K.C. 3rd., 1252 1253 488 Mather, T.N., 2002. Genetic variants of Ehrlichia phagocytophila, Rhode Island and 1254 1255 489 Connecticut. Emerg. Infect. Dis. 8, 467-472. 1256 1257 1258 490 Massung, R.F., Priestley, R.A., Miller, N.J., Mather, T.N., Levin, M.L., 2003. Inability of a 1259 1260 491 variant strain of Anaplasma phagocytophilum to infect mice. J. Infect. Dis. 188, 1757-1763. 1261 1262 1263 492 Moreno, C.X., Moy, F., Daniels, T.J., Godfrey, H.P., Cabello, F.C., 2006. Molecular 1264 1265 493 analysis of microbial communities identified in different developmental stages of Ixodes 1266 1267 494 scapularis ticks from Westchester and Dutchess Counties, New York. Environ. Microbiol. 1268 1269 495 8, 761-72. 1270 1271 1272 496 Nakao, R., Abe, T., Nijhof, A. M., Yamamoto, S., Jongejan, F., Ikemura, T., Sugimoto, C., 1273 1274 497 2013. A novel approach, based on BLSOMs (Batch Learning Self- Organizing Maps) to the 1275 1276 498 microbiome analysis of ticks. ISME J. 7, 1003-1015. 1277 1278 499 Narasimhan, S., Fikrig, E., 2015. Tick microbiome: the force within. Trends Parasitol. 31, 1279 1280 1281 500 315-323. 1282 1283 1284 1285 1286 23 1287 1288 1289 1290 1291 501 Narasimhan, S., Rajeevan, N., Liu, L., Zhao, Y.O., Heisig, J., Pan, J., Eppler-Epstein, R., 1292 1293 502 Deponte, K., Fish, D., Fikrig, E., 2014. of the tick vector Ixodes scapularis 1294 1295 1296 503 modulate colonization of the Lyme disease spirochete. Cell Host Microbe. 15, 58-71. 1297 1298 504 Narasimhan, S., Schuijt, T.J., Abraham, N.M., Rajeevan, N., Coumou, J., Graham, M., 1299 1300 505 Robson, A., Wu, M.J., Daffre, S., Hovius, J.W., Fikrig, E., 2017. Modulation of the tick gut 1301 1302 506 milieu by a secreted tick protein favors Borrelia burgdorferi colonization. Nat. Commun. 8, 1303 1304 507 184. 1305 1306 508 Noda, A.A., Rodríguez, I., Mondeja, B., Fernándeze, C., 2013. Design, optimization and 1307 1308 509 evaluation of a polymerase chain reaction for detection of Borrelia spp. Adv. Clin. Exp. 1309 1310 510 Med. 5, 639-653. 1311 1312 1313 511 Oteo, J.A., Portillo, A., Santibáñez, S., Blanco, J.R., Pérez-Martínez, L., Ibarra, V., 2006. 1314 1315 512 Cluster of cases of human Rickettsia felis infection from Southern Europe (Spain) 1316 1317 513 diagnosed by PCR. J. Clin. Microbiol. 44, 2669-2671. 1318 1319 1320 514 Philippot, L., Raaijmakers, J.M., Lemanceau, P., van der Putten, W.H., 2013. Going back 1321 1322 515 to the roots: the of the . Nat. Rev. Microbiol. 11, 789-799. 1323 1324 516 Ponnusamy, L., González, A., Van Treuren, W., Weiss, S., Parobek, C., Juliano, J.J, 1325 1326 517 Knight, R., Roe, R.M., Apperson, C.S., Meshnick, S.R., 2014. Diversity of Rickettsiales in 1327 1328 518 the microbiome of the lone star tick, Amblyomma americanum. Appl. Env. Microbiol. 80, 1329 1330 1331 519 354-359. 1332 1333 520 Portillo, A., Santos, A.S., Santibáñez, S., Pérez-Martínez, L., Blanco, J.R., Ibarra, V., Oteo, 1334 1335 521 J.A., 2005. Detection of a non-pathogenic variant of Anaplasma phagocytophilum in Ixodes 1336 1337 522 ricinus from La Rioja, Spain. Ann. N. Y. Acad. Sci. 1063, 333-336. 1338 1339 1340 1341 1342 24 1343 1344 1345 1346 1347 523 Qiu, Y., Nakao, R., Ohnuma, A., Kawamori, F., Sugimoto, C., 2014. Microbial population 1348 1349 524 analysis of the salivary glands of ticks: a possible strategy for the surveillance of bacterial 1350 1351 1352 525 pathogens. PLoS ONE. 9, e103961. 1353 1354 526 Razzauti, M., Galan, M., Bernard, M., Maman, S., Klopp, C., Charbonnel, N., Vayssier- 1355 1356 527 Taussat, M., Eloit, M., Cosson, J.F., 2015. A comparison between transcriptome 1357 1358 528 sequencing and 16S metagenomics for detection of bacterial pathogens in wildlife. PLoS 1359 1360 1361 529 Negl. Trop. Dis. 9, e0003929. 1362 1363 530 Ririe, K.M., Rasmussen, R.P., Wittwer, C.T., 1997. Product differentiation by analysis of 1364 1365 531 DNA melting curves during the polymerase chain reaction. Anal. Biochem. 245, 154-160. 1366 1367 1368 532 Ruiz-Fons, F., Fernández de Mera, I., G., Acevedo, P., Gortázar, C., de la Fuente, J., 2012. 1369 1370 533 Factors driving the abundance of Ixodes ricinus ticks and the prevalence of zoonotic I. 1371 1372 534 ricinus-borne pathogens in natural foci. Appl. Environ. Microbiol. 8, 2669-2676. 1373 1374 1375 535 Salter, S. J., Cox, M. J., Turek, E. M., Calus, S. T., Cookson, W. O., Moffatt, M.F., Turner, 1376 1377 536 P., Parkhill, J., Loman, N.J., Walker, A.W., 2014. Reagent and laboratory contamination 1378 1379 537 can critically impact sequence-based microbiome analyses. BMC Biol. 12, 87. 1380 1381 1382 538 Schabereiter-Gurtner, C., Lubitz, W., Rolleke, S., 2003. Application of broad-range 16S 1383 1384 539 rRNA PCR amplification and DGGE fingerprinting for detection of tick-infecting bacteria. 1385 1386 540 J. Microbiol. Methods. 52, 251-260. 1387 1388 541 Schevchenko, A., Tomas, H., Havlis, J., Olsen, J.V., Mann, M., 2006. In-gel digestion for 1389 1390 542 mass-spectrometric characterization of proteins and proteomes. Nat. Protoc. 1, 2856-2860. 1391 1392 1393 543 Schwabe, R.F., Jobin, C., 2013. The microbiome and cancer. Nat. Rev. Cancer. 13, 800- 1394 1395 544 812. 1396 1397 1398 25 1399 1400 1401 1402 1403 545 Selvaratnam, S., Schoedel, B. A., McFarland, B. L., Kulpa, C. F., 1995. Application of 1404 1405 546 reverse transcriptase PCR for monitoring expression of the catabolic dmpN gene in a 1406 1407 1408 547 phenol-degrading sequencing batch reactor. Appl. Environ. Microbiol. 61, 3981-3985. 1409 1410 548 Stokes, J.V., Moraru, G.M., McIntosh, C., Kummari, E., Rausch, K., Varela-Stokes, A.S., 1411 1412 549 2016. Differentiated THP-1 cells exposed to pathogenic and nonpathogenic Borrelia 1413 1414 550 species demonstrate minimal differences in production of four inflammatory cytokines. 1415 1416 1417 551 Vector Borne Zoonotic Dis. 16, 691-695. 1418 1419 552 Tanca, A., Palomba, A., Deligios, M., Cubeddu, T., Fraumene, C., Biosa, G., Pagnozzi, D., 1420 1421 553 Addis, M.F., Uzzau, S., 2013. Evaluating the impact of different sequence databases on 1422 1423 554 metaproteome analysis: Insights from a lab-assembled microbial mixture. PLoS One. 8, 1424 1425 1426 555 e82981. 1427 1428 556 Tanca, A., Palomba, A., Pisanu, S., Deligios, M., Fraumene, C., Manghina, V., Pagnozzi, 1429 1430 557 D., Addis, M.F., Uzzau, S., 2014. A straightforward and efficient analytical pipeline for 1431 1432 558 metaproteome characterization. Microbiome. 2, 49. 1433 1434 1435 559 Trout Fryxell, R., DeBruyn, J., 2016. The microbiome of Ehrlichia-infected and uninfected 1436 1437 560 lone star ticks (Amblyomma americanum). PLoS ONE. 11, 1-19. 1438 1439 1440 561 van Overbeek, L., Gassner, F., van der Plas, C. L., Kastelein, P., Nunes-da Rocha, U., 1441 1442 562 Takken, W., 2008. Diversity of Ixodes ricinus tick-associated bacterial communities from 1443 1444 563 different forests. FEMS Microbiol. Ecol. 66, 72-84. 1445 1446 564 Van Treuren, W., Ponnusamy, L., Brinkerhoff, R.J., Gonzalez, A., Parobek, C.M., Juliano, 1447 1448 1449 565 J.J., Andreadis, T.G., Falco, R.C., Ziegler, L.B., Hathaway, N., Keeler, C., Emch, M., 1450 1451 566 Bailey, J.A., Roe, R.M., Apperson, C.S., Knight, R., Meshnick, S.R., 2015. Variation in the 1452 1453 1454 26 1455 1456 1457 1458 1459 567 microbiota of Ixodes ticks with regard to geography, species, and sex. Appl. Environ. 1460 1461 568 Microbiol. 81, 6200–6209. 1462 1463 1464 569 Varela-Stokes, A.S., Park, S.H., Kim, S.A., Ricke, S.C., 2017. Microbial communities 1465 1466 570 in North American ixodid ticks of veterinary and medical importance. Front. Vet. Sci. 4, 1467 1468 571 179. 1469 1470 572 Vayssier-Taussat, M., Moutailler, S., Michelet, L., Devillers, E., Bonnet, S., Cheval, J., 1471 1472 1473 573 Hébert, C., Eloit, M., 2013. Next deneration sequencing uncovers unexpected bacterial 1474 1475 574 pathogens in ticks in Western Europe. PLoS One. 8, e81439. 1476 1477 575 Vayssier-Taussat, M., Kazimirova, M., Hubalek, Z., Hornok, S., Farkas, R., Cosson, J.F., 1478 1479 576 Bonnet, S., Vourch, G., Gasqui, P., Mihalca, A.D., Plantard, O., Silaghi, C., Cutler, S., 1480 1481 1482 577 Rizzoli, A., 2015. Emerging horizons for tick-borne pathogens: from the “one pathogen- 1483 1484 578 one disease” vision to the pathobiome paradigm. Future Microbiol. 10, 2033-2043. 1485 1486 579 Villar, M., Ayllón, N., Alberdi, P., Moreno, A., Moreno, M., Tobes, R., Mateos-Hernández, 1487 1488 580 L., Weisheit, S., Bell-Sakyi, L., de la Fuente, J., 2015. Integrated metabolomics, 1489 1490 581 transcriptomics and proteomics identifies metabolic pathways affected by Anaplasma 1491 1492 1493 582 phagocytophilum infection in tick cells. Mol. Cell. Proteomics. 14, 3154-3172. 1494 1495 583 Williams-Newkirk, A.J., Rowe, L.A., Mixson-Hayden, T.R., Dasch, G.A., 2014. 1496 1497 584 Characterization of the bacterial communities of life stages of free living lone star ticks 1498 1499 585 (Amblyomma americanum). PLoS ONE. 9, e102130. 1500 1501 1502 586 Xiang, L., Pożniak, B., Cheng, T., 2017. Bacteriological analysis of saliva from partially or 1503 1504 587 fully engorged female adult Rhipicephalus microplus by next-generation sequencing. 1505 1506 588 Antonie Van Leeuwenhoek. 110, 105-113. 1507 1508 1509 1510 27 1511 1512 1513 1514 1515 589 Xu, X. L., Cheng, T. Y., Yang, H., Yan, F., 2015. Identification of intestinal bacterial flora 1516 1517 590 in Rhipicephalus microplus ticks by conventional methods and PCR-DGGE analysis. Exp. 1518 1519 1520 591 Appl. Acarol. 66, 257-268. 1521 1522 592 Yoon, S.S., Kim, E.K., Lee, W.J., 2015. Functional genomic and metagenomic approaches 1523 1524 593 to understanding gut microbiota-animal mutualism. Curr. Opin. Microbiol. 24, 38-46. 1525 1526 594 Zhuang, L., Zhang, Z., An, X., Fan, H., Ma, M., Anderson, B.D., Jiang, J., Liu, W., Cao, 1527 1528 595 W., Tong, Y., 2014a. An efficient strategy of screening for pathogens in wild-caught ticks 1529 1530 596 and mosquitoes by reusing small RNA deep sequencing data. PLoS One. 9, e90831. 1531 1532 597 Zhuang, L., Wang, C.Y., Tong, Y.G., Tang, F., Yang, H., Liu, W., Cao, W.C., 2014b. 1533 1534 598 Discovery of Rickettsia species in Dermacentor niveus Neumann ticks by investigating the 1535 1536 1537 599 diversity of bacterial communities. Ticks Tick Borne Dis. 5, 564-568. 1538 1539 600 Zivkovic, Z., Blouin, E.F., Manzano-Roman, R., Almazán, C., Naranjo, V., Massung, R.F., 1540 1541 601 Jongejan, F., Kocan, K.M., de la Fuente, J., 2009. Anaplasma 1542 1543 602 phagocytophilum and Anaplasma marginale elicit different gene expression responses in 1544 1545 1546 603 cultured tick cells. Comp. Funct. Genomics. 2009, 705034. 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 28 1567 1568 1569 1570 1571 1572 605 Figure legends 1573 1574 606 Figure 1. Metaomics experimental design. An integrated metatranscriptomics and 1575 1576 607 metaproteomics approach was developed for the characterization of I. ricinus bacterial 1577 1578 1579 608 microbiota. Reused I. ricinus RNA-seq data was the basis for metatranscriptomics analysis 1580 1581 609 that resulted in the database of identified bacterial genera. This database was then use to 1582 1583 610 generate the Uniprot protein database used in metaproteomics analysis. 1584 1585 611 Figure 2. Phylogenetic and taxonomic abundance analyses. Phylogenetic pruned tree 1586 1587 612 and associated heatmap showing the relative abundance of bacterial genomes (number of 1588 1589 613 count reads or identifications, IDs) and the correspondent peptide assignments (peptide 1590 1591 614 spectrum matches, PSMs) at genus level. The analyses were done using a Heatmap Tool 1592 1593 615 associated with a pruned phylogenetic tree generated with the platform phyloT 1594 1595 1596 616 (http://phylot.biobyte.de) based on NCBI taxonomy, and visualized using the Interactive 1597 1598 617 Tree of Life software v3.4.3 (http://itol.embl.de). 1599 1600 618 Figure 3. Comparative analysis of metranscriptomics and metaproteomics results. (A) 1601 1602 619 Representation of the bacterial phyla with the corresponding number of genera (species) 1603 1604 620 identified at RNA/protein levels. (B) Validation of integrated metatranscriptomics and 1605 1606 621 metaproteomics results by PCR. Bacterial DNA levels were determined by real-time PCR 1607 1608 622 and normalized against tick 16S rRNA and Ixodes rps 4. Oligonucleotide primers and real- 1609 1610 623 time PCR conditions are described in Table 2. DNA levels are shown in arbitrary units as 1611 1612 1613 624 normalized Ct values. Two experiments were conducted with similar results. (C) 1614 1615 625 Correlation analysis was conducted between normalized number of count reads or 1616 1617 626 identifications (IDs; metatranscriptomics) and peptide spectrum matches (PSMs; 1618 1619 627 metaproteomics) results (Table 2) corresponding to all bacterial genera (N=38), 1620 1621 1622 29 1623 1624 1625 1626 1627 628 Proteobacteria (N=21), Actinobacteria (N=7) and Firmicutes (N=4) phyla represented by 4 1628 1629 629 genera or more, and TBPs (genera Anaplasma, Ehrlichia, Rickettsia, and Borrelia). The 1630 1631 2 1632 630 linear correlation coefficients (R ) and Pearson correlation coefficients (*r > +0.5) are 1633 1634 631 shown. For reference, Streptococcus and Anaplasma genera are shown in black and red 1635 1636 632 rhombuses, respectively. 1637 1638 633 Figure 4. Protein annotation in identified tick-borne pathogens. The GO analysis for 1639 1640 634 BP was conducted on proteins identified in TBPs (Anaplasma, Borrelia, Ehrlichia and 1641 1642 635 Rickettsia genera). Quantitative representation of BP abundance (%) was done using the 1643 1644 636 total number of PSMs represented on each BP. 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 30 1679 1680 1681 1682 1683 638 Table 1. Oligonucleotide primers and real-time PCR conditions. 1684 1685 Annealing 1686 1687 Organism Primer sequences (5-′3′) temperature 1688 Gene References 1689 1690 16S GACAAGAAGACCCTA Zivkovic et Ixodidae 42 °C 1691 rRNA ATCCAACATCGAGGT al., 2009 1692 1693 GGTGAAGAAGATTGTCAAGCAGAG Koči et al., Ixodes spp. rps 4 54 °C 1694 TGAAGCCAGCAGGGTAGTTTG 2013 1695 1696 AGAGTTTGATCCTGGCTCAG Fernández de 1697 16S AACGTCATTATCTTCCTTGC 54 °C Mera et al., 1698 Rickettsia spp. 1699 rRNA 2013 1700 ATGGCGAATATTTCTCCAAAA Oteo et al., 1701 1702 SFG Rickettsia ompA AGTGCAGCATTCGCTCCCCCT 52 °C 2006 1703 Anaplasma spp. 16S CAGAGTTTGATCCTGGCTCAGAACG 1704 Ruiz-Fons et 1705 rRNA GAGTTTGCCGGGACTTCTTCTGTA 42 °C 1706 al., 2012 1707 1708 ATGGAAGGTAGTGTTGGTTATGGTATT 1709 Courtney et Anaplasma msp2 TTGGTCTTGAAGCGCTCGTA 60 °C 1710 al., 2004 1711 phagocytophilum 1712 GGTACCYACAGAAGAAGTCC 1713 Martin et al., 1714 Ehrlichia spp. 16S TAGCACTCATCGTTTACAGC 54 °C 2005 1715 rRNA 1716 1717 Pseudomonas ATCACCGACTGGGACAAGTGGGAAGACC Selvaratnam dmpN 50 °C 1718 putida TGGTATTCCAGCGGTGAAACGGCGG et al., 1995 1719 1720 GGGTCCAATAAGTGATGAAGAAAC Kondo et al., Wolbachia spp. wsp 55 °C 1721 TTAAAACGCTACTCCAGCTTCTGC 2002 1722 1723 Candidatus CAAAAGTGAAAGCCTTGGGC 16S Cafiso et al., 1724 Midichloria TGAGACTTAAAYCCCAACATC 58 °C 1725 rRNA 2016 1726 mitochondrii 1727 TAGATGAGTCTGCGTCTTATTA 1728 16S Noda et al., Borrelia spp. CTTACACCAGGAATTCTAACTT 58 °C 1729 rRNA 2013 1730 1731 1732 1733 1734 31 1735 1736 1737 1738 1739 640 Table 2. Relative bacteria IDs and PSMs obtained at genus level. 1740 1741 641 1742 1743 1744 Phylum Genus IDs PSMs 1745 1746 Proteobacteria Acinetobacter 0.22 0.16 1747 1748 Anaplasma 10.67 0.73 1749 1750 Bradyrhizobium 2.44 11.07 1751 1752 Brevibacterium 1.78 0.77 1753 1754 Brucella 0.22 0.38 1755 1756 Candidatus Midichloria 15.55 0.05 1757 1758 1759 Candidatus Neoehrlichia 8.22 0.00 1760 1761 Ehrlichia 12.89 0.63 1762 1763 Escherichia 0.89 4.67 1764 1765 Francisella 0.22 0.00 1766 1767 Gilliamella 0.22 1.23 1768 1769 Klebsiella 0.22 4.24 1770 1771 Lysobacter 0.22 0.00 1772 1773 Mesorhizobium 0.22 1.34 1774 1775 1776 Neorickettsia 0.67 0.06 1777 1778 Pseudomonas 0.67 14.72 1779 1780 Rickettsia 0.44 0.18 1781 1782 Rickettsiella 1.78 0.15 1783 1784 Sphingomonas 0.22 9.55 1785 1786 Variovorax 0.22 3.92 1787 1788 1789 1790 32 1791 1792 1793 1794 1795 Wolbachia 0.67 0.00 1796 1797 1798 Actinobacteria Actinomyces 1.33 2.90 1799 1800 Arthrobacter 0.44 4.44 1801 1802 Amycolatopsis 0.22 0.23 1803 1804 Bifidobacterium 0.44 0.68 1805 1806 Corynebacterium 1.55 5.70 1807 1808 Propionibacterium 0.89 0.31 1809 1810 Rhodococcus 0.22 7.17 1811 1812 Firmicutes Enterococcus 0.44 5.06 1813 1814 1815 Kurthia 0.22 0.15 1816 1817 Staphylococcus 0.44 0.48 1818 1819 Streptococcus 0.89 12.20 1820 1821 Spirochaetes Borrelia 20.67 1.23 1822 1823 Treponema 0.22 0.00 1824 1825 Bacteroidetes Mucilaginibacter 0.22 1.72 1826 1827 Fusobacteria Leptotrichia 0.22 0.27 1828 1829 1830 Tenericutes Spiroplasma 10.67 1.03 1831 1832 Uncultured bacteria Uncultured bacteria 2.22 2.63 1833 1834 642 1835 1836 643 Normalized data is shown as the number of count reads or identifications/ peptide spectrum 1837 1838 644 matches (IDs/PSMs) divided by the total number of IDs/PSMs x 100. 1839 1840 1841 1842 1843 1844 1845 1846 33 1847 1848 1849 1850 1851 646 Table 3. Biological information about bacterial genera identified by 1852 1853 647 metatranscriptomics in I. ricinus, and previously reported in different tick species. 1854 1855 1856 648 1857 1858 Bacterial genera Biological information Tick spp. References 1859 Actinomyces Commensals of the caecum Data not found 1860 gut flora and oral cavities in 1861 human. Responsible of 1862 abscesses formation in the 1863 mouth, lungs and 1864 gastrointestinal tract. 1865 I. scapularis Benson et al., 2004; 1866 I. ricinus Moreno et al., 2006; 1867 1868 I. ovatus van Overbeek et al., 1869 Soil organisms. Some I. persulcatus 2008; Qiu et al., species are opportunistic Haemaphysalis 2014; Clay et al., 1870 Acinetobacter 1871 pathogens causing human flava 2008; Zhuang et al., 1872 infections. Amblyomma 2014b; Narashiman 1873 americanum et al., 2014; 2017; 1874 Dermacentor Abraham et al., 1875 niveus 2017 1876 Soil bacteria with antibiotic Data not found 1877 Amycolatopsis and anti-inflammatory 1878 properties. 1879 I. scapularis Benson et al., 2004; 1880 Tick-borne intracellular I. persulcatus Moreno et al., 2006; 1881 bacterial pathogens causing Anaplasma I. pavlovskyi Kurilshikov et al., 1882 diseases in humans and I. ricinus 2014; van Overbeek 1883 animals. 1884 et al., 2008 Soil bacteria associated to D. niveus Zhuang et al., 1885 Arthrobacter 1886 bioremediation processes. 2014b 1887 Commensal and symbiotic 1888 bacteria in the human body. Bidifidobacterium Data not found 1889 Produce lactic acid that 1890 modulates the intestinal pH. 1891 Rhipicephalus Andreotti et al., 1892 microplus 2011; Carpi et al., 1893 I. ricinus 2011;Vayssier- 1894 Vector-borne bacteria I. scapularis Taussat et al., 2013; 1895 responsible for Lyme Borrelia I. persulcatus van Overbeek et al., 1896 disease and relapsing fever I. pavlovskyi 2008; Schabereiter- 1897 in mammals. 1898 A. americanum Gurtner et al., 2003; 1899 Moreno et al., 2006; 1900 Kurilshikov et al., 1901 1902 34 1903 1904 1905 1906 1907 2014 1908 1909 Clay et al., 2008 I. ovatus Qiu et al., 2014 1910 Soil bacteria, present in the 1911 Bradyrhizobium I. persulcatus roots of plant fixing N2. 1912 H. flava 1913 I. ovatus Qiu et al., 2014; 1914 I. scapularis Narashiman et al., Soil bacteria, some species 1915 Brevibacterium 2014; 2017; can be found in human skin. 1916 Abraham et al., 1917 2017 1918 Pathogenic bacteria causing I. ricinus Carpi et al., 2011 1919 Brucella disease in human and 1920 animals. 1921 1922 A. americanum Ponnusamy et al., I. ricinus 2014; Trout Fryxell 1923 Candidatus Tick endosymbiotic and DeBuyn, 2016; 1924 Midichloria bacteria. 1925 van Overbeek et al., 1926 2008 1927 I. ricinus Carpi et al., 2011; 1928 I. pavlovskyi Vayssier-Taussat et 1929 Candidatus Tick endosymbiotic al., 2013; van 1930 Neoehrlichia bacteria. Overbeek et al., 1931 2008; Kurilshikov 1932 et al., 2014 1933 R. microplus Andreotti et al., 1934 Saprophytes, some species I. ovatus 2011; Qiu et al., 1935 *Corynebacterium are pathogenic for plants I. persulcatus 2014 1936 and animals. 1937 H. flava 1938 I. scapularis Benson et al., 2004; I. persulcatus Kurilshikov et al., 1939 Tick-borne intracellular I. ovatus 2014; Qiu et al., 1940 bacterial pathogens causing 1941 Ehrlichia H. flava 2014; Xu et al., diseases in humans and 1942 D. reticulatus 2015; Clay et al., animals. 1943 R. microplus 2008 1944 A. americanum 1945 Commensals of digestive R. microplus Andreotti et al., 1946 tract, opportunistic I. ovatus 2011; Qiu et al., 1947 Enterococcus pathogens causing I. persulcatus 2014; Clay et al., 1948 septicemia and urinary tract A. americanum 2008 1949 infection in mammals. 1950 1951 Commensals of digestive R. microplus Andreotti et al., 1952 and urinary tracts, D. silvarum 2011; Liu et al., 1953 Escherichia opportunistic pathogens D. niveus 2016; Zhuang et al., 1954 causing diarrhea to I. persulcatus 2014b; Qiu et al., 1955 dysentery in mammals. 2014 1956 Francisella/ Intracellular bacterial D. reticulatus Kurilshikov et al., 1957 1958 35 1959 1960 1961 1962 1963 Francisella-like pathogens transmitted by D. andersoni 2014; Gall et al., 1964 1965 endosymbiont vectors (ticks, mosquitoes, I. ricinus 2016; Vayssier- 1966 flies), and causing tularemia. I. ovatus Taussat et al., 2013; 1967 I. persulcatus Nakao et al., 2013; 1968 A. maculatum Budachetri et al., 1969 2014 1970 Giliamella Bee endosymbiotic bacteria. Data not found 1971 Saprophytes in soil and R. microplus Andreotti et al., 1972 water, commensals of 2011 1973 gastrointestinal tract, Klebsiella 1974 opportunistic pathogens 1975 responsible for septicemia 1976 and pneumonia in mammals 1977 Environmental bacteria, Data not found 1978 1979 present in mammal feces, soil and water. 1980 Kurthia 1981 Opportunistic pathogens for 1982 humans causing 1983 endocarditis. 1984 Natural flora in humans, I. persulcatus Qiu et al., 2014 1985 Leptotrichia some species cause 1986 opportunistic infections. 1987 Soil bacteria, present in the Data not found 1988 Mesorhizobium roots of plant fixing N2. 1989 Mucilaginibacter Environmental bacteria H. longicornis Liu et al., 2016 1990 1991 I. scapularis Benson et al., 2004; 1992 I. ovatus Moreno et al., 2006; 1993 I. affinis van Treuren et al., 1994 I. persulcatus 2015; Qiu et al., 1995 I. ricinus 2014; Nakao et al., 1996 I. pavlovskyi 2013; van Treuren 1997 D. andersoni et al., 2015; 1998 D. reticulatus Kurilshikov et al., Intracellular bacteria, 1999 D. niveus 2014; Zhuang et al., transmitted by vectors 2000 H. longicornis 2014b; Williams- Neorickettsia/ (ticks, fleas, chiggers, lice), 2001 H. formosensis Newkirk et al., Rickettsia responsible for human 2002 H. flava 2014; Carpi et al., diseases such as spotted 2003 A.testudinarium 2011; van Overbeek fever and typhus. 2004 A. americanum et al., 2008; 2005 A. maculatum Vayssier-Taussat et 2006 R. microplus al., 2013; 2007 2008 Schabereiter- 2009 Gurtner et al., 2003; 2010 Liu et al., 2016; 2011 Gall et al., 2016; 2012 Ponnusamy et al., 2013 2014 36 2015 2016 2017 2018 2019 2014; Trout Fryxell 2020 2021 and DeBuyn, 2016; 2022 Clay et al., 2008; 2023 Budachetri et al., 2024 2014; Xiang et al., 2025 2017; Xu et al., 2026 2015 2027 I. ricinus Carpi et al., 2011; 2028 Commensals of human gut I. ovatus Qiu et al., 2014 Propionibacterium 2029 and skin. I. persulcatus 2030 H. flava 2031 R. microplus Andreotti et al., 2032 I. ricinus 2011; Xu et al., 2033 I. scapularis 2015; Schabereiter- 2034 I. persulcatus Gurtner et al., 2003; 2035 Saprophytes in soil, I. pavlovsky Carpi et al., 2011; 2036 opportunistic pathogens for I. ovatus Moreno et al., 2006; 2037 Pseudomonas humans and plants, plant H. longicornis Qiu et al., 2014; 2038 growth promoters. 2039 D. niveus Kurilshikov et al., 2040 A. americanum 2014; Liu et al., 2041 2016; Zhuang et al., 2042 2014b; Clay et al., 2043 2008 2044 R. microplus Andreotti et al., 2045 I. ricinus 2011; Carpi et al., 2046 Saprophytes in soil and I. persulcatus 2011; Schabereiter- 2047 water, one species is I. ovatus Gurtner et al., 2003; *Rhodococcus 2048 pathogenic for animals I. pavlovsky Kurilshikov et al., 2049 causing pneumonia. I. scapularis 2014; Qiu et al., 2050 2014; Moreno et al., 2051 2052 2006 2053 Rickettsiella Tick endosymbiotic I. pavlovsky Kurilshikov et al., 2054 endosymbiont bacteria. A. variegatum 2014; Nakao et al., 2055 2013 2056 I. scapularis Benson et al., 2004; Environmental bacteria and 2057 I. ovatus Qiu et al., 2014; Liu bioremediation agents, some 2058 Sphingomonas I. persulcatus et al., 2016 specimens cause clinical 2059 H. longicornis infections in humans. 2060 H. flava 2061 D. niveus Zhuang et al., Commonly isolated from 2062 Sphingobium I. ovatus 2014b; Qiu et al., 2063 soil 2014 2064 2065 Symbionts in the gut I. ovatus Qiu et al., 2014 2066 Spiroplasma hemolymph, few species are I. persulcatus 2067 pathogenic for mice H. flava 2068 (cataracts and neurological 2069 2070 37 2071 2072 2073 2074 2075 damage) 2076 2077 R. microplus Andreotti et al., 2078 Saprophytes in soil, D. nievus 2011; Xu et al., 2079 commensals of skin and I. ricinus 2015; Zhuang et al., 2080 Staphylococcus mucosal surfaces, I. ovatus 2014b;Schabereiter- 2081 opportunistic pathogens 2082 I. persulcatus Gurtner et al., 2003; (septicemia, food poisoning) 2083 H. flava Qiu et al., 2014 2084 Saprophytes in soil and R. microplus Andreotti et al., 2085 water, commensals of skin I. scapularis 2011; Benson et al., 2086 and mucosal surfaces, 2004 Streptococcus 2087 opportunistic pathogens 2088 (septicemia, meningitis, 2089 pneumonia) 2090 2091 Soil bacterium associated Data not found 2092 Variovorax with bioremediation 2093 processes 2094 R. microplus Andreotti et al., I. scapularis 2011; Benson et al., 2095 Mutualistic bacteria of many 2096 Wolbachia I. ricinus 2004; Carpi et al., insects and nematodes 2097 2011; van Overbeek 2098 et al., 2008 2099 649 2100 2101 650 Table modified from Razzauti et al. (2015). 2102 2103 651 *These genera were previously identified as contamination of DNA extraction kits reagents 2104 2105 2106 652 and ultrapure water systems, which may lead to erroneous identifications in bacterial 2107 2108 653 assignments (Salter et al., 2014). 2109 2110 654 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 38 2127 2128

Supplementary file 1 – Table 1. Bacterial sequence database constructed with genome and/or species-specific rRNA sequences.

Accession Number Taxonomic Assignments (NCBI) NC_019395 Acidipropionibacterium acidipropionici ATCC 4875, complete genome NC_014374 Acidilobus saccharovorans 345-15, complete genome NC_013926 Aciduliprofundum boonei T469, complete genome KU991570.1 Acinetobacter tjernbergiae strain T40 16S ribosomal RNA gene, partial sequence NC_009053 Actinobacillus pleuropneumoniae L20 serotype 5b complete genome CP014232.1 Actinomyces oris strain T14V, complete genome NR_117358.1 Actinomyces oris strain ATCC 27044 16S ribosomal RNA gene, partial sequence LT223615.1 Actinomyces odontolyticus partial 16S rRNA gene, strain Marseille-P2379 LT576385.1 Actinomyces sp. Marseille-P2825 partial 16S rRNA gene, strain Marseille-P2825 NC_022521 Aeropyrum camini SY1 = JCM 12091 DNA, complete genome. NC_003062 Agrobacterium fabrum str. C58 chromosome circular, complete sequence NC_018011 Alistipes finegoldii DSM 17242, complete genome NC_014318 Amycolatopsis mediterranei U32 chromosome, complete genome CP009110.1 Amycolatopsis methanolica 239, complete genome NC_004842 Anaplasma marginale str. St. Maries, complete genome CP006847.1 Anaplasma marginale str. Dawn genome NC_007797 Anaplasma phagocytophilum HZ, complete genome KX236049.1 Anaplasma phagocytophilum isolate LYD52 16S ribosomal RNA gene, partial sequence CP015376.1 Anaplasma phagocytophilum str. Norway variant2, complete genome KM021418.1 Anaplasma phagocytophilum str. BovBat18 23S ribosomal RNA gene, partial sequence KM021425.1 Anaplasma platys str. ChieCal05 23S ribosomal RNA gene, partial sequence KT364327.1 Anaplasma sp. "ivoriense TCI149" 23S ribosomal RNA gene, partial sequence

1 NC_011567 Anoxybacillus flavithermus WK1, complete genome NC_014218 Arcanobacterium haemolyticum DSM 20595, complete genome NC_008541 Arthrobacter sp. FB24, complete genome. KX168140.1 Arthrobacter subterraneus str. SBSK-401 16S ribosomal RNA gene, partial sequence NC_021171 Bacillus sp. 1NLA3E, complete genome NC_003909 Bacillus cereus ATCC 10987, complete genome NC_005363 Bdellovibrio bacteriovorus complete genome, strain HD100 KU593501.1 Bifidobacterium animalis str. DR2-1 16S ribosomal RNA gene, partial sequence CP011965.1 Bifidobacterium longum subsp. longum strain CCUG30698, complete genome NC_008277 Borrelia afzelii PKo, complete genome NR_121981.1 Borrelia anserina str. BA2 23S ribosomal RNA gene, complete sequence KM269460.1 Borrelia bissettii str. M7p 23S ribosomal RNA gene NC_001318 Borrelia burgdorferi B31 chromosome, complete genome NC_017418 Borrelia burgdorferi N40, complete genome NC_011229 Borrelia duttonii Ly, complete genome NC_010673 Borrelia hermsii DAH, complete genome NC_022079 Borrelia miyamotoi LB-2001, complete genome KU196080.1 Borrelia miyamotoi str. Sonom53 16S ribosomal RNA gene, partial sequence NC_011244 Borrelia recurrentis A1, complete genome NC_008710 Borrelia turicatae 91E135, complete genome NC_006156 Borreliella bavariensis PBi chromosome linear, complete sequence KX034021.1 Bradyrhizobium sp. strain TC29 16S ribosomal RNA gene, partial sequence LC167484.1 Bradyrhizobium elkanii gene for 16S ribosomal RNA, partial sequence, str: PHM 1 NC_004463 Bradyrhizobium japonicum USDA 110 chromosome, complete genome. CP014869.1 Brevibacterium linens str. BS258, complete genome KX168131.1 Brevibacterium oceani str. SBSK-404 16S ribosomal RNA gene, partial sequence

2 CP007758.1 Brucella canis str. RM6/66 chromosome 1, complete sequence KX529832.1 Brucella canis str. YH-C16 16S ribosomal RNA gene, partial sequence NC_019751 Calothrix sp. PCC 6303, complete genome. NC_008599 Campylobacter fetus subsp. fetus 82-40, complete genome NC_017299 Clostridium botulinum H04402 065, complete genome sequence NC_017096 Caldisericum exile AZM16c01 DNA, complete genome. NC_019791 Caldisphaera lagunensis DSM 15908, complete genome. NC_002163 Campylobacter jejuni subsp. jejuni NCTC 11168 = ATCC 700819 chr. complete genome NC_013194 Candidatus Accumulibacter phosphatis clade IIA str. UW-1, complete genome NC_013771 Candidatus Atelocyanobacterium thalassa isolate ALOHA, complete genome NC_010482 Candidatus Korarchaeum cryptofilum OPF8, complete genome NC_015722 Candidatus Midichloria mitochondrii IricVA, complete genome HF568841.1 Candidatus Midichloria mitochondrii partial 16S rRNA gene, strain MA7 KU865475.1 Candidatus Neoehrlichia mikurensis isolate LN5 16S rRNA gene, partial sequence JX406180.1 Candidatus Rickettsiella isopodorum str. JKI D244/2012 16S ribosomal RNA (rrs) gene NC_020135 Candidatus Uzinura diaspidicola str. ASNER, complete genome NC_011916 Caulobacter crescentus NA1000, complete genome NC_014151 Cellulomonas flavigena DSM 20109, complete genome NC_009480 Clavibacter michiganensis subsp. michiganensis NCPPB 382 complete genome NC_009495 Clostridium botulinum A str. ATCC 3502 chromosome, complete genome NC_022538 complete chromosome Acholeplasma palmae NC_012590 Corynebacterium aurimucosum ATCC 700975, complete genome NC_002935 Corynebacterium diphtheriae NCTC 13129, complete genome NC_003450 Corynebacterium glutamicum ATCC 13032 chromosome, complete genome NR_121991.1 Corynebacterium glycinophilum str. AJ 3170 23S rRNA gene KU319425.1 Corynebacterium sp. C2-18 16S rRNA gene, partial sequence

3 NC_007164 Corynebacterium jeikeium K411 complete genome NC_014329 Corynebacterium pseudotuberculosis FRC41, complete genome NC_015673 Corynebacterium resistens DSM 45100, complete genome NC_021663 Corynebacterium terpenotabidum Y-11, complete genome NC_019753 Crinalium epipsammum PCC 9333, complete genome. NC_010530 Cupriavidus taiwanensis str. LMG19424 chromosome 2, complete genome NC_019778 Cyanobacterium stanieri PCC 7202, complete genome NC_013939 Deferribacter desulfuricans SSM1 DNA, complete genome NC_013216 Desulfotomaculum acetoxidans DSM 771, complete genome NC_015589 Desulfotomaculum ruminis DSM 2154, complete genome NC_007354 Ehrlichia canis str. Jake, complete genome NC_007799 Ehrlichia chaffeensis str. Arkansas, complete genome KT308164.1 Ehrlichia chaffeensis isolate 1246 16S rRNA gene, partial sequence NC_023063 Ehrlichia muris AS145, complete genome. KP702294.1 Ehrlichia muris 23S rRNA gene, complete sequence NC_005295 Ehrlichia ruminantium str. Welgevonden, complete genome KX185055.1 Enterococcus faecium str. LUB950217 16S rRNA gene, partial sequence NC_015601 Erysipelothrix rhusiopathiae str. Fujisawa DNA, complete genome CP015855.1 str. EDL933-1 genome J01859.1 Escherichia coli 16S rRNA, complete sequence NC_021019 Eubacterium cylindroides T2-87 draft genome NC_017461 Fervidicoccus fontis Kam940 chromosome, complete genome NC_006570 Francisella tularensis subsp. tularensis SCHU S4 chromosome, complete genome JQ740890.1 Francisella-like endosymbiont of Ixodes ricinus 16S rRNA gene, partial sequence NR_122000.1 Gilliamella apicola str. wkB1 23S rRNA, complete sequence NC_010125 Gluconacetobacter diazotrophicus PAl 5 complete genome

4 NC_018581 Gordonia sp. KTR9, complete genome NC_015153 Haemophilus influenzae F3031 complete genome NC_013422 Halothiobacillus neapolitanus c2, complete genome NC_015564 Hoyosella subflava DQS3-9A1, complete genome NC_015588 Isoptericola variabilis 225, complete genome NC_013174 Jonesia denitrificans DSM 20603, complete genome KX691737.1 Klebsiella sp. strain 17LKA 16S rRNA gene, partial sequence NC_010617 Kocuria rhizophila DC2201 DNA, complete genome NC_012785 Kosmotoga olearia TBF 19.5.1, complete genome CP013217.1 Kurthia sp. 11kri321, complete genome KC904244.1 Kurthia sp. LAM0618 16S rRNA gene, partial sequence NC_006087 Leifsonia xyli subsp. xyli str. CTCB07, complete genome NC_013192 Leptotrichia buccalis DSM 1135, complete genome NC_006055 Mesoplasma florum L1 chromosome, complete genome NC_002678 Mesorhizobium loti MAFF303099 DNA, complete genome NC_023044 Methanobacterium sp. MB1 complete sequence NC_021355 Methanobrevibacter sp. AbM4, complete genome NC_000909 Methanocaldococcus jannaschii DSM 2661, complete genome NC_003551 Methanopyrus kandleri AV19, complete genome NC_009712 Methanoregula boonei 6A8, complete genome NC_019943 Methanoregula formicica SMSP, complete genome NC_007796 Methanospirillum hungatei JF-1, complete genome NC_000916 Methanothermobacter thermautotrophicus str. Delta H, complete genome NC_015636 Methanothermococcus okinawensis IH1, complete genome NC_018485 Methylocystis sp. SC2 complete genome NC_014246 Mobiluncus curtisii ATCC 43063, complete genome

5 NC_014147 Moraxella catarrhalis BBH18, complete genome NC_020418 Morganella morganii subsp. morganii KT, complete genome NC_002944 Mycobacterium avium subsp. paratuberculosis str. k10, complete genome NC_023036 Mycobacterium neoaurum VKM Ac-1815D, complete genome NC_011025 Mycoplasma arthritidis 158L3-1, complete genome NC_007633 Mycoplasma capricolum subsp. capricolum ATCC 27343, complete genome NC_013511 Mycoplasma hominis ATCC 23114 chromosome complete genome NC_006360 Mycoplasma hyopneumoniae 232, complete genome NC_005364 Mycoplasma mycoides subsp. mycoides SC str. PG1 chromosome, complete genome NC_004432 Mycoplasma penetrans HF-2 DNA, complete genome NC_015153 Mycoplasma suis KI3806 complete genome KU597223.1 Mucilaginibacter sp. str. JWp32 16S rRNA gene, partial sequence NR_076635.1 Neorickettsia risticii str. Illinois 23S rRNA gene, complete sequence NC_007798 Neorickettsia sennetsu str. Miyayama, complete genome NC_007406 Nitrobacter winogradskyi Nb-255, complete genome NC_015222 Nitrosomonas sp. AL212, complete genome NC_010085 Nitrosopumilus maritimus SCM1 chromosome, complete genome NC_006361 Nocardia farcinica IFM 10152 DNA, complete genome NC_008699 Nocardioides sp. JS614, complete genome NC_018524 Nocardiopsis alba ATCC BAA-2165, complete genome NC_005303 Onion yellows phytoplasma OY-M DNA, complete genome NC_008711 Paenarthrobacter aurescens TC1, complete genome NC_015702 Parachlamydia acanthamoebae UV-7, complete genome NC_020514 Paraglaciecola psychrophila 170, complete genome. NC_014414 Parvularcula bermudensis HTCC2503 str. HTCC2503, complete genome NC_014537 Parvularcula bermudensis HTCC2503 str. HTCC2503, complete genome

6 NC_016605 Pediococcus claussenii ATCC BAA-344, complete genome NC_008609 Pelobacter propionicus DSM 2379, complete genome NC_012440 Persephonella marina EX-H1, complete genome NC_011144 Phenylobacterium zucineum HLK1, complete genome NC_012962 Photorhabdus asymbiotica ATCC43949 complete genome NC_019689 Pleurocapsa sp. PCC 7327, complete genome NC_015501 Porphyromonas asaccharolytica DSM 20707, complete genome NC_004578 Pseudomonas syringae pv. tomato str. DC3000 chromosome, complete genome NC_012660 Pseudomonas fluorescens SBW25 complete genome NC_023064 Pseudomonas sp. TKP, complete genome NC_021085 Propionibacterium acnes HL096PA1, complete genome KX108930.1 Propionibacterium acnes str. IR-TUMS/BPG6 16S ribosomal RNA gene NC_015177 Pseudopedobacter saltans DSM 12145, complete genome NC_018142 Pseudopropionibacterium propionicum F0230a, complete genome NC_008709 Psychromonas ingrahamii 37, complete genome NC_003364 Pyrobaculum aerophilum str. IM2 chromosome, complete genome NC_015931 Pyrolobus fumarii 1A, complete genome CP016819.1 Rhodococcus sp. p52, complete genome AM179867.1 Rhodococcus sp. A83 A83 A83 A83 partial 16S rRNA gene, isolate A83 NC_005296 Rhodopseudomonas palustris CGA009 complete genome NC_007940 Rickettsia bellii RML369-C, complete genome NC_007109 Rickettsia felis URRWXCal2, complete genome NC_006142 Rickettsia typhi str. Wilmington, complete genome KP994764.1 Rickettsiella endosymbiont of Ixodes tasmani isolate Ixo tasmani3 23S ribosomal KT697685.1 Rickettsiella sp. RKTSLLA_T3262 16S rRNA gene, partial sequence NC_014920 Rothia dentocariosa ATCC 17931, complete genome

7 NC_013715 Rothia mucilaginosa DY-18 DNA, complete genome NC_009092 Shewanella loihica PV-4, complete genome NC_020527 Sinorhizobium meliloti 2011 plasmid pSymA, complete sequence NC_012587 Sinorhizobium fredii NGR234 chromosome, complete genome KX672814.1 Sphingomonas sp. K-16 16S rRNA gene, partial sequence NC_009511 Sphingomonas wittichii RW1, complete genome LN019399.1 Spirometra erinaceieuropaei genome assembly S_erinaceieuropaei NC_008048 Sphingopyxis alaskensis RB2256, complete genome NC_022998 Spiroplasma apis B31, complete genome NC_021833 Spiroplasma diminutum CUAS-1, complete genome NC_021846 Spiroplasma taiwanense CT-1, complete genome NC_021280 Spiroplasma chrysopicola DF-1, complete genome KP967685.1 Spiroplasma sp. Bratislava 1 16S rRNA gene, partial sequence DQ004912.1 Spiroplasma ixodetis str. Y-30 16S rRNA gene and 23S rRNA FJ824553.1 Spiroplasma sp. GSU5508 23S rRNA gene, partial sequence NC_002976 Staphylococcus epidermidis RP62A, complete genome NC_007168 Staphylococcus haemolyticus JCSC1435 DNA, complete genome NC_020164 Staphylococcus warneri SG1, complete genome NC_003098 Streptococcus pneumoniae R6 chromosome, complete genome NC_012004 Streptococcus uberis 0140J complete genome KX679404.1 Streptococcus sp. HTS29 16S rRNA gene, partial sequence NC_002754 Sulfolobus solfataricus P2, complete genome NC_007775 Synechococcus sp. JA-3-3Ab, complete genome NC_012997 Teredinibacter turnerae T7901, complete genome NC_022093 Thermofilum sp. 1910b, complete genome NC_017954 Thermogladius cellulolyticus 1633, complete genome

8 NC_009616 Thermosipho melanesiensis BI429, complete genome NC_018012 Thiocystis violascens DSM 198, complete genome NC_002967 Treponema denticola ATCC 35405 chromosome, complete genome NC_014158 Tsukamurella paurometabola DSM 20162, complete genome NC_002162 Ureaplasma parvum serovar 3 str. ATCC 700970, complete genome LT607803.1 Variovorax sp. HW608 genome assembly, chromosome: I AB300597.1 Variovorax boronicumulans gene for 16S rRNA, partial sequence NC_014643 Vulcanisaeta distributa DSM 14429, complete genome NC_015759 Weissella koreensis KACC 15510, complete genome NC_010981 Wolbachia endosymbiont of Culex quinquefasciatus Pel str wPip complete genome NC_002978 Wolbachia endosymbiont of Drosophila melanogaster, complete genome NC_018267 Wolbachia endosymbiont of Onchocerca ochengi complete genome NC_006833 Wolbachia endosymbiont str. TRS of Brugia malayi, complete genome NC_006526 Zymomonas mobilis subsp. mobilis ZM4, complete genome KT151398.1 Uncultured Lysobacter sp. clone AT204 16S rRNA gene, partial sequence JQ654170.1 Uncultured Treponema sp. clone 5:7P56 16S rRNA gene, partial sequence KX630121.1 Uncultured bacterium clone 16S(V3-V4)-7606 16S rRNA gene, partial sequence

9