<<

The ISME Journal (2013) 7, 1003–1015 & 2013 International Society for Microbial Ecology All rights reserved 1751-7362/13 www.nature.com/ismej ORIGINAL ARTICLE A novel approach, based on BLSOMs (Batch Learning Self-Organizing Maps), to the analysis of

Ryo Nakao1,8, Takashi Abe2,3,8, Ard M Nijhof4, Seigo Yamamoto5, Frans Jongejan6,7, Toshimichi Ikemura2 and Chihiro Sugimoto1 1Division of Collaboration and Education, Research Center for Zoonosis Control, Hokkaido University, Sapporo, Hokkaido, Japan; 2Nagahama Institute of Bio-Science and Technology, Nagahama, Shiga, Japan; 3Graduate School of Science & Technology, Niigata University, Nishi-ku, Niigata, Japan; 4Institute for Parasitology and Tropical Veterinary Medicine, Freie Universita¨t Berlin, Ko¨nigsweg 67, Berlin, Germany; 5Miyazaki Prefectural Institute for Public Health and Environment, Miyazaki, Japan; 6Utrecht Centre for -borne Diseases (UCTD), Department of Infectious Diseases and Immunology, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands and 7Department of Veterinary Tropical Diseases, Faculty of Veterinary Science, University of Pretoria, Onderstepoort, South Africa

Ticks transmit a variety of viral, bacterial and protozoal , which are often zoonotic. The aim of this study was to identify diverse tick , which may contain as-yet unidentified pathogens, using a metagenomic approach. DNA prepared from /- enriched fractions obtained from seven tick species, namely testudinarium, Amblyomma variegatum, Haemaphysalis formosensis, Haemaphysalis longicornis, Ixodes ovatus, Ixodes persulcatus and Ixodes ricinus, was subjected to pyrosequencing after whole- amplification. The resulting sequence reads were phylotyped using a Batch Learning Self- Organizing Map (BLSOM) program, which allowed phylogenetic estimation based on similarity of oligonucleotide frequencies, and functional annotation by BLASTX similarity searches. In addition to bacteria previously associated with human/ diseases, such as Anaplasma, , Borrelia, Ehrlichia, and , BLSOM analysis detected microorganisms belonging to the phylum in some tick species. This was confirmed by pan-Chlamydia PCR and analysis. Gene sequences associated with bacterial pathogenesis were also identified, some of which were suspected to originate from horizontal gene transfer. These efforts to construct a database of tick microbes may lead to the ability to predict emerging tick-borne diseases. Furthermore, a comprehensive understanding of tick microbiomes will be useful for understanding tick biology, including vector competency and interactions with pathogens and symbionts. The ISME Journal (2013) 7, 1003–1015; doi:10.1038/ismej.2012.171; published online 10 January 2013 Subject Category: microbe-microbe and microbe-host interactions Keywords: BLSOMs; emerging diseases; ; microbiomes; symbionts; ticks

Introduction environmental changes (Parola, 2004; Beugnet and Marie´, 2009). Meanwhile, reports of novel tick-borne Ticks (: Ixodida) are amongst the most pathogens are also increasing (Spolidorio et al., well-known blood-sucking . They can 2010; Yu et al., 2011). Above all, many of the transmit a variety of pathogens, including , emerging rickettsial pathogens identified in bacteria and protozoa, to humans and . recent decades were first discovered in ticks and Opportunities for human/animal contact in natural subsequently recognized as the causative agents of foci of tick-borne infection are increasing, along animal or human diseases (Parola et al., 2005). This with expanded distributions of vector ticks due to indicates that active surveillance for potential pathogens in ticks may be a feasible approach to Correspondence: C Sugimoto, Division of Collaboration and predict and combat emerging tick-borne diseases. Education, Research Center for Zoonosis Control, Hokkaido Given the diversity of tick species and their wide University, Kita-20, Nishi-10, Kita-ku, Sapporo, Hokkaido geographic distribution (Bowman and Nuttall, 001-0020, Japan. 2008), it is likely that as-yet unidentified pathogens E-mail: [email protected] are harboured by ticks. 8These authors contributed equally to this work. Received 1 August 2012; revised 7 November 2012; accepted Morphological detection by electron microscopy 4 December 2012; published online 10 January 2013 has shown that Rickettsia- and Wolbachia-like Tick microbiomes revealed by BLSOMs R Nakao et al 1004 bacteria infect a variety of tick species (Roshdy, diversity and relative abundance of different phylo- 1968; Lewis, 1979), although these bacteria are not type members of uncultured microorganisms in a necessarily pathogenic in either mammalian hosts wide variety of environmental metagenomic sam- or ticks. The availability of molecular genetic tools ples (Abe et al., 2005; Uehara et al., 2011). Of note, such as PCR and real-time PCR has increased the similar SOM-based methods were recently used to number of reports on a variety of microorganisms in obtain clear phylotype-specific classification of ticks (Noda et al., 1997; Benson et al., 2004; van metagenomic sequences (Chan et al., 2008; Martin Overbeek et al., 2008); however, their biological et al., 2008; Dick et al., 2009; Weber et al., 2011). roles and potential to cause disease in mammals In the present study, we used BLSOM to char- remain poorly understood. Interactions between acterize diverse microbiomes in seven tick species. arthropods and their symbionts have been relatively The ticks harboured a variety of bacteria, including well studied in aphids. A variety of symbionts are those previously reported to be associated with reported to have detrimental, neutral or beneficial human and animal diseases, and others not effects on the aphid hosts; for example, through previously reported from ticks. This study was pathogenicity (Grenier et al., 2006), providing intended as a first step towards the discovery of nutrition (Nakabachi and Ishikawa, 1999), altering emerging tick-borne pathogens and a systematic gene expression associated with body colour understanding of the relationship between ticks (Tsuchida et al., 2010) and, most intriguingly, and their microbiomes. displaying protective functions against pathogens and predators (Scarborough et al., 2005). It has been suggested that certain symbiotic microbes might also Materials and methods affect the vector competency of their host (Weiss and Aksoy, 2011). A full characterization of Tick species microorganisms harboured by ticks will provide Ticks were collected from the field by dragging valuable information towards a better understanding flannel sheets over the vegetation. Three tick of tick biology in terms of the relationship between species, namely Amblyomma testudinarium (AT), ticks and their microbiomes, including microbes Haemaphysalis formosensis (HF) and Haemaphysa- that are potential pathogens in mammals. lis longicornis (HL) were collected in March 2009 in Recent progress in second generation sequencing Miyazaki, Japan. This is an area where Rickettsia technologies has led to the discovery of previously japonica, the causative agent of Japanese spotted unknown in a variety of samples, includ- fever, and Rickettsia tamurae, the ing arthropods (Cox-Foster et al., 2007; Warnecke group rickettsia recently associated with human et al., 2007; Suen et al., 2010), since uncultured disease (Imaoka et al., 2011), are known to be microorganisms represent the vast majority of endemic (Morita et al., 1990; S Yamamoto, personal symbiotic, commensal and pathogenic microorgan- communication). Two Ixodes species, Ixodes ovatus isms. Attempts to assess the microbial diversity in (IO) and Ixodes persulcatus (IP), were collected in two tick species, Rhipicephalus microplus and June 2010 in Hokkaido, Japan, where Borrelia Ixodes ricinus, using novel sequencing technologies, burgdorferi sensu lato (s.l.)andAnaplasma phago- resulted in the identification of microbes (including cytophilum, the causative agents of Lyme disease bacterial species) previously unreported in ticks and human granulocytic , respectively, (Andreotti et al., 2011; Carpi et al., 2011). However, are endemic (Miyamoto et al., 1992; Murase et al., since the data processing methods used in these two 2011). Ixodes ricinus (IR) was collected in August studies were based on conventional homology-based 2010 in Soesterberg, The Netherlands, where sequence searches, the discoveries were limited to Lyme disease is endemic (Hofhuis et al., 2006). microorganisms for which genomic information was Amblyomma variegatum (AV) ticks were collected already available in the database used. An alter- in The Gambia in 2005 and subsequently main- native approach to characterizing DNA fragments tained under laboratory conditions at the Utrecht might offer improvements for investigating micro- Centre for Tick-borne Diseases, The Netherlands. bial diversity in ticks that are expected to harbour Adult AV ticks from the second laboratory genera- poorly characterized microorganisms. tion were used in this study. The collected ticks Batch Learning Self-Organizing Map (BLSOM) is a were pooled according to species, life stage and method that follows a learning sex (except for HF and HL) and specific pool IDs process and generates a map independently of the (AVf, AVm, IOf, IOm, IPf, IPm, IRf, ATn, HFfmn order of data input (Kanaya et al., 2001; Abe et al., and HLfmn) were provided as indicated in Table 1 2003). This method was designed to separate and (‘f’, ‘m’ and ‘n’ represent female, male and nymph, cluster genomic sequence fragments based on the respectively). similarity of oligonucleotide frequencies without any other taxonomical information. Since BLSOM does not require orthologous sequence sets and DNA preparation sequence alignments, it could provide a new The live ticks were washed twice with 70% ethanol systematic strategy for revealing the microbial supplemented with 1% povidone-iodine (Meiji

The ISME Journal Tick microbiomes revealed by BLSOMs R Nakao et al 1005 Table 1 Collection details for the ticks used in the present study

Pool ID Tick species Collection site Adult Nymph Total

Female Male

AVf Amblyomma variegatum Laboratory colonya 20 20 AVm Amblyomma variegatum Laboratory colonya 20 20 IOf Ixodes ovatus Hokkaido, Japan 59 59 IOm Ixodes ovatus Hokkaido, Japan 110 110 IPf Ixodes persulcatus Hokkaido, Japan 56 56 IPm Ixodes persulcatus Hokkaido, Japan 75 75 IRf Ixodes ricinus Soesterberg, The Netherlands 55 55 ATn Amblyomma testudinarium Miyazaki, Japan 57 57 HFfmn Haemaphysalis formosensis Miyazaki, Japan 32 24 513 569 HLfmn Haemaphysalis longicornis Miyazaki, Japan 16 20 84 120 aThis laboratory colony was originally obtained from The Gambia and has been maintained at Utrecht Centre for Tick-borne Diseases, Utrecht University, The Netherlands.

Seika, Tokyo, Japan) solution and rinsed three times with distilled water to decontaminate the tick body surface. Ticks were then ground with 4.8 mm stainless steel beads (TOMY, Tokyo, Japan) using the Micro Smash MS-100R (TOMY). Tick homogenates were treated with a membrane lysis buffer [10 mM Tris-HCl (pH 8.0), 150 mM NaCl, 10 mM MgCl2, 0.1% IGEPAL CA-630 (Sigma Chemical Co., St. Louis, MO)] for 10 min on ice and centrifuged at 400 Â g for 25 min. The supernatant was centrifuged at 20 000 Â g for 30 min and the pellet containing the bacterial/archaeal cells was resuspended in a DNase buffer [25 mM Tris-HCl (pH 8.0), 10 mM MgCl2,0.8U/ml DNase I (Takara, Shiga, Japan)] to remove any contaminating host-cell DNA. After 60 min at 37 1C, the reaction was stopped by adding 25 mM EDTA (pH 8.0). The solution was filtered through a 5.0 mm pore-size membrane filter (Millipore, Bed- ford, MA) by centrifugation at 1000 Â g. The filtrate was centrifuged at 20 000 Â g for 30 min and the resulting pellet was washed twice with PBS. All centrifugation steps were performed at 4 1C. To lyse Figure 1 Workflow for bacterial/archaeal purification and a wide range of bacterial/archaeal species, this study metagenomic analysis. The present strategy employed a process utilized an achromopeptidase, which is reported to for purifying of bacteria/archaea, which comprised centrifugation, have broad-spectrum bacteriolytic activity (Ezaki DNase treatment and filtration, to enrich bacterial/archaeal cells and Suzuki, 1982). The bacterial/archaeal pellet was from tick homogenates prior to whole-genome amplification and pyrosequencing. treated with an achromopeptidase buffer [10 mM Tris-HCl (pH 8.0), 10 mM NaCl, 1 U/ml achromopep- tidase (WAKO, Osaka, Japan)] for 60 min at 37 1C. Pyrosequencing Bacterial/archaeal DNA was then extracted using the Genomic DNA from each tick pool was subjected to NucleoSpin Tissue XS Kit (Macherey-Nagel, Du¨ ren, pyrosequencing on a Roche/454 Genome Sequencer Germany) according to the manufacturer0s instruc- FLX Titanium (Roche Applied Science/454 Life tions. The genomic DNA was then subjected to Science, Brandford, CT) at Hokkaido System whole-genome amplification using the GenomiPhi Science (Sapporo, Japan). Two GS FLX shotgun V2 DNA Amplification Kit (GE Healthcare, Chalfont libraries were prepared using a standard protocol St Giles, UK) according to the manufacturer’s from the manufacturer and analysed in four inde- instructions. The DNA concentration was measured pendent sequencing runs: the first library contained using the Quant-iT dsDNA BR assay with a Qubit three pools (ATn, HFfmn and HLfmn) with different Fluorometer (Invitrogen, Carlsbad, CA). All proce- multiplex identifiers and was analysed on 1/8 of the dures were conducted under sterile conditions in a PicoTiterPlate (Roche Applied Science/454 Life flow cabinet. A schematic flow diagram showing the Science), while the second library contained seven extraction process of bacterial/archaeal DNA from pools (AVf, AVm, IOf, IOm, IPf, IPm and IRf) and ticks is presented in Figure 1. was analysed on 1/4 of the plate. The metagenomic

The ISME Journal Tick microbiomes revealed by BLSOMs R Nakao et al 1006 sequences were deposited in the DNA Data Bank of correctly classified to the , phylum and Japan (DDBJ) (http://www.ddbj.nig.ac.jp) Sequence genus levels, respectively (Supplementary Figure S1). Read Archive under the accession no. DRA000590.

Pan-Chlamydia PCR, sequencing and data analysis BLSOM analysis A pan-Chlamydia PCR was conducted on the tick Self-Organizing Map (SOM) is an unsupervised species in which bacteria belonging to the phylum neural network algorithm that implements a char- Chlamydiae had been detected by BLSOM analysis. acteristic nonlinear projection from the high-dimen- PCR amplifications were performed using three sional space of input data onto a two-dimensional different sets of primers targeting the 16S rRNA array of weight vectors (Kohonen, 1990). We used gene (Supplementary Table 1). The successfully the ‘Batch Learning SOM’ (BLSOM), which is a amplified products were cloned into a pGEM-T modified version of the conventional SOM for vector (Promega, Madison, WI). Each clone genome informatics that makes the learning process was sequenced using the BigDye Terminator version and creation of the resulting map independent of the 3.1 Cycle Sequencing Kit (Applied Biosystems, order of data input (Kanaya et al., 2001; Abe et al., Foster City, CA) and an ABI Prism 3130x genetic 2003). The initial weight vectors were defined by analyzer (Applied Biosystems) according to the principal component analysis instead of random manufacturer’s instructions. The DNA sequences values. BLSOM learning was conducted as obtained were submitted to the DDBJ under acces- described previously (Abe et al., 2003), and the sion nos. AB725685–AB725705. BLSOM program was obtained from UNTROD Inc., The obtained sequences were aligned using Japan ([email protected]). ClustalW2 (Larkin et al., 2007) and the resulting To estimate phylotypes of the metagenomic alignments were used to generate an uncorrected sequences, three types of large-scale BLSOMs, pairwise distance matrix using the MOTHUR namely Kingdom-, Bacteria/Archaea- and Genus program version 1.25.0 (Schloss et al., 2009). An group-BLSOM, were constructed in advance, using average neighbour clustering algorithm in the sequences deposited in DDBJ/EMBL/GenBank as MOTHUR program was used to assign sequences previously described (Abe et al., 2003). The cluster- into operational taxonomic units (OTUs), with a ing conditions were adopted from the previously 97% sequence similarity cut-off. A representative optimised parameters to minimise the computation sequence from each OTU was taxonomically identi- time without loss of clustering power (Abe et al., fied using the Classifier tool available on the 2005). Kingdom-BLSOM was constructed with tetra- Ribosomal Database Project (RDP) database (Cole nucleotide frequencies for 5-kb sequences from et al., 2009) and was compared against the NCBI the whole-genome sequences of 111 , GenBank database using a BLASTn search. 2813 bacteria/archaea, 1728 mitochondria, 110 chloroplasts and 31 486 viruses. To obtain more detailed phylotype information for bacterial/ Functional annotation and characterization archaeal sequences, Bacteria/Archaea- and Genus The metagenomic sequences were annotated using group-BLSOM were constructed with a total of BLASTX (Altschul et al., 1990) against the Clusters of Orthologous Groups (COG) database (Tatusov 3 500 000 5-kb sequences from 3157 species, for À 5 which at least 10 kb of sequence was available from et al., 1997) with an E-value threshold of 1 Â 10 DDBJ/EMBL/GenBank. and were classified according to COG functional Mapping of metagenomic sequences longer than categories. We also identified putative virulence- 300 bp on Kingdom-BLSOMs, after normalization of associated factors using the MvirDB database (Zhou the sequence length, was conducted by finding the et al., 2007), which consists of sequence information lattice point with the minimum Euclidean distance in from multiple microbial databases of protein toxins, the multidimensional space. To identify further virulence factors and antibiotic resistance genes. detailed phylogenies of the metagenomic sequences Sequences associated with ‘antibiotic resistance’, that had been mapped to the bacterial/archaeal ‘pathogenicity islands’ and ‘virulence protein’ were territories on Kingdom-BLSOM, these sequences were further phylotyped using BLSOM as described successively mapped on Bacteria/Archaea-BLSOM. above. All sequence reads were used to identify Similar stepwise mappings of metagenomic putative virulence-associated factors in BLASTX- sequences on the Genus group-BLSOM were subse- based analyses, while only sequences longer than quently conducted to obtain further detailed phylo- 300 bp were phylotyped. genetic information. In order to evaluate the accuracy of BLSOMs for taxonomic classification of metage- Results nomic sequences, the BLSOM analysis was applied to three previously published simulated datasets Kingdom classification of varying complexities (Mavromatis et al., 2007). Bacterial/archaeal cells in tick homogenates were When sequences longer than 300 bp were mapped, purified using centrifugation, DNase treatment and approximately 90%, 70% and 40%–50% were filtration. DNA was extracted from bacteria/archaea-

The ISME Journal Tick microbiomes revealed by BLSOMs R Nakao et al 1007 enriched fractions and was subjected to pyrosequen- (HLfmn) of the reads were successfully assigned to cing after whole-genome amplification. The total specific phyla (Table 2). The number of different sequence reads for each species ranged from 21 213 phyla obtained from a single library ranged from 20 (HFfmn) to 42 258 (AVf), with average lengths (HLfmn)–28 (AVm, IOf and IPf). The composition of between 143.5 bp (IPm) and 391.4 bp (AVm) bacterial/archaeal phyla in each tick pool is shown (Table 2). The sequence reads longer than 300 bp in Figure 3. Each tick species had a different were classified as bacterial/archaeal, eukaryotic, microbial composition. Differences were also found , mitochondrial or chloroplast using King- between males and females within the same tick dom-BLSOM (Figure 2). The percentage of the reads species. Sequences classified into the phyla Firmi- categorized as bacteria/archaea ranged from 37.8% cutes and Gammaproteobacteria constituted nearly (AVm) to 72.3% (HLfmn), indicating that the present half of the total sequences in most tick species. protocol could efficiently enrich bacterial/archaeal Sequences classified into the phylum Alphaproteo- cells from tick homogenates. An average of 35% of bacteria were also observed in all tick species, while the eukaryotic reads were clustered together with those classified into the phylum Chlamydiae were those of other arthropods, which were included in observed at high abundance only in IPf, IPm and the Kingdom-BLSOM construction, while the HFfmn. remaining reads were with a variety of eukaryotes including protists and mammals (data not shown). Genus classification Between 85.4% (AVm) and 92.3% (ATn) of the Phylum classification sequences characterized into certain bacterial/ The bacterial/archaeal sequence reads were further archaeal phyla were successfully characterized to classified at the phylum level using Bacteria/ the genus level using Genus group-BLSOM Archaea-BLSOM. Between 97.2% (AVf) and 99.4% (Table 2). The number of different genera obtained

Table 2 Summary of pyrosequencing and BLSOM classifications

AVf AVm IOf IOm IPf IPm IRf ATn HFfmn HLfmn

Total no. of reads 42 258 27 582 37 667 35 544 37 136 33 789 39 810 21 639 21 213 24 249 Average read length (nt) 163.6 391.4 379.8 154.6 371.7 143.5 149.9 149.0 185.0 187.5 No. of long reads (Z300 bp) 8160 22 201 29 804 2962 28 618 4210 5438 2906 4502 5186 No. of reads assigned to bacteria/ 3951 8389 16 239 1620 18 711 2468 3782 1222 2946 3750 archaea (%)a (48.4) (37.8) (54.5) (54.7) (65.4) (58.6) (69.5) (42.1) (65.4) (72.3) No. of reads assigned to a bacterial/ 3841 8272 16 088 1598 18 556 2439 3748 1213 2925 3728 archaeal phyla (%)b (97.2) (98.6) (99.1) (98.6) (99.2) (98.8) (99.1) (99.3) (99.3) (99.4) No. of phyla obtained 27 28 28 23 28 25 22 27 24 20 No. of reads assigned to known bacterial/ 3419 7068 14 410 1410 16 566 2131 3398 1119 2605 3319 archaeal genera (%)c (89.0) (85.4) (89.6) (88.2) (89.3) (87.4) (90.7) (92.3) (89.1) (89.0) No. of reads assigned to candidate 71 (1.8) 62 (0.7) 700 (4.4) 66 (4.1) 669 (3.6) 98 (4.0) 76 (2.0) 13 (1.1) 89 (3.0) 75 (2.0) bacterial/archaeal genera (%) No. of bacterial/archaeal genera obtained 158 223 215 143 185 162 133 201 138 156 aCalculated using the formula: (No. of reads assigned to bacteria/archaea)/{No. of long reads (Z300 bp)} Â 100. bCalculated using the formula: (No. of reads assigned to any phyla)/(No. of reads assigned to bacteria/archaea) Â 100. cCalculated using the formula: (No. of reads assigned to any genera)/(No. of reads assigned to any phyla) Â 100.

Figure 2 Kingdom classification of the metagenomic sequences from each tick pool. Sequence reads longer than 300 bp were classified into bacteria/archaea, eukaryotes, viruses, mitochondria or chloroplasts using Kingdom-BLSOM. The percentage of sequences in each category is provided. The pool ID is shown in the centre of each pie chart.

The ISME Journal Tick microbiomes revealed by BLSOMs R Nakao et al 1008

Figure 3 Phylum classification of the metagenomic sequences from each tick pool. Sequence reads classified as bacterial/archaeal using Kingdom-BLSOM were further classified at the phylum level using Bacteria/Archaea-BLSOM. Thirty-three phyla are indicated by different colours. The pool ID is shown on the left of the graph.

from a single library ranged from 133 (IRf) to 223 Pan-Chlamydia PCR and sequencing analysis (AVm). The twenty most prevalent bacterial/ DNA extracted from five tick pools, IPf, IPm, HFf, archaeal genera within each tick species are listed HFm and HFn, was tested by pan-Chlamydia PCR in Table 3, while a complete list of bacterial/archaeal using three different primer sets. Only one primer genera recovered from each tick pool is shown in set (16S FOR2 and 16S REV2) produced PCR Supplementary Table 2. A total of 64 different products of the expected size (approximately genera were detected and some of the dominant 260 bp) in all tested samples (data not shown). bacterial/archaeal genera were commonly observed A total of 14 different clones of partial 16S rRNA between tick species. Some of the sequences could gene sequences were recovered, four of which were not be assigned to known taxa. For example, 4.7% of identified in both tick species (Table 4). The the sequences in AVm were characterized to the MOTHUR program assigned these 14 sequences into phylum Euryarchaeota, but could not be character- five OTUs, the representative sequences of which ized at the genus level (Table 3). In accordance with were predicted to belong to the genera Neochlamy- the results of phylum classification, many of the dia, Parachlamydia or Simkania based on RDP dominant bacteria/archaea belonged to genera database classification. BLASTn analysis of a repre- within the phyla and Gammaproteobac- sentative sequence for each OTU showed highest teria. The sequences of known tick-borne pathogens similarity (88%–98%) to Chlamydial 16S rRNA gene such as Anaplasma, Bartonella, Borrelia, Ehrlichia, sequences recovered from various animal sources Francisella and Rickettsia, as well as those of known including cockroach, sea bass, koala, cat and leafy tick-symbionts such as , and seadragon (Table 4). Wolbachia, were also listed as top 20 genera (Table 3). Only three pools (IPf, IPm and HFfmn) contained bacterial sequences classified into the Functional annotation family Chlamydiaceae (genera Chlamydia and Chla- Functional annotation of the bacterial/archaeal mydophila) as dominant taxa/populations. sequences using BLASTX against the COG database

The ISME Journal Tick microbiomes revealed by BLSOMs R Nakao et al 1009 (3.7) (2.7) (3.4) (2.5) (3.1) (2.5) (2.4) (3.0) (1.9) (2.2) (2.7) (1.8) (1.9) (1.7) (7.1) (1.5) (5.1) (1.5) (4.6) (4.3) indicated similar functional gene compositions between pools, except for AVm (Supplementary Figure 2). More than 80% of the AVm sequences were annotated as ‘Replication, recombination and repair’-related genes. This result may indicate that Coxiella Cyanothece Mycoplasma Streptococcus Rickettsia Staphylococcus Campylobacter Lactobacillus teobacteria) Nostoc Grimontia the sequences in this library were not randomly (4.2) (2.5) (3.5) (2.3) (3.3) (2.1) (3.0) (1.3) (1.5) (2.8) (10.5) (1.2) (8.5) U (Gammapro- (1.2) (6.7) (5.5) amplified during the whole-genome amplification process. Sequences associated with putative viru- lence-associated factors were identified using BLASTX against the MvirDB database. The number of identified sequences differed greatly between Campylobacter Lactobacillus Bacillus Sulfurihydrogenibium Clostridium Wolbachia Bartonella Thermoanaerobacter Chlamydophila Borrelia Cyanothece Streptococcus Sulfurimonas Chlamydia Helicobacter Prochlorococcus libraries: IOm showed the highest number (6715) and ATn the lowest (365) (Figure 4). Most of these (1.6) (1.2) (1.6) (1.2) U (Euryarchaeota) (2.2) (1.2) (1.5) (1.0) (1.2) (1.4) (1.0) U (Firmicutes) (1.2) (1.2) C (Chlamydiae) (1.4) (1.0) C () (1.2) (17.2) (1.4) (1.0) (10.0) (1.0) (2.1) (2.1) (1.8) were categorized as either ‘antibiotic resistance’, ‘’ or ‘virulence protein’-related genes. Both IOm and IPf contained a large number of sequences associated with ‘virulence protein’ teobacteria) Amycolatopsis Orientia Micromonospora Francisella Bordetella Streptococcus Bacillus Borrelia Methylobacillus teobacteria) Rickettsia Staphylococcus Synechococcus Wigglesworthia Clostridium (n ¼ 6515 and 2189, respectively), in comparison to other pools. Sequences associated with ‘pathogeni- (3.9) U (Alphapro- (2.5) (3.8) (2.4) (2.4) (3.0) (2.3) (2.3) (2.9) (1.9) (2.3) (1.7) (5.3) (2.7) U (Gammapro- (1.7) (5.3) (1.7) (4.7) (4.3) (4.1) city island’ were observed to some extent in all pools, with the highest number obtained from IPf (n ¼ 2072), followed by IOf, AVm and HLfmn. The origins of these virulence-associated sequences were Grimontia Thiothrix Bacillus Actinobacillus Coxiella Enterococcus Francisella Mycoplasma Salmonella Borrelia Rickettsia Acinetobacter Lactobacillus Clostridium Streptococcus Ureaplasma teobacteria) predicted using Bacteria/Archaea-BLSOM, and five (3.9) (2.4) (3.4) (2.3) (2.0) (3.1) (1.6) (1.6) (3.0) (1.5) (1.6) (1.4) (7.6) (5.4) (1.3) (5.0) (4.7) (4.4) U (Gammapro- of the most dominant origins of the sequences associated with ‘antibiotic resistance’, ‘pathogeni- city island’ and ‘virulence protein’ in each tick pool are shown in Table 5. The most common phyla were Firmicutes, , Gammaproteobac- Prochlorococcus Bacillus Mycoplasma Clostridium Thermoanaerobacter Borrelia teobacteria) Bartonella Chlamydia Francisella Lactobacillus Sulfurihydrogenibium Ureaplasma Chlamydophila Phytoplasma Streptococcus Campylobacter Helicobacter teria and Chlamydiae, while many of the sequences could not be phylotyped by BLSOM and remained (3.5) (2.4) (3.0) (2.3) (2.7) (1.8) U (Gammapro- (2.7) (1.2) (1.9) (2.6) U (Euryarchaeota) (2.8) (1.1) C (Chlamydiae) (1.3) (1.2) (8.8) (8.5) (1.1) (6.9) (5.9) (5.5) unclassified.

Discussion Prochlorococcus Sulfurihydrogenibium Borrelia Francisella Clostridium Bartonella Bacillus Sulfurimonas Wolbachia Thermoanaerobacter Alkaliphilus Lactobacillus Chlamydophila Streptococcus Lamprocystis Helicobacter Chlamydia Campylobacter The present study investigated the diversity of (3.1) (2.3) (3.1) (2.2) (2.7) (1.9) U (Euryarchaeota) (2.0) (1.4) (1.6) C (Chlamydiae) (2.0) (2.6) (1.4) (1.4) (2.4) (1.3) (11.0) (7.3) (1.3) (7.1) (3.1) (3.1) microbial communities in ticks using pyrosequen- cing technology coupled with a composition-based data processing approach called BLSOM. In order to provide an overview of microbiomes associated with Desulfovibrio Borrelia teobacteria) Thiothrix Streptococcus Ehrlichia Lactobacillus Staphylococcus Phytoplasma Listeria Campylobacter Francisella teobacteria) Ureaplasma Rickettsia Bacillus Mycoplasma Clostridium Helicobacter different tick species, ticks were pooled according to species and subjected to the analyses. We showed (3.4) (1.9) (2.9) U (Gammapro- (1.5) (1.4) (1.3) (2.0) (1.1) (1.2) (1.9) (1.1) C (Alphapro- (19.7) (1.2) (15.4) (8.2) (5.8) (4.0) that ticks harboured a variety of bacteria, including those previously associated with human/animal diseases such as Anaplasma, Bartonella, Borrelia, Ehrlichia, Francisella and Rickettsia, as well as Francisella Orientia Borrelia teobacteria) Streptococcus Wolbachia Campylobacter teobacteria) Bacillus Staphylococcus Phytoplasma Rickettsia Ureaplasma Mycoplasma Clostridium Ehrlichia potential pathogens such as Chlamydia. This is a first attempt to apply BLSOM to the detection of (3.1) (2.9) (2.5) C (Gammapro- (2.7) U (Euryarchaeota) (2.1) (2.3) (1.9) (2.6) (1.4) U (Gammapro- (1.7) (2.6) (1.4) (13.4) (1.6) (1.3) C (Bacteroidetes) (1.1) U (Euryarchaeota) (1.3) (5.1) (1.2) C (Tenericutes) (1.0) (3.6) (3.2) potential pathogens. Since this approach can be directly applied to other vector arthropods of medical and veterinary importance, it has great potential for the detection of diverse microbes, teobacteria) Arcanobacterium Hydrogenovibrio Actinobacillus Aeromonas teobacteria) Micromonospora Geobacter Streptococcus Salmonella Rickettsiella Clavibacter Methylobacterium Pseudoalteromonas Actinoalloteichus Treponema Synechococcus including as-yet unidentified microorganisms, and to pre-empt emerging infectious diseases. (3.7) U (Gammapro- (2.4) U () (2.6) (3.7) (3.4) (2.4) (2.1) U (Alphapro- (2.7) (2.0) (2.1) (2.5) (1.8) (8.6) (1.8) (6.2) (1.8) (4.0) U (Euryarchaeota) (4.7) (4.0) (4.0) Unlike the conventional sequence homology searches used in most metagenomic studies, BLSOM

Top 20 genera associated with the metagenomic sequences from each tick pool does not require orthologous sequences for phylo- genetic classification of metagenomic sequences. Grimontia Campylobacter Thiocystis Luteimonas Lactobacillus Thiothrix Dichelobacter Neisseria Salmonella Phytoplasma teobacteria) AVfBorrelia AVm IOf IOm IPf IPm IRf ATn HFfmn HLfmn Providencia Mycoplasma Vibrio Staphylococcus Actinobacillus Coxiella This is a great advantage, especially when applied 6 11 7 12 U (Tenericutes) (2.4) 8 13 14 9 17 15 10 18 U (Gammapro- 1 Table 3 Numbers in parentheses indicate the percentage. U, Unclassified; C, Candidatus. 16 U (Euryarchaeota) (2.0) 19 2 20 3 4 5 to a group composed of poorly characterized

The ISME Journal h SEJournal ISME The 1010

Table 4 Phylogenetic analysis of 16S rRNA gene sequences amplified by pan-Chlamydia PCR

Operational Sequence a a taxonomic ID Clones assigned to each OTU RDP result BLASTn search result BLSOMs by revealed microbiomes Tick unit (OTU) IPf IPm HFf HFm HFn Family (%)b Genus (%)b Identity (%)c Source Accession no.

11–1d Yes Yes Yes Simkaniaceae (80%) Simkania (80%) Rhabdochlamydia crassificans strain Cockroach (Blatta AY928092 CRIB01 (98%) orientalis) 1–2 Yes 1–3 Yes Nakao R 1–4 Yes 1–5 Yes 22–1d Yes Yes Yes Yes Parachlamydiaceae Neochlamydia (22%) Uncultured Chlamydiae bacterium Sea bass (Dicen- FJ376381

(53%) clone UFC3 (90%) trarchus labrax) al et 2–2 Yes Yes 2–3 Yes 2–4 Yes 33–1d Yes Parachlamydiaceae Neochlamydia (73%) Uncultured Chlamydiales bacterium Koala (Phascolarctos AY167120 (89%) isolate UKC7 (90%) cinereus) 3–2 Yes Yes 44–1d Yes Parachlamydiaceae Neochlamydia (67%) Uncultured clinical Neochlamydia Cat (Felis silvestris AY225593 (95%) sp. clone WB13 (92%) catus) 4–2 Yes 55–1d Yes Parachlamydiaceae Parachlamydia (74%) Uncultured Chlamydiales CRG20 Leafy seadragon (Phy- AY013396 (76%) (88%) codurus eques)

aAnalysis was conducted using a representative sequence for each OTU. bPercentages indicate bootstrap values. cPercentages indicate the similarity between two sequences. dRepresentative sequence. Tick microbiomes revealed by BLSOMs R Nakao et al 1011

Figure 4 Sequences associated with virulence-associated fac- tors. Sequences associated with putative virulence-associated factors were identified by BLASTX against the MvirDB database (Zhou et al., 2007), applying a cut-off value of 1e-5. The resulting data are shown in seven categories denoted by different colours. The numbers at the bottom of the graph indicate the number of sequences. The numbers in parentheses indicate the percentage of total sequence reads. The pool ID is shown on the left of the graph. microorganisms (Abe et al., 2005). In addition, ticks have highly variable genome sizes, ranging from nearly one third to over two times the size of the human genome, but only one whole-genome sequencing project is underway (Nene, 2009). This indicates that alignment-based techniques may encounter difficulties in sorting out tick-derived sequences in the metagenomic libraries. On the other hand, BLSOM can separate bacterial/archaeal sequences from those of eukaryotes with high accuracy (Supplementary Figure S1) and thus have great potential for detecting symbiotic and commen- sal microorganisms in eukaryotic hosts with scarce or no genomic information. Despite the fact that Kingdom-BLSOM did not include tick genome sequences as these were not available at the time of construction, an average of 35% of the eukaryotic sequences were clustered together with those of other arthropods and thus likely to be derived from tick cells. Once it becomes available, the inclusion of a tick genome in BLSOM construction would facilitate the identification of eukaryotic reads as being tick-derived. In a previous pyrosequencing study, in which the whole bodies of I. ricinus were directly subjected to DNA and RNA extraction, only 0.095% of the genomic DNA and 0.30% of the cDNA-derived sequences were assigned to known bacterial taxa (Carpi et al., 2011). Although this shotgun strategy seems ideal for capturing a snapshot of a microbial community with minimum bias introduced during sample preparation, there is a concern that genomic data for bacteria/archaea present in very low numbers may be masked by the huge amount of host-genome sequences. To address this, our strategy included a process of microbial enrichment using centrifugation, DNase treatment and filtration, to

The ISME Journal Tick microbiomes revealed by BLSOMs R Nakao et al 1012 enrich bacterial/archaeal cells from tick homoge- research is warranted to improve the understanding nates prior to pyrosequencing. The microbial pro- of their biology, especially with respect to their files obtained in this study, therefore, may not pathogenic potential for humans and animals. completely reflect the whole picture regarding Genus group-BLSOM identified 133–223 different microbial composition in the tested ticks due to genera from each single tick pool (Table 2). The the biases introduced during the microbial presence of a variety of bacterial/archaeal genera in enrichment and whole-genome amplification pro- ticks was demonstrated in previous metagenomic cesses, especially in the case of AVm, where the studies; in R. microplus and I. ricinus, 120 and 108 sequences in the library were not likely to be different genera were identified, respectively, using randomly amplified (Supplementary Figure 2). a rDNA-based amplicon pyrosequencing approach Nonetheless, the fact that nearly half of the sequence (Andreotti et al., 2011; Carpi et al., 2011). This reads were identified as bacterial/archaeal, and were discrepancy may be explained by the fact that these associated with a diverse range of bacterial/archaeal two approaches rely on different analytic principles. phyla, indicated sufficient coverage by our method One possible explanation is that BLSOM might have to obtain an overview of tick microbiomes. misclassified genome segments introduced through Sequences of genera containing several tick-borne horizontal gene transfer (HGT) (Abe et al., 2005). pathogens, such as Anaplasma, Borrelia and Rick- Tamames and Moya (2008) reported evidence of ettsia, were present in the tested samples (Table 3 HGT in 0.8%–1.5% and 2%–8% of the sequences in and Supplementary Table 2). Further identification environmental metagenomic libraries using phylo- to the species level can be achieved by the use of genetic and composition-based methods, respec- traditional methods such as conventional species- tively, indicating that the present approach may specific PCR. This can be done to confirm the require further refinement; for example, by combin- presence of tick-borne bacterial species such as A. ing BLSOM with rDNA-based phylogenetic classifi- phagocytophilum, B. burgdorferi s.l., R. japonica cation as proposed elsewhere (Abe et al., 2005). It is and R. tamurae, which have previously been also possible that the Genus group-BLSOM has detected in the tested tick species (Morita et al., overestimated the microbial diversity in the tested 1990; Miyamoto et al., 1992; Hofhuis et al., 2006; samples as a result of a decreased classification Murase et al., 2011). The data obtained in the specificity on the genus level (Supplementary present study, however, are useful for broadening Figure S1). This drawback can be remedied by using our understanding of phylogenetic diversity among only longer sequences to improve the specificity of bacterial genera, and the extent of their habitats. For the analysis or by using more genome sequences example, the genera Chlamydia and Chlamydophila associated with the habitat under study during were predicted to be harboured by IPf, IPm and BLSOM construction (Abe et al., 2005; Weber HFfmn in high abundance (Table 3). Further et al., 2011). pan-Chlamydia PCR and sequencing analysis sup- The present study included the ticks maintained ported the existence of microorganisms possibly under laboratory conditions. The microbial diversity related to the genera Neochlamydia, Parachlamydia of those ticks was comparable to that of those and Simkania (Table 4). Some representatives of collected in the field in terms of the number of these genera are implicated in diseases affecting different genus identified (Table 2). Environmental their host animals (von Bomhard et al., 2003; Meijer factors and developmental stages were reported to et al., 2006; Corsaro et al., 2007). Since only the affect microbial communities in ticks (Clay et al., sequences of the genera Chlamydia and Chlamydo- 2008; van Overbeek et al., 2008; Andreotti et al., phila were available from the phylum Chlamydiae at 2011; Carpi et al., 2011). Therefore, a further the time of BLSOM construction, those of the genera comparison of microbial diversity between labora- Neochlamydia, Parachlamydia and Simkania were tory colony and field ticks of the same species may not included in Genus group-BLSOM, so sequences be useful to elucidate environmental factors playing from the phylum Chlamydiae were allocated to key roles on microbial compositions. IO and IP were either genus Chlamydia or Chlamydophila. This sampled at the same site and time and have a technical limitation will be overcome as sequence common host preference for sika deer (Cervus data from a wider range of microorganisms become nippon) in the sampling area (Isogai et al., 1996), available. Although possible transmission of Chla- allowing a comparison of microbial compositions mydial bacteria to and humans through ticks between these two tick species. IO harboured was reported in several previous studies (Caldwell Rickettsia and Ehrlichia, but IP harboured Barto- and Belden, 1973; McKercher et al., 1980; Facco nella, at high abundance, despite the presence of et al., 1992), there has been no molecular genetic Borrelia and Francisella in both tick species at evidence to support these observational studies. similar abundance (Table 3). These findings may be Thus, this is the first molecular genetic evidence useful for improving the understanding of the vector supporting the presence of Chlamydial organisms in potential of different tick species. In addition to ticks. Given the wide genetic diversity and host these species differences, microbial communities range, including several arthropods, of members of are expected to vary between individual ticks as this phylum of bacteria (Horn, 2008), further reported elsewhere (Andreotti et al., 2011; Carpi

The ISME Journal Tick microbiomes revealed by BLSOMs R Nakao et al 1013 et al., 2011). One aspect to be addressed, which may the unclassified sequences were introduced through help to explain these differences, is the interaction ancient HGT, and lost their oligonucleotide fre- between microbial lineages. Indeed, antagonistic quency in the recipient genome (either tick or other interactions between microbial lineages in ticks, microbes). Indeed, the oligonucleotide composition including pathogenic Rickettsia, were demonstrated of sequences introduced through ancient HGT drift previously (Macaluso et al., 2002; de la Fuente et al., over time (Brown, 2003), makes it difficult to 2003). Thus, one of the most interesting challenges correctly phylotype those sequences using composi- for future studies on tick microbiomes might be the tion-based methods such as BLSOM. Furthermore, association between microbial composition in ticks numerous reports have associated bacterial viru- and the potential for transmission. The lence factors with HGT (Pallen and Wren, 2007; Ho pictures of whole microbiomes harboured by each Sui et al., 2009; Juhas et al., 2009). Another tick species generated in the present study provide explanation is that these sequences simply origi- the foundation for a statistical correlation analysis nated from as-yet unclassified microorganisms. between pathogens and certain microbial lineages in Regardless, the detection of virulence-associated individual ticks. Such studies could lead to the factors in ticks may suggest either the existence or identification of microbiological factors affecting possible emergence of tick-borne pathogens and, the prevalence and persistence of pathogenic thus, warrants further investigation to assess poten- lineages in ticks. tial risks to human and animal health. The sequences associated with putative virulence- In conclusion, these results provide a foundation associated factors were identified from metagenomic for the construction of a database of tick microbes, libraries using BLASTX-based methods (Figure 4). which may aid in the prediction of emerging It should be noted, however, that some of the tick-borne diseases. The present approach can be factors, such as type III secretion systems, are extended to detect or predict emerging pathogens in responsible for both bacterial pathogenesis and other vector arthropods such as mosquitoes and , including mutualism and flies. A comprehensive understanding of tick micro- (Coombes, 2009), and thus not all factors are directly biomes will also be useful for understanding tick or necessarily related to human/animal pathogens. biology, including vector competence and interac- The present study predominantly identified se- tions with pathogens. quences related to antibiotic resistance (Figure 4). It has been recognized for some time that antibiotic Acknowledgements resistance genes are not only associated with pathogens but also with a wide variety of environ- We thank everyone who helped with tick collection, mental microbes (Allen et al., 2010). Several lines of especially Nariaki Nonaka, Jacqueline Schipper, Joseph W. evidence indicate that some of the clinically relevant Magona, Joseph M. Kamau, Noboru Sasaki and Noriko resistance mechanisms in pathogenic organisms Nakao. The first author was supported by a research grant originated from such environmental bacteria through fellowship from the Japanese Society for the Promotion of Science (JSPS) for young scientists. This work was HGT (Wright, 2010). Direct contact between patho- supported by a Grant-in-Aid for JSPS fellows and for gens and resistance genes in commensals harboured Scientific Research from the Ministry of Education, Culture, by ticks could therefore represent a mechanism for Sports, Science and Technology of Japan (MEXT), the the emergence of resistant pathogens. program of Funding Research Center for Emerging and Re- Several studies described mutualistic relation- emerging Infectious Disease, MEXT. The BLSOM calcula- ships between microbes and host arthropods tion was done, in part, by the Earth Simulator of the Japan (Currie et al., 1999; Kaltenpoth et al., 2005; Scott Agency for Marine-Earth Science and Technology. et al., 2008). It is plausible that the tick microbiome also affects aspects of tick physiology. This was for example demonstrated by a reduction in reproduc- tive fitness of Amblyomma americanum females References following antibiotic treatment, suggesting that the Abe T, Kanaya S, Kinouchi M, Ichiba Y, Kozuki T, Ikemura T. microbe plays a role in tick growth and development (2003). Informatics for unveiling hidden genome signa- (Zhong et al., 2007). Further studies should consider tures. Genome Res 13: 693–702. the potential contributions of tick microbiomes to Abe T, Sugawara H, Kinouchi M, Kanaya S, Ikemura T. the survival of ticks in the environment; this could (2005). Novel phylogenetic studies of genomic provide novel targets for the control of ticks and sequence fragments derived from uncultured microbe tick-borne pathogens. mixtures in environmental and clinical samples. DNA BLSOM was used to estimate the phylogenetic Res 12: 281–290. origins of the sequences associated with putative Allen HK, Donato J, Wang HH, Cloud-Hansen KA, Davies J, Handelsman J. (2010). Call of the wild: antibiotic virulence-associated factors; however, many of these resistance genes in natural environments. Nat Rev remained unclassified (Table 5). This may indicate Microbiol 8: 251–259. that those sequences had retained their functions, Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. but lost sequence characteristics such as oligonu- (1990). Basic local alignment search tool. J Mol Biol cleotide frequency. One possible explanation is that 215: 403–410.

The ISME Journal Tick microbiomes revealed by BLSOMs R Nakao et al 1014 Andreotti R, Pe´rez de Leo´n AA, Dowd SE, Guerrero FD, chrysanthemi 3937) is a pathogen of the pea aphid. Bendele KG, Scoles GA. (2011). Assessment of Appl Environ Microbiol 72: 1956–1965. bacterial diversity in the cattle tick Rhipicephalus Ho Sui SJ, Fedynak A, Hsiao WW, Langille MG, Brinkman (Boophilus) microplus through tag-encoded pyrose- FS. (2009). The association of virulence factors with quencing. BMC Microbiol 11:6. genomic islands. PLoS One 4: e8094. Bowman AS, Nuttall PA. (2008). Ticks: Biology, Disease Hofhuis A, van der Giessen JWB, Borgsteede FHM, and Control. Cambridge University Press: Cambridge. Wielinga PR, Notermans DW, van Pelt W. (2006). Benson MJ, Gawronski JD, Eveleigh DE, Benson DR. Lyme borreliosis in the Netherlands: strong increase in (2004). Intracellular symbionts and other bacteria GP consultations and hospital admissions in past associated with deer ticks (Ixodes scapularis) from 10 years. Euro Surveillance: European Communicable Nantucket and Wellfleet, Cape Cod, Massachusetts. Disease Bulletin 11: E060622.2. Appl Environ Microbiol 70: 616–620. Horn M. (2008). Chlamydiae as symbionts in eukaryotes. Beugnet F, Marie´ JL. (2009). Emerging arthropod-borne Annu Rev Microbiol 62: 113–131. diseases of companion animals in Europe. Vet Para- Imaoka K, Kaneko S, Tabara K, Kusatake K, Morita E. sitol 163: 298–305. (2011). The First Human Case of Rickettsia tamurae Brown JR. (2003). Ancient horizontal gene transfer. Nat Infection in Japan. Case Rep Dermatol 3: 68–73. Rev Genet 4: 121–132. Isogai E, Isogai H, Masuzawa T, Postic D, Baranton G, Caldwell HD, Belden EL. (1973). Studies of the role Kamewaka Y et al. (1996). Borrelia burgdorferi of Dermacentor occidentalis in the transmission of sensu lato in an endemic environment: wild sika deer bovine chlamydial abortion. Infect Immun 7: 147–151. (Cervus nippon yesoensis) with infected ticks and Carpi G, Cagnacci F, Wittekindt NE, Zhao F, Qi J, Tomsho antibodies. Microbiol Immunol 40: 13–19. LP et al. (2011). Metagenomic profile of the bacterial Juhas M, van der Meer JR, Gaillard M, Harding RM, Hood communities associated with Ixodes ricinus ticks. DW, Crook DW. (2009). Genomic islands: tools of PLoS One 6: e25604. bacterial horizontal gene transfer and evolution. FEMS Chan C-KK, Hsu AL, Tang S-L, Halgamuge SK. (2008). Microbiol Rev 33: 376–393. Using growing self-organising maps to improve the Kaltenpoth M, Go¨ttler W, Herzner G, Strohm E. (2005). binning process in environmental whole-genome Symbiotic bacteria protect wasp larvae from fungal shotgun sequencing. J Biomed Biotechnol 2008: infestation. Curr Biol 15: 475–479. Doi:10.1155/2008/513701. Kanaya S, Kinouchi M, Abe T, Kudo Y, Yamada Y, Nishi T Clay K, Klyachko O, Grindle N, Civitello D, Oleske D, et al. (2001). Analysis of codon usage diversity of Fuqua C. (2008). Microbial communities and interac- bacterial genes with a self-organizing map (SOM): tions in the lone star tick, Amblyomma americanum. characterization of horizontally transferred genes Mol Ecol 17: 4371–4381. with emphasis on the E. coli O157 genome. Gene Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ et al. 276: 89–99. (2009). The Ribosomal Database Project: improved Kohonen T. (1990). The self-organizing map. Proceedings alignments and new tools for rRNA analysis. Nucleic of the IEEE pp 1464–1480. Acids Res 37: D141–D145. Larkin MA, Blackshields G, Brown NP, Chenna R, Coombes BK. (2009). Type III secretion systems in McGettigan PA, McWilliam H et al. (2007). Clustal W symbiotic adaptation of pathogenic and non-patho- and Clustal X version 2.0. Bioinformatics 23: genic bacteria. Trends Microbiol 17: 89–94. 2947–2948. Corsaro D, Thomas V, Goy G, Venditti D, Radek R, Lewis D. (1979). The detection of rickettsia-like micro- Greub G. (2007). ‘Candidatus Rhabdochlamydia organisms within the ovaries of female Ixodes ricinus crassificans’, an intracellular bacterial pathogen of ticks. Z Parasitenkd 59: 295–298. the cockroach Blatta orientalis (Insecta: Blattodea). Macaluso KR, Sonenshine DE, Ceraul SM, Azad AF. Syst Appl Microbiol 30: 221–228. (2002). Rickettsial infection in Dermacentor variabilis Cox-Foster DL, Conlan S, Holmes EC, Palacios G, Evans (Acari: ) inhibits transovarial transmission of JD, Moran NA et al. (2007). A metagenomic survey of a second Rickettsia. J Med Entomol 39: 809–813. microbes in honey bee colony collapse disorder. Martin C, Diaz NN, Ontrup J, Nattkemper TW. (2008). Science 318: 283–287. Hyperbolic SOM-based clustering of DNA fragment Currie CR, Mueller UG, Malloch D. (1999). The agricul- features for taxonomic visualization and classifica- tural pathology of ant fungus gardens. Proc Natl Acad tion. Bioinformatics 24: 1568–1574. Sci USA 96: 7998–8002. Mavromatis K, Ivanova N, Barry K, Shapio H, Goltsman E, de la Fuente J, Blouin EF, Kocan KM. (2003). Infection McHardy AC et al. (2007). Use of simulated data sets exclusion of the rickettsial pathogen Anaplasma to evaluate the fidelity of metagenomic processing marginale in the tick vector Dermacentor variabilis. methods. Nat Methods 4: 495–500. Clin Diagn Lab Immunol 10: 182–184. McKercher DG, Wada EM, Ault SK, Theis JH. (1980). Dick GJ, Andersson AF, Baker BJ, Simmons SL, Thomas BC, Preliminary studies on transmission of Chlamydia to Yelton AP et al. (2009). Community-wide analysis of micro- cattle by ticks (Ornithodoros coriaceus). Am J Vet Res bial genome sequence signatures. Genome Biol 10:R85. 41: 922–924. Ezaki T, Suzuki S. (1982). Achromopeptidase for lysis of Meijer A, Roholl PJ, Ossewaarde JM, Jones B, Nowak BF. anaerobic gram-positive cocci. J Clin Microbiol 16: (2006). Molecular evidence for association of 844–846. chlamydiales bacteria with epitheliocystis in leafy Facco F, Grazi G, Bonassi S, Magnani M, Di Pietro P. seadragon (Phycodurus eques), silver perch (Bidyanus (1992). Chlamydial and rickettsial transmission bidyanus), and barramundi (Lates calcarifer). Appl through tick bite in children. Lancet 339: 992–993. Environ Microbiol 72: 284–290. Grenier AM, Duport G, Page`s S, Condemine G, Rahbe´ Y. Miyamoto K, Nakao M, Uchikawa K, Fujita H. (1992). (2006). The phytopathogen dadantii ( Prevalence of Lyme borreliosis spirochetes in ixodid

The ISME Journal Tick microbiomes revealed by BLSOMs R Nakao et al 1015 ticks of Japan, with special reference to a new microbiome with high plant biomass-degrading capa- potential vector, Ixodes ovatus (Acari: Ixodidae). city. PLoS Genet 6: e1001129. J Med Entomol 29: 216–220. Tamames J, Moya A. (2008). Estimating the extent of Morita C, Yamamoto S, Tsuchiya K, Yoshida Y, Yabe T, horizontal gene transfer in metagenomic sequences. Kawabata N et al. (1990). Prevalence of spotted fever BMC 9: 136. group rickettsia antibody in Apodemus speciosus Tatusov RL, Koonin EV, Lipman DJ. (1997). A genomic captured in an endemic focus in Miyazaki Prefecture, perspective on protein families. Science 278: 631–637. Japan. Jpn J Med Sci Biol 43: 15–18. Tsuchida T, Koga R, Horikawa M, Tsunoda T, Maoka T, Murase Y, Konnai S, Hidano A, Githaka NW, Ito T, Takano Matsumoto S et al. (2010). Symbiotic bacterium A et al. (2011). Molecular detection of Anaplasma modifies aphid body color. Science 330: 1102–1104. phagocytophilum in cattle and Ixodes persulcatus Uehara H, Iwasaki Y, Wada C, Ikemura T, Abe T. (2011). A ticks. Vet Microbiol 149: 504–507. novel bioinformatics strategy for searching indust- Nakabachi A, Ishikawa H. (1999). Provision of riboflavin rially useful genome resources from metagenomic to the host aphid, Acyrthosiphon pisum, by endosym- sequence libraries. Genes Genet Syst 86: 53–66. biotic bacteria, Buchnera. J Insect Physiol 45: 1–6. van Overbeek L, Gassner F, van der Plas CL, Kastelein P, Nene V. (2009). Tick genomics–coming of age. Front Biosci Nunes-da Rocha U, Takken W. (2008). Diversity of 14: 2666–2673. Ixodes ricinus tick-associated bacterial communities Noda H, Munderloh UG, Kurtti TJ. (1997). from different forests. FEMS Microbiol Ecol 66: 72–84. of ticks and their relationship to Wolbachia spp. von Bomhard W, Polkinghorne A, Lu ZH, Vaughan L, and tick-borne pathogens of humans and animals. Vo¨gtlin A, Zimmermann DR et al. (2003). Detection of Appl Environ Microbiol 63: 3926–3932. novel chlamydiae in cats with ocular disease. Am J Vet Pallen MJ, Wren BW. (2007). Bacterial pathogenomics. Res 64: 1421–1428. Nature 449: 835–842. Warnecke F, Luginbu¨ hl P, Ivanova N, Ghassemian M, Parola P. (2004). Tick-borne rickettsial diseases: emerging Richardson TH, Stege JT et al. (2007). Metagenomic risks in Europe. Comp Immunol Microbiol Infect Dis and functional analysis of hindgut microbiota of a 27: 297–304. wood-feeding higher termite. Nature 450: 560–565. Parola P, Davoust B, Raoult D. (2005). Tick- and -borne Weber M, Teeling H, Huang S, Waldmann J, Kassabgy M, rickettsial emerging zoonoses. Vet Res 36: 469–492. Fuchs BM et al. (2011). Practical application of Roshdy MA. (1968). A rickettsialike microorganism in the self-organizing maps to interrelate biodiversity and tick Ornithodoros savignyi: Observations on its struc- functional data in NGS-based metagenomics. ISME J 5: ture and distribution in the tissues of the tick. 918–928. J Invertebr Pathol 11: 155–169. Weiss B, Aksoy S. (2011). Microbiome influences on Scarborough CL, Ferrari J, Godfray HC. (2005). Aphid insect host vector competence. Trends Parasitol 27: protected from pathogen by . Science 514–522. 310: 1781. Wright GD. (2010). Antibiotic resistance in the environ- Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, ment: a link to the clinic? Curr Opin Microbiol 13: Hollister EB et al. (2009). Introducing mothur: open- 589–594. source, platform-independent, community-supported Yu XJ, Liang MF, Zhang SY, Liu Y, Li JD, Sun YL et al. software for describing and comparing microbial (2011). Fever with thrombocytopenia associated communities. Appl Environ Microbiol 75: 7537–7541. with a novel bunyavirus in China. N Engl J Med 364: Scott JJ, Oh DC, Yuceer MC, Klepzig KD, Clardy J, Currie 1523–1532. CR. (2008). Bacterial protection of beetle-fungus Zhong J, Jasinskas A, Barbour AG. (2007). Antibiotic mutualism. Science 322: 63. treatment of the tick vector Amblyomma americanum Spolidorio MG, Labruna MB, Mantovani E, Brandao PE, reduced reproductive fitness. PLoS One 2: e405. Richtzenhain LJ, Yoshinari NH. (2010). Novel spotted Zhou CE, Smith J, Lam M, Zemla A, Dyer MD, Slezak T. fever group , Brazil. Emerg Infect Dis 16: (2007). MvirDB–a microbial database of protein toxins, 521–523. virulence factors and antibiotic resistance genes Suen G, Scott JJ, Aylward FO, Adams SM, Tringe SG, for bio-defence applications. Nucleic Acids Res 35: Pinto-Toma´sAAet al. (2010). An insect herbivore D391–D394.

Supplementary Information accompanies the paper on The ISME Journal website (http://www.nature.com/ ismej)

The ISME Journal