JOURNAL OF VIROLOGY, Dec. 2010, p. 13004–13018 Vol. 84, No. 24 0022-538X/10/$12.00 doi:10.1128/JVI.01255-10 Copyright © 2010, American Society for Microbiology. All Rights Reserved.

Metagenomic Analysis of the Viromes of Three North American Species: Viral Diversity among Different Bat Species That Share a Common Habitatᰔ Eric F. Donaldson,1†* Aimee N. Haskew,2 J. Edward Gates,2† Jeremy Huynh,1 Clea J. Moore,3 and Matthew B. Frieman4† Department of Epidemiology, University of North Carolina, Chapel Hill, North Carolina 275991; University of Maryland Center for Environmental Science, Appalachian Laboratory, Frostburg, Maryland 215322; Department of Biological Sciences, Oakwood University, Huntsville, Alabama 358963; and Department of Microbiology and Immunology, University of Maryland at Baltimore, Baltimore, Maryland 212014

Received 11 June 2010/Accepted 24 September 2010

Effective prediction of future viral zoonoses requires an in-depth understanding of the heterologous viral population in key species that will likely serve as reservoir hosts or intermediates during the next viral epidemic. The importance of as natural hosts for several important viral zoonoses, including Ebola, Marburg, Nipah, Hendra, and and severe acute respiratory syndrome-coronavirus (SARS-CoV), has been established; however, the large viral population diversity (virome) of bats has been partially deter- mined for only a few of the ϳ1,200 bat species. To assess the virome of North American bats, we collected fecal, oral, urine, and tissue samples from individual bats captured at an abandoned railroad tunnel in Maryland that is cohabitated by 7 to 10 different bat species. Here, we present preliminary characterization of the virome of three common North American bat species, including big brown bats (Eptesicus fuscus), tricolored bats (Perimyotis subflavus), and little brown myotis (Myotis lucifugus). In samples derived from these bats, we identified viral sequences that were similar to at least three novel group 1 CoVs, large numbers of insect and plant sequences, and nearly full-length genomic sequences of two novel bacteriophages. These observa- tions suggest that bats encounter and disseminate a large assortment of viruses capable of infecting many different , insects, and plants in nature.

The sum total of all viruses found in a species is called its populations. Several recent human epidemics have resulted virome (4), and in recent years, next-generation sequencing from viruses that originated in zoonotic reservoirs, including methodologies have been employed in a number of meta- the following: human immunodeficiency virus (HIV), which genomics studies designed to assess the viromes of different probably emerged from chimpanzees (32, 39, 54); H1N1 of animal and plant species (41, 45, 47, 53, 67, 80, 81, 88, 89). In 2009/2010, which likely originated in pigs and birds (5, 31, 71); several cases, novel viruses were detected in many types of and the severe acute respiratory syndrome coronavirus (SARS- samples, including human blood (10), human diarrhea (29, 30, CoV), which likely originated from the Chinese 36, 46, 58), human respiratory secretions (2, 3, 33, 88), equine (43, 64). Further studies assessing the worldwide distribution feces (15), grapevines (20), and, most recently, bat guano (47). of bat coronavirus (BtCoV) has identified several novel CoVs These studies have demonstrated that numerous viruses are in a variety of bat species (9, 16, 18, 26, 34, 43, 44, 50, 55, 57, naturally present in a given species and provide further evi- 64, 74, 77, 91, 95). In fact, BtCoV sequences have greatly dence that many viral epidemics occur as the result of a cross- expanded CoV phylogeny from the three traditional groups species transmission of viruses from one species to another. known as group 1 (alphacoronaviruses; including human CoVs Monitoring of animals harvested as bushmeat in sub-Sa- such as HCoV-229E and HCoV-NL63), group 2 (betacorona- haran Africa has revealed that zoonotic viruses frequently spill viruses; including human CoVs such as SARS-CoV and over into the human populations that hunt, butcher, or manage HCoV-OC43), and group 3 (gammacoronaviruses; including the animals; and these results have provided significant insights avian CoVs). into origins, geographic distributions, temperate limitations, In addition to CoVs, 12 of the 28 (ϳ43%) emerging viruses and emergence potential of several important human patho- included on the NIAID list of emerging and reemerging patho- gens (90). These studies, and others like them, have demon- gens in biodefense categories A to C (http://www.niaid.nih.gov strated that viruses can be relatively benign in a given host but /topics/emerging/pages/list.aspx) have been found in samples have devastating consequences when they emerge into human derived from multiple species of bats (13). Moreover, several other virus genera have been identified in bat samples, includ- ing coronaviruses, , lyssaviruses, rubulaviruses, fla- * Corresponding author. Mailing address: Department of Epidemi- viviruses, bunyaviruses, polyomaviruses, phleboviruses, oribivi- ology, University of North Carolina, Chapel Hill, NC 27599. Phone: ruses, and orthoreoviruses, as well as uncharacterized viruses (919) 966-3881. Fax: (919) 966-0584. E-mail: eric_donaldson@med Arenaviridae Picornaviridae .unc.edu. of the , , and families † E.F.D., J.E.G., and M.B.F. contributed equally to the study design. (13). Given the depth of viral richness observed in bats, sur- ᰔ Published ahead of print on 6 October 2010. prisingly little research has been conducted to determine which

13004 VOL. 84, 2010 METAGENOMIC ANALYSIS OF NORTH AMERICAN BATS 13005

viruses are specific to bats, which viruses persistently infect in paper sacks secured to a rope line for 10 to 15 min, allowing enough time for different bat species, and which viruses are trafficked from one the excretion of fresh fecal boluses. population to another with eventual dissemination to different Sterile forceps were used to retrieve fecal boluses from the paper sacks, and these were placed into tubes containing 1 ml of universal transport medium species or reservoir hosts (12, 56, 75, 96). The one bat virome (UTM; Copan Diagnostics, Inc., Murrieta, CA). We used miniature flocked study published to date focused on bat guano obtained from a swabs to acquire oral samples by placing the swab into the mouth cavity of each mixture of bat species in the southern United States (47). bat and gently rubbing the mucosa. Oral swabs containing bat saliva were im- Bats comprise the second largest order of , with mediately placed in 2 ml of UTM (Copan Diagnostics, Inc., Murrieta, CA) (26). All vials were labeled with a unique identifier for each bat, based upon the date approximately 1,200 species making up nearly 25% of the (e.g., 12/Aug/2010 or Aug/12/2010), location, bat species, sex, reproductive con- ϳ5,000 species in the class Mammalia (78, 79). Bats are further dition, and age. Samples were stored in a cooler on ice while in the field and classified in the order Chiroptera, which is subdivided into two placed in storage at 4°C at the University of Maryland Center for Environmental suborders: the Yinpterochiroptera (formerly Megachiroptera) Science (UMCES) Appalachian Laboratory. Each captured bat was also assessed which contains the , and the (for- to determine species, weight (g), forearm length (mm), wing score, sex, repro- ductive state, and age (49, 60). All captured bats were marked by hair clipping to merly Microchiroptera), which includes the majority of micro- facilitate identification of recaptures. Once the data were collected, the bat was bat families (13, 78, 79). Bat species occur on every continent, released. with the exception of Antarctica, and some species Precautions taken to prevent the spread of white nose syndrome. Since white- migrate up to 1,287 km. In addition, the are insec- nose syndrome (WNS), a deadly malady of hibernating bats, had been reported in hibernacula in nearby Pennsylvania, West Virginia, and Virginia, the sampling tivorous, exhibit exceptionally long life spans of 25 to 35 years, crew followed standard disinfection protocols based on recommendations put and live in panmictic populations often comprised of millions forth by the U.S. Fish and Wildlife Service in April 2008. The harp traps, holding of bats in densely packaged aggregates of greater than 300 bats bags, nets, and boots were sterilized at the end of each night with a 10% bleach per square foot (13). solution. In addition, holding bags, boots, and leather gloves were also washed at In eastern North America, the most common microbats are the end of each week. Latex gloves were worn when bats were handled, and a new pair was used for each individual bat. All objects that each bat came in from the family , genus Myotis, although big contact with were disposed of or immediately sterilized with a 10% bleach brown bats (Eptesicus fuscus) are the most commonly captured solution. A wing damage index was used to assess the wing condition of bats species during trapping and hibernacula surveys in Maryland possibly affected by WNS (63). The presence of scar or necrotic tissue, flaking (61). In addition, tricolored bats (Perimyotis subflavus) (61) and/or dehydrated skin, and wing tears or other abnormalities was recorded and and little brown myotis (Myotis lucifugus) are frequently cap- photographed. Sample processing. The fecal samples were vortexed to completely resuspend tured from tunnels and caves during hibernaculum surveys in the fecal material into solution, and 1 ml of Hank’s balanced salt solution (HBSS; the state (6). Samples for this study were obtained by capturing Gibco, Auckland, New Zealand) was added to each sample, and samples were individual bats at Indigo tunnel, which is an abandoned rail- further vortexed to create a less viscous solution. The samples were then cen- ϫ road tunnel located within the Ridge and Valley physiographic trifuged at 10,000 g using a benchtop microcentrifuge for 2 min, and super- natants were transferred to a fresh tube. This step was repeated twice. Samples province in the Chesapeake and Ohio Canal National Histor- were then pooled by species, sex, age, or sample type by adding 1 ml of each ical Park near Little Orleans, Allegany County, MD. The tun- sample into the appropriately pooled sample tube. The remaining supernatants nel is cohabitated by 7 to 10 different bat species. In 2009, for each sample were divided into two tubes and stored at Ϫ80°C. The six pooled samples were collected from seven different bat species during samples were then placed in a 10-ml syringes, and each pool was filtered through ␮ the summer roosting and fall swarm. a 0.45- m-pore-size syringe filter (Fisher Scientific, Pittsburgh, PA). The filtered supernatants were then brought up to volume using HBSS to fill Beckman SW41 In this paper, we describe a preliminary study that focused Ultracentrifuge tubes (Beckman Coulter, Inc., Brea, CA) to capacity. Tubes were on samples from 41 bats captured on a single sampling night, weighed, and HBSS was added such that all tubes weighed exactly the same. The representing three bat species, including big brown bats, tri- samples were then centrifuged at 50,000 ϫ g for3hat10°C. Supernatants were ␮ colored bats, and the little brown myotis. Fecal and oral sam- poured into 10% bleach solution, and the pellet was resuspended in 100 lof HBSS and frozen at Ϫ80°C until further processing could be performed. ples were processed and grouped into six pools; viral RNA and Nuclease treatment and RNA extraction. To reduce the amount of contami- DNA were extracted, reverse transcribed, bar coded, and am- nating RNA and DNA present, each sample (116 ␮l) was treated with 14 U of plified; and the cDNA was analyzed by 454 sequencing to DNase Turbo (Ambion, Austin, TX), 25 U of benzonase (Novagen, Darmstadt, assess the viral sequences present in each. Phylogenetic ana- Germany), and 20 U of RNase One (Promega, Madison, WI); samples were ␮ ϫ lyses demonstrated that an abundance of bacteriophage, plant, brought up to a final volume of 140 lin10 DNase buffer (Ambion, Austin, TX) and incubated at 37°C for 2 h. Samples were then immediately processed and insect viruses were present in all fecal samples. In addition, with a QiaAmp Viral RNA Mini Kit (Qiagen, Valencia, CA) using the manu- three fecal sample pools contained strong matches to at least facturer’s protocol to extract viral RNA and DNA that were protected from three novel group 1 CoVs, and a single pool comprised of viral nuclease digestion by the viral capsid (1). Viral RNA and DNA were eluted to sequences extracted from oral swab samples contained multi- a final volume of 60 ␮l and stored at Ϫ80°C prior to use. Reverse transcription and single primer amplification. Primers that added ple hits to novel betaherpesviruses. unique multiplex identification tags (bar codes) were ordered using the se- quence-independent single primer amplification (SISPA) protocol, which has been described previously (3, 24, 25). Briefly, the bar code primers were designed MATERIALS AND METHODS as random hexamers containing novel 22-nucleotide (nt) tags. The primers used Study area. Bat samples for this study were collected from the northeast in this study have been published elsewhere (3, 24, 25). The primers containing entrance of Indigo tunnel located near Little Orleans, MD (GPS coordinates: the random hexamer were used to prime the reverse transcription reaction, using easting 726475; northing 4391037). The tunnel was originally constructed in 1904 one unique bar code primer for each of the six pooled samples. The same 22-nt and was used primarily by the Western Maryland Railway. Use of the tunnel was tag was used without the random hexamer extension to prime the PCR ampli- discontinued in 1975 although the structure remains largely intact. fication stage. Bat sampling. To capture the bats, two forest strainer harp traps (1.8 m by Reverse transcription. Viral RNA was reverse transcribed using a SuperScript 2.3 m; Bat Conservation and Management, Carlisle, PA) were placed side by side III reverse transcriptase kit (Invitrogen, Carlsbad, CA) following the manufac- in the tunnel entrance to catch bats entering or exiting the tunnel. The harp traps turer’s protocol. Briefly, 50 to 200 ng of RNA from each of the pooled samples were completely surrounded with tarpaulin and bird netting to prevent bats from and one bar code primer containing the random hexamer (10 ␮M) were incu- bypassing the traps. Captured bats were retrieved from the harp traps and held bated in the presence of dimethyl sulfoxide (DMSO) at 72°C for 5 min and then 13006 DONALDSON ET AL. J. VIROL. placed on ice. Dithiothreitol (DTT; 100 mM), deoxynucleoside triphosphates by searching both databases. BLAST searches were conducted at the nucleotide (dNTPs; 10 mM), RNAse Out (8 U; Invitrogen, Carlsbad, CA), and SuperScript level using blastn and discontiguous MegaBLAST and at the amino acid level III reverse transcriptase (100 U) were then added along with 5ϫ First Strand using blastx and tblastX to translate either the query sequence, the database, or buffer to bring the final reaction volume to 20 ␮l. This volume was incubated at both. Contigs matching the same viral sequences were binned together for 25°C for 10 min, at 50°C for 50 min, and then at 85°C for 10 min to deactivate genome assembly. Contigs of interest were manually refined using either Codon the reverse transcriptase. Code Aligner or Geneious to generate consensus sequences that contained few Klenow reaction. A Klenow reaction was then performed to convert the cDNA or no ambiguities. Viral genomes were downloaded for each of the viruses that into double-stranded DNA (dsDNA) with interspersed sequences of the 22-nt on matched most closely to any of the contigs, and these genomes were used as one DNA strand. For this reaction, a Klenow polymerase (2.5 U; New England reference sequences to guide assembly of full-length genomes. Coronavirus ref- Biolabs, Ipswich, MA) was added to the cDNA reaction mixture, which was then erence sequences were downloaded from PATRIC (http://patric.vbi.vt.edu/) incubated at 37°C for 60 min, followed by 75°C for 10 min. (72), and all other viral reference sequences were obtained from GenBank Phosphatase and exonuclease treatment. Phosphatase and exonuclease were (http://www.ncbi.nlm.nih.gov/) (8). used to remove phosphates and single-stranded bases from the newly synthesized Comparison of contigs. Contigs with similarity to known viruses were com- dsDNA by adding1Uofshrimp alkaline phosphatase ([SAP] USB, Cleveland, pared to a reference set of viral genomes representing the diversity within the OH)and2Uofexonuclease (USB, Cleveland, OH) to the samples at the end of viral family and genus of the virus most closely related to the contig. Initially, the Klenow reaction along with phosphatase buffer and water to a final volume genomic sequences and contigs were compared to determine the region of the of 50 ␮l. The reaction mixture was then incubated at 37°C for 60 min, followed viral genome that matched to the contig, and these regions were extracted from by 72°C for 15 min. the reference viral genomes along with 200 flanking nucleotides on either end to PCR amplification and DNA extraction. Amplification was conducted using an allow for insertion and deletion variations. The contig sequence and sequences Accuprime PCR Kit (Invitrogen, Carlsbad, CA) to ensure primer binding to from the reference set were then compared by multiple sequence alignment specific template sequences. The reaction mixture contained 10 ␮l of the sample using Clustal X, version 2.011 (42), and MUSCLE (27) as implemented in the from the SAP-Exo reaction mixture, the primer specific for the bar code without Geneious package. Contigs from different pools were individually compared to the random hexamer portion (12.5 ␮M), Accuprime Taq polymerase (1 U), 10ϫ the reference genomes, and if contigs from different pools matched the same Accuprime buffer 1, and water to a final volume of 50 ␮l. This mixture was then region in the viral reference set, multiple alignments were conducted including amplified using the following cycling conditions: 94°C for 2 min, followed by 35 all of the related contigs. All alignments were manually refined and trimmed to cycles of 94°C for 30 s, 55°C for 30 s, and 68°C for 1 min, with a final 10 min at represent a robust alignment of the novel contigs and the known reference by 68°C. The PCRs were then analyzed by gel electrophoresis on a 1% agarose sequences. gel, and a DNA smear ranging from ϳ0.5 to 1 kb was excised and extracted using Phylogenetic analysis. Maximum-likelihood phylogenetic trees were generated a QiaQuick Gel Extraction Kit (Qiagen, Valencia, CA). Purified DNA samples using the PhyML program (37) as implemented in the Geneious package. All were submitted for sequencing to the University of North Carolina (UNC) High trees were generated using the MtREV substitution model with estimated tran- Throughput Sequencing facility using Roche 454 Life Sciences FLX Titanium sition/transversion ratios for DNA models and the proportion of invariable sites chemistry (454 Life Sciences of Roche, Branford, CT). The DNA concentration fixed at zero. Boot strapping was conducted generating 100 boostrapped data was determined by spectrophotometry for each of the six pooled samples, and sets, and a consensus tree was generated using Consensus from the Phylip equal concentrations of each pool were combined into a single tube that was package (28). Trees were visualized in the Geneious tree viewer (Drummond et submitted to the facility. However, the sequencing facility determined concen- al., Biomatters, Auckland, New Zealand), and the Seaview tool (35) was used for tration using a fluorometer and determined the concentration was too low, and editing and rearranging branches. Alignments and trees were generated using additional pool 1 sample was added to bring the final concentration of all pooled either nucleotide sequences, amino acid sequences, or both. samples to ϳ1 ␮g. Cloning of novel group 1 CoV replicase fragment. Primers were designed Assembly of sequence reads. The large sequence files generated by 454 pyro- based upon 454 sequence reads that assembled to the BtCoV-HKU2 genome, sequencing were converted from the SFF format to Fasta format using the GS and these primers were used to amplify a ϳ2,540-nucleotide fragment of the De Novo Assembler software from Roche (454 Life Sciences of Roche, Bran- novel group 1 CoV replicase open reading frame 1ab (ORF1ab). Briefly, forward ford, CT). UNIX and PERL scripts were used to segregate the bar-coded reads primer CoV_16486F (5-GCGTATGTGTTGTCTTGGTC-3) and reverse primer into individual bins, allowing for a 2-nucleotide truncation from the read ends. CoV_19013R (5-ACTAACTGCAACCCCGTTAA-3) were used in an attempt The program Codon Code Aligner (CodonCode Corporation, Dedham, MA) to amplify cDNA templates from each of the eight individual samples from pool was used to trim the bar code sequences from the ends of the reads. Contigs were 1. Each PCR mixture contained 2 ␮l of cDNA (derived from reverse transcribing generated de novo using three methods: (i) Codon Code Aligner (CodonCode viral RNA extracted from each sample), forward and reverse primers (10 mM), Corporation, Dedham, MA) using default settings and running multiple itera- dNTPs (10 mM), Accuprime Taq polymerase (1 U; Invitrogen, Carslbad, CA), tions, (ii) Geneious (version 3.0; A. Drummond, J., B. Ashton, M. Cheung, 10ϫ buffer 1, and water to a final volume of 50 ␮l. These mixtures were then J. Heled, M. Kearse, R. Moir, S. Stones-Havas, T. Thierer, and A. Wilson, amplified using the following cycling conditions: 94°C for 5 min, followed by 40 Biomatters, Auckland, New Zealand) using the high-sensitivity assembly method cycles of 94°C for 1 min, 50°C for 1 min, and 72°C for 3 min, with a final 10 min with default settings, and (iii) DNAstar (DNASTAR, Inc., Madison, WI) using at 72°C. PCR fragments were excised, purified using a Qiagen gel extraction kit the NGEN assembler with default settings. Contigs analyzed in this study were (Qiagen, Valencia, CA), cloned into the TopoXL vector (Invitrogen, Carlsbad, manually corrected, and overlaps between reads were assessed to verify proper CA), grown up in competent TOP10 cells, screened by restriction digestion, and assembly by visualizing the assemblies and generating consensus sequences by sequenced via Sanger sequencing at the UNC Genome Analysis core facility. removing spurious single nucleotide insertions that were likely sequence errors. Nucleotide sequence accession numbers. GenBank accession numbers for Single base insertions that occurred a single time in a given column were re- CoV genes used in phylogenies are HQ585081 to HQ585086. GenBank accession moved, provided the inserted base was not present in the reference genome or numbers for all other CoV sequences are HQ585087 to HQ585105. Those for all in any other sequence reads. All contigs described in this report contained other viral genomes are HQ585106 to HQ585116. complete agreement across all read junctions. Annotation of contigs. Assembled contigs were then assessed for identity using a stand-alone version of the Basic Local Alignment Search Tool (BLAST) (14) RESULTS downloaded from the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/) or using BLAST as implemented in the Geneious Sampling. Sampling was conducted from 24 July through 29 package (Drummond et al., Biomatters, Auckland, New Zealand). Briefly, contig September 2009, at Indigo tunnel, alternating between the sequences were individually used to query the nonredundant (nr) nucleotide database and the viral RefSeq database (59). Contigs characterized in this study northeast and the southeast entrances. A total of 512 samples were verified using both search types, with E values set to reduce the number of were collected from seven species of bats during this time, random matches. Since search integrity is highly dependent upon the database including 305 fecal samples, 130 DNA samples, 46 oral swabs, used, contigs were annotated against the viral RefSeq database to identify po- and 24 urine samples (Table 1). Forty-one of these samples tential matches to known viruses, and then these were verified by searching the were selected for this initial screen, and these included fecal larger and more comprehensive nonredundant nucleotide database. As general criteria, E values less than 10eϪ04 and bit scores greater than 40 were the and oral samples from three species including big brown bats, established thresholds as nearly all hits meeting these criteria were corroborated tricolored bats, and little brown myotis. All of these samples VOL. 84, 2010 METAGENOMIC ANALYSIS OF NORTH AMERICAN BATS 13007

TABLE 1. Individual bat fecal samples collected in 2009

No. of bat samples collected by sex and age

Species Common name Male Female Undetermined a Total Adult Juvenile Adult Juvenile sex Eptesicus fuscus 36 35 25 33 0 129 Lasiurus borealis 0 1 0 0 0 1 Lasiurus cinereus 0 1 0 0 0 1 Myotis leibii Eastern small-footed myotis 3 13 2 5 0 23 Myotis lucifugus Little brown myotis 24 15 11 9 2 61 Northern long-eared myotis 17 7 7 9 1 41 Perimyotis subflavus 15 26 3 5 0 49

Total 95 98 48 61 3 305

a Bat escaped before sex could be determined.

were collected during the night of 12 August to 13 August, and ever, this preliminary assessment demonstrated that the vi- samples were pooled based upon species, sex, age, and sample rome of bats varies by species, age, and sample type (Fig. 1). type (Table 2) and sequenced by the University of North Caro- BLAST searches were also conducted using the nonredun- lina High Throughput Sequencing facility using the 454 Roche dant (nr) nucleotide database at NCBI and using programs FLX Titanium chemistry. blastn, blastx, and discontiguous MegaBLAST to identify se- Next-generation sequencing results. A total of 611,745 se- quences that matched to known viruses and to confirm the quence reads were generated that contained one of the six BLAST results using the RefSeq database. The strongest unique bar code sequences, and these were segregated into matches were to insect, plant, and bacteriophage sequences, bins based upon these unique sequences; the bar code tags and this initial search was used to identify potential reference were trimmed from the ends, and sequences shorter than 50 nt genomes for viral genome assembly. However, many of the were removed, resulting in a total of 576,274 trimmed reads sequences were highly divergent at the nucleotide level, indi- (Table 3). The average size of each trimmed read was 256 nt. cating that most of these viruses are novel, distantly related Contigs were assembled for each bin, and Ͼ87% of the reads variants of known viruses. In some cases, translation of contigs were assembled into contigs, with pool 1 containing the largest to amino acid sequences provided greater reliability in finding number of sequence reads (308,026) and contigs (4,044) (Table matches and identifying sequence reads for building larger 3). Of note, pool 1 contained 4-fold more sequence reads than contigs. In addition to insect, plant, and bacteriophage se- any of the other bar-coded samples, due to the additional DNA quences, several sequence reads were most closely related to added to bring the pooled sample up to the required concen- group 1 BtCoVs. tration (see Materials and Methods). Identification of novel coronavirus sequences. Seventy-six Initial characterization of the sequence reads. Each bar individual sequence reads matched strongly to CoV sequences code bin was formatted as a BLAST database, and the viral (Table 4). Pool 1 contained the majority of CoV-like sequences RefSeq database from NCBI containing all known viruses was with 68 sequence reads, 62 of which formed 14 contigs (Table used as input query to search each bar code pool database 4). Seven of these contigs matched most closely to group 1 running tblastx on a local version of stand-alone BLAST (E- BtCoV-HKU2, while the additional contigs were more closely value cutoff of 10eϪ04). This strategy identified matches to related to other group 1 bat or human CoVs (HCoV) (Table known viruses, and a total of roughly 45,000 sequence reads 4). In pool 3, two additional contigs comprised of two and from all pools (7 to 8%) contained sequence similarity to a three reads, respectively, were unique and most closely related variety of different viruses in the RefSeq database (Fig. 1) to HCoV-NL63 (Table 4). In addition, pool 2 had three CoV- although many of these hits represented large numbers of like reads, two of which formed a contig that was unique and reads that matched to one specific virus or viral gene. How- most closely related to group 1 CoV porcine epidemic diarrhea

TABLE 2. Species and pools used in this study

Pool Size Date Species Common name Sexa Ageb Sample type no. (no. of samples) (mo/day/yr) 1 8 8/12/09 Eptesicus fuscus Big brown bat F J Fecal 2 5 8/12/09 Eptesicus fuscus Big brown bat M A Fecal 3 9 8/12/09 Eptesicus fuscus Big brown bat M J Fecal 4 7 8/12/09 Myotis lucifugus Little brown myotis Mix Mix Fecal 5 4 8/12/09 Perimyotis subflavus Tricolored bat M J Fecal 6 8 8/12/09 Eptesicus fuscus Big brown bat F J Oral

a F, female, M, male; mix, mixed pool of males and females. b J, juvenile; A, adult; mix, mixed pool of adults and juveniles. 13008 DONALDSON ET AL. J. VIROL.

TABLE 3. Overview of sequence reads and contigs

Read data Avg no. of Pool no. % No. of contigs Total no. Avg length (nt) No. assembled No. unassembled reads/contig Assembleda 1 308,026 263 268,811 39,215 87 4,044 66 2 70,275 237 63,061 7,214 90 1,135 55 3 26,741 241 24,032 2,709 90 966 24 4 76,580 228 62,153 14,427 81 1,111 55 5 44,838 253 40,774 4,064 91 886 46 6 49,814 237 44,653 5,161 90 641 69

Total 576,274 256 503,484 72,790 87 8,783 52.5

a Rounded to the nearest whole number. virus (PEDV), while the single sequence read was most closely used to generate a multiple alignment with the reference set related to BtCoV-HKU2 (Table 4). Interestingly, all of these that spanned 1,085 nt of the ORF1a region of the replicase viral sequences were significantly different from known CoVs, gene in the region of nonstructural protein 3 (nsp3) (Fig. 2B). indicating that these are novel CoVs that we have named This contig clustered most closely to HCoV-229E and shared Appalachian Ridge CoV (ARCoV). These will be further des- 53.4% pairwise nucleotide identity with the replicase gene of ignated in this study by appending the pool number from which this virus. Also from pool 1, contig Pool1.C.837 was used to they originated. Further, CoV sequences were found only in generate a multiple alignment of the nucleocapsid gene based the big brown bat samples and were predominantly associated upon 833 nt (Fig. 2C). This ARCoV.Pool1.nucleocapsid frag- with juvenile bats. ment was closest to BtCoV-1A and shared 50.5% similarity to Multiple alignments were conducted for four contigs repre- BtCoV-1A based upon pairwise nucleotide identity. While senting four different regions of the CoV genome, comparing Pool1.C.1148 was most closely related to HCoV-229E and novel viral sequences to CoV reference sequences acquired via Pool1.C.837 was most closely related to BtCoV-1A, it is download from the PATRIC website (Fig. 2A) (http://patric possible that these two contigs are from the same novel .vbi.vt.edu/). Maximum-likelihood trees were generated for group 1 CoV. each of the manually corrected multiple alignments, and in all In pool 3, two contigs were used to assess phylogeny with cases, the novel CoV sequences found in this study were most contig Pool3.C.766, used to create a 764-nucleotide fragment closely related to group 1 CoVs (Fig. 2). A pool 1 contig that of the spike gene that was aligned to the reference set, and this was 1,148 nt in length (designated contig Pool1.C.1148) was contig was shown by phylogenetic analysis to be most closely

FIG. 1. Distribution of sequence read matches by pool. The 576,274 sequence reads were compared to the viral RefSeq database using tblastx with an E-value limit of 10eϪ04 to determine which reads were associated with which viruses. Each pool described in Table 2 is represented as a pie chart, with the percentage of viruses for each category colored according to the key. VOL. 84, 2010 METAGENOMIC ANALYSIS OF NORTH AMERICAN BATS 13009

TABLE 4. Overview of CoV sequence reads and contigs

Pool no. Sequence no. No. of reads Length (nt) Best match (virus) Region E value Bit score 1 Pool1.C.1148 4 1,148 HCoV-229E Replicase 7.32EϪ42 96.4 1 Pool1.C.837 5 837 BtCoV-512 Nucleocapsid 1.72EϪ39 91.8 1 Pool1.C.676 4 676 BtCoV-HKU2 Replicase 2.77EϪ101 378.2 1 Pool1.C.454 2 454 BtCoV-HKU2 Replicase 1.52EϪ77 198.6 1 Pool1.239577 1 443 BtCoV-512 Replicase 1.50EϪ23 79 1 Pool1.C.414 12 414 BtCoV-HKU2 Replicase 3.97EϪ19 87.7 1 Pool1.287740 1 412 PRCoV Spike 2.64EϪ22 98.2 1 Pool1.C.412 4 412 BtCoV-HKU2 Replicase 6.32EϪ25 106.9 1 Pool1.C.368 8 368 BtCoV-HKU2 Replicase 4.73EϪ39 106 1 Pool1.C.362 4 362 PEDV Replicase 2.19EϪ70 257.7 1 Pool1.C.348 2 348 BtCoV-512 Replicase 1.55EϪ29 86.3 1 Pool1.C.328 4 328 HCoV-229E Replicase 8.31EϪ40 84 1 Pool1.293985 1 284 HCoV-NL63 Replicase 4.03EϪ26 110.1 1 Pool1.5088 1 270 PEDV Replicase 5.07EϪ38 149.5 1 Pool1.66604 1 265 HCoV-229E Replicase 1.44EϪ35 100 1 Pool1.C.265 5 265 BtCoV-HKU2 Membrane 2.59EϪ34 137.2 1 Pool1.C.264 2 264 HCoV-229E Replicase 5.74EϪ20 89.5 1 Pool1.265225 1 207 PRCoV Spike 9.08EϪ23 98.2 1 Pool1.C.149 3 149 PEDV Replicase 1.21EϪ22 96.8 1 Pool1.C.146 3 146 BtCoV-HKU2 Replicase 1.15EϪ22 96.8 2 Pool2.C.344 2 344 PEDV Membrane 2.03EϪ39 84 2 Pool2.21252 1 257 BtCoV-HKU2 Replicase 2.42EϪ32 73 3 Pool3.C.766 3 764 HCoV-NL63 Spike 6.77EϪ42 84.9 3 Pool3.C.450 2 450 HCoV-NL63 Membrane 5.51EϪ38 150.4

related to BtCoV-HKU8 (41.1% pairwise identity at the nu- of CoV genomes to determine its relatedness to other CoVs. cleotide level) (Fig. 2D). Contig Pool3.C.450 was used to assess Phylogenetic analysis suggests that the novel group 1 CoV in the membrane gene by comparing it to the reference set (Fig. pool 1 is related to BtCoV-HKU2 and HCoV-229E (Fig. 4B 2E). This gene fragment was most closely related to the and C). BtCoV-HKU2 membrane gene with 42.1% pairwise identity. Novel insect viruses from the Iflaviridae. Several contigs Likewise, these contigs could be part of a second novel group from pools 1 to 5 (fecal samples) were determined to be strong 1 CoV associated with pool 3. matches (E value of Ͻ10eϪ20; bit scores of Ͼ100) to two Interpool variation in the spike gene. A sequence read of related picorna-like viruses of insects. In fact, four out of the 412 nt in pool 1 (Pool1.287740) also matched to the same, six pools contained differently sized contigs with an overlap- albeit smaller, portion of the spike gene fragment identified in ping region of 1,449 nt that matched a similar region of known pool 3, indicating that two similar CoVs were found in both insect viruses. In all cases, the nucleotide and amino acid pools 1 and 3. This 412-nucleotide region shared a 96.4% similarities were distant enough to suggest that these were pairwise similarity at the nucleotide level and 98.2% pairwise likely to be novel insect viruses most closely related to two identity at the amino acid level between the pool 1 and pool 3 honeybee (Apis spp.) viruses: deformed wing virus (DWV) and variants, and both were 41% similar at the nucleotide level to sacbrood virus (SBV), both of which are members of the Ifla- BtCoV-HKU2 (Fig. 3). viridae family. These results indicate that at least three novel group 1 CoVs Phylogenetic analysis using the 1,449-nucleotide region were detected in big brown bat fecal samples. No CoV se- demonstrated that these novel viruses clustered with other quences were present in samples analyzed for little brown viruses of the Iflaviridae family, with all being most closely myotis or tricolored bats, and no CoV sequences were detected related to honeybee viruses (Fig. 5). While honeybee viruses in the saliva samples of big brown bats. were the closest relatives, the percent identity at the amino PCR amplification of novel group 1 CoV replicase fragment. acid level ranged between 27.7% and 41.9%. This suggests that The BtCoV-HKU2 genome was used as a reference genome the insect viruses identified in the different samples were dis- for assembling the 68 sequence reads identified in pool 1, and tant relatives of the honeybee viruses that probably infect an 45 of these reads mapped to this genome (Fig. 4A). Primers unidentified arthropod. In addition to this larger contig se- were designed using 454 reads that assembled to the genome, quence, several other reads from all five pools had smaller and PCR was conducted to fill in the gaps between 454 reads matches to members of the Iflaviridae and the , that mapped to the reference genome. One primer pair was with the closest relatives being the aphid and cricket lethal used to successfully amplify 2,540 nt from genomic position paralysis viruses (data not shown). 16486 to 19013 (based on BtCoV-HKU2 numbering), verifying Detection of multiple novel plant viruses of the that a novel CoV was present (Fig. 4A). Interestingly, pool 1 is family. Two contigs from pool 3 and pool 5 contained regions comprised of eight juvenile female big brown bats, but only one of 1902 nt that matched most closely to the grapevine fleck of these samples was positive for this novel CoV. Sanger se- virus (GFkV). In both cases, the contigs were assembled with quencing was used to determine the sequence of the PCR a high degree of coverage, with the pool 3 contig comprised of fragment, and the sequence was compared to the reference set 4,659 sequence reads and the pool 5 contig assembled using 13010 DONALDSON ET AL. J. VIROL.

FIG. 2. Novel group 1 CoV phylogeny. At least three novel group 1 CoVs were identified in this study as CoV sequences were found in three pools (pools 1 to 3). (A) The CoV genome is comprised of five genes found in all CoVs, including the replicase that makes up the first two-thirds of the genome, and four structural genes including the spike (S), the envelope (E), the membrane (M), and the nucleocapsid (N) genes. Black bars represent the sequences identified and compared in this study. Maximum-likelihood analyses were conducted using the CoV contig sequences identified in this study in comparison to known CoV reference genomes. (B) Pool 1 contained a contig of 1,148 nt that was compared to other CoV VOL. 84, 2010 METAGENOMIC ANALYSIS OF NORTH AMERICAN BATS 13011

FIG. 3. Variation between CoV spike genes found in different pools. A sequence read of 412 nt in length from pool 1, Pool 1.287740, matched to the larger spike gene fragment assembled from pool 3, indicating that two related CoVs were found in pools 1 and 3. This 412-nt region shared 96.4% pairwise similarity at the nucleotide level and 98.2% pairwise identity at the amino acid level between the pool 1 and pool 3 variants, and both were most closely related to BtCoV-HKU2. The tree was generated via maximum-likelihood analysis, with 100 bootstrap replicates, and is shown as a proportional cladogram.

364 sequence reads. Interestingly, these viruses shared 99.9% Marafivirus genus. In fact, phylogenetic analysis suggests that pairwise identity with one another and 56% pairwise nucleo- these novel plant viruses may represent a new genus of the tide identity with GFkV. GFkV is primarily a graft-transmis- Tymoviridae family (Fig. 6). Many other sequence reads from sible virus, without an insect vector, which suggests that the pools 1 to 5 also contained hits to many of the plant viruses in novel plant virus sequences are a distant relative of this virus. this figure (Fig. 6) and likely represent novel plant viruses of Phylogenetic analysis of this region demonstrated that the virus different plant species found in this region of Maryland. is similar to other plant viruses in the Tymoviridae family and Identification of novel bacteriophage sequences. Several of that these are distinct viruses most closely related to the the pools contained large contig sequences that matched dif-

reference genomes. An alignment of the most conserved 1,085 nt suggests that this contig is most closely related to group 1 HCoV-229E. (C) A second 833-nt contig in pool 1 matched most closely to the nucleocapsid gene of BtCoV-1A. (D) A 764-nt contig in pool 3 matched most closely to the spike gene of BtCoV-HKU8, a group 1 CoV. (E) A 450-nt contig in pool 3 was most closely related to the membrane gene of BtCoV-HKU2. Values at branch points represent bootstrap values based upon 100 replicates. Branches with values of Ͻ70 should be interpreted with caution. Trees are shown as proportional cladograms. Classic CoV group numbers are shown as roman numerals. Bat viruses are indicated with the prefix Bt. HCoV indicates human virus. MHV, murine hepatitis virus; FIPV, feline infectious peritonitis virus; AIBV, avian infectious bronchitis virus; FIPV, feline infectious peritonitis virus; BCoV, bovine coronovirus; PRCoV, porcine respiratory coronovirus. 13012 DONALDSON ET AL. J. VIROL.

FIG. 4. Amplification of the replicase fragment of ARCoV from pool 1. (A) A total of 45 sequence reads derived from 454 sequencing of the pool 1 sample assembled to the BtCoV-HKU2 genome, and these were used to guide PCR amplification of a 2,540-nt fragment of ORF1ab of ARCoV. Green, 454 reads; blue, amplicons derived by PCR and Sanger sequencing; gray, replicase gene; pink, structural and accessory genes. (B) Phylogenetic analysis of the 2,540-nt fragment of the replicase shows that the ARCoV from pool 1 is most closely related to the group 1 CoVs, with its nearest neighbor being BtCoV-HKU2. (C) Phylogenetic analysis of the 833 amino acids suggests the ARCoV replicase sequence is most closely related to HCoV-229E. Trees are shown as proportional cladograms. Values at branch points represent bootstrap values based upon 100 replicates. Branches with values of Ͻ70 should be interpreted with caution. Classic CoV group numbers are shown as roman numerals.

ferent bacteriophages, with pool 1 containing several contigs were closely related to this phage, with nucleotide pairwise that most closely matched to the enterobacteria phage K1F identities that ranged between 97 to 99%. The full-length ter- (EbP-K1F) of the family, the Autographivirinae minase gene (1,184 nt) was used to compare the novel phage subfamily, and the T7-like genus. EbP-K1F was used as a refer- sequence from pool 5 to seven previously described APSE ence for assembling the reads from pool 1, and a total of 9,384 phage sequences (Fig. 7B) (21, 52). The phylogenetic analysis reads assembled to 88.3% of the EbP-K1F genome with a mean suggested that the pool 5 APSE-like sequence is a novel bac- coverage of ϳ82 reads. While phage evolution has been shown to teriophage most closely related to APSE-4 and APSE-6. be mosaic in nature and therefore not accurately represented by Detection of herpesvirus in bat saliva. In the pool 6 sample, phylogenetics (38), we constructed a phylogenetic tree using the the predominant viral species were most closely related to full-length DNA polymerase genes (2,115 nt) of the novel pool 1 genera within the Herpesviridae family, sub- phage and several reference genome sequences from the T7-like family. A 798-nt contig was translated to a 266-amino acid genus to demonstrate how closely related this novel sequence was sequence that was compared by multiple alignment and phy- to other viruses of the T7 lineage (Fig. 7A). logenetics to several representative sequences of the Betaher- In pool 5, the predominant bacteriophage sequence was pesvirinae subfamily. This sequence matched most closely to most closely related to Acyrthosiphon pisum secondary endo- tupaiid herpesvirus 1 (TuHV-1) (53.9% pairwise identity), also symbiont 1 (APSE-1) bacteriophage, a group 1 virus of the known as tree shrew herpesvirus, for which a genus has not Podoviridae family. A total of 289 sequence reads assembled to been assigned, and the rat (RCMV) of the the APSE-1 reference genome, covering roughly 55% of the Muromegalovirus genus (53.8% pairwise identity) (Fig. 8). Sev- APSE-1 genome with a mean coverage of ϳ2. Four contigs eral additional contigs were similar to the Muromegalovirus greater than 1,200 nt from four different regions of the genome genus, which is comprised of rat and mouse . VOL. 84, 2010 METAGENOMIC ANALYSIS OF NORTH AMERICAN BATS 13013

FIG. 5. Novel insect viruses of the Iflaviridae family. Six unique FIG. 7. Two novel bacteriophage sequences identified in different 1,449-nt fragments from contigs assembled using sequence from four pools. Several pools contained sequences that matched most closely to different pools (pools 1, 2 [n ϭ 3], 4, and 5) were aligned to known different bacteriophages. (A) In pool 1, a phage most closely related to insect viruses of the Iflaviridae and Dicistroviridae families and com- enterobacteria phage (EbP) K1F was identified, and a phylogenetic pared by maximum-likelihood analysis. Phylogenetic analysis demon- tree was generated using maximum likelihood to compare the DNA strated that these novel sequences clustered with other viruses of the polymerase gene of the novel pool 1 phage to that of other members Iflaviridae family, with all being most closely related to SBV and DWV, of the T7-like genus. The tree demonstrates that the novel phage is both of which are viruses of honeybees. Abbreviations are as follows: most closely related to EbP-K1F. KP, Kluyvera phage Kvp1; VP, vibrio- InV, insect virus; KBV, Kashmir bee virus; ALPV, aphid lethal paral- phage VP4; YP, Yersinia phage. Values at branch points represent ysis virus; BQCV, black queen colony virus; CPV, cricket paralysis bootstrap values based upon 100 replicates. Branches with values of virus. Values at branch points represent bootstrap values based upon Ͻ70 should be interpreted with caution. Scale bar represents 0.1 sub- 100 replicates. Branches with values of Ͻ70 should be interpreted with stitutions per nucleotide site. (B) In pool 5, the predominant bacte- caution. Scale bar represents 1 substitution per nucleotide position. riophage sequence was most closely related to Acyrthosiphon pisum secondary endosymbiont 1 (APSE-1) bacteriophage. The full-length terminase gene was used to compare the novel phage sequence to DISCUSSION seven previously described APSE phage sequences. The phylogenetic analysis shows that the pool 5 APSE-like sequence is a novel bacte- From 1980 to 2007, 87 new pathogens were detected in riophage most closely related to APSE-4 and APSE-6. Values at humans, and 58 of these were viruses, including 45 single- branch points represent bootstrap values based upon 100 replicates. Branches with values of Ͻ70 should be interpreted with caution. Scale stranded RNA viruses. Bunyavirus was the largest single virus bar represents 0.01 substitutions per nucleotide site. family found in bats (94). These observations underscore the

importance of characterizing the virome of as many bat species as possible, focusing particularly on bats that are frequently found in or near human habitats. The goal of this study was to characterize the viromes of three bat species that share a roost in an abandoned railroad tunnel in Maryland using next-gen- eration sequencing technology to determine the entire popu- lation of viruses found in each species and to determine how viruses differ between bats segregated by species, sex, and age. A set of 41 bat samples was selected and pooled based on important ecological factors including age (juvenile or adult), sex, and species. These parameters were selected for the fol- lowing reasons: (i) juvenile bats captured in the summer and fall have not yet undergone hibernation and therefore may be subject to viruses that would be restricted by a decrease in body temperature; (ii) male and female bats of many microbat spe- FIG. 6. Novel plant viruses of the Tymoviridae family. Two nearly cies live separately during the spring when females establish identical plant virus sequences were identified in pool 1 and pool 3, maternity colonies away from the male population, and these and a 1,902-nt segment of each was most closely related to viruses of populations may encounter different viruses; (iii) different spe- the Tymoviridae family. These viruses appear to form a new genus in the family as a sister clade to the Maculavirus genus. Abbreviations are cies forage in different areas, attract different ectoparasites, as follows: PlV, plant virus; GFkV, grapevine fleck virus; CSDV, citrus and frequently target different arthropods as food, providing sudden death-associated virus; OBDV, oat blue dwarf virus; KYMV, different opportunities for viruses to infect different species. Kennedia yellow mosaic virus; OMV, okra mosaic virus; TYMV, tur- Therefore, five pools were established to assess viromes of nip yellow mosaic virus; EMV, eggplant mosaic virus; PhyMV, Physalis mosaic virus. Numbers at branch points represent bootstrap values fecal samples from adult male, juvenile male, and juvenile based upon 100 iterations. Scale bar represents 0.01 substitutions per female big brown bats; a mixed population of little brown nucleotide site. myotis; and juvenile male tricolored bats (Table 2). In addi- 13014 DONALDSON ET AL. J. VIROL.

57, 64, 65, 77, 92, 93, 95). In addition to SARS-CoV, two additional CoV cross-species transmission events have been documented, both of which resulted in human viruses that continue to circulate, including HCoV-OC43, a close relative of bovine CoV (BCoV) that crossed the bovine species barrier to infect humans approximately 120 years ago, as estimated by a theoretical molecular clock model (82), and HCoV-229E, which hypothetically emerged from African bats (55). Molec- ular clock analysis was also used to estimate that HCoV-229E emerged roughly 200 years ago, and the authors of this work suggest that all CoVs may have emerged from bats (55). Given that many CoVs continue to circulate in several different spe- cies of bats, including bats of the Old and New World, the next CoV spillover into humans or important domestic animals is only a matter of time and ecological opportunity. CoV sequences have been found in the United States in the big brown bat (26) and in a Texas roost shared by Brazilian FIG. 8. Phylogeny of a novel herpesvirus found in pool 6. Several contigs from pool 6 matched most closely to viruses in a variety of free-tailed bats (Tadarida brasiliensis), the cave myotis (Myotis genera of the Betaherpesvirinae subfamily. A 267-amino acid frag- velifer), evening bats (Nycticeius humeralis), and tricolored bats ment found in pool 6 was aligned and compared to sequences of (47), but in both cases only small segments of the large repli- several reference viruses of the Betaherpesvirinae. Phylogenetic case gene were recovered. In this study, we identified 76 CoV analyses indicated that this amino acid sequence may be from a sequence reads that formed 17 contigs (Fig. 2 and Table 4), novel bat cytomegalovirus that forms a new genera between and Muromegalovirus. HHV, human herpesvirus; CCMV, and these were most closely related to known group 1 CoVs, chimpanzee cytomegalovirus; PanHV, panine herpesvirus; CeHV, cer- including BtCoV-HKU2 that was isolated from a Rhinolophus copithecine herpesvirus; MacHV, macacine herpesvirus; PCMV, por- bat species in Hong Kong (44). Of note, the sequence identity cine cytomegalovirus; TuHV, tupaiid herpesvirus; RCMV, rat cyto- of these CoV sequences to their closest CoV match ranged megalovirus; MoCMV, mouse cytomegalovirus; MurHV, murid herpesvirus; AHV, atoline herpesvirus; SQCMV, squirrel monkey cy- between 41.1 and 53.4% at the nucleotide level (38.7 to 89.1% tomegalovirus. Values at branch points represent bootstrap values at the amino acid level), suggesting that these are novel group based upon 100 replicates. Branches with values of Ͻ70 should be 1 viruses that we have designated ARCoV. These sequence interpreted with caution. Scale bar represents 0.1 substitutions per similarities are within the same range as the novel BtCoVs nucleotide site. found in Southeast Asian and African bats (44, 55). The ARCoV sequence was verified by reverse transcription- PCR (RT-PCR), using 454 sequence reads to design primers tion, a single pool was established to examine the virome of that successfully amplified a 2,540-nt fragment of ORF1ab oral secretions from juvenile female big brown bats (Table 2). (Fig. 4). We are currently using a primer-walking strategy in an The overall sequence results from each pool were compared to attempt to ascertain the full-length CoV genomes from the known virus sequences to determine a preliminary virome for different samples (21). Once an entire genome is derived, we each pooled sample, and these results indicated that the het- will synthetically reconstruct the virus and determine if it rep- erologous viral populations were largely different for each pool licates in bats (7). (Fig. 1). No emerging human viruses were detected in these CoVs were detected in three of the pools, indicating that at samples although some sequence reads were distantly related least three different CoVs were present, and pool 1 contained to human viruses, including enteric viruses such as rotavirus, several contigs that matched more closely to different group 1 enterovirus, and CoVs. It is important to note that this analysis CoVs (Table 4 and Fig. 2), suggesting that more than one CoV gives a preliminary overview of the differences in viromes be- may have been present in those samples. However, it is possi- tween different species and sample types; however, a more ble that these are all fragments of the same novel strain of CoV comprehensive characterization of these viromes will require that is commonly associated with big brown bats. Other studies multiple sequencing runs of more individual bats and genome have found CoV coinfection in individual animals (40, 48); assembly and annotation to verify the viral populations asso- however, the study design precluded determining if this was ciated with each species. the case in our samples. In addition, a similar section of the Novel group 1 CoVs identified in juvenile big brown bats. spike gene was found in pools 1 and 3, and these sequences SARS-CoV emerged from bats as a highly infectious and were 98.2% identical at the amino acid level, indicating that a deadly disease caused by a CoV that crossed the species barrier similar virus was found in two different pools of big brown bats from a zoonotic reservoir into the human population, which (Fig. 3). occurred in the wet markets of the Guangdong province of Novel viruses associated with arthropods of Cicadellidae China (17). During the outbreak of 2002 to 2003, the overall family in the order Hemiptera. In five of the pools (pools 1 to mortality rate was over 8%, but for those over age 60, the death 5; all of the fecal samples) there was an abundance of se- rate approached 60% (73). Although SARS-CoV has not re- quences that matched most closely to viruses of insects and emerged since the original outbreak, the discovery of new plants. Most of the plant virus homologs were related to known human CoVs and SARS-like viruses in bats has been occurring viruses vectored by arthropods that transmit the virus to plants. with increasing frequency (9, 16, 18, 19, 26, 34, 43, 44, 50, 55, This is not surprising, given that there are 56 genera and 141 VOL. 84, 2010 METAGENOMIC ANALYSIS OF NORTH AMERICAN BATS 13015 known families of plant viruses, and most depend upon an EbP-K1F was originally isolated from human sewage in 1984 insect vector for transmission to other plants (11). The most (83), and it is known to encode an endosialidase that allows it common insect vectors to transmit plant viruses are known as to infect Escherichia coli strains expressing the K1 antigen, the leafhoppers, which are a group of insects with piercing- which normally serves as a physical barrier to phage adsorption sucking mouthparts and rows of spine-like setae (hairs) in their and aids in immune evasion (70). Such E. coli strains are often hind tibiae. There are over 20,000 species of leafhoppers, associated with human illnesses, including septicemia, urinary which belong to the family Cicadellidae in the order tract infections, and meningitis (69). YP.Berlin phage is known Hemiptera (Tree of Life Web Project [http://tolweb.org to infect the bacterium Yersinia pestis, the bacterium that /Cicadellidae/10853/2005.07.15]). It has been estimated that causes bubonic plague in humans. To our knowledge, Y. pestis in a single summer season the average maternity colony of is not commonly found in animals in the eastern United States, approximately 150 bats can easily consume over 50,000 leaf- and its primary reservoir is thought to be rodents in the west- hoppers (86). This suggests that bats likely eat leafhoppers ern United States (http://www.cdc.gov/ncidod/dvbid/plague/). infected with important insect and/or plant viruses that are In pool 5, the predominant viral sequences were most closely then shed in the bat feces. It is not known if these insect and related to APSE-1 bacteriophage, which is a lambda-like phage plant viruses replicate in bats; however, the results of this study of the family, possessing a circular 36.5-kb dsDNA show that intact, perhaps infectious, virus does pass through genome (66). APSE-1 infects and lyses a secondary endosym- the bat gastrointestinal tract, as evidenced by genome resis- biont of the pea aphid Acyrthosiphon pisum, which has also tance to degradative nucleases. been found on black locust trees that are common in western The most common insect viruses observed in this study be- Maryland. Interestingly, aphids rely upon a primary endosym- longed to the picorna-like viruses of the Iflaviridae and Dicis- biont, such as Buchnera, to provide essential dietary nutrients, troviridae families. Large segments of genomes (1,449 nt) most and many aphids also contain secondary symbionts, such as the closely related to the DWV or SBV of the Iflaviridae were enterobacteric “Candidatus Hamiltonella defensa,” which is a detected in four of the six pools (Fig. 5). Interestingly, DWV facultative symbiont that helps protect the aphid against ento- has been associated with honeybee colony collapse disorder mopathogenic fungi and parasitoid wasps. The APSE phages (CCD) when the virus is transmitted by the ectoparasitic mite, are thought to be an obligate component of the “Ca. Hamil- Varroa destructor (23). SBV is also a virus of honeybees. How- tonella” life cycle, and it has been proposed that APSE and ever, since the sequence identity of these viruses is so low (less “Ca. Hamiltonella” form a mutualistic symbiosis with Acyrtho- than 42% at the amino acid level) compared to DWV and siphon pisum, whereby toxin genes expressed by a particular SBV, we hypothesize that these are novel viruses of an uniden- APSE strain provide protection against eukaryotic parasites tified arthropod. (21, 22, 52, 68). Two nearly identical plant virus sequences were identified in The APSE phage found in this study is distinctly unique and pool 1 and pool 3 representing a 1,902-nucleotide segment that most closely related to APSE-4 and APSE-6, when the full- most closely matched viruses of the Tymoviridae family (Fig. 6). length terminase genes of all seven known APSEs are com- Interestingly, these viral homologs were also distant relatives, pared (Fig. 7B). This phage did not contain the toxins CdtB with roughly 56% pairwise identity to other viral sequences in and Stx associated with protection of the sweet pea aphid from the family, and these viruses likely form a new genus in the parasites. Based upon the information known about APSE Tymoviridae family as a sister clade to the Maculavirus genus phage, we hypothesize that this is a novel APSE that was likely (Fig. 6). isolated from a strain of “Ca. Hamiltonella” and that both are Bacteriophage sequences suggest unexpected bacterial flora probably mutualisitic symbionts of an unknown arthropod that in bats. The strongest matches to any viral sequences were was eaten by the bat. This observation suggests that bats may bacteriophage sequences that were derived from bat fecal sam- play an important role in distributing viruses and bacteria in ples (pools 1 to 5). The populations of bacteriophage appeared the environment that might be beneficial to other organisms. to vary by pool; however, this may be due to sequence coverage Identification of T7-like phage related to bacteriophages of and depth, and we will be investigating this observation more bacterial species associated with human disease suggests that closely with additional samples. In pool 1, the predominant bats may be harboring and trafficking bacterial species known bacteriophage sequence matched most closely to EbP-K1F to be important human pathogens. To address this question, although portions of the sequence were more closely related to we are currently assessing the bacterial populations in these Yersinia phage Berlin (Fig. 7A, YP.Berlin). Sequence reads bat samples by 16S rRNA gene profiling of fecal DNA from from this phage were assembled using the EbP-K1F genome as the different species of bats collected in Maryland and else- a reference, and the assembly aligned to 88.3% of the EbP- where. K1F genome, with a mean coverage of ϳ82 reads. Phylogenetic Detection of novel herpesviruses of the Betaherpesvirus ge- analysis of the full-length DNA polymerase genes (2,115 nt) of nus. The predominant viral species observed in the oral swab the novel pool 1 phage in comparison to other bacteriophage samples (pool 6) matched most closely to cytomegaloviruses of of the T7-like genus indicated that the novel phage is distinctly small mammals, including the rat cytomegalovirus and the different from EbP-K1F although this phage appears to be its tupaiid herpesvirus (TuHV). This phylogenetic tree was con- closest relative (Fig. 7A). Given the mosaic nature of the bac- structed using amino acid sequences because the nucleotide teriophages (38), we speculate that this virus may contain sequences were too distant to reliably align the sequences (Fig. genes from both the EbP-K1F and YP.Berlin parental strains, 8). Phylogenetic analyses of the 267-amino-acid fragment and we are currently completing the sequence of this genome found in pool 6 along with several reference viruses of the using traditional PCR-based approaches. Betaherpesvirinae indicated that this amino acid sequence may 13016 DONALDSON ET AL. J. VIROL. be a novel bat cytomegalovirus that forms a new genus between help with sample preparation protocols and reaction conditions. We Roseolovirus and Muromegalovirus (Fig. 8). Herpesviruses have thank the field technicians from the Appalachian Laboratory, includ- ing Angela Sjollema, Jennifer Saville, and Lisa Smith, who helped with been detected in bats in a number of studies (51, 62, 76, 84, 85, sample collections. 87), and in European bats gamma- and betaherpesviruses were This work was supported by NIH grant U54 AI057157 from the detected by pan-herpes consensus PCR (87). The predominant Southeastern Regional Center of Excellence for Emerging Infections herpesviruses strains found in this study were more closely and Biodefense. related to mammalian species besides bats, and we expect that The contents of this report are solely the responsibility of the au- thors and do not necessarily represent the official views of the NIH. different species of bats will contain different types of herpes- viruses. REFERENCES Adventures in metagenomics. The use of 454 sequencing 1. Allander, T., S. U. Emerson, R. E. Engle, R. H. Purcell, and J. Bukh. 2001. and other methods, as well as other emerging sequencing tech- A virus discovery method incorporating DNase treatment and its application nologies that hold great promise for drastically reducing the to the identification of two bovine parvovirus species. Proc. Natl. Acad. Sci. U. S. A. 98:11609–11614. cost of megasequencing, will continue to drive the discovery of 2. Allander, T., T. Jartti, S. Gupta, H. G. Niesters, P. Lehtinen, R. Osterback, novel viruses and help elucidate the complex viral ecology that T. Vuorinen, M. Waris, A. Bjerkner, A. Tiveljung-Lindell, B. G. van den plays a role in the emergence of infectious disease and in Hoogen, T. Hyypia, and O. Ruuskanen. 2007. Human bocavirus and acute wheezing in children. Clin. Infect. Dis. 44:904–910. spread of virulence factors. In this study, we used 454 sequenc- 3. Allander, T., M. T. Tammi, M. Eriksson, A. Bjerkner, A. Tiveljung-Lindell, ing technology to analyze samples from 41 bats, generating and B. Andersson. 2005. Cloning of a human parvovirus by molecular screen- over 600,000 sequence reads of over 250 nt in length. This ing of respiratory tract samples. Proc. Natl. Acad. Sci. U. S. A. 102:12891– 12896. abundance of sequence information provided many insights 4. Anderson, N. G., J. L. Gerin, and N. L. Anderson. 2003. Global screening for into the virome of bats and bat ecology; however, no full-length human viral pathogens. Emerg. Infect. Dis. 9:768–774. 5. Arias, C. F., M. Escalera-Zamudio, L. Soto-Del Rio Mde, A. G. Cobian- viral genomes were detected. These results were not unique, as Guemes, P. Isa, and S. Lopez. 2009. Molecular anatomy of 2009 influenza our results were relatively similar to those obtained by Li et al., virus A (H1N1). Arch. Med. Res. 40:643–654. who recently conducted a study analyzing the guano of bats 6. Barbour, R. W., and W. H. Davis. 1969. Bats of America. University Press of Kentucky, Lexington, KY. from the southwestern United States (47). However, we be- 7. Becker, M. M., R. L. Graham, E. F. Donaldson, B. Rockx, A. C. Sims, T. lieve that virus discovery must focus upon deriving entire viral Sheahan, R. J. Pickles, D. Corti, R. E. Johnston, R. S. Baric, and M. R. genomic sequences that can be used by the scientific commu- Denison. 2008. Synthetic recombinant bat SARS-like coronavirus is infec- tious in cultured cells and in mice. Proc. Natl. Acad. Sci. U. S. A. 105:19944– nity to characterize the viruses and compare them to others. 19949. Toward that end, we are employing traditional Sanger se- 8. Benson, D. A., I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and E. W. Sayers. 2009. GenBank. Nucleic Acids Res. 37:D26–D31. quencing to fill in the gaps of several of our novel viral ge- 9. Brandao, P. E., K. Scheffer, L. Y. Villarreal, S. Achkar, N. Oliveira Rde, O. nomes. Additionally, we believe that individual bat-to-bat Fahl Wde, J. G. Castilho, I. Kotait, and L. J. Richtzenhain. 2008. A coro- comparisons will allow us to elucidate the cross-species trans- navirus detected in the vampire bat Desmodus rotundus. Braz. J. Infect. Dis. 12:466–468. mission dynamics between pathogen and host, as well as be- 10. Breitbart, M., and F. Rohwer. 2005. Method for discovering novel DNA tween different populations in a shared habitat. viruses in blood using viral particle selection and shotgun sequencing. Bio- We have encountered several issues with assembling the techniques 39:729–736. 11. Brunt, A. A., K. Crabtree, M. J. Dallwitz, A. J. Gibbs, L. Watson, and E. J. viral sequences in this study because in most cases the viral Zurcher. 20 August 1996, posting date. Plant viruses online: descriptions and homologs are 50 to 60% identical at the nucleotide level to the lists from the VIDE database. University of Idaho, Moscow, ID. closest sequence match in the database. This suggests that 12. Bunde, J. M., E. J. Heske, N. E. Mateus-Pinilla, J. E. Hofmann, and R. J. Novak. 2006. A survey for West Nile virus in bats from Illinois. J. Wildl. Dis. most sequences analyzed in this study belong to novel viruses, 42:455–458. but there are no known reference sequences available to guide 13. Calisher, C. H., J. E. Childs, H. E. Field, K. V. Holmes, and T. Schountz. 2006. Bats: important reservoir hosts of emerging viruses. Clin. Microbiol. assembly. This problem has also been encountered with novel Rev. 19:531–545. CoV sequences isolated from bats in China and elsewhere that 14. Camacho, C., G. Coulouris, V. Avagyan, N. Ma, J. Papadopoulos, K. Bealer, have completely redefined the traditional CoV phylogeny be- and T. L. Madden. 2009. BLASTϩ: architecture and applications. BMC Bioinformatics 10:421. cause many of these viruses are too diverse to cluster into the 15. Cann, A. J., S. E. Fandrich, and S. Heaphy. 2005. Analysis of the virus traditional three group system (18, 44, 57, 64, 93). population present in equine faeces indicates the presence of hundreds of To circumvent this problem, translated BLAST searches uncharacterized virus genomes. Virus Genes 30:151–156. 16. Carrington, C. V., J. E. Foster, H. C. Zhu, J. X. Zhang, G. J. Smith, N. were used to identify matches at the amino acid level that did Thompson, A. J. Auguste, V. Ramkissoon, A. A. Adesiyun, and Y. Guan. not assemble well at the nucleotide level, despite clear se- 2008. Detection and phylogenetic analysis of group 1 coronaviruses in South American bats. Emerg. Infect. Dis. 14:1890–1893. quence conservation noted at the amino acid level. Tools to 17. Chinese SARS Molecular Epidemiology Consortium. 2004. Molecular evo- address divergent genome assembly are currently lacking, and lution of the SARS coronavirus during the course of the SARS epidemic in development of new tools for assessing divergent genomes is China. Science 303:1666–1669. 18. Chu, D. K., J. S. Peiris, H. Chen, Y. Guan, and L. L. Poon. 2008. Genomic essential. In addition to defining and characterizing the vi- characterizations of bat coronaviruses (1A, 1B and HKU8) and evidence for romes of different bat species, our goal is to establish a refer- co-infections in Miniopterus bats. J. Gen. Virol. 89:1282–1287. ence set of viral genomes that have originated in bats that will 19. Chu, D. K., L. L. Poon, K. H. Chan, H. Chen, Y. Guan, K. Y. Yuen, and J. S. Peiris. 2006. Coronaviruses in bent-winged bats (Miniopterus spp.). J. Gen. allow future investigators access to these reference genomes to Virol. 87:2461–2466. aid in genome assembly. 20. Coetzee, B., M. J. Freeborough, H. J. Maree, J. M. Celton, D. J. Rees, and J. T. Burger. 2010. Deep sequencing analysis of viruses infecting grapevines: virome of a vineyard. Virology 400:157–163. ACKNOWLEDGMENTS 21. Degnan, P. H., and N. A. Moran. 2008. Diverse phage-encoded toxins in a protective insect endosymbiont. Appl. Environ. Microbiol. 74:6782–6791. We thank the UNC High Throughput Sequencing Facility for help 22. Degnan, P. H., and N. A. Moran. 2008. Evolutionary genetics of a defensive with 454 sequencing, J. Craig Venter Institute and David Spiro for help facultative symbiont of insects: exchange of toxin-encoding bacteriophage. with bar-coding protocols, and Joseph Victoria and Eric Delwart for Mol. Ecol. 17:916–929. VOL. 84, 2010 METAGENOMIC ANALYSIS OF NORTH AMERICAN BATS 13017

23. de Miranda, J. R., and E. Genersch. 2010. Deformed wing virus. J. Invertebr. 47. Li, L., J. G. Victoria, C. Wang, M. Jones, G. M. Fellers, T. H. Kunz, and E. Pathol. 103:S48–S61. Delwart. 2010. Bat guano virome: predominance of dietary viruses from 24. Djikeng, A., R. Halpin, R. Kuzmickas, J. Depasse, J. Feldblyum, N. Senga- insects and plants plus novel mammalian viruses. J. Virol. 85:6955–6965. malay, C. Afonso, X. Zhang, N. G. Anderson, E. Ghedin, and D. J. Spiro. 48. Masters, P. S., and P. J. Rottier. 2005. Coronavirus reverse genetics by 2008. Viral genome sequencing by random priming methods. BMC Genom- targeted RNA recombination. Curr. Top. Microbiol. Immunol. 287:133–159. ics 9:5. 49. Menzel, M. A., J. M. Menzel, S. B. Castleberry, J. Ozier, W. M. Ford, J. W. 25. Djikeng, A., and D. Spiro. 2009. Advancing full length genome sequencing Edwards, and E. W. Pearson. 2002. Illustrated key to skins and skulls of bats for human RNA viral pathogens. Future Virol. 4:47–53. in the Southeastern and Mid-Atlantic States. Research note NE-376. USDA 26. Dominguez, S. R., T. J. O’Shea, L. M. Oko, and K. V. Holmes. 2007. Detec- Forest Service, Northeastern Research Station, Newtown Square, PA. tion of group 1 coronaviruses in bats in North America. Emerg. Infect. Dis. 50. Misra, V., T. Dumonceaux, J. Dubois, C. Willis, S. Nadin-Davis, A. Severini, 13:1295–1300. A. Wandeler, R. Lindsay, and H. Artsob. 2009. Detection of polyoma and 27. Edgar, R. C. 2004. MUSCLE: a multiple sequence alignment method with corona viruses in bats of . J. Gen. Virol. 90:2015–2022. reduced time and space complexity. BMC Bioinformatics 5:113. 51. Molnar, V., M. Janoska, B. Harrach, R. Glavits, N. Palmai, D. Rigo, E. Sos, 28. Felsenstein, J. 1989. PHYLIP: phylogeny inference package (version 3.2). and M. Liptovszky. 2008. Detection of a novel bat gammaherpesvirus in 5:164–166. Hungary. Acta Vet. Hung. 56:529–538. 29. Finkbeiner, S. R., A. F. Allred, P. I. Tarr, E. J. Klein, C. D. Kirkwood, and 52. Moran, N. A., P. H. Degnan, S. R. Santos, H. E. Dunbar, and H. Ochman. D. Wang. 2008. Metagenomic analysis of human diarrhea: viral detection and 2005. The players in a mutualistic symbiosis: insects, bacteria, viruses, and discovery. PLoS Pathog. 4:e1000011. virulence genes. Proc. Natl. Acad. Sci. U. S. A. 102:16919–16926. 30. Frank, D. N., and N. R. Pace. 2008. Gastrointestinal microbiology enters the 53. Ng, T. F., C. Manire, K. Borrowman, T. Langer, L. Ehrhart, and M. Breit- metagenomics era. Curr. Opin. Gastroenterol. 24:4–10. bart. 2009. Discovery of a novel single-stranded DNA virus from a sea turtle 31. Fraser, C., C. A. Donnelly, S. Cauchemez, W. P. Hanage, M. D. Van Kerk- fibropapilloma by using viral metagenomics. J. Virol. 83:2500–2509. hove, T. D. Hollingsworth, J. Griffin, R. F. Baggaley, H. E. Jenkins, E. J. 54. Peeters, M., M. L. Chaix, and E. Delaporte. 2008. Genetic diversity and Lyons, T. Jombart, W. R. Hinsley, N. C. Grassly, F. Balloux, A. C. Ghani, phylogeographic distribution of SIV: how to understand the origin of HIV. N. M. Ferguson, A. Rambaut, O. G. Pybus, H. Lopez-Gatell, C. M. Alpuche- Med. Sci. (Paris) 24:621–628. (In French.) Aranda, I. B. Chapela, E. P. Zavala, D. M. Guevara, F. Checchi, E. Garcia, 55. Pfefferle, S., S. Oppong, J. F. Drexler, F. Gloza-Rausch, A. Ipsen, A. Seebens, S. Hugonnet, and C. Roth. 2009. Pandemic potential of a strain of influenza M. A. Muller, A. Annan, P. Vallo, Y. Adu-Sarkodie, T. F. Kruppa, and C. A (H1N1): early findings. Science 324:1557–1561. Drosten. 2009. Distant relatives of severe acute respiratory syndrome coro- 32. Gao, F., E. Bailes, D. L. Robertson, Y. Chen, C. M. Rodenburg, S. F. navirus and close relatives of human coronavirus 229E in bats, Ghana. Michael, L. B. Cummins, L. O. Arthur, M. Peeters, G. M. Shaw, P. M. Sharp, Emerg. Infect. Dis. 15:1377–1384. and B. H. Hahn. 1999. Origin of HIV-1 in the chimpanzee Pan troglodytes 56. Pilipski, J. D., L. M. Pilipskl, and L. S. Risley. 2004. West Nile virus troglodytes. Nature 397:436–441. antibodies in bats from New Jersey and New York. J. Wildl. Dis. 40:335–337. 33. Gaynor, A. M., M. D. Nissen, D. M. Whiley, I. M. Mackay, S. B. Lambert, G. 57. Poon, L. L., D. K. Chu, K. H. Chan, O. K. Wong, T. M. Ellis, Y. H. Leung, Wu, D. C. Brennan, G. A. Storch, T. P. Sloots, and D. Wang. 2007. Identi- S. K. Lau, P. C. Woo, K. Y. Suen, K. Y. Yuen, Y. Guan, and J. S. Peiris. 2005. fication of a novel polyomavirus from patients with acute respiratory tract Identification of a novel coronavirus in bats. J. Virol. 79:2001–2009. infections. PLoS Pathog. 3:e64. 58. Preidis, G. A., and J. Versalovic. 2009. Targeting the human microbiome 34. Gloza-Rausch, F., A. Ipsen, A. Seebens, M. Gottsche, M. Panning, J. Felix with antibiotics, probiotics, and prebiotics: gastroenterology enters the meta- Drexler, N. Petersen, A. Annan, K. Grywna, M. Muller, S. Pfefferle, and C. genomics era. Gastroenterology 136:2015–2031. Drosten. 2008. Detection and prevalence patterns of group I coronaviruses 59. Pruitt, K. D., T. Tatusova, and D. R. Maglott. 2007. NCBI reference se- in bats, northern Germany. Emerg. Infect. Dis. 14:626–631. quences (RefSeq): a curated non-redundant sequence database of genomes, 35. Gouy, M., S. Guindon, and O. Gascuel. 2010. SeaView version 4: a multi- transcripts and proteins. Nucleic Acids Res. 35:D61–D65. platform graphical user interface for sequence alignment and phylogenetic 60. Racey, P. 2008. Reproductive assessment in bats, p. 31–45. In T. H. Kuntz tree building. Mol. Biol. Evol. 27:221–224. (ed.), Ecological and behavioral methods in the study of bats. Smithsonian 36. Greninger, A. L., C. Runckel, C. Y. Chiu, T. Haggerty, J. Parsonnet, D. Institution Press, Washington, DC. Ganem, and J. L. DeRisi. 2009. The complete genome of klassevirus: a novel 61. Raesly, R. L., and J. E. Gates. 1987. Winter habitat selection by north in pediatric stool. Virol. J. 6:82. temperate cave bats. Am. Midl. Nat. 118:15–31. 37. Guindon, S., and O. Gascuel. 2003. A simple, fast, and accurate algorithm to 62. Razafindratsimandresy, R., E. M. Jeanmaire, D. Counor, P. F. Vasconcelos, estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696–704. A. A. Sall, and J. M. Reynes. 2009. Partial molecular characterization of 38. Hatfull, G. F., D. Jacobs-Sera, J. G. Lawrence, W. H. Pope, D. A. Russell, alphaherpesviruses isolated from tropical bats. J. Gen. Virol. 90:44–47. C. C. Ko, R. J. Weber, M. C. Patel, K. L. Germane, R. H. Edgar, N. N. Hoyte, 63. Reichard, J. D. 2008. Wing-damage index used for characterizing wing con- C. A. Bowman, A. T. Tantoco, E. C. Paladin, M. S. Myers, A. L. Smith, M. S. dition of bats affected by white-nose syndrome. National Speleological So- Grace, T. T. Pham, M. B. O’Brien, A. M. Vogelsberger, A. J. Hryckowian, ciety, Huntsville, AL. http://www.caves.org/WNS/Bat Wings.pdf. J. L. Wynalek, H. Donis-Keller, M. W. Bogel, C. L. Peebles, S. G. Cresawn, 64. Ren, W., W. Li, M. Yu, P. Hao, Y. Zhang, P. Zhou, S. Zhang, G. Zhao, Y. and R. W. Hendrix. 2010. Comparative genomic analysis of 60 mycobac- Zhong, S. Wang, L. F. Wang, and Z. Shi. 2006. Full-length genome se- teriophage genomes: genome clustering, gene acquisition, and gene size. J. quences of two SARS-like coronaviruses in horseshoe bats and genetic Mol. Biol. 397:119–143. variation analysis. J. Gen. Virol. 87:3355–3359. 39. Holmes, E. C. 2001. On the origin and evolution of the human immunode- 65. Rihtaric, D., P. Hostnik, A. Steyer, J. Grom, and I. Toplak. 2010. Identifi- ficiency virus (HIV). Biol. Rev. Camb. Philos Soc. 76:239–254. cation of SARS-like coronaviruses in horseshoe bats (Rhinolophus hippo- 40. Kottier, S. A., D. Cavanagh, and P. Britton. 1995. First experimental evi- sideros) in Slovenia. Arch. Virol. 155:507–514. dence of recombination in infectious bronchitis virus. Recombination in 66. Rohwer, F., and R. Edwards. 2002. The Phage Proteomic Tree: a genome- IBV. Adv. Exp. Med. Biol. 380:551–556. based for phage. J. Bacteriol. 184:4529–4535. 41. Kristensen, D. M., A. R. Mushegian, V. V. Dolja, and E. V. Koonin. 2010. 67. Roossinck, M. J., P. Saha, G. B. Wiley, J. Quan, J. D. White, H. Lai, F. New dimensions of the virus world discovered through metagenomics. Chavarria, G. Shen, and B. A. Roe. 2010. Ecogenomics: using massively Trends Microbiol. 18:11–19. parallel pyrosequencing to understand virus ecology. Mol. Ecol. 19(Suppl. 42. Larkin, M. A., G. Blackshields, N. P. Brown, R. Chenna, P. A. McGettigan, 1):81–88. H. McWilliam, F. Valentin, I. M. Wallace, A. Wilm, R. Lopez, J. D. Thomp- 68. Sandstrom, J. P., J. A. Russell, J. P. White, and N. A. Moran. 2001. Inde- son, T. J. Gibson, and D. G. Higgins. 2007. Clustal W and Clustal X version pendent origins and horizontal transfer of bacterial symbionts of aphids. 2.0. Bioinformatics 23:2947–2948. Mol. Ecol. 10:217–228. 43. Lau, S. K., P. C. Woo, K. S. Li, Y. Huang, H. W. Tsoi, B. H. Wong, S. S. 69. Scholl, D., S. Adhya, and C. Merril. 2005. Escherichia coli K1’s capsule is a Wong, S. Y. Leung, K. H. Chan, and K. Y. Yuen. 2005. Severe acute respi- barrier to bacteriophage T7. Appl. Environ. Microbiol. 71:4872–4874. ratory syndrome coronavirus-like virus in Chinese horseshoe bats. Proc. Natl. 70. Scholl, D., and C. Merril. 2005. The genome of bacteriophage K1F, a T7-like Acad. Sci. U. S. A. 102:14040–14045. phage that has acquired the ability to replicate on K1 strains of Escherichia 44. Lau, S. K., P. C. Woo, K. S. Li, Y. Huang, M. Wang, C. S. Lam, H. Xu, R. coli. J. Bacteriol. 187:8499–8503. Guo, K. H. Chan, B. J. Zheng, and K. Y. Yuen. 2007. Complete genome 71. Smith, G. J., D. Vijaykrishna, J. Bahl, S. J. Lycett, M. Worobey, O. G. Pybus, sequence of bat coronavirus HKU2 from Chinese horseshoe bats revealed a S. K. Ma, C. L. Cheung, J. Raghwani, S. Bhatt, J. S. Peiris, Y. Guan, and A. much smaller spike gene with a different evolutionary lineage from the rest Rambaut. 2009. Origins and evolutionary genomics of the 2009 swine-origin of the genome. Virology 367:428–439. H1N1 influenza A epidemic. Nature 459:1122–1125. 45. Letarov, A., and E. Kulikov. 2009. The bacteriophages in human- and animal 72. Snyder, E. E., N. Kampanya, J. Lu, E. K. Nordberg, H. R. Karur, M. Shukla, body-associated microbial communities. J. Appl. Microbiol. 107:1–13. J. Soneja, Y. Tian, T. Xue, H. Yoo, F. Zhang, C. Dharmanolla, N. V. Dongre, 46. Li, L., J. Victoria, A. Kapoor, O. Blinkova, C. Wang, F. Babrzadeh, C. J. J. J. Gillespie, J. Hamelius, M. Hance, K. I. Huntington, D. Jukneliene, J. Mason, P. Pandey, H. Triki, O. Bahri, B. S. Oderinde, M. M. Baba, D. N. Koziski, L. Mackasmiel, S. P. Mane, V. Nguyen, A. Purkayastha, J. Shallom, Bukbuk, J. M. Besser, J. M. Bartkus, and E. L. Delwart. 2009. A novel G. Yu, Y. Guo, J. Gabbard, D. Hix, A. F. Azad, S. C. Baker, S. M. Boyle, Y. picornavirus associated with gastroenteritis. J. Virol. 83:12002–12006. Khudyakov, X. J. Meng, C. Rupprecht, J. Vinje, O. R. Crasta, M. J. Czar, A. 13018 DONALDSON ET AL. J. VIROL.

Dickerman, J. D. Eckart, R. Kenyon, R. Will, J. C. Setubal, and B. W. Sobral. Mizutani. 2010. Novel betaherpesvirus in bats. Emerg. Infect. Dis. 16:986– 2007. PATRIC: the VBI PathoSystems Resource Integration Center. Nu- 988. cleic Acids Res. 35:D401–D406. 85. Watanabe, S., N. Ueda, K. Iha, J. S. Masangkay, H. Fujii, P. Alviola, T. 73. Stadler, K., V. Masignani, M. Eickmann, S. Becker, S. Abrignani, H. D. Mizutani, K. Maeda, D. Yamane, A. Walid, K. Kato, S. Kyuwa, Y. Tohya, Y. Klenk, and R. Rappuoli. 2003. SARS—beginning to understand a new virus. Yoshikawa, and H. Akashi. 2009. Detection of a new bat gammaherpesvirus Nat. Rev. Microbiol. 1:209–218. in the Philippines. Virus Genes 39:90–93. 74. Stockman, L. J., L. M. Haynes, C. Miao, J. L. Harcourt, C. E. Rupprecht, 86. Whitaker, J. O., Jr. 1993. Bats, beetles, and bugs: more big brown bats mean T. G. Ksiazek, T. B. Hyde, A. M. Fry, and L. J. Anderson. 2008. Coronavirus less agricultural pests. Bats 11:23. antibodies in bat biologists. Emerg. Infect. Dis. 14:999–1000. 87. Wibbelt, G., A. Kurth, N. Yasmum, M. Bannert, S. Nagel, A. Nitsche, and B. 75. Tajima, S., T. Takasaki, S. Matsuno, M. Nakayama, and I. Kurane. 2005. Ehlers. 2007. Discovery of herpesviruses in bats. J. Gen. Virol. 88:2651–2655. Genetic characterization of Yokose virus, a flavivirus isolated from the bat in 88. Willner, D., M. Furlan, M. Haynes, R. Schmieder, F. E. Angly, J. Silva, S. Japan. Virology 332:38–44. Tammadoni, B. Nosrat, D. Conrad, and F. Rohwer. 2009. Metagenomic 76. Tandler, B. 1996. Cytomegalovirus in the principal submandibular gland of analysis of respiratory tract DNA viral communities in cystic fibrosis and the , Myotis lucifugus. J. Comp. Pathol. 114:1–9. non-cystic fibrosis individuals. PLoS One 4:e7370. 77. Tang, X. C., J. X. Zhang, S. Y. Zhang, P. Wang, X. H. Fan, L. F. Li, G. Li, 89. Willner, D., R. V. Thurber, and F. Rohwer. 2009. Metagenomic signatures of B. Q. Dong, W. Liu, C. L. Cheung, K. M. Xu, W. J. Song, D. Vijaykrishna, 86 microbial and viral metagenomes. Environ. Microbiol. 11:1752–1766. L. L. Poon, J. S. Peiris, G. J. Smith, H. Chen, and Y. Guan. 2006. Prevalence 90. Wolfe, N. D., C. P. Dunavan, and J. Diamond. 2007. Origins of major human and genetic diversity of coronaviruses in bats from China. J. Virol. 80:7481– infectious diseases. Nature 447:279–283. 7490. Woo, P. C., S. K. Lau, Y. Huang, and K. Y. Yuen. 78. Teeling, E. C., M. S. Springer, O. Madsen, P. Bates, S. J. O’Brien, and W. J. 91. 2009. Coronavirus diversity, 234: Murphy. 2005. A molecular phylogeny for bats illuminates biogeography and phylogeny and interspecies jumping. Exp. Biol. Med. 1117–1127. the fossil record. Science 307:580–584. 92. Woo, P. C., S. K. Lau, K. S. Li, R. W. Poon, B. H. Wong, H. W. Tsoi, B. C. 79. Tudge, C. 2000. The variety of life: a survey and a celebration of all the Yip, Y. Huang, K. H. Chan, and K. Y. Yuen. 2006. Molecular diversity of creatures that have ever lived. Oxford University Press, Oxford, England. coronaviruses in bats. Virology 351:180–187. 80. Victoria, J. G., A. Kapoor, K. Dupuis, D. P. Schnurr, and E. L. Delwart. 93. Woo, P. C., M. Wang, S. K. Lau, H. Xu, R. W. Poon, R. Guo, B. H. Wong, K. 2008. Rapid identification of known and new RNA viruses from animal Gao, H. W. Tsoi, Y. Huang, K. S. Li, C. S. Lam, K. H. Chan, B. J. Zheng, and tissues. PLoS Pathog. 4:e1000163. K. Y. Yuen. 2007. Comparative analysis of twelve genomes of three novel 81. Victoria, J. G., A. Kapoor, L. Li, O. Blinkova, B. Slikas, C. Wang, A. Naeem, group 2c and group 2d coronaviruses reveals unique group and subgroup S. Zaidi, and E. Delwart. 2009. Metagenomic analyses of viruses in stool features. J. Virol. 81:1574–1585. samples from children with acute flaccid paralysis. J. Virol. 83:4642–4651. 94. Woolhouse, M., and E. Gaunt. 2007. Ecological origins of novel human 82. Vijgen, L., E. Keyaerts, E. Moes, I. Thoelen, E. Wollants, P. Lemey, A. M. pathogens. Crit. Rev. Microbiol. 33:231–242. Vandamme, and M. Van Ranst. 2005. Complete genomic sequence of human 95. Yuan, J., C. C. Hon, Y. Li, D. Wang, G. Xu, H. Zhang, P. Zhou, L. L. Poon, coronavirus OC43: molecular clock analysis suggests a relatively recent zoo- T. T. Lam, F. C. Leung, and Z. Shi. 2010. Intraspecies diversity of SARS-like notic coronavirus transmission event. J. Virol. 79:1595–1604. coronaviruses in Rhinolophus sinicus and its implications for the origin of 83. Vimr, E. R., R. D. McCoy, H. F. Vollger, N. C. Wilkison, and F. A. Troy. 1984. SARS coronaviruses in humans. J. Gen. Virol. 91:1058–1062. Use of prokaryotic-derived probes to identify poly(sialic acid) in neonatal 96. Zhang, H., X. Yang, and G. Li. 1998. Detection of dengue virus genome neuronal membranes. Proc. Natl. Acad. Sci. U. S. A. 81:1971–1975. RNA in some kinds of animals caught from dengue fever endemic areas in 84. Watanabe, S., K. Maeda, K. Suzuki, N. Ueda, K. Iha, S. Taniguchi, H. Hainan Island with reverse transcription-polymerase chain reaction. Zhon- Shimoda, K. Kato, Y. Yoshikawa, S. Morikawa, I. Kurane, H. Akashi, and T. ghua Shi Yan He Lin Chuang Bing Du Xue Za Zhi 12:226–228. (In Chinese.)