JVI Accepted Manuscript Posted Online 21 June 2017 J. Virol. doi:10.1128/JVI.00680-17 Copyright © 2017 American Society for Microbiology. All Rights Reserved.
1 High Resolution Meta-Transcriptomics Reveals the Ecological
2 Dynamics of Mosquito-Associated RNA Viruses in Western Downloaded from
3 Australia
4 5 Mang Shia, Peter Nevilleb,c, Jay Nicholsonb,c,d, John-Sebastian Edena,e, Allison Imriec*, 6 Edward C. Holmesa* 7 http://jvi.asm.org/ 8 aMarie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, 9 School of Biological Sciences and Sydney Medical School, The University of Sydney, 10 Sydney, Australia.
11 bEnvironmental Health Directorate, Public Health Division, Department of Health, 12 Government of Western Australia, Australia. on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209
13 cSchool of Biomedical Sciences, The University of Western Australia, Australia.
14 dCenter for Vectorborne Diseases, Department of Pathology, Microbiology and Immunology, 15 School of Veterinary Medicine, University of California, Davis, USA.
16 eCentre for Virus Research, The Westmead Institute for Medical Research, Sydney, Australia.
17 18 *Corresponding authors: 19 Edward C. Holmes – [email protected] 20 Allison Imrie – [email protected] 21 22 Word count: Abstract – 247, Importance – 119, Main Text – 4835
23 Running title: Ecology of the Mosquito Virome
1 24 ABSTRACT Mosquitoes harbour a high diversity of RNA viruses, including many that
25 impact human health. Despite a growing effort to describe the extent and nature of the Downloaded from 26 mosquito virome, little is known about how these viruses persist, spread, and interact with
27 both their hosts and other microbes. To address this issue we performed a meta-
28 transcriptomics analysis of 12 Western Australian mosquito populations structured by species
29 and geographic location. Our results identified the complete genomes of 24 species of RNA http://jvi.asm.org/ 30 viruses from a diverse range of viral families and orders, among which 19 are newly
31 described. Comparisons of viromes revealed a striking difference between the two mosquito
32 genera, with viromes of mosquitoes from the Aedes genus exhibiting substantially less
33 diversity and lower abundance than those of Culex genus, within which viral abundance on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 34 reached 16.87% of the total non-rRNA. In addition, there was little overlap in viral diversity
35 between the two genera, although the viromes were very similar among the three Culex
36 species studied, suggesting that host taxon plays a major role in structuring virus diversity. In
37 contrast, we found no evidence that geographic location played a major role in shaping RNA
38 virus diversity, and several viruses discovered here exhibited high similarity (95-98%
39 nucleotide identity) to those from Indonesia and China. Finally, using abundance level and
40 phylogenetic relationships we were able to distinguish potential mosquito viruses from those
41 present in co-infecting bacteria, fungi, and protists. In sum, our meta-transcriptomics
42 approach provides important insights into the ecology of mosquito RNA viruses.
43
44 IMPORTANCE Studies of virus ecology have generally focused on individual viral
45 species. However, recent advances in bulk RNA sequencing make it possible to utilize meta-
46 transcriptomic approaches to reveal both complete virus diversity and their relative
47 abundance. We used such a meta-transcriptomic approach to determine key aspects of the
48 ecology of mosquito viruses in Western Australia. Our results show that RNA viruses are one
2 49 of the most important components of the mosquito transcriptome, and we identified 19 new
50 virus species from a diverse set of virus families. A key result was that host genetic Downloaded from 51 background plays a more important role in shaping virus diversity than sampling location,
52 with Culex species harbouring more viruses at greater abundance than those from Aedes
53 mosquitoes. http://jvi.asm.org/ on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209
3 54 Mosquitoes (Diptera: Culicidae) act as vectors for a number of disease agents that infect
55 humans and domestic animals, including malaria, dengue virus, Chikungunya virus, and Zika Downloaded from 56 virus. However, in addition to their role as transmission vectors, mosquitoes harbour a far
57 larger virome, including many viruses that are confined to these insects, such that they are
58 “insect-specific” (1, 2). Although these insect-specific viruses that have no direct impact on
59 public health, they may modulate the transmission of viruses that are pathogenic to http://jvi.asm.org/ 60 vertebrates (3). The development of metagenomic sequencing approaches has therefore led to
61 a re-evaluation of the mosquito virome, including the recent discovery of viruses in the
62 families Bunyaviridae (4-8), Rhabdoviridae (6, 9-11), Orthomyxoviridae (6, 12), Flaviviridae
63 (13-15), Mesoviridae (16), Reoviridae (8, 17), as well as in the unclassified Chuvirus (6) and on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 64 Negevirus (18) groups. In addition, metagenomics surveys have discovered viruses in
65 families not previously known to infect mosquitoes, such as the Iflaviridae, Dicistroviridae,
66 Totiviridae, Chrysoviridae, and Narnaviridae (8, 19-22). Although these viruses have not
67 been isolated or characterized in vivo, their host association is supported by the presence of
68 related endogenous viruses in the genomes of various mosquito species (8). Hence, it is clear
69 that mosquitoes harbour a substantial viral diversity, the majority of which may not be
70 associated with vertebrates (1, 2).
71 Despite our expanding knowledge of the mosquito virome, there have been fewer
72 studies of ecological aspects of these viruses within their hosts (1). It has been suggested that
73 most of these newly discovered viruses share features that distinguish them from “classic”
74 human pathogens, including (i) an inability to infect vertebrates or vertebrate cell lines, (ii) a
75 high prevalence, (iii) prolonged host infection, and (iv) vertical transmission (1, 2, 23). Based
76 on these features, these mosquito viruses have been referred to as “commensal” microbes (3).
77 In reality, however, little is known about their natural infection status (e.g. abundance,
78 frequency of superinfection), host specificity in relation to different mosquito species,
4 79 geographic distribution and movement, and interactions with hosts and other microbes that
80 may be present within a specific host. Downloaded from 81 To reveal more of the natural ecology of mosquito RNA viruses we employed a meta-
82 transcriptomics approach to characterise the entire RNA environment excluding ribosomal
83 RNA (rRNA) within a mosquito sample. Meta-transcriptomics has several advantages over
84 approaches such as cell culture, consensus PCR, and metagenomics methods based on viral http://jvi.asm.org/ 85 particle purification (24, 25), and has proven successful in characterizing the RNA viromes of
86 diverse invertebrates (6, 8, 14, 20). Specifically: (i) it reveals the entire RNA virome, with
87 sufficient coverage to reconstruct complete viral genomes, including those from co-infecting
88 parasites; (ii) it provides a reliable quantification and assessment of both viral and host on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 89 RNAs; and (iii) it is relatively simple, requiring minimal sample processing. Most
90 importantly, meta-transcriptomics provides more information than the genome sequence
91 alone, allowing a straightforward characterization of viral diversity and ecology.
92 To infer aspects of virome ecology among mosquito species sampled from different
93 geographic locations we characterized the total transcriptome of 12 mosquito populations,
94 comprising five species collected from four locations in Western Australia. In particular, we
95 determined the number, type, and abundance of each virus within the context of the host
96 transcriptome and that of other microbial symbionts/parasites, and addressed whether these
97 parameters varied by species and/or sampling location.
98
99 RESULTS
100 The mosquito virome. We characterized the total transcriptome of 12 mosquito pools,
101 representing five species of mosquitoes sampled from four geographic locations in Western
102 Australia (Fig. 1). RNA sequencing of ribosomal (r) RNA-depleted libraries resulted in 40-47
103 million reads per pool, which were assembled de novo into 159,861 to 225,352 contigs.
5 104 Subsequent blast analyses revealed the complete genomes of 24 species of RNA viruses, of
105 which 19 are newly described here. These virus species fell into a wide range of RNA virus Downloaded from 106 groups, including those that fell within existing families and orders, namely the
107 Bunyaviridae, Mononegavirales, Orthomyxoviridae, Narnaviridae, Mesoniviridae,
108 Partitiviridae, Reoviridae, Totiviridae, Chrysoviridae, as well as in several newly described
109 groups: Qinvirus (a highly divergent group of negative-sense RNA viruses (8)), the Partiti- http://jvi.asm.org/ 110 like viruses, the Luteo-like viruses and the Negev-like viruses (Table 1). Importantly, these
111 viruses were unlikely to represent endogenous viral elements (EVEs) as they were present as
112 complete genomes without any interruption by frame-shifts, nonsense mutations, repeat
113 sequences, reverse transcriptases, or other features that are common to EVEs. on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 114 For each library, the number of virus species varied from 1 to 10 (Table 1). The
115 abundance (i.e. frequency) of each virus also varied from 0.013% to 16.87% of total non-
116 rRNA reads within the pool (Table 1). In comparison, the host gene RPL32, which is often
117 used as a reference gene in quantitative PCR assays, showed consistent abundance levels
118 across all libraries (from 0.034 – 0.065%, Table 2). This suggests that the huge variation in
119 viral number and abundance is unlikely to be an artefact of sample processing or nucleic acid
120 extraction. Indeed, for individual viral species the abundance levels were comparable across
121 libraries, including both highly abundant viruses such as Culex phasma-like virus (1.632 –
122 4.113%) and those of lower abundance like Culex mononega-like virus 2 (0.011 – 0.034%).
123 Overall, for all the Culex pools, the total abundance levels of viral RNA were above 4% of
124 total non-rRNA, suggesting that RNA viruses can make up a substantial part of the RNA
125 environment in mosquitoes.
126 Also of note was that some of the viruses were highly prevalent. In particular, Culex
127 phasma-like virus (CPLV) and Wuhan mosquito virus 6 (WHMV6) appeared in all of the
128 Culex pools, while Culex mononega-like virus 1 and 2 (CMLV1 and 2), Zhejiang mosquito
6 129 virus 3 (ZJMV3), and Hubei chryso-like virus 1 (HBCLV1) appeared in most of the Culex
130 pools. WHMV6, ZJMV3, and HBCLV1 were also prevalent in the Culex species from China Downloaded from 131 (6). Importantly, each of these viruses had consistent abundance levels across different
132 libraries and were absent from the Aedes pools, suggesting that they are unlikely to result
133 from contamination. This observation highlights the persistence of some viral infections in
134 Culex mosquitoes, to the extent that infections are the norm rather than the exception. http://jvi.asm.org/ 135
136 Virome ecology. Our analysis revealed substantial differences between the Aedes and Culex
137 genera in terms of virus composition and abundance. Generally, the Aedes mosquitoes
138 contained fewer viruses than the Culex mosquitoes (Fig. 2). Although the Ae. on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 139 camptorhynchus pool from South Guildford contained seven viral species, all were at low
140 abundance and of uncertain host association (see below). More striking was that the total
141 viral abundance was much lower in the Aedes pools (0.013 – 0.391%) than the Culex pools
142 (4.508 – 16.87%), an observation that was consistent across sampling locations.
143 The differences between the two mosquito genera were also reflected in the types of
144 the viruses they harboured (Fig. 3A). Of the 24 viral species discovered, only two – Wilkie
145 qin-like virus (WQLV) and Wilkie narna-like virus 1 (WNLV1) – were shared between the
146 Aedes and Culex pools (Table 1). However, that these viruses had low abundance and co-
147 appeared with a group of related fungal pathogens rendered them more likely to be associated
148 with fungi than mosquitoes (see below). The lack of similarity between the Aedes and Culex
149 viromes was in marked contrast to the number of common viral species found between the
150 three Culex species (Fig. 3 and Table 1). Notably, Cx. quinquefasciatus shared five of the six
151 viruses with the other two Culex species despite the substantial genetic distance between
152 these hosts (Fig. 1). Conversely, no viruses were shared between Ae. camptorhynchus and Ae.
153 alboannulatus, although only one virus was discovered in Ae. alboannulatus.
7 154 Also of note was that there was a significant overlap between the viromes from the
155 three locations that harboured Culex species (Fig. 3B). The fourth location, Leschenault Downloaded from 156 Peninsula, contained only Ae. camptorhynchus mosquitoes whose virome was very limited.
157 Hence, there is seemingly a lack of geographic structure to the RNA virome at the scale of
158 this study. Indeed, the geographic distribution of each of these viral species may be much
159 broader and involve locations outside of Australia. In particular, several of the viruses we http://jvi.asm.org/ 160 identified shared high genetic identity with those found in disparate geographic locations,
161 including Wuhan mosquito virus 6 (98% nucleotide identity), Zhejiang mosquito virus 3
162 (96%), Hubei chryso-like virus 1 (97%), and Shuangao chryso-like virus 1 (97%) which were
163 also identified in China (6), as well as Ngewotan virus (99%) from Indonesia (18). on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 164
165 Evolutionary history of the newly identified RNA viruses. While the majority of the
166 viruses identified from this study exhibited relatively close relationships to viruses previously
167 described in either mosquitoes, Dipteran insects, or other related arthropods, six clustered
168 with fungal viruses (see below; Fig. 4-6). The clustering of mosquito-associated viruses from
169 different countries or mosquito species was apparent at many places within the phylogenies
170 and sometimes these monophyletic groups contained substantial genetic diversity, suggestive
171 of a long-term association between the viruses and their mosquito hosts. Notably, the
172 mosquito-associated clusters often contained multiple viral lineages associated with single or
173 multiple host species/genera (Fig. 4-6), with no clear pattern of virus-host co-divergence,
174 although this needs to be examined with a much larger sample size.
175
176 Negative-sense RNA viruses. We discovered eight putative negative-sense RNA viruses,
177 representing all the major taxonomic categories (Table 1). Among these, six were related to
178 previously described mosquito viruses, while the remaining two viruses either grouped with a
8 179 fungi virus (WOLV1, Ophioviridae) or were of uncertain host association (WQLV, Qinvirus)
180 (Fig. 4). In the RdRp phylogeny, CPLV clustered within the recently proposed phasmavirus Downloaded from 181 group (family Bunyaviridae) (36), whose host range is currently limited to arthropods (6, 8).
182 Its closest relative was Wuhan mosquito virus 2 identified from Culex mosquitoes in China.
183 CPLV showed a genome structure typical of phasmavirus, which have substantially shorter
184 glycoprotein–encoding segments than other bunyaviruses. http://jvi.asm.org/ 185 Culex rhabdo-like virus (CRLV), CMLV1, and CMLV2 were related to viruses from
186 the order Mononegavirales. CRLV1 was from the Dimarhabdovirus group and related to
187 North Creek virus that was isolated from Cx. sitiens (Wiedemann) sampled on the east coast
188 of Australia (10) (Fig. 4), while CMLV1 and 2 grouped with Xincheng Mosquito virus in a on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 189 currently unclassified clade. Interestingly, CMLV1 had a bi-segment genome arrangement
190 that occurs only rarely in the Mononegavirales (37) (Fig. 4), although the most closely related
191 viruses – CMLV2 and Xincheng Mosquito virus – all had unsegmented genomes.
192 Aedes alboannulatus orthomyxo-like virus (AAOLV) and WHMV6 belonged to two
193 separate mosquito-associated clusters within the family Orthomyxoviridae. WHMV6 was
194 initially identified in Culex mosquitoes from China (6, 8), and we were able to reveal two
195 more genome segments, containing a glycoprotein gene and an unknown protein gene, in
196 addition to those described previously, making a total of 6 segments. Although the Aedes
197 alboannulatus orthomyxo-like virus was only discovered in one pool, it had moderately high
198 abundance (0.217%) and clustered with viruses identified from the other mosquito hosts in
199 China (Fig. 4), which suggested a potential association with Ae. alboannulatus.
200
201 Positive-sense RNA viruses. The positive-sense RNA viruses discovered in this study fell
202 within the Narnaviridae, Mesoniviridae (Nidovirales), Negev-like viruses, and Luteoviridae-
203 related viruses (Fig. 5). The Negev-like viruses were initially identified in mosquitoes (18),
9 204 and now expanded to include a number of other arthropod species. Based on the RdRp, the
205 Negev-like viruses form part of a larger group referred to as alpha-like supergroup (38) or Downloaded from 206 Hepe-Virga-like group (8), which includes the Togaviridae, Virgaviridae, Hepeviridae,
207 Tymovirales amongst others. We identified four divergent viruses within the Negev-like virus
208 group. Among these, Culex Negev-like virus 2 and 3 (CNLV2 and 3) were closely related to
209 viruses identified in mosquitoes and had a similar genome structure to the prototype Negev http://jvi.asm.org/ 210 virus (Fig. 5). In contrast, Culex negev-like virus 1 (CNLV1) was distantly related to a virus
211 identified from nematodes (Fig. 5). However, since CNLV1 had moderately high abundance
212 and appeared in three Culex pools that contained no traces of nematode genes, its host
213 association was more likely mosquitoes. The Aedes camptorhynchus Negev-like virus on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 214 (ACNLV) showed a distant relationship with Muthill virus and Marsac virus identified from
215 flies (Fig. 5). Its genome had several unique features, including a permuted RdRp domain, a
216 potential stop codon read-through site between the helicase and RdRp domains, and a
217 distinctive (and longer) set of genes downstream of the replicase.
218 We also identified four viruses from the Narnaviridae (Fig. 5). Of these, Zhejiang
219 mosquito virus 3 (ZJMV3) was highly abundant and prevalent across all the Culex pools,
220 while the other three viruses were of low abundance and clustered with fungal pathogens.
221 Viruses closely related to ZJMV3 have been identified in China (8), France (19), and the
222 United States (20), and can be distinguished from other narnaviruses because of their dual-
223 coding genome structure, characterized by two open reading frames (ORFs) that cover the
224 complete length of both the sense and anti-sense genome (Fig. 5). One of the ORFs encodes
225 the RdRp, while the other had no homology to any gene. Importantly, this feature was
226 conserved across a divergent phylogenetic group, including more distantly related viruses
227 such as Hubei narna-like virus 20 (Fig. 5).
10 228 We also identified a virus related to the Luteo-Sobemo-like group whose host range
229 has recently expanded from plants to include arthropods, nematodes, molluscs, and protists Downloaded from 230 (8). Specifically, we identified a single member of this group, Culex luteo-like virus (CLLV),
231 that was related to Hubei sobemo-like virus 41 previously identified in mosquitoes from
232 China. Despite its relatively low abundance, it was identified in the three Culex pools that did
233 not contain any abundant cellular parasites (Table 2), suggesting that it is most likely http://jvi.asm.org/ 234 associated with Culex mosquitoes. The genome of CLLV contained two segments, encoding
235 the replicase and the capsid (identified by structural blast). The replicase segment contained a
236 ribosomal frameshift site before the coding regions of the RdRp, typical of the members of
237 the Luteo-Sobemo-like group (8). on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 238
239 Double-stranded RNA viruses. We identified seven double-stranded RNA viruses:
240 Chrysoviridae (n = 2), Totiviridae (n = 1), Reoviridae (n = 1), and Partitiviridae (n = 3).
241 With the exception of the three viruses from the Partitiviridae, all these viruses were related
242 to those identified from mosquito or other arthropod hosts (Fig. 6). Hubei chryso-like virus 1
243 (HBCLV1) and Shuangao chryso-like virus 1 (SCLV1), initially identified from mosquitoes
244 in China, were now found to be prevalent in Culex mosquitoes from Western Australia. Their
245 complete genomes, as revealed in this study, contained four segments similar to the prototype
246 genome of the Chrysoviridae (Fig. 6). In the case of the Totiviridae we identified the Aedes
247 camptorhynchus toti-like virus (ACTLV), which, like the other totiviruses, has an
248 unsegmented genome comprising two major ORFs. Finally, the only reovirus identified here
249 – Aedes camptorhynchus reo-like virus – was related to Hubei reo-like virus 11 from
250 dragonflies, which in turn formed a distant sister clade to viruses of the genus Phytoreovirus.
251
11 252 Revealing host associations. The total transcriptomes described here not only contained
253 virus transcripts, but also abundantly expressed host genes and those from other intra-host Downloaded from 254 microbes such as bacteria, archaea, fungi, and protists. To reveal the presence and diversity of
255 these microbes we searched within the assembled transcripts for the presence of abundantly
256 expressed marker genes of cellular organisms. In this way we were able to identify several
257 dominant microbes within the mosquito host: some were related to parasites known to cause http://jvi.asm.org/ 258 infections in humans (e.g. Leishmania), whereas others included intracellular symbiotic
259 bacteria such as Wolbachia strain wPip (Table 2). Generally, the abundance levels of genes
260 from the (non-viral) microbes were orders of magnitude lower than those of the mosquito
261 hosts (Table 2). In addition, we identified a group of related fungi, which we termed on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 262 ‘Unknown sp1, 2, and 3’, in three of the pools including both Ae. camptorhynchus and Cx.
263 globocoxitus. The abundance level of these fungi was relatively high: in one of the Ae.
264 camptorhynchus pools the abundance of the fungal cox1 gene reached 0.125%, compared to
265 0.669% for that of mosquitoes. Interestingly, two viruses (WQLV and WPLV2) found in both
266 Ae. camptorhynchus and Cx. globocoxitus co-appeared with these fungi (Table 1 and 2), and
267 the viruses and fungi had matching evolutionary histories (Fig. 7). Furthermore, WQLV and
268 WPLV2 both grouped with fungal viruses rather than mosquito or arthropod viruses (Fig. 3
269 and 5). Collectively, these results provide strong evidence that these viruses were more likely
270 to be associated with fungi than mosquitoes.
271 Finally, to provide a summary of potential host association for all the viruses
272 discovered here, we considered several key attributes that are relevant to host association:
273 abundance level, prevalence, host association of close relatives, and co-appearance with other
274 cellular microbes within the hosts (Table 3). Among the 24 viruses identified here, 16 were
275 likely to be associated with mosquitoes under these criteria, whereas eight were more likely
276 associated with other hosts, although this clearly requires additional confirmation.
12 277
278 DISCUSSION Downloaded from 279 We have used a metagenomics approach to reveal key aspects of the ecology of RNA viruses
280 in mosquitoes from Western Australia. Of particular interest was the high diversity, high
281 prevalence, and relatively high abundance for a number of the RNA viruses from multiple
282 virus groups. Hence, these results highlight the capacity of Culex mosquitoes to tolerate high http://jvi.asm.org/ 283 levels of viral RNA, as has been described for other invertebrates (6, 8). Indeed, given the
284 very high prevalence of these viruses it seems intuitively unlikely that these viruses are
285 associated with severe disease in their hosts, and we propose that the most likely status for
286 these viruses is either sub-lethal infection or commensal. This is supported by the observation on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 287 that viruses have been detected in both laboratory mosquito colonies and insect cell lines that
288 show little loss of fitness (39, 40), although this clearly requires additional study.
289 Our results also revealed a striking difference between the viral diversities harboured
290 by the Aedes and Culex genera of mosquitoes: infections in the former group are sporadic and
291 there is little resemblance between the different populations, although clearly this needs to be
292 examined with more data. Similarly, among the previously described vector-borne viruses,
293 there is little overlap in the viruses carried by Aedes and Culex mosquitoes, such that the
294 diversity of mosquito-borne flaviviruses can be further subdivided into Culex- or Aedes-
295 associated phylogenetic groups (41, 42). The three Culex species studied here (Cx.
296 quinquefasciatus, Cx. australicus and Cx. globocoxitus) all form part of the Culex pipiens
297 complex and are closely related in their cox1 gene sequences (43, 44). Hence, the similarity
298 in viromes between the three Culex species may in part reflect their close evolutionary
299 relationships, which may in turn dictate similarities in the cellular environment,
300 immunological response, and perhaps ecological niche (45). The two mosquito genera also
13 301 exhibit a large discrepancy in virus numbers and abundance, which is robust across all
302 comparisons despite the relatively small sample size (Fig. 2). Downloaded from 303 In contrast to the difference in viromes between genera, the Culex virome was
304 relatively homogenous among the species and across the regions sampled. Furthermore, a
305 number of the viruses discovered here were not only found in Western Australia, but also in
306 regional countries like China and Indonesia indicating that they infect hosts over a wide http://jvi.asm.org/ 307 geographical area. As the viruses present in these different countries are very similar (95% ~
308 98% nucleotide identities), such limited genetic distance tentatively suggests that these
309 viruses were introduced by windblown mosquitoes (46, 47), by cyclones from neighbouring
310 regions (48), or were inadvertently spread by humans, rather than the result of ancient on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 311 mosquito dispersal. Conversely, on current data there appears to be relatively little overlap
312 between the mosquito viromes sampled from Western Australia and other parts of Australia
313 (3, 10), which may reflect that different mosquito species present in these localities. In
314 addition, a previous survey of viruses in eastern Australia was performed after the viruses
315 were passaged in the cell culture, which may eliminate some of the viruses present in the
316 original sample (10). A more complete characterization of virome ecology in Australia
317 evidently requires larger scale sampling covering more geographic locations and mosquito
318 species. Similarly, the current study relied on the collection of mosquitoes with Encephalitis
319 Virus Surveillance CO2 traps that are likely to be biased in the species of mosquitoes
320 collected. A broader sampling of mosquito fauna to determine the overall diversity of viruses
321 will evidently require the use of a variety of trapping techniques that reflect specific mosquito
322 habits or attraction to collection traps.
323 Although our study was directed toward mosquitoes (20), it was striking that we
324 identified a number of RNA viruses that were likely associated with hosts other than
325 mosquitoes. Specifically, potential non-mosquito viruses were revealed through phylogenetic
14 326 analysis (i.e. that they clustered with viruses from fungi rather than from mosquitoes),
327 evidence of co-divergence with their microbial hosts, and their low abundance (Table 3, Fig. Downloaded from 328 7). The abundance of these confirmed and suspected microbial viruses was generally below
329 0.001% of the total non-rRNA reads (i.e. so low that they are unlikely to be associated with
330 mosquitoes), although the highest (WQLV) reached 0.074%. This, in turn, suggests that viral
331 abundance level is a useful indication of host association, although it should be examined in http://jvi.asm.org/ 332 the context of the type and quantity of the dominant microbes within the sample.
333 Finally, it is important to note that among the various virus species discovered here,
334 none fell into the category of “vector-borne” viruses that are known to infect humans or other
335 mammalian hosts. Indeed, in a previous metagenomics survey of mosquitoes and ticks, most on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 336 of viruses discovered either clustered with “arthropod-specific” viruses or were
337 uncharacterized (1, 2), and the sequencing of nearly 200 mosquitoes revealed only two
338 known vector-borne viruses (6, 8). This suggests that human and vertebrate pathogens
339 represent only a tiny fraction of the mosquito virome, although it is possible that they exist at
340 very low copy numbers if they exhibit low levels of replication. Whatever the cause, the
341 observation that vector-borne viruses are rare further indicates how the characterization of the
342 mosquito virome provides important insight into the ecology and evolution of insect viruses.
343
344 MATERIALS AND METHODS
345 Sample collection. A total of 519 adult mosquitoes were collected in 2015 from four
346 locations in Western Australia considered to be of significant public health risk in relation to
347 mosquito-borne diseases including Ross River (RRV) and Barmah Forest viruses (BFV). The
348 four locations comprised (i) South Guildford, an eastern suburb of the Perth Metropolitan
349 Region located on the Swan River; (ii) Leschenault Peninsula, near Australind and (iii) Point
350 Douro, Bunbury, both of which are tidally driven inlet sites and approximately 160 km and
15 351 175 km southwest of Perth, respectively; and (iv) Siesta Park, in Dunsborough,
352 approximately 250 km southwest of Perth (Fig. 1). Mosquitoes were collected using Downloaded from
353 Encephalitis Virus Surveillance carbon dioxide (EVS CO2) traps that were set at each
354 location for approximately 12 hours. Each trap was baited with dry ice to attract mosquitoes.
355 Upon trap collection, the mosquitoes were euthanized by placing each collection on dry ice to
356 kill and preserve the mosquitoes and RNA. Mosquitoes were then placed in labelled vials and http://jvi.asm.org/ 357 left on dry ice until returned to the laboratory, where the samples were placed in a -80°C
358 freezer.
359 Mosquito species identification was initially carried out by experienced field
360 biologists using taxonomic keys (26) and dissecting microscopes on cold tables, later verified on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 361 by analysing the cytochrome c oxidase subunit I (cox1) gene (Fig. 1). The majority of the
362 mosquitoes collected in this study were from five species: Ae. camptorhynchus (Thomson),
363 Ae. alboannulatus (Macquart), Culex globocoxitus (Dobrotworsky), Cx. australicus
364 (Dobrotworsky and Drummond), and Cx. quinquefasciatus (Say). As Cx. globocoxitus and
365 Cx. australicus cannot be distinguished by COI gene sequences (Fig. 1), they were identified
366 using two main morphological diagnostic features – the tergal banding patterns and median
367 patches of dark scales on the sternites. Specifically, Cx. globocixtus has tergal banding
368 without lateral constrictions and no dark patches of scales on the sternites, while Cx.
369 australicus has lateral constrictions on tergal bands and has prominent patches of dark scales
370 on the sternites (26). All mosquito samples were then categorized by species and geographic
371 locations and stored at -80°C before RNA extraction.
372
373 Sample processing and sequencing. RNA extraction and sequencing were carried out on 12
374 pools of mosquitoes, with each pool containing 5-10 representative female mosquitoes from
375 the same geographic region and species (Table 1). Prior to homogenization, each mosquito
16 376 pool was washed three times with 1ml sterile, RNA and DNA-free PBS solution (GIBCO) to
377 remove external microbes. The samples were then homogenized in 600 µl of lysis buffer Downloaded from 378 using a TissueRuptor (Qiagen). Total RNA was extracted using an RNeasy Plus Mini Kit
379 following the manufacturer's instructions. The quality of the extracted RNA was evaluated
380 using an Agilent 2100 Bioanalyzer (Agilent Technologies). All extractions performed in this
381 study had a RIN value larger than 8.7. Sequencing libraries were constructed using a TruSeq http://jvi.asm.org/ 382 total RNA Library Preparation Kit (Illumina) with the host rRNA removed using a Ribo-
383 Zero-Gold (Human-Mouse-Rat) Kit (Illumina). Paired-end (100bp) sequencing of each
384 library was then performed on the Hiseq2500 platform (Illumina). All library preparation and
385 sequencing procedures were carried out by Australian Genome Research Facility (AGRF). on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 386
387 RNA virus discovery. Sequencing reads were de-multiplexed and trimmed for quality with
388 Trimmomatic (27) before de novo assembly using Trinity (28). The resulting contigs were
389 first compared against the database of all reference RNA virus proteins downloaded from
390 GenBank using Blastx with an e-value cut-off at 1E-5. Potential viral contigs were then
391 compared to the entire non-redundant nucleotide (nt) and protein (nr) database to remove
392 false-positives. The quality-filtered virus contigs with unassembled overlaps were then
393 merged using the SeqMan program implemented in the Lasergene software package v7.1
394 (DNAstar). To confirm the assembly results, reads were mapped back to the virus genomes
395 with Bowtie2 (29) and inspected using the Integrated Genomics Viewer (30) for any
396 assembly errors. The final sequences of the virus genomes were obtained from the majority
397 consensus of the mapping assembly.
398
399 Virus genome annotation. The potential open reading frames (ORFs) of the newly identified
400 virus genomes were predicted based on those from the closest reference virus genomes. To
17 401 characterize the functional domains within each ORFs we performed a domain-based Blast
402 search against the Conserved Domain Database (CDD) with an expected value threshold of Downloaded from 403 1E-5. The potential functions of the remaining ORFs were predicted by homology with other
404 known viral proteins. A potential viral glycoprotein from families of negative-sense RNA
405 viruses was identified based on the presence of (i) a N-terminal signal domain, (ii) a C-
406 terminal or mid-point transmembrane domain, and (iii) putative glycosylation sites. http://jvi.asm.org/ 407 For those viruses with the multiple segments, non-RdRp segments were usually
408 identified by homology to the proteins of related reference viruses. Other potential segments
409 of no homology were identified using an in silico approach that utilizes information on RNA
410 quantity, protein structure, and/or conserved genome termini. To determine whether these on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 411 segments belonged to the same virus, we checked: (i) the sequencing depth of the segments,
412 (ii) the presence of conserved genome termini, (iii) the co-appearance with the RdRp
413 segments, and (iv) the phylogenetic positions of related viral proteins.
414
415 Identification of other microbes within mosquitoes. To identify abundant bacteria, fungi,
416 and protists within the mosquito populations sampled we searched the assembled
417 transcriptome for a collection of key marker genes that are abundantly and stably expressed
418 in eukaryotes and prokaryotes. Specifically, we looked for the cox1 and Ribosomal Protein
419 L32 (RPL32) genes to identify eukaryotes (including the mosquito host) and the DNA gyrase
420 subunit B (gyrB) and Recombinase A protein (recA) genes to identify prokaryotes. The
421 contigs discovered were then confirmed with (i) blastx search against the nr database and (ii)
422 read mapping. The quality screened contigs were then trimmed to contain only coding
423 regions for quantification (see below).
424
18 425 RNA quantification. To help determine the abundance of RNA transcripts we estimated the
426 percentage of total reads that mapped to target genomes/genes. The sequences used for Downloaded from 427 mapping involved viral genomes as well as the mosquito and microbial marker genes
428 identified above. The mapping was performed using Bowtie2 (31). The mapping results were
429 manually checked with IGV (30) for potential assembly errors.
430 http://jvi.asm.org/ 431 Phylogenetic analyses. We used the amino acid sequences of the viral replicase (i.e. RNA-
432 dependent RNA polymerase) to determine the evolutionary history of the newly discovered
433 viruses. For comparison, we included previously published viral protein sequences
434 representative of each of the relevant phylogenetic groups (e.g. virus family). This also on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 435 included all the previously described mosquito viruses within these groups. Within each
436 group, the replicase proteins were aligned using the E-INS-i algorithm in MAFFT (version 7)
437 (32). Ambiguously aligned regions were subsequently removed using TrimAl (33). Based on
438 the sequence alignment, the best-fit model of amino acid substitution was determined using
439 ProtTest 3.4 (34). Phylogenetic trees were then estimated using the maximum likelihood
440 method (ML) implemented in PhyML version 3.0 (35), utilizing the best-fit substitution
441 model and the Subtree Pruning and Regrafting (SPR) branch-swapping algorithm. Support
442 for individual nodes on the phylogenetic tree was accessed using an approximate likelihood
443 ratio test (aLRT) with the Shimodaira-Hasegawa-like procedure as implemented in PhyML.
444
445 Accession numbers. The raw sequence reads generated in this study are available at the
446 NCBI Sequence Read Archive (SRA) database under BioProject accession PRJNA388696.
447 All virus genome sequences generated in this study have been deposited in GenBank under
448 the accession numbers MF176241 – MF176391 .
449
19 450 ACKNOWLEDGMENTS
451 We thank the staff of the Environmental Health Directorate of the Department of Health, Downloaded from 452 Western Australia for the collection of mosquitoes from the Southwest of Western Australia.
453 In addition, we thank the staff at the City of Swan (especially Neil Harries and James
454 McCallum) for the collection of mosquitoes from the east of Perth. The authors also wish to
455 acknowledge The University of Sydney HPC service at The University of Sydney for http://jvi.asm.org/ 456 providing high performance computing resources that have contributed to the research results
457 reported within this paper. J-SE is supported by an NHMRC Early Career Fellowship
458 (GNT1073466) and ECH is supported by an NHMRC Australia Fellowship (GNT1037231).
459 on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209
20 460 REFERENCES 461 1. Bolling BG, Weaver SC, Tesh RB, Vasilakis N. 2015. Insect-specific virus
462 discovery: significance for the arbovirus community. Viruses. 7:4911-4928. Downloaded from 463 2. Vasilakis N, Tesh RB. 2015. Insect-specific viruses and their potential impact on 464 arbovirus transmission. Curr Opin Virol 15:69-74. 465 3. Hall RA, Bielefeldt-Ohmann H, McLean BJ, O'Brien CA, Colmant AM, 466 Piyasena TB, Harrison JJ, Newton ND, Barnard RT, Prow NA, Deerain JM, 467 Mah MG, Hobson-Peters J. (2017). Commensal viruses of mosquitoes: host 468 restriction, transmission, and interaction with arboviral pathogens. Evol Bioinform 469 Online 12:35-44. 470 4. Marklewitz M, Handrick S, Grasse W, Kurth A, Lukashev A, Drosten C, http://jvi.asm.org/ 471 Ellerbrok H, Leendertz FH, Pauli G, Junglen S. 2011. Gouleako virus isolated 472 from West African mosquitoes constitutes a proposed novel genus in the family 473 Bunyaviridae. J Virol 85:9227-9234. 474 5. Marklewitz M, Zirkel F, Rwego IB, Heidemann H, Trippner P, Kurth A, Kallies 475 R, Briese T, Lipkin WI, Drosten C, Gillespie TR, Junglen S. 2013. Discovery of a 476 unique novel clade of mosquito-associated bunyaviruses. J Virol 87:12850-12865. 477 6. Li CX, Shi M, Tian JH, Lin XD, Kang YJ, Chen LJ, Qin XC, Xu J, Holmes EC,
478 Zhang YZ. 2015. Unprecedented genomic diversity of RNA viruses in arthropods on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 479 reveals the ancestry of negative-sense RNA viruses. eLife 4:e05378. 480 7. Marklewitz M, Zirkel F, Kurth A, Drosten C, Junglen S. 2015. Evolutionary and 481 phenotypic analysis of live virus isolates suggests arthropod origin of a pathogenic 482 RNA virus family. Proc Natl Acad Sci USA 112:7536-7541. 483 8. Shi M, Lin XD, Tian JH, Chen LJ, Chen X, Li CX, Qin XC, Li J, Cao JP, Eden 484 JS, Buchmann J, Wang W, Xu J, Holmes EC, Zhang YZ. 2016. Redefining the 485 invertebrate RNA virosphere. Nature 540:539–543. 486 9. Kuwata R, Isawa H, Hoshino K, Tsuda Y, Yanase T, Sasaki T, Kobayashi M, 487 Sawabe K. 2011. RNA splicing in a new rhabdovirus from Culex mosquitoes. J Virol 488 85:6185-6196. 489 10. Coffey LL, Page BL, Greninger AL, Herring BL, Russell RC, Doggett SL, 490 Haniotis J, Wang C, Deng X, Delwart EL. 2014. Enhanced arbovirus surveillance 491 with deep sequencing: identification of novel rhabdoviruses and bunyaviruses in 492 Australian mosquitoes. Virology 448:146-158. 493 11. Walker PJ, Firth C, Widen SG, Blasdell KR, Guzman H, Wood TG, Paradkar 494 PN, Holmes EC, Tesh RB, Vasilakis N. 2015. Evolution of genome size and 495 complexity in the Rhabdoviridae. PLoS Pathog 11:e1004664. 496 12. Presti RM, Zhao G, Beatty WL, Mihindukulasuriya KA, da Rosa AP, Popov VL, 497 Tesh RB, Virgin HW, Wang D. 2009. Quaranfil, Johnston Atoll, and Lake Chad 498 viruses are novel members of the family Orthomyxoviridae. J Virol 83:11599-11606. 499 13. Qin XC, Shi M, Tian JH, Lin XD, Gao DY, He JR, Wang JB, Li CX, Kang YJ, 500 Yu B, Zhou DJ, Xu J, Plyusnin A, Holmes EC, Zhang YZ. 2014. A tick-borne 501 segmented RNA virus contains genome segments derived from unsegmented viral 502 ancestors. Proc Natl Acad Sci USA 111:6744-6749. 503 14. Shi M, Lin XD, Vasilakis N, Tian JH, Li CX, Chen LJ, Eastwood G, Diao XN, 504 Chen MH, Chen X, Qin XC, Widen SG, Wood TG, Tesh RB, Xu J, Holmes EC, 505 Zhang YZ. 2015. Divergent viruses discovered in arthropods and vertebrates revise 506 the evolutionary history of the Flaviviridae and related viruses. J Virol 90:659-669. 507 15. Ladner JT, Wiley MR, Beitzel B, Auguste AJ, Dupuis AP, 2nd, Lindquist ME, 508 Sibley SD, Kota KP, Fetterer D, Eastwood G, Kimmel D, Prieto K, Guzman H,
21 509 Aliota MT, Reyes D, Brueggemann EE, St John L, Hyeroba D, Lauck M, 510 Friedrich TC, O'Connor DH, Gestole MC, Cazares LH, Popov VL, Castro- 511 Llanos F, Kochel TJ, Kenny T, White B, Ward MD, Loaiza JR, Goldberg TL,
512 Weaver SC, Kramer LD, Tesh RB, Palacios G. 2016. A multicomponent animal Downloaded from 513 virus isolated from mosquitoes. Cell Host Microbe 20:357-367. 514 16. Nga PT, Parquet Mdel C, Lauber C, Parida M, Nabeshima T, Yu F, Thuy NT, 515 Inoue S, Ito T, Okamoto K, Ichinose A, Snijder EJ, Morita K, Gorbalenya AE. 516 2011. Discovery of the first insect nidovirus, a missing evolutionary link in the 517 emergence of the largest RNA virus genomes. PLoS Pathog 7:e1002215. 518 17. Attoui H, Jaafar FM, Belhouchet M, Tao SJ, Chen BQ, Liang GD, Tesh RB, de 519 Micco P, de Lamballerie X. 2006. Liao ning virus, a new Chinese seadornavirus that
520 replicates in transformed and embryonic mammalian cells. J Gen Virol 87:199-208. http://jvi.asm.org/ 521 18. Vasilakis N, Forrester NL, Palacios G, Nasar F, Savji N, Rossi SL, Guzman H, 522 Wood TG, Popov V, Gorchakov R, Gonzalez AV, Haddow AD, Watts DM, da 523 Rosa AP, Weaver SC, Lipkin WI, Tesh RB. 2013. Negevirus: a proposed new 524 taxon of insect-specific viruses with wide geographic distribution. J Virol 87:2475- 525 2488. 526 19. Cook S, Chung BY, Bass D, Moureau G, Tang S, McAlister E, Culverwell CL, 527 Glucksman E, Wang H, Brown TD, Gould EA, Harbach RE, de Lamballerie X, 528 Firth AE. 2013. Novel virus discovery and genome reconstruction from field RNA on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 529 samples reveals highly divergent viruses in dipteran hosts. PLoS One. 8:e80720. 530 20. Chandler JA, Liu RM, Bennett SN. 2015. RNA shotgun metagenomic sequencing 531 of northern California (USA) mosquitoes uncovers viruses, bacteria, and fungi. Front 532 Microbiol 6:185. 533 21. Cholleti H, Hayer J, Abilio AP, Mulandane FC, Verner-Carlsson J, Falk KI, 534 Fafetine JM, Berg M, Blomstrom AL. 2016. Discovery of novel viruses in 535 mosquitoes from the Zambezi valley of Mozambique. PLoS One 11:e0162751. 536 22. Frey KG, Biser T, Hamilton T, Santos CJ, Pimentel G, Mokashi VP, Bishop- 537 Lilly KA. 2016. Bioinformatic characterization of mosquito viromes within the 538 eastern United States and Puerto Rico: discovery of novel viruses. Evol Bioinform 539 Online 12:1-12. 540 23. Junglen S, Drosten C. 2013. Virus discovery and recent insights into virus diversity 541 in arthropods. Curr Opin Microbiol 16:507-513. 542 24. Mokili JL, Rohwer F, Dutilh BE. 2012. Metagenomics and future perspectives in 543 virus discovery. Curr Opin Virol 2:63-77. 544 25. Radford AD, Chapman D, Dixon L, Chantrey J, Darby AC, Hall N. 2012. 545 Application of next-generation sequencing technologies in virology. J Gen Virol 546 93:1853-1868. 547 26. Liehne PFS. 1991. An Atlas of the Mosquitoes of Western Australia. Health 548 Department of Western Australia. 549 27. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina 550 sequence data. Bioinformatics 30:2114-2120. 551 28. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis 552 X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, 553 Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, 554 Regev A. 2011. Full-length transcriptome assembly from RNA-Seq data without a 555 reference genome. Nat Biotechnol 29:644-652. 556 29. Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat 557 Methods 9:357-359.
22 558 30. Thorvaldsdottir H, Robinson JT, Mesirov JP. 2013. Integrative Genomics Viewer 559 (IGV): high-performance genomics data visualization and exploration. Brief 560 Bioinform 14:178-192.
561 31. Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat Downloaded from 562 Methods 9:357-359. 563 32. Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software 564 version 7: improvements in performance and usability. Mol Biol Evol. 30:772-780. 565 33. Capella-Gutierrez S, Silla-Martinez JM, Gabaldon T. 2009. trimAl: a tool for 566 automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 567 25:1972-1973. 568 34. Darriba D, Taboada GL, Doallo R, Posada D. 2011. ProtTest 3: fast selection of
569 best-fit models of protein evolution. Bioinformatics 27:1164-1165. http://jvi.asm.org/ 570 35. Guindon S, Gascuel O. 2003. A simple, fast, and accurate algorithm to estimate large 571 phylogenies by maximum likelihood. Syst Biol 52:696-704. 572 36. Ballinger MJ, Bruenn JA, Hay J, Czechowski D, Taylor DJ. 2014. Discovery and 573 evolution of bunyavirids in arctic phantom midges and ancient bunyavirid-like 574 sequences in insect genomes. J Virol 88:8783-8794. 575 37. Dietzgen RG, Kondo H, Goodin MM, Kurath G, Vasilakis N. 2017. The family 576 Rhabdoviridae: mono- and bipartite negative-sense RNA viruses with diverse genome 577 organization and common evolutionary origins. Virus Res 227:158-170. on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 578 38. Koonin EV. 1991. The phylogeny of RNA-dependent RNA polymerases of positive- 579 strand RNA viruses. J Gen Virol 72:2197-2206. 580 39. Stollar V, Thomas VL. 1975. An agent in the Aedes aegypti cell line (Peleg) which 581 causes fusion of Aedes albopictus cells. Virology 64:367-377. 582 40. Bolling BG, Vasilakis N, Guzman H, Widen SG, Wood TG, Popov VL. 2015. 583 Insect-specific viruses detected in laboratory mosquito colonies and their potential 584 implications for experiments evaluating arbovirus vector competence. Am J Trop 585 Med Hyg 92:422-428. 586 41. Coffey LL, Forrester N, Tsetsarkin K, Vasilakis N, Weaver SC. 2013. Factors 587 shaping the adaptive landscape for arboviruses: implications for the emergence of 588 disease. Future Microbiol 8:155-176. 589 42. Huang YJ, Higgs S, Horne KM, Vanlandingham DL. 2014. Flavivirus-mosquito 590 interactions. Viruses 6:4703-4730. 591 43. Batovska J, Blacket MJ, Brown K, Lynch SE. 2016. Molecular identification of 592 mosquitoes (Diptera: Culicidae) in southeastern Australia. Ecol Evol 6:3001-3011. 593 44. Smith JL, Fonseca DM. 2004. Rapid assays for identification of members of the 594 Culex (Clex) pipiens complex, their hybrids, and other sibling species (Diptera: 595 Culicidae). Am J Trop Med Hyg 70:339-345. 596 45. Streicker DG, Turmelle AS, Vonhof MJ, Kuzmin IV, McCracken GF, 597 Rupprecht CE. 2010. Host phylogeny constrains cross-species emergence and 598 establishment of rabies virus in bats. Science 329:676-679. 599 46. Johansen CA, Van den Hurk AF, Ritchie SA, Zborowski P, Nisbet DJ, Paru R, 600 Bockarie MJ, MacDonald J, Drew AC, Khromykh TI, MacKenzie JS. 2000. 601 Isolation of Japanese Encephalitis virus from mosquitoes (Diptera: Culicidae) 602 collected in the Western Province of Papua New Guinea, 1997-1998. Am J Trop Med 603 Hyg 62:631-638. 604 47. Ritchie S, Rochester W. 2001. Wind blown mosquitoes and introduction of Japanese 605 Encephalitis into Australia. Emerg Infect Dis 5:900-903. 606 48. Inglis TJJ, O’Rielly L, Merritt AJ, Levy A, Heath C. 2011. Review: The aftermath 607 of the western Australian melioidosis outbreak. Am J Trop Med Hyg 84:851-857.
23 608 FIGURE LEGENDS
609 FIG 1 Information on the host and geographic location (south-western Australia) of the Downloaded from 610 mosquito samples collected in this study. Upper panel: maximum likelihood phylogeny of the
611 Cytochrome C Oxidase (cox1) gene from mosquito samples collected in this study. The name
612 of each sequence contains the information of sampling location and host species
613 identification in the field. Lower panel: locations of four sampling sites, marked by sold black http://jvi.asm.org/ 614 dots.
615
616 FIG 2 An overview of the diversity and abundance of the RNA viruses discovered. From the
617 top to bottom we show four column graphs depicting the number of viruses, the composition on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 618 of viral families, the abundance of total virome, and the abundance of the host RPL32 gene in
619 each of the 12 pool sequenced here. The mosquito species and location information for each
620 pool are shown at the top of the figure.
621
622 FIG 3 The similarity of viromes between (A) host species and (B) geographic locations. The
623 size of the circle is proportional to the total number of viruses discovered in each mosquito
624 species (A) or geographic location (B). Within the circle, information on the host species or
625 geographic location and the number of viruses (in parenthesis) is provided. The thickness of
626 the line connecting the circles reflects the number of viruses shared between species or
627 geographic locations. The number of shared viruses is shown next to the line.
628
629 FIG 4 Evolutionary history and genomic features of the negative-sense RNA viruses
630 discovered. The maximum likelihood phylogenetic trees show the position of newly
631 discovered viruses (solid black circles) in the context of representatives of their closest
632 relatives. The names of mosquito viruses identified in previous studies are marked in red and
24 633 contain the information of the mosquito species from which they were sampled (square
634 brackets). The genome structures of these newly discovered viruses are shown next to their Downloaded from 635 corresponding phylogenies. Predicted ORFs of these genomes are labelled with information
636 of the potential protein or protein domain they encode.
637
638 FIG 5 Evolutionary history and genomic features of the positive-sense RNA viruses http://jvi.asm.org/ 639 discovered. The legend is the same as that of Figure 4.
640
641 FIG 6 Evolutionary history and genomic features of the double-stranded RNA viruses
642 discovered. The legend is the same as that of Figure 4. on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 643
644 FIG 7 The matching tree topologies of the Wilkie qin-like viruses and a group of fungi (cox1
645 gene) discovered in three mosquito pools. Pool information is given in the middle of the two
646 phylogenies, both of which are mid-point rooted for clarity only.
25 Downloaded from http://jvi.asm.org/
647 Table 1. The presence and abundance of viruses from different mosquito species and locations (% total reads) Aedes Culex Aedes camptorhynchus Culex globocoxitus Culex australicus Virus Name Classification alboannulatus quinquefasciatus LocA LocB LocC LocD LocD LocA LocC LocD LocA LocC LocD LocD Culex phasma-like virus (CPLV) Bunyaviridae 0 0 0 0 0 3.881 4.113 3.547 3.908 2.659 3.952 1.632 Culex mononega-like virus 1 (CMLV1) Mononegavirales 0 0 0 0 0 0.193 0.068 0.191 0 0.059 0.063 0 Culex mononega-like virus 2 (CMLV2) Mononegavirales 0 0 0 0 0 0.011 0.022 0 0.021 0.034 0.016 0.009 Culex rhabdo-like virus (CRLV) Rhabdoviridae 0 0 0 0 0 0 0 0.217 0.138 0 0 0.169 Wuhan mosquito virus 6 (WHMV6)1 Orthomyxoviridae 0 0 0 0 0 1.035 1.494 3.340 1.353 1.756 1.358 1.380
Aedes alboannulatus orthomyxo-like on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Orthomyxoviridae 0 0 0 0 0.217 0 0 0 0 0 0 0 virus (AAOLV) Wilkie qin-like virus (WQLV) Qinvirus (New -ve sense) 0 0.008 0 0.014 0 0.074 0 0 0 0 0 0 Wilkie ophio-like virus (WOLV) Ophioviridae 0 0 0 0.003 0 0 0 0 0 0 0 0 Culex negev-like virus 1 (CNLV1) Negev virus-related 0 0 0 0 0 0.286 0.236 0 0 0 0.501 0 Culex negev-like virus 2 (CNLV2) Negev virus-related 0 0 0 0 0 0 0 0 2.645 0 0 0 Culex negev-like virus 3 (CNLV3) Negev virus-related 0 0 0 0 0 0 0 0 4.092 0 0 0 Aedes camptorhynchus negev-like virus Negev virus-related 0 0 0.389 0 0 0 0 0 0 0 0 0 (ACNLV) Culex luteo-like virus (CLLV) Luteoviridae-related 0 0 0 0 0 0 0.031 0.050 0 0 0.044 0 Point-Douro narna-like virus (PDNLV) Narnaviridae 0 0 0 0 0 0.036 0 0 0 0 0 0 Zhejiang mosquito virus 3 (ZJMV3)2 Narnaviridae 0 0 0 0 0 0.840 0.449 1.510 0.342 0 0.080 2.181 Wilkie narna-like virus 1 (WNLV1) Narnaviridae 0 0 0.002 0.009 0 0 0 0 0 0 0 0 Wilkie narna-like virus 2 (WNLV2) Narnaviridae 0 0 0 0.013 0 0 0 0 0 0 0 0 Ngewotan virus3 Mesoniviridae 0 0 0 0 0 0 0 0 4.326 0 0 0 Wilkie partiti-like virus 1 (WPLV1) Partitiviridae-related 0 0 0 0.002 0 0 0 0 0 0 0 0 Wilkie partiti-like virus 2 (WPLV2) Partitiviridae-related 0 0 0 0.005 0 0.002 0 0 0 0 0 0 Leschenault partiti-like virus (LPLV) Partitiviridae-related 0 0.006 0 0 0 0 0 0 0 0 0 0 Aedes camptorhynchus reo-like virus Reoviridae 0 0.132 0 0 0 0 0 0 0 0 0 0 (ACRLV) Aedes camptorhynchus toti-like virus Totiviridae 0.013 0 0 0.001 0 0 0 0 0 0 0 0 (ACTLV) Hubei chryso-like virus 1 (HBCLV1)2 Chrysoviridae 0 0 0 0 0 0.108 0.142 0.131 0.044 0 0.027 0 Shuangao chryso-like virus 1 (SCLV1)2 Chrysoviridae 0 0 0 0 0 0 0 0 0 0 0 0.141 All viruses 0.013 0.146 0.391 0.047 0.217 6.464 6.555 8.987 16.870 4.508 6.095 5.513 648 649 1Li et al. 2015 (6); 2Shi et al. 2016 (8); 3Vasilakis et al. 2013 (18).
26 Downloaded from http://jvi.asm.org/
650 Table 2. The most abundant genes from mosquitoes and other microbial organisms present in mosquitoes Aedes Culex Culex Culex Aedes camptorhynchus alboannulatus globocoxitus australicus quinquefasciatus Organisms Gene LocA LocB LocC LocD LocD LocA LocC LocD LocA LocC LocD LocD
Mosquito (principle host) cox1 0.455 0.669 0.335 0.346 0.437 1.114 0.851 0.606 0.587 0.830 0.866 0.499
Mosquito (principle host) RPL32 0.041 0.040 0.034 0.039 0.045 0.043 0.057 0.053 0.051 0.054 0.065 0.069
Fungi: Unknown sp1 cox1 0 0 0 0 0 0.032 0 0 0 0 0 0 on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Fungi: Unknown sp1 RPL32 0 0 0 0 0 0.00028 0 0 0 0 0 0
Fungi: Unknown sp2 cox1 0 0 0 0.026 0 0 0 0 0 0 0 0
Fungi: Unknown sp2 RPL32 0 0 0 0.00065 0 0 0 0 0 0 0 0
Fungi: Unknown sp3 cox1 0 0.125 0 0 0 0 0 0 0 0 0 0
Fungi: Unknown sp3 RPL32 0 0.00126 0 0 0 0 0 0 0 0 0 0
Fungi: Microsporidia sp RPL32 0 0 0.00008 0.00033 0 0 0 0 0 0 0 0
Protist: Leishmania sp cox1 0 0 0 0 0 0 0 0 0.00022 0 0 0.00006
Protist: Leishmania sp RPL32 0 0 0 0 0 0 0 0 0.00027 0 0 0
Protist: Trypanosoma sp RPL32 0 0 0.00005 0 0 0 0 0 0 0 0 0
Nematode: Mermithidae sp cox1 0 0.00153 0 0 0 0 0 0 0 0 0 0
Nematode: Onchocercidae sp cox1 0 0 0.00138 0 0 0 0 0 0 0 0 0
Bacteria: Zymobacter palmae gyrB 0.00018 0 0 0 0 0 0 0 0 0 0 0
Bacteria: Zymobacter palmae recA 0.00049 0 0 0 0 0 0 0 0 0 0 0
Bacteria: Wolbachia wPip gyrB 0 0 0 0 0 0 0 0 0 0 0 0.00028
Bacteria: Wolbachia wPip recA 0 0 0 0 0 0 0 0 0 0 0 0.00069
651
27 Downloaded from http://jvi.asm.org/
652 Table 3. Criteria used to identify viruses likely associated with mosquitoes Relatively high abundance level Found in Close relatives are Positive association Virus Name (>0.1% of total RNA in the library) >2 libraries mosquito or insect viruses with mosquitoes Culex phasma-like virus Yes Yes Yes Strong Culex mononega-like virus 1 Yes Yes Yes Strong Culex mononega-like virus 2 Yes Yes Strong Culex rhabdo-like 1 Yes Yes Yes Strong Wuhan mosquito virus 6 Yes Yes Yes Strong
Aedes alboannulatus orthomyxo-like virus Yes Yes Strong on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Wilkie qin-like virus Yes Low: viruses co-appear with fungi Wilkie ophio-like virus 1 Low Culex negev-like virus 1 Yes Yes Yes Strong Culex negev-like virus 2 Yes Yes Strong Culex negev-like virus 3 Yes Yes Strong Culex negev-like virus 4 Yes Yes Strong Culex luteo-like virus Yes Yes Strong Point-Douro_narna-like_virus Low Culex narna-like virus Yes Yes Yes Strong Wilkie narna-like virus 1 Low Wilkie narna-like virus 2 Low Nam Dinh virus Yes Yes Strong Wilkie partiti-like virus 1 Low Wilkie partiti-like virus 2 Yes Low: viruses co-appear with fungi Leschenault partiti-like virus Low Aedes alboannulatus reo-like virus Yes Yes Strong Aedes alboannulatus toti-like virus Yes Strong Culex chryso-like virus Yes Yes Yes Strong Culex quinquefasciatus chryso-like virus Yes Yes Strong 653
28 LocC Culex australicus 0.05 85 LocA Culex australicus 76 LocA Culex globocoxitus Culex LocD australicus/globocoxitus 95 Culex australicus LocC Culex globocoxitus Downloaded from 100 LocD Culex globocoxitus LocD Culex quinquefasciatus LocA Aedes camptorhynchus LocB Aedes camptorhynchus http://jvi.asm.org/ 100 Aedes camptorhynchus LocD Aedes camptorhynchus LocC Aedes camptorhynchus LocD Aedes alboannulatus on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209
Perth LocD: South Guildford
LocB: Leschenault Peninsula LocA: Point Douro
LocC: Siesta Park
0 100 200km Downloaded from
Aedes alboannulatus Aedes camptorhynchus Culex globocoxitus Culex australicus Culex quinquefasciatus
ABCDDACDACDD Location Keys 12 A: Point Douro http://jvi.asm.org/ 10 Low abundance viruses B: Leschenault Peninsula 8 High abundance viruses C: Siesta Park 6 D: South Guildford 4
Numberviruses of 2 0 100 Nido Toti-Chryso on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Reo 80 Narna Negev Partiti 60 Other negative-sense 40 Orthomyxo 20 virusfamilies/orders (% of total viraltotal of (% RNA) Mononega Proportioneach majorof 0 Bunya
16
12
8
4
AbundancevirusesRNA of 0 (% of total RNA in the library)inRNAthe total of (%
0.08
0.06
0.04
0.02
0.00 (% of total RNA in the library)inRNAthe total of (% AbundanceRPL32host of gene A Cx. glo. (10) 2 5 Downloaded from
Cx. qui. Ae. cam. 9 (6) (9) http://jvi.asm.org/ 5
Cx. aus.
(13) on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Ae. alb. (1)
B Point Douro Leschenault (16) 7 Peninsula (3) 1 Siesta Park 11 (16) 1 9 South Guildford (18) Bunyaviridae: Phasmavirus related Culex phasma-like virus
100 Wuhan mosquito virus 2 [Culex sp] Seg 1 L 6489bp 100 0.5 Culex phasma-like virus (CPLV) 100 Kigluaik phantom virus [Chaoborus trivitattus] Hubei diptera virus 7 Seg 2 N 2249bp 94 Hubei diptera virus 6 99 100 Wuhan mosquito virus 1 99 Hubei odonate virus 9 Seg 3 G 2076bp 100 100 Hubei odonate virus 8 Wuchang cockroach virus 1 Rhabdoviridae: Dimarhabodovirus related Downloaded from Hubei bunya-like virus 8 Shuangao insect virus 2 100 Bovine ephemeral fever virus Hubei bunya-like virus 9 99 Wongabel virus Ferak virus [Culex sp] 0.5 99 Tupaia virus 100 Sanxia Water Strider Virus 2 Nishimuro virus Wuhan insect virus 2 Vesicular stomatitis Indiana virus 100 99 98 Jonchet virus [Culex quinquefasciatus] 100 Drosophila immigrans sigmavirus 99 Drosophila melanogaster sigmavirus Hubei diptera virus 10 http://jvi.asm.org/ Culex rhabdo-like virus 1 100 Hubei lepidoptera virus 2 99 North Creek virus [Culex sitiens] NP G L 11473bp 100 Culex rhabdo-like virus (CRLV) 100 M Tongilchon virus 1 [Culex bitaeniorhynchus] Hubei dimarhabdovirus virus 3 Culex tritaeniorhynchus rhabdovirus Long Island tick rhabdovirus Moussa virus
Mononegavirales: Borna- and Nyamivirus related on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Culex mononega-like virus 1 (CMLV1) Culex mononega-like virus 1 0.5 100 Xincheng Mosquito Virus [Anopheles sinensis] 99 Culex mononega-like virus 2 (CMLV2) Seg 1 L 7116bp 100 Hubei diptera virus 11 100 Shuangao Fly Virus 2 Seg 2 N G 5980bp Hubei rhabdo-like virus 7 Hubei orthoptera virus 5 100 Wenling crustacean virus 12 Culex mononega-like virus 2 90 Wenzhou crab virus 1 91 Borna disease virus Bornaviridae G L 13259bp 100 Beihai rhabdo-like virus 3 Beihai rhabdo-like virus 5 100100 Midway virus Nyamiviridae Wenzhou tapeworm virus 1 Orthomyxoviridae 100 Wuhan Mosquito Virus 5 [Culex] 100 Aedes alboannulatus orthomyxo-like virus (AAOLV) Wuhan Mosquito Virus 6 0.5 99 Wuhan Mosquito Virus 3 [Culex] 100 Wuhan Louse Fly Virus 3 Seg 1 PB1 2432bp Wuhan mosquito virus 4 [Culex] 100 Wuhan Mosquito Virus 6 [Culex] (WHMV6) Seg 2 PB2 2428bp 100 Wuhan Mosquito Virus 7 97 Hubei earwig virus 1 Wellfleet Bay virus Seg 3 PA 2219bp Quaranfil virus 100 Quaranjavirus 99 Johnston Atoll virus Seg 4 N 1854bp 100 Thogoto virus Hubei orthoptera virus 6 Seg 5 G 1456bp Influenza D virus Influenza C virus Influenza virus Seg 6 849bp 100 Influenza B virus 100 Influenza A virus
Ophioviridae related Qinvirus 100 Hubei qinvirus-like virus 1 0.5 Mirafiore lettuce virus 0.5 99 Wuhan insect virus 15 99 Lettuce ring necrosis virus Ophioviridae Beihai sesarmid crab virus 4 Citrus psorosis virus Sanxia qinvirus-like virus 1 100 Fusarium poae negative-stranded virus 1 Xinzhou nematode virus 3 56 Hubei qinvirus-like virus 2 Rhizoctonia solani negative-stranded virus 2 Wilkie qin-like virus 100 Wilkie ophio-like virus 1 Wenzhou qinvirus-like virus 2 Negev virus related V: Viral methyltransferase R: RdRp S: S domain capsid
100 Culex negev-like virus 2 (CNLV2) A: FtsJ-like methyltransferase R’: Permuted RdRp RT: Read-through 0.5 100 Negev virus [Culex] H: RNA helicase M: Membrane protein 100 Brejeira virus [Culex] 100 Loreto virus [Anopheles albimanus] Culex negev-like virus 1 94 Wuhan house centipede virus 1 100 Wuhan insect virus 8 VA H R 10859bp
Beihai barnacle virus 2 Downloaded from 100 Culex negev-like virus 3 (CNLV3) Culex negev-like virus 2 100 Goutanap virus [Culicidae] 100 9324bp Wallerfield virus [Culex] VA HR M 100 Tanay virus [Culex quinquefasciatus] Hubei virga-like virus 7 Culex negev-like virus 3 98 Citrus leprosis virus C V A H RM 9176bp 100 Muthill virus [Drosophila immigrans] 100 Marsac virus [Scaptodrosophila deflexa] Aedes camptorhynchus negev-like virus (ACNLV) http://jvi.asm.org/ 100 Aedes camptorhynchus negev-like virus 94 Hubei virga-like virus 17 Culex negev-like virus 1 (CNLV1) V H R’ 11470bp Xinzhou nematode virus 1 80 99 RT 100 Wuhan heteroptera virus 1 Narnaviridae 100 Hubei virga-like virus 16 Wuhan insect virus 9 99 Cassava virus C Beihai anemone virus 1 100 Ourmiavirus 91 0.5 Ourmia melon virus on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Hubei virga-like virus 23 98 80 Wuhan spider virus 7 95 Bofa virus 100 Wenzhou shrimp virus 10 Hubei virga-like virus 21 Hubei mosquito virus 3 Boutonnet virus Uncultured virus AGW51782 [Culicine sp] 99 Hubei virga-like viurs 9 99 Zhejiang mosquito virus 3 [Culex sp] (ZJMV3) 100 Narnaviridae environmental sample [Culex pipiens] 89 99 Uncultured virus 2 AJT39597 [Culex pipiens] Zhejiang mosquito virus 3 (Narnaviridae) 98 Hubei narna-like virus 20 Saccharomyces 20S RNA narnavirus RdRp Point-Douro narna-like virus (PDNLV) 99 Narnavirus 3205bp 99 Saccharomyces 23S RNA narnavirus 100 Phytophthora infestans RNA virus 4 100 Wilkie narna-like virus 1 (WNLV1) Beihai barnacle virus 10 99 Beihai narna-like virus 23 Culex luteo-like virus Beihai narna-like virus 22 100 99 Wilkie narna-like virus 2 (WNLV2) Seg1 R 2830bp Wuhan horsefly Virus 3 FS: frame shift Hubei narna-like virus 23 Ophiostoma mitovirus 4 99 Mitovirus Seg2 S 1400bp 98 Ophiostoma mitovirus 3a RT Mesoniviridae Luteoviridae related 99 Ngewotan virus [Culex vishnui] 99 Ngewotan virus [Culex australicus] 99 Alphamesonivirus 1 [Culex pipiens] 100 La Tardoire virus 100 0.5 0.1 Houston virus [Aedes albopictus] 99 Wuhan house centipede virus 5 100 Nam Dinh virus Hubei diptera virus 14 Cavally virus [Aedes harrisoni] 100 Culex luteo-like virus (CLLV) 100 100 Bontang virus [Culex tritaeniorhynchus] 100 Hubei sobemo-like virus 41 [Culicine sp] 100 100 Karang Sari virus [Culex vishnui] Humaita-Tubiacanga virus 100 Kamphang Phet virus Wuchan romanomermis nematode virus 3 90 Hana virus [Culex sp] 99 Wenzhou shrimp virus 9 Casuarina virus [Coquillettidia xanthogaster] Hubei sobemo-like virus 39 [Culicine sp] Meno virus Sanxia water strider virus 12 Nse virus Downloaded from
Totiviridae, Chrysoviridae related Aedes camptorhynchus toti-like virus RdRp 6363bp Shuangao chryso-like virus 1 [Culex quinquefasciatus] (SCLV1)
100 http://jvi.asm.org/ Hubei chryso-like virus 1 [Culex] (HBCLV1) Shuangao chryso-like virus 1 99 Hubei chryso-like virus 2 [Diptera] Chryso- 100 Penicillium chrysogenum virus viridae Seg 1 RdRp 3544bp 100 Helminthosporium victoriae 145S virus Magnaporthe oryzae chrysovirus 1 Seg 2 Protease 3197bp Wenling toti-like virus 1 Hubei toti-like virus 9 [Odonate] Seg 3 3154bp 99 dsRNA virus environmental sample [Ochlerotatus sierrensis] on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 100 dsRNA virus environmental sample [Culiseta incidens] Seg 4 3145bp 99 Hubei toti-like virus 10 [Culicine] 85 Aedes camptorhynchus toti-like virus (ACTLV) 100 Saccharomyces cerevisiae virus L-BC Saccharomyces cerevisiae virus L-A Partitiviridae related I 100 Trichomonas vaginalis virus Totiviridae Eimeria brunetti RNA virus 1 Atkinsonella hypoxylon partitivirus 99 100 99 Leishmania RNA virus 1-4 Wilkie partiti-like virus 1 (WPLV1) White clover cryptic virus 2 1 79 Wilkie partiti-like virus 2 (WPLV2) Sclerotinia sclerotiorum partitivirus S White clover cryptic virus 1
Phytoreovirus (Reoviridae) related 0.5
98 Homalodisca vitripennis reovirus Partitiviridae related II 100 Rice dwarf virus Phytoreovirus 99 Ustilaginoidea virens RNA virus M Rice gall dwarf virus Leschenault partiti-like virus (LPLV) Hubei reo-like virus 10 [Odonate] Zygosaccharomyces bailii virus Z Hubei reo-like virus 11 [Odonate] Hubei partiti-like virus 59 Aedes camptorhynchus reo-like virus Beihai barnacle virus 14
0.5 0.5 Downloaded from http://jvi.asm.org/
Qinvirus Fungi: Cox1 on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 LocD Aedes camptorhynchus LocA Culex globocoxitus LocB Aedes camptorhynchus
0.05 0.05