JVI Accepted Manuscript Posted Online 21 June 2017 J. Virol. doi:10.1128/JVI.00680-17 Copyright © 2017 American Society for . All Rights Reserved.

1 High Resolution Meta-Transcriptomics Reveals the Ecological

2 Dynamics of -Associated RNA in Western Downloaded from

3 Australia

4 5 Mang Shia, Peter Nevilleb,c, Jay Nicholsonb,c,d, John-Sebastian Edena,e, Allison Imriec*, 6 Edward C. Holmesa* 7 http://jvi.asm.org/ 8 aMarie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, 9 School of Biological Sciences and Sydney Medical School, The University of Sydney, 10 Sydney, Australia.

11 bEnvironmental Health Directorate, Public Health Division, Department of Health, 12 Government of Western Australia, Australia. on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209

13 cSchool of Biomedical Sciences, The University of Western Australia, Australia.

14 dCenter for Vectorborne Diseases, Department of Pathology, Microbiology and Immunology, 15 School of Veterinary Medicine, University of California, Davis, USA.

16 eCentre for Research, The Westmead Institute for Medical Research, Sydney, Australia.

17 18 *Corresponding authors: 19 Edward C. Holmes – [email protected] 20 Allison Imrie – [email protected] 21 22 Word count: Abstract – 247, Importance – 119, Main Text – 4835

23 Running title: Ecology of the Mosquito Virome

1 24 ABSTRACT Mosquitoes harbour a high diversity of RNA viruses, including many that

25 impact human health. Despite a growing effort to describe the extent and nature of the Downloaded from 26 mosquito virome, little is known about how these viruses persist, spread, and interact with

27 both their hosts and other microbes. To address this issue we performed a meta-

28 transcriptomics analysis of 12 Western Australian mosquito populations structured by species

29 and geographic location. Our results identified the complete genomes of 24 species of RNA http://jvi.asm.org/ 30 viruses from a diverse range of viral families and orders, among which 19 are newly

31 described. Comparisons of viromes revealed a striking difference between the two mosquito

32 genera, with viromes of mosquitoes from the Aedes genus exhibiting substantially less

33 diversity and lower abundance than those of Culex genus, within which viral abundance on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 34 reached 16.87% of the total non-rRNA. In addition, there was little overlap in viral diversity

35 between the two genera, although the viromes were very similar among the three Culex

36 species studied, suggesting that host taxon plays a major role in structuring virus diversity. In

37 contrast, we found no evidence that geographic location played a major role in shaping RNA

38 virus diversity, and several viruses discovered here exhibited high similarity (95-98%

39 nucleotide identity) to those from Indonesia and China. Finally, using abundance level and

40 phylogenetic relationships we were able to distinguish potential mosquito viruses from those

41 present in co-infecting bacteria, fungi, and protists. In sum, our meta-transcriptomics

42 approach provides important insights into the ecology of mosquito RNA viruses.

43

44 IMPORTANCE Studies of virus ecology have generally focused on individual viral

45 species. However, recent advances in bulk RNA sequencing make it possible to utilize meta-

46 transcriptomic approaches to reveal both complete virus diversity and their relative

47 abundance. We used such a meta-transcriptomic approach to determine key aspects of the

48 ecology of mosquito viruses in Western Australia. Our results show that RNA viruses are one

2 49 of the most important components of the mosquito transcriptome, and we identified 19 new

50 virus species from a diverse set of virus families. A key result was that host genetic Downloaded from 51 background plays a more important role in shaping virus diversity than sampling location,

52 with Culex species harbouring more viruses at greater abundance than those from Aedes

53 mosquitoes. http://jvi.asm.org/ on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209

3 54 Mosquitoes (Diptera: Culicidae) act as vectors for a number of disease agents that infect

55 humans and domestic , including , dengue virus, Chikungunya virus, and Zika Downloaded from 56 virus. However, in addition to their role as transmission vectors, mosquitoes harbour a far

57 larger virome, including many viruses that are confined to these , such that they are

58 “-specific” (1, 2). Although these insect-specific viruses that have no direct impact on

59 public health, they may modulate the transmission of viruses that are pathogenic to http://jvi.asm.org/ 60 vertebrates (3). The development of metagenomic sequencing approaches has therefore led to

61 a re-evaluation of the mosquito virome, including the recent discovery of viruses in the

62 families Bunyaviridae (4-8), (6, 9-11), (6, 12),

63 (13-15), Mesoviridae (16), (8, 17), as well as in the unclassified Chuvirus (6) and on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 64 Negevirus (18) groups. In addition, metagenomics surveys have discovered viruses in

65 families not previously known to infect mosquitoes, such as the , ,

66 , , and Narnaviridae (8, 19-22). Although these viruses have not

67 been isolated or characterized in vivo, their host association is supported by the presence of

68 related endogenous viruses in the genomes of various mosquito species (8). Hence, it is clear

69 that mosquitoes harbour a substantial viral diversity, the majority of which may not be

70 associated with vertebrates (1, 2).

71 Despite our expanding knowledge of the mosquito virome, there have been fewer

72 studies of ecological aspects of these viruses within their hosts (1). It has been suggested that

73 most of these newly discovered viruses share features that distinguish them from “classic”

74 human pathogens, including (i) an inability to infect vertebrates or vertebrate cell lines, (ii) a

75 high prevalence, (iii) prolonged host infection, and (iv) vertical transmission (1, 2, 23). Based

76 on these features, these mosquito viruses have been referred to as “commensal” microbes (3).

77 In reality, however, little is known about their natural infection status (e.g. abundance,

78 frequency of superinfection), host specificity in relation to different mosquito species,

4 79 geographic distribution and movement, and interactions with hosts and other microbes that

80 may be present within a specific host. Downloaded from 81 To reveal more of the natural ecology of mosquito RNA viruses we employed a meta-

82 transcriptomics approach to characterise the entire RNA environment excluding ribosomal

83 RNA (rRNA) within a mosquito sample. Meta-transcriptomics has several advantages over

84 approaches such as cell culture, consensus PCR, and metagenomics methods based on viral http://jvi.asm.org/ 85 particle purification (24, 25), and has proven successful in characterizing the RNA viromes of

86 diverse invertebrates (6, 8, 14, 20). Specifically: (i) it reveals the entire RNA virome, with

87 sufficient coverage to reconstruct complete viral genomes, including those from co-infecting

88 parasites; (ii) it provides a reliable quantification and assessment of both viral and host on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 89 RNAs; and (iii) it is relatively simple, requiring minimal sample processing. Most

90 importantly, meta-transcriptomics provides more information than the genome sequence

91 alone, allowing a straightforward characterization of viral diversity and ecology.

92 To infer aspects of virome ecology among mosquito species sampled from different

93 geographic locations we characterized the total transcriptome of 12 mosquito populations,

94 comprising five species collected from four locations in Western Australia. In particular, we

95 determined the number, type, and abundance of each virus within the context of the host

96 transcriptome and that of other microbial symbionts/parasites, and addressed whether these

97 parameters varied by species and/or sampling location.

98

99 RESULTS

100 The mosquito virome. We characterized the total transcriptome of 12 mosquito pools,

101 representing five species of mosquitoes sampled from four geographic locations in Western

102 Australia (Fig. 1). RNA sequencing of ribosomal (r) RNA-depleted libraries resulted in 40-47

103 million reads per pool, which were assembled de novo into 159,861 to 225,352 contigs.

5 104 Subsequent blast analyses revealed the complete genomes of 24 species of RNA viruses, of

105 which 19 are newly described here. These virus species fell into a wide range of RNA virus Downloaded from 106 groups, including those that fell within existing families and orders, namely the

107 Bunyaviridae, , Orthomyxoviridae, Narnaviridae, ,

108 , Reoviridae, Totiviridae, Chrysoviridae, as well as in several newly described

109 groups: Qinvirus (a highly divergent group of negative-sense RNA viruses (8)), the Partiti- http://jvi.asm.org/ 110 like viruses, the Luteo-like viruses and the Negev-like viruses (Table 1). Importantly, these

111 viruses were unlikely to represent endogenous viral elements (EVEs) as they were present as

112 complete genomes without any interruption by frame-shifts, nonsense mutations, repeat

113 sequences, reverse transcriptases, or other features that are common to EVEs. on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 114 For each library, the number of virus species varied from 1 to 10 (Table 1). The

115 abundance (i.e. frequency) of each virus also varied from 0.013% to 16.87% of total non-

116 rRNA reads within the pool (Table 1). In comparison, the host gene RPL32, which is often

117 used as a reference gene in quantitative PCR assays, showed consistent abundance levels

118 across all libraries (from 0.034 – 0.065%, Table 2). This suggests that the huge variation in

119 viral number and abundance is unlikely to be an artefact of sample processing or nucleic acid

120 extraction. Indeed, for individual viral species the abundance levels were comparable across

121 libraries, including both highly abundant viruses such as Culex phasma-like virus (1.632 –

122 4.113%) and those of lower abundance like Culex mononega-like virus 2 (0.011 – 0.034%).

123 Overall, for all the Culex pools, the total abundance levels of viral RNA were above 4% of

124 total non-rRNA, suggesting that RNA viruses can make up a substantial part of the RNA

125 environment in mosquitoes.

126 Also of note was that some of the viruses were highly prevalent. In particular, Culex

127 phasma-like virus (CPLV) and Wuhan mosquito virus 6 (WHMV6) appeared in all of the

128 Culex pools, while Culex mononega-like virus 1 and 2 (CMLV1 and 2), Zhejiang mosquito

6 129 virus 3 (ZJMV3), and Hubei chryso-like virus 1 (HBCLV1) appeared in most of the Culex

130 pools. WHMV6, ZJMV3, and HBCLV1 were also prevalent in the Culex species from China Downloaded from 131 (6). Importantly, each of these viruses had consistent abundance levels across different

132 libraries and were absent from the Aedes pools, suggesting that they are unlikely to result

133 from contamination. This observation highlights the persistence of some viral infections in

134 Culex mosquitoes, to the extent that infections are the norm rather than the exception. http://jvi.asm.org/ 135

136 Virome ecology. Our analysis revealed substantial differences between the Aedes and Culex

137 genera in terms of virus composition and abundance. Generally, the Aedes mosquitoes

138 contained fewer viruses than the Culex mosquitoes (Fig. 2). Although the Ae. on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 139 camptorhynchus pool from South Guildford contained seven viral species, all were at low

140 abundance and of uncertain host association (see below). More striking was that the total

141 viral abundance was much lower in the Aedes pools (0.013 – 0.391%) than the Culex pools

142 (4.508 – 16.87%), an observation that was consistent across sampling locations.

143 The differences between the two mosquito genera were also reflected in the types of

144 the viruses they harboured (Fig. 3A). Of the 24 viral species discovered, only two – Wilkie

145 qin-like virus (WQLV) and Wilkie narna-like virus 1 (WNLV1) – were shared between the

146 Aedes and Culex pools (Table 1). However, that these viruses had low abundance and co-

147 appeared with a group of related fungal pathogens rendered them more likely to be associated

148 with fungi than mosquitoes (see below). The lack of similarity between the Aedes and Culex

149 viromes was in marked contrast to the number of common viral species found between the

150 three Culex species (Fig. 3 and Table 1). Notably, Cx. quinquefasciatus shared five of the six

151 viruses with the other two Culex species despite the substantial genetic distance between

152 these hosts (Fig. 1). Conversely, no viruses were shared between Ae. camptorhynchus and Ae.

153 alboannulatus, although only one virus was discovered in Ae. alboannulatus.

7 154 Also of note was that there was a significant overlap between the viromes from the

155 three locations that harboured Culex species (Fig. 3B). The fourth location, Leschenault Downloaded from 156 Peninsula, contained only Ae. camptorhynchus mosquitoes whose virome was very limited.

157 Hence, there is seemingly a lack of geographic structure to the RNA virome at the scale of

158 this study. Indeed, the geographic distribution of each of these viral species may be much

159 broader and involve locations outside of Australia. In particular, several of the viruses we http://jvi.asm.org/ 160 identified shared high genetic identity with those found in disparate geographic locations,

161 including Wuhan mosquito virus 6 (98% nucleotide identity), Zhejiang mosquito virus 3

162 (96%), Hubei chryso-like virus 1 (97%), and Shuangao chryso-like virus 1 (97%) which were

163 also identified in China (6), as well as Ngewotan virus (99%) from Indonesia (18). on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 164

165 Evolutionary history of the newly identified RNA viruses. While the majority of the

166 viruses identified from this study exhibited relatively close relationships to viruses previously

167 described in either mosquitoes, Dipteran insects, or other related , six clustered

168 with fungal viruses (see below; Fig. 4-6). The clustering of mosquito-associated viruses from

169 different countries or mosquito species was apparent at many places within the phylogenies

170 and sometimes these monophyletic groups contained substantial genetic diversity, suggestive

171 of a long-term association between the viruses and their mosquito hosts. Notably, the

172 mosquito-associated clusters often contained multiple viral lineages associated with single or

173 multiple host species/genera (Fig. 4-6), with no clear pattern of virus-host co-divergence,

174 although this needs to be examined with a much larger sample size.

175

176 Negative-sense RNA viruses. We discovered eight putative negative-sense RNA viruses,

177 representing all the major taxonomic categories (Table 1). Among these, six were related to

178 previously described mosquito viruses, while the remaining two viruses either grouped with a

8 179 fungi virus (WOLV1, Ophioviridae) or were of uncertain host association (WQLV, Qinvirus)

180 (Fig. 4). In the RdRp phylogeny, CPLV clustered within the recently proposed phasmavirus Downloaded from 181 group (family Bunyaviridae) (36), whose host range is currently limited to arthropods (6, 8).

182 Its closest relative was Wuhan mosquito virus 2 identified from Culex mosquitoes in China.

183 CPLV showed a genome structure typical of phasmavirus, which have substantially shorter

184 glycoprotein–encoding segments than other bunyaviruses. http://jvi.asm.org/ 185 Culex rhabdo-like virus (CRLV), CMLV1, and CMLV2 were related to viruses from

186 the order Mononegavirales. CRLV1 was from the Dimarhabdovirus group and related to

187 North Creek virus that was isolated from Cx. sitiens (Wiedemann) sampled on the east coast

188 of Australia (10) (Fig. 4), while CMLV1 and 2 grouped with Xincheng Mosquito virus in a on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 189 currently unclassified clade. Interestingly, CMLV1 had a bi-segment genome arrangement

190 that occurs only rarely in the Mononegavirales (37) (Fig. 4), although the most closely related

191 viruses – CMLV2 and Xincheng Mosquito virus – all had unsegmented genomes.

192 Aedes alboannulatus orthomyxo-like virus (AAOLV) and WHMV6 belonged to two

193 separate mosquito-associated clusters within the family Orthomyxoviridae. WHMV6 was

194 initially identified in Culex mosquitoes from China (6, 8), and we were able to reveal two

195 more genome segments, containing a glycoprotein gene and an unknown protein gene, in

196 addition to those described previously, making a total of 6 segments. Although the Aedes

197 alboannulatus orthomyxo-like virus was only discovered in one pool, it had moderately high

198 abundance (0.217%) and clustered with viruses identified from the other mosquito hosts in

199 China (Fig. 4), which suggested a potential association with Ae. alboannulatus.

200

201 Positive-sense RNA viruses. The positive-sense RNA viruses discovered in this study fell

202 within the Narnaviridae, Mesoniviridae (), Negev-like viruses, and Luteoviridae-

203 related viruses (Fig. 5). The Negev-like viruses were initially identified in mosquitoes (18),

9 204 and now expanded to include a number of other species. Based on the RdRp, the

205 Negev-like viruses form part of a larger group referred to as alpha-like supergroup (38) or Downloaded from 206 Hepe-Virga-like group (8), which includes the Togaviridae, , ,

207 amongst others. We identified four divergent viruses within the Negev-like virus

208 group. Among these, Culex Negev-like virus 2 and 3 (CNLV2 and 3) were closely related to

209 viruses identified in mosquitoes and had a similar genome structure to the prototype Negev http://jvi.asm.org/ 210 virus (Fig. 5). In contrast, Culex negev-like virus 1 (CNLV1) was distantly related to a virus

211 identified from nematodes (Fig. 5). However, since CNLV1 had moderately high abundance

212 and appeared in three Culex pools that contained no traces of nematode genes, its host

213 association was more likely mosquitoes. The Aedes camptorhynchus Negev-like virus on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 214 (ACNLV) showed a distant relationship with Muthill virus and Marsac virus identified from

215 (Fig. 5). Its genome had several unique features, including a permuted RdRp domain, a

216 potential stop codon read-through site between the helicase and RdRp domains, and a

217 distinctive (and longer) set of genes downstream of the replicase.

218 We also identified four viruses from the Narnaviridae (Fig. 5). Of these, Zhejiang

219 mosquito virus 3 (ZJMV3) was highly abundant and prevalent across all the Culex pools,

220 while the other three viruses were of low abundance and clustered with fungal pathogens.

221 Viruses closely related to ZJMV3 have been identified in China (8), France (19), and the

222 United States (20), and can be distinguished from other because of their dual-

223 coding genome structure, characterized by two open reading frames (ORFs) that cover the

224 complete length of both the sense and anti-sense genome (Fig. 5). One of the ORFs encodes

225 the RdRp, while the other had no homology to any gene. Importantly, this feature was

226 conserved across a divergent phylogenetic group, including more distantly related viruses

227 such as Hubei narna-like virus 20 (Fig. 5).

10 228 We also identified a virus related to the Luteo-Sobemo-like group whose host range

229 has recently expanded from plants to include arthropods, nematodes, molluscs, and protists Downloaded from 230 (8). Specifically, we identified a single member of this group, Culex luteo-like virus (CLLV),

231 that was related to Hubei sobemo-like virus 41 previously identified in mosquitoes from

232 China. Despite its relatively low abundance, it was identified in the three Culex pools that did

233 not contain any abundant cellular parasites (Table 2), suggesting that it is most likely http://jvi.asm.org/ 234 associated with Culex mosquitoes. The genome of CLLV contained two segments, encoding

235 the replicase and the (identified by structural blast). The replicase segment contained a

236 site before the coding regions of the RdRp, typical of the members of

237 the Luteo-Sobemo-like group (8). on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 238

239 Double-stranded RNA viruses. We identified seven double-stranded RNA viruses:

240 Chrysoviridae (n = 2), Totiviridae (n = 1), Reoviridae (n = 1), and Partitiviridae (n = 3).

241 With the exception of the three viruses from the Partitiviridae, all these viruses were related

242 to those identified from mosquito or other arthropod hosts (Fig. 6). Hubei chryso-like virus 1

243 (HBCLV1) and Shuangao chryso-like virus 1 (SCLV1), initially identified from mosquitoes

244 in China, were now found to be prevalent in Culex mosquitoes from Western Australia. Their

245 complete genomes, as revealed in this study, contained four segments similar to the prototype

246 genome of the Chrysoviridae (Fig. 6). In the case of the Totiviridae we identified the Aedes

247 camptorhynchus toti-like virus (ACTLV), which, like the other , has an

248 unsegmented genome comprising two major ORFs. Finally, the only reovirus identified here

249 – Aedes camptorhynchus reo-like virus – was related to Hubei reo-like virus 11 from

250 dragonflies, which in turn formed a distant sister clade to viruses of the genus .

251

11 252 Revealing host associations. The total transcriptomes described here not only contained

253 virus transcripts, but also abundantly expressed host genes and those from other intra-host Downloaded from 254 microbes such as bacteria, archaea, fungi, and protists. To reveal the presence and diversity of

255 these microbes we searched within the assembled transcripts for the presence of abundantly

256 expressed marker genes of cellular organisms. In this way we were able to identify several

257 dominant microbes within the mosquito host: some were related to parasites known to cause http://jvi.asm.org/ 258 infections in humans (e.g. Leishmania), whereas others included intracellular symbiotic

259 bacteria such as Wolbachia strain wPip (Table 2). Generally, the abundance levels of genes

260 from the (non-viral) microbes were orders of magnitude lower than those of the mosquito

261 hosts (Table 2). In addition, we identified a group of related fungi, which we termed on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 262 ‘Unknown sp1, 2, and 3’, in three of the pools including both Ae. camptorhynchus and Cx.

263 globocoxitus. The abundance level of these fungi was relatively high: in one of the Ae.

264 camptorhynchus pools the abundance of the fungal cox1 gene reached 0.125%, compared to

265 0.669% for that of mosquitoes. Interestingly, two viruses (WQLV and WPLV2) found in both

266 Ae. camptorhynchus and Cx. globocoxitus co-appeared with these fungi (Table 1 and 2), and

267 the viruses and fungi had matching evolutionary histories (Fig. 7). Furthermore, WQLV and

268 WPLV2 both grouped with fungal viruses rather than mosquito or arthropod viruses (Fig. 3

269 and 5). Collectively, these results provide strong evidence that these viruses were more likely

270 to be associated with fungi than mosquitoes.

271 Finally, to provide a summary of potential host association for all the viruses

272 discovered here, we considered several key attributes that are relevant to host association:

273 abundance level, prevalence, host association of close relatives, and co-appearance with other

274 cellular microbes within the hosts (Table 3). Among the 24 viruses identified here, 16 were

275 likely to be associated with mosquitoes under these criteria, whereas eight were more likely

276 associated with other hosts, although this clearly requires additional confirmation.

12 277

278 DISCUSSION Downloaded from 279 We have used a metagenomics approach to reveal key aspects of the ecology of RNA viruses

280 in mosquitoes from Western Australia. Of particular interest was the high diversity, high

281 prevalence, and relatively high abundance for a number of the RNA viruses from multiple

282 virus groups. Hence, these results highlight the capacity of Culex mosquitoes to tolerate high http://jvi.asm.org/ 283 levels of viral RNA, as has been described for other invertebrates (6, 8). Indeed, given the

284 very high prevalence of these viruses it seems intuitively unlikely that these viruses are

285 associated with severe disease in their hosts, and we propose that the most likely status for

286 these viruses is either sub-lethal infection or commensal. This is supported by the observation on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 287 that viruses have been detected in both laboratory mosquito colonies and insect cell lines that

288 show little loss of fitness (39, 40), although this clearly requires additional study.

289 Our results also revealed a striking difference between the viral diversities harboured

290 by the Aedes and Culex genera of mosquitoes: infections in the former group are sporadic and

291 there is little resemblance between the different populations, although clearly this needs to be

292 examined with more data. Similarly, among the previously described vector-borne viruses,

293 there is little overlap in the viruses carried by Aedes and Culex mosquitoes, such that the

294 diversity of mosquito-borne can be further subdivided into Culex- or Aedes-

295 associated phylogenetic groups (41, 42). The three Culex species studied here (Cx.

296 quinquefasciatus, Cx. australicus and Cx. globocoxitus) all form part of the Culex pipiens

297 complex and are closely related in their cox1 gene sequences (43, 44). Hence, the similarity

298 in viromes between the three Culex species may in part reflect their close evolutionary

299 relationships, which may in turn dictate similarities in the cellular environment,

300 immunological response, and perhaps ecological niche (45). The two mosquito genera also

13 301 exhibit a large discrepancy in virus numbers and abundance, which is robust across all

302 comparisons despite the relatively small sample size (Fig. 2). Downloaded from 303 In contrast to the difference in viromes between genera, the Culex virome was

304 relatively homogenous among the species and across the regions sampled. Furthermore, a

305 number of the viruses discovered here were not only found in Western Australia, but also in

306 regional countries like China and Indonesia indicating that they infect hosts over a wide http://jvi.asm.org/ 307 geographical area. As the viruses present in these different countries are very similar (95% ~

308 98% nucleotide identities), such limited genetic distance tentatively suggests that these

309 viruses were introduced by windblown mosquitoes (46, 47), by cyclones from neighbouring

310 regions (48), or were inadvertently spread by humans, rather than the result of ancient on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 311 mosquito dispersal. Conversely, on current data there appears to be relatively little overlap

312 between the mosquito viromes sampled from Western Australia and other parts of Australia

313 (3, 10), which may reflect that different mosquito species present in these localities. In

314 addition, a previous survey of viruses in eastern Australia was performed after the viruses

315 were passaged in the cell culture, which may eliminate some of the viruses present in the

316 original sample (10). A more complete characterization of virome ecology in Australia

317 evidently requires larger scale sampling covering more geographic locations and mosquito

318 species. Similarly, the current study relied on the collection of mosquitoes with Encephalitis

319 Virus Surveillance CO2 traps that are likely to be biased in the species of mosquitoes

320 collected. A broader sampling of mosquito fauna to determine the overall diversity of viruses

321 will evidently require the use of a variety of trapping techniques that reflect specific mosquito

322 habits or attraction to collection traps.

323 Although our study was directed toward mosquitoes (20), it was striking that we

324 identified a number of RNA viruses that were likely associated with hosts other than

325 mosquitoes. Specifically, potential non-mosquito viruses were revealed through phylogenetic

14 326 analysis (i.e. that they clustered with viruses from fungi rather than from mosquitoes),

327 evidence of co-divergence with their microbial hosts, and their low abundance (Table 3, Fig. Downloaded from 328 7). The abundance of these confirmed and suspected microbial viruses was generally below

329 0.001% of the total non-rRNA reads (i.e. so low that they are unlikely to be associated with

330 mosquitoes), although the highest (WQLV) reached 0.074%. This, in turn, suggests that viral

331 abundance level is a useful indication of host association, although it should be examined in http://jvi.asm.org/ 332 the context of the type and quantity of the dominant microbes within the sample.

333 Finally, it is important to note that among the various virus species discovered here,

334 none fell into the category of “vector-borne” viruses that are known to infect humans or other

335 mammalian hosts. Indeed, in a previous metagenomics survey of mosquitoes and ticks, most on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 336 of viruses discovered either clustered with “arthropod-specific” viruses or were

337 uncharacterized (1, 2), and the sequencing of nearly 200 mosquitoes revealed only two

338 known vector-borne viruses (6, 8). This suggests that human and vertebrate pathogens

339 represent only a tiny fraction of the mosquito virome, although it is possible that they exist at

340 very low copy numbers if they exhibit low levels of replication. Whatever the cause, the

341 observation that vector-borne viruses are rare further indicates how the characterization of the

342 mosquito virome provides important insight into the ecology and evolution of insect viruses.

343

344 MATERIALS AND METHODS

345 Sample collection. A total of 519 adult mosquitoes were collected in 2015 from four

346 locations in Western Australia considered to be of significant public health risk in relation to

347 mosquito-borne diseases including Ross River (RRV) and Barmah Forest viruses (BFV). The

348 four locations comprised (i) South Guildford, an eastern suburb of the Perth Metropolitan

349 Region located on the Swan River; (ii) Leschenault Peninsula, near Australind and (iii) Point

350 Douro, Bunbury, both of which are tidally driven inlet sites and approximately 160 km and

15 351 175 km southwest of Perth, respectively; and (iv) Siesta Park, in Dunsborough,

352 approximately 250 km southwest of Perth (Fig. 1). Mosquitoes were collected using Downloaded from

353 Encephalitis Virus Surveillance carbon dioxide (EVS CO2) traps that were set at each

354 location for approximately 12 hours. Each trap was baited with dry ice to attract mosquitoes.

355 Upon trap collection, the mosquitoes were euthanized by placing each collection on dry ice to

356 kill and preserve the mosquitoes and RNA. Mosquitoes were then placed in labelled vials and http://jvi.asm.org/ 357 left on dry ice until returned to the laboratory, where the samples were placed in a -80°C

358 freezer.

359 Mosquito species identification was initially carried out by experienced field

360 biologists using taxonomic keys (26) and dissecting microscopes on cold tables, later verified on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 361 by analysing the cytochrome c oxidase subunit I (cox1) gene (Fig. 1). The majority of the

362 mosquitoes collected in this study were from five species: Ae. camptorhynchus (Thomson),

363 Ae. alboannulatus (Macquart), Culex globocoxitus (Dobrotworsky), Cx. australicus

364 (Dobrotworsky and Drummond), and Cx. quinquefasciatus (Say). As Cx. globocoxitus and

365 Cx. australicus cannot be distinguished by COI gene sequences (Fig. 1), they were identified

366 using two main morphological diagnostic features – the tergal banding patterns and median

367 patches of dark scales on the sternites. Specifically, Cx. globocixtus has tergal banding

368 without lateral constrictions and no dark patches of scales on the sternites, while Cx.

369 australicus has lateral constrictions on tergal bands and has prominent patches of dark scales

370 on the sternites (26). All mosquito samples were then categorized by species and geographic

371 locations and stored at -80°C before RNA extraction.

372

373 Sample processing and sequencing. RNA extraction and sequencing were carried out on 12

374 pools of mosquitoes, with each pool containing 5-10 representative female mosquitoes from

375 the same geographic region and species (Table 1). Prior to homogenization, each mosquito

16 376 pool was washed three times with 1ml sterile, RNA and DNA-free PBS solution (GIBCO) to

377 remove external microbes. The samples were then homogenized in 600 µl of lysis buffer Downloaded from 378 using a TissueRuptor (Qiagen). Total RNA was extracted using an RNeasy Plus Mini Kit

379 following the manufacturer's instructions. The quality of the extracted RNA was evaluated

380 using an Agilent 2100 Bioanalyzer (Agilent Technologies). All extractions performed in this

381 study had a RIN value larger than 8.7. Sequencing libraries were constructed using a TruSeq http://jvi.asm.org/ 382 total RNA Library Preparation Kit (Illumina) with the host rRNA removed using a Ribo-

383 Zero-Gold (Human-Mouse-Rat) Kit (Illumina). Paired-end (100bp) sequencing of each

384 library was then performed on the Hiseq2500 platform (Illumina). All library preparation and

385 sequencing procedures were carried out by Australian Genome Research Facility (AGRF). on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 386

387 RNA virus discovery. Sequencing reads were de-multiplexed and trimmed for quality with

388 Trimmomatic (27) before de novo assembly using Trinity (28). The resulting contigs were

389 first compared against the database of all reference RNA virus proteins downloaded from

390 GenBank using Blastx with an e-value cut-off at 1E-5. Potential viral contigs were then

391 compared to the entire non-redundant nucleotide (nt) and protein (nr) database to remove

392 false-positives. The quality-filtered virus contigs with unassembled overlaps were then

393 merged using the SeqMan program implemented in the Lasergene software package v7.1

394 (DNAstar). To confirm the assembly results, reads were mapped back to the virus genomes

395 with Bowtie2 (29) and inspected using the Integrated Genomics Viewer (30) for any

396 assembly errors. The final sequences of the virus genomes were obtained from the majority

397 consensus of the mapping assembly.

398

399 Virus genome annotation. The potential open reading frames (ORFs) of the newly identified

400 virus genomes were predicted based on those from the closest reference virus genomes. To

17 401 characterize the functional domains within each ORFs we performed a domain-based Blast

402 search against the Conserved Domain Database (CDD) with an expected value threshold of Downloaded from 403 1E-5. The potential functions of the remaining ORFs were predicted by homology with other

404 known viral proteins. A potential viral glycoprotein from families of negative-sense RNA

405 viruses was identified based on the presence of (i) a N-terminal signal domain, (ii) a C-

406 terminal or mid-point transmembrane domain, and (iii) putative glycosylation sites. http://jvi.asm.org/ 407 For those viruses with the multiple segments, non-RdRp segments were usually

408 identified by homology to the proteins of related reference viruses. Other potential segments

409 of no homology were identified using an in silico approach that utilizes information on RNA

410 quantity, protein structure, and/or conserved genome termini. To determine whether these on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 411 segments belonged to the same virus, we checked: (i) the sequencing depth of the segments,

412 (ii) the presence of conserved genome termini, (iii) the co-appearance with the RdRp

413 segments, and (iv) the phylogenetic positions of related viral proteins.

414

415 Identification of other microbes within mosquitoes. To identify abundant bacteria, fungi,

416 and protists within the mosquito populations sampled we searched the assembled

417 transcriptome for a collection of key marker genes that are abundantly and stably expressed

418 in eukaryotes and prokaryotes. Specifically, we looked for the cox1 and Ribosomal Protein

419 L32 (RPL32) genes to identify eukaryotes (including the mosquito host) and the DNA gyrase

420 subunit B (gyrB) and Recombinase A protein (recA) genes to identify prokaryotes. The

421 contigs discovered were then confirmed with (i) blastx search against the nr database and (ii)

422 read mapping. The quality screened contigs were then trimmed to contain only coding

423 regions for quantification (see below).

424

18 425 RNA quantification. To help determine the abundance of RNA transcripts we estimated the

426 percentage of total reads that mapped to target genomes/genes. The sequences used for Downloaded from 427 mapping involved viral genomes as well as the mosquito and microbial marker genes

428 identified above. The mapping was performed using Bowtie2 (31). The mapping results were

429 manually checked with IGV (30) for potential assembly errors.

430 http://jvi.asm.org/ 431 Phylogenetic analyses. We used the amino acid sequences of the viral replicase (i.e. RNA-

432 dependent RNA polymerase) to determine the evolutionary history of the newly discovered

433 viruses. For comparison, we included previously published sequences

434 representative of each of the relevant phylogenetic groups (e.g. virus family). This also on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 435 included all the previously described mosquito viruses within these groups. Within each

436 group, the replicase proteins were aligned using the E-INS-i algorithm in MAFFT (version 7)

437 (32). Ambiguously aligned regions were subsequently removed using TrimAl (33). Based on

438 the sequence alignment, the best-fit model of amino acid substitution was determined using

439 ProtTest 3.4 (34). Phylogenetic trees were then estimated using the maximum likelihood

440 method (ML) implemented in PhyML version 3.0 (35), utilizing the best-fit substitution

441 model and the Subtree Pruning and Regrafting (SPR) branch-swapping algorithm. Support

442 for individual nodes on the phylogenetic tree was accessed using an approximate likelihood

443 ratio test (aLRT) with the Shimodaira-Hasegawa-like procedure as implemented in PhyML.

444

445 Accession numbers. The raw sequence reads generated in this study are available at the

446 NCBI Sequence Read Archive (SRA) database under BioProject accession PRJNA388696.

447 All virus genome sequences generated in this study have been deposited in GenBank under

448 the accession numbers MF176241 – MF176391 .

449

19 450 ACKNOWLEDGMENTS

451 We thank the staff of the Environmental Health Directorate of the Department of Health, Downloaded from 452 Western Australia for the collection of mosquitoes from the Southwest of Western Australia.

453 In addition, we thank the staff at the City of Swan (especially Neil Harries and James

454 McCallum) for the collection of mosquitoes from the east of Perth. The authors also wish to

455 acknowledge The University of Sydney HPC service at The University of Sydney for http://jvi.asm.org/ 456 providing high performance computing resources that have contributed to the research results

457 reported within this paper. J-SE is supported by an NHMRC Early Career Fellowship

458 (GNT1073466) and ECH is supported by an NHMRC Australia Fellowship (GNT1037231).

459 on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209

20 460 REFERENCES 461 1. Bolling BG, Weaver SC, Tesh RB, Vasilakis N. 2015. Insect-specific virus

462 discovery: significance for the arbovirus community. Viruses. 7:4911-4928. Downloaded from 463 2. Vasilakis N, Tesh RB. 2015. Insect-specific viruses and their potential impact on 464 arbovirus transmission. Curr Opin Virol 15:69-74. 465 3. Hall RA, Bielefeldt-Ohmann H, McLean BJ, O'Brien CA, Colmant AM, 466 Piyasena TB, Harrison JJ, Newton ND, Barnard RT, Prow NA, Deerain JM, 467 Mah MG, Hobson-Peters J. (2017). Commensal viruses of mosquitoes: host 468 restriction, transmission, and interaction with arboviral pathogens. Evol Bioinform 469 Online 12:35-44. 470 4. Marklewitz M, Handrick S, Grasse W, Kurth A, Lukashev A, Drosten C, http://jvi.asm.org/ 471 Ellerbrok H, Leendertz FH, Pauli G, Junglen S. 2011. Gouleako virus isolated 472 from West African mosquitoes constitutes a proposed novel genus in the family 473 Bunyaviridae. J Virol 85:9227-9234. 474 5. Marklewitz M, Zirkel F, Rwego IB, Heidemann H, Trippner P, Kurth A, Kallies 475 R, Briese T, Lipkin WI, Drosten C, Gillespie TR, Junglen S. 2013. Discovery of a 476 unique novel clade of mosquito-associated bunyaviruses. J Virol 87:12850-12865. 477 6. Li CX, Shi M, Tian JH, Lin XD, Kang YJ, Chen LJ, Qin XC, Xu J, Holmes EC,

478 Zhang YZ. 2015. Unprecedented genomic diversity of RNA viruses in arthropods on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 479 reveals the ancestry of negative-sense RNA viruses. eLife 4:e05378. 480 7. Marklewitz M, Zirkel F, Kurth A, Drosten C, Junglen S. 2015. Evolutionary and 481 phenotypic analysis of live virus isolates suggests arthropod origin of a pathogenic 482 RNA virus family. Proc Natl Acad Sci USA 112:7536-7541. 483 8. Shi M, Lin XD, Tian JH, Chen LJ, Chen X, Li CX, Qin XC, Li J, Cao JP, Eden 484 JS, Buchmann J, Wang W, Xu J, Holmes EC, Zhang YZ. 2016. Redefining the 485 invertebrate RNA virosphere. Nature 540:539–543. 486 9. Kuwata R, Isawa H, Hoshino K, Tsuda Y, Yanase T, Sasaki T, Kobayashi M, 487 Sawabe K. 2011. RNA splicing in a new rhabdovirus from Culex mosquitoes. J Virol 488 85:6185-6196. 489 10. Coffey LL, Page BL, Greninger AL, Herring BL, Russell RC, Doggett SL, 490 Haniotis J, Wang C, Deng X, Delwart EL. 2014. Enhanced arbovirus surveillance 491 with deep sequencing: identification of novel rhabdoviruses and bunyaviruses in 492 Australian mosquitoes. 448:146-158. 493 11. Walker PJ, Firth C, Widen SG, Blasdell KR, Guzman H, Wood TG, Paradkar 494 PN, Holmes EC, Tesh RB, Vasilakis N. 2015. Evolution of genome size and 495 complexity in the Rhabdoviridae. PLoS Pathog 11:e1004664. 496 12. Presti RM, Zhao G, Beatty WL, Mihindukulasuriya KA, da Rosa AP, Popov VL, 497 Tesh RB, Virgin HW, Wang D. 2009. Quaranfil, Johnston Atoll, and Lake Chad 498 viruses are novel members of the family Orthomyxoviridae. J Virol 83:11599-11606. 499 13. Qin XC, Shi M, Tian JH, Lin XD, Gao DY, He JR, Wang JB, Li CX, Kang YJ, 500 Yu B, Zhou DJ, Xu J, Plyusnin A, Holmes EC, Zhang YZ. 2014. A tick-borne 501 segmented RNA virus contains genome segments derived from unsegmented viral 502 ancestors. Proc Natl Acad Sci USA 111:6744-6749. 503 14. Shi M, Lin XD, Vasilakis N, Tian JH, Li CX, Chen LJ, Eastwood G, Diao XN, 504 Chen MH, Chen X, Qin XC, Widen SG, Wood TG, Tesh RB, Xu J, Holmes EC, 505 Zhang YZ. 2015. Divergent viruses discovered in arthropods and vertebrates revise 506 the evolutionary history of the Flaviviridae and related viruses. J Virol 90:659-669. 507 15. Ladner JT, Wiley MR, Beitzel B, Auguste AJ, Dupuis AP, 2nd, Lindquist ME, 508 Sibley SD, Kota KP, Fetterer D, Eastwood G, Kimmel D, Prieto K, Guzman H,

21 509 Aliota MT, Reyes D, Brueggemann EE, St John L, Hyeroba D, Lauck M, 510 Friedrich TC, O'Connor DH, Gestole MC, Cazares LH, Popov VL, Castro- 511 Llanos F, Kochel TJ, Kenny T, White B, Ward MD, Loaiza JR, Goldberg TL,

512 Weaver SC, Kramer LD, Tesh RB, Palacios G. 2016. A multicomponent Downloaded from 513 virus isolated from mosquitoes. Cell Host Microbe 20:357-367. 514 16. Nga PT, Parquet Mdel C, Lauber C, Parida M, Nabeshima T, Yu F, Thuy NT, 515 Inoue S, Ito T, Okamoto K, Ichinose A, Snijder EJ, Morita K, Gorbalenya AE. 516 2011. Discovery of the first insect nidovirus, a missing evolutionary link in the 517 emergence of the largest RNA virus genomes. PLoS Pathog 7:e1002215. 518 17. Attoui H, Jaafar FM, Belhouchet M, Tao SJ, Chen BQ, Liang GD, Tesh RB, de 519 Micco P, de Lamballerie X. 2006. Liao ning virus, a new Chinese that

520 replicates in transformed and embryonic mammalian cells. J Gen Virol 87:199-208. http://jvi.asm.org/ 521 18. Vasilakis N, Forrester NL, Palacios G, Nasar F, Savji N, Rossi SL, Guzman H, 522 Wood TG, Popov V, Gorchakov R, Gonzalez AV, Haddow AD, Watts DM, da 523 Rosa AP, Weaver SC, Lipkin WI, Tesh RB. 2013. Negevirus: a proposed new 524 taxon of insect-specific viruses with wide geographic distribution. J Virol 87:2475- 525 2488. 526 19. Cook S, Chung BY, Bass D, Moureau G, Tang S, McAlister E, Culverwell CL, 527 Glucksman E, Wang H, Brown TD, Gould EA, Harbach RE, de Lamballerie X, 528 Firth AE. 2013. Novel virus discovery and genome reconstruction from field RNA on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 529 samples reveals highly divergent viruses in dipteran hosts. PLoS One. 8:e80720. 530 20. Chandler JA, Liu RM, Bennett SN. 2015. RNA shotgun metagenomic sequencing 531 of northern California (USA) mosquitoes uncovers viruses, bacteria, and fungi. Front 532 Microbiol 6:185. 533 21. Cholleti H, Hayer J, Abilio AP, Mulandane FC, Verner-Carlsson J, Falk KI, 534 Fafetine JM, Berg M, Blomstrom AL. 2016. Discovery of novel viruses in 535 mosquitoes from the Zambezi valley of Mozambique. PLoS One 11:e0162751. 536 22. Frey KG, Biser T, Hamilton T, Santos CJ, Pimentel G, Mokashi VP, Bishop- 537 Lilly KA. 2016. Bioinformatic characterization of mosquito viromes within the 538 eastern United States and Puerto Rico: discovery of novel viruses. Evol Bioinform 539 Online 12:1-12. 540 23. Junglen S, Drosten C. 2013. Virus discovery and recent insights into virus diversity 541 in arthropods. Curr Opin Microbiol 16:507-513. 542 24. Mokili JL, Rohwer F, Dutilh BE. 2012. Metagenomics and future perspectives in 543 virus discovery. Curr Opin Virol 2:63-77. 544 25. Radford AD, Chapman D, Dixon L, Chantrey J, Darby AC, Hall N. 2012. 545 Application of next-generation sequencing technologies in virology. J Gen Virol 546 93:1853-1868. 547 26. Liehne PFS. 1991. An Atlas of the Mosquitoes of Western Australia. Health 548 Department of Western Australia. 549 27. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina 550 sequence data. Bioinformatics 30:2114-2120. 551 28. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis 552 X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, 553 Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, 554 Regev A. 2011. Full-length transcriptome assembly from RNA-Seq data without a 555 reference genome. Nat Biotechnol 29:644-652. 556 29. Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat 557 Methods 9:357-359.

22 558 30. Thorvaldsdottir H, Robinson JT, Mesirov JP. 2013. Integrative Genomics Viewer 559 (IGV): high-performance genomics data visualization and exploration. Brief 560 Bioinform 14:178-192.

561 31. Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat Downloaded from 562 Methods 9:357-359. 563 32. Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software 564 version 7: improvements in performance and usability. Mol Biol Evol. 30:772-780. 565 33. Capella-Gutierrez S, Silla-Martinez JM, Gabaldon T. 2009. trimAl: a tool for 566 automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 567 25:1972-1973. 568 34. Darriba D, Taboada GL, Doallo R, Posada D. 2011. ProtTest 3: fast selection of

569 best-fit models of protein evolution. Bioinformatics 27:1164-1165. http://jvi.asm.org/ 570 35. Guindon S, Gascuel O. 2003. A simple, fast, and accurate algorithm to estimate large 571 phylogenies by maximum likelihood. Syst Biol 52:696-704. 572 36. Ballinger MJ, Bruenn JA, Hay J, Czechowski D, Taylor DJ. 2014. Discovery and 573 evolution of bunyavirids in arctic phantom midges and ancient bunyavirid-like 574 sequences in insect genomes. J Virol 88:8783-8794. 575 37. Dietzgen RG, Kondo H, Goodin MM, Kurath G, Vasilakis N. 2017. The family 576 Rhabdoviridae: mono- and bipartite negative-sense RNA viruses with diverse genome 577 organization and common evolutionary origins. Virus Res 227:158-170. on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 578 38. Koonin EV. 1991. The phylogeny of RNA-dependent RNA polymerases of positive- 579 strand RNA viruses. J Gen Virol 72:2197-2206. 580 39. Stollar V, Thomas VL. 1975. An agent in the cell line (Peleg) which 581 causes fusion of Aedes albopictus cells. Virology 64:367-377. 582 40. Bolling BG, Vasilakis N, Guzman H, Widen SG, Wood TG, Popov VL. 2015. 583 Insect-specific viruses detected in laboratory mosquito colonies and their potential 584 implications for experiments evaluating arbovirus vector competence. Am J Trop 585 Med Hyg 92:422-428. 586 41. Coffey LL, Forrester N, Tsetsarkin K, Vasilakis N, Weaver SC. 2013. Factors 587 shaping the adaptive landscape for arboviruses: implications for the emergence of 588 disease. Future Microbiol 8:155-176. 589 42. Huang YJ, Higgs S, Horne KM, Vanlandingham DL. 2014. -mosquito 590 interactions. Viruses 6:4703-4730. 591 43. Batovska J, Blacket MJ, Brown K, Lynch SE. 2016. Molecular identification of 592 mosquitoes (Diptera: Culicidae) in southeastern Australia. Ecol Evol 6:3001-3011. 593 44. Smith JL, Fonseca DM. 2004. Rapid assays for identification of members of the 594 Culex (Clex) pipiens complex, their hybrids, and other sibling species (Diptera: 595 Culicidae). Am J Trop Med Hyg 70:339-345. 596 45. Streicker DG, Turmelle AS, Vonhof MJ, Kuzmin IV, McCracken GF, 597 Rupprecht CE. 2010. Host phylogeny constrains cross-species emergence and 598 establishment of rabies virus in bats. Science 329:676-679. 599 46. Johansen CA, Van den Hurk AF, Ritchie SA, Zborowski P, Nisbet DJ, Paru R, 600 Bockarie MJ, MacDonald J, Drew AC, Khromykh TI, MacKenzie JS. 2000. 601 Isolation of Japanese Encephalitis virus from mosquitoes (Diptera: Culicidae) 602 collected in the Western Province of Papua New Guinea, 1997-1998. Am J Trop Med 603 Hyg 62:631-638. 604 47. Ritchie S, Rochester W. 2001. Wind blown mosquitoes and introduction of Japanese 605 Encephalitis into Australia. Emerg Infect Dis 5:900-903. 606 48. Inglis TJJ, O’Rielly L, Merritt AJ, Levy A, Heath C. 2011. Review: The aftermath 607 of the western Australian melioidosis outbreak. Am J Trop Med Hyg 84:851-857.

23 608 FIGURE LEGENDS

609 FIG 1 Information on the host and geographic location (south-western Australia) of the Downloaded from 610 mosquito samples collected in this study. Upper panel: maximum likelihood phylogeny of the

611 Cytochrome C Oxidase (cox1) gene from mosquito samples collected in this study. The name

612 of each sequence contains the information of sampling location and host species

613 identification in the field. Lower panel: locations of four sampling sites, marked by sold black http://jvi.asm.org/ 614 dots.

615

616 FIG 2 An overview of the diversity and abundance of the RNA viruses discovered. From the

617 top to bottom we show four column graphs depicting the number of viruses, the composition on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 618 of viral families, the abundance of total virome, and the abundance of the host RPL32 gene in

619 each of the 12 pool sequenced here. The mosquito species and location information for each

620 pool are shown at the top of the figure.

621

622 FIG 3 The similarity of viromes between (A) host species and (B) geographic locations. The

623 size of the circle is proportional to the total number of viruses discovered in each mosquito

624 species (A) or geographic location (B). Within the circle, information on the host species or

625 geographic location and the number of viruses (in parenthesis) is provided. The thickness of

626 the line connecting the circles reflects the number of viruses shared between species or

627 geographic locations. The number of shared viruses is shown next to the line.

628

629 FIG 4 Evolutionary history and genomic features of the negative-sense RNA viruses

630 discovered. The maximum likelihood phylogenetic trees show the position of newly

631 discovered viruses (solid black circles) in the context of representatives of their closest

632 relatives. The names of mosquito viruses identified in previous studies are marked in red and

24 633 contain the information of the mosquito species from which they were sampled (square

634 brackets). The genome structures of these newly discovered viruses are shown next to their Downloaded from 635 corresponding phylogenies. Predicted ORFs of these genomes are labelled with information

636 of the potential protein or protein domain they encode.

637

638 FIG 5 Evolutionary history and genomic features of the positive-sense RNA viruses http://jvi.asm.org/ 639 discovered. The legend is the same as that of Figure 4.

640

641 FIG 6 Evolutionary history and genomic features of the double-stranded RNA viruses

642 discovered. The legend is the same as that of Figure 4. on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 643

644 FIG 7 The matching tree topologies of the Wilkie qin-like viruses and a group of fungi (cox1

645 gene) discovered in three mosquito pools. Pool information is given in the middle of the two

646 phylogenies, both of which are mid-point rooted for clarity only.

25 Downloaded from http://jvi.asm.org/

647 Table 1. The presence and abundance of viruses from different mosquito species and locations (% total reads) Aedes Culex Aedes camptorhynchus Culex globocoxitus Culex australicus Virus Name Classification alboannulatus quinquefasciatus LocA LocB LocC LocD LocD LocA LocC LocD LocA LocC LocD LocD Culex phasma-like virus (CPLV) Bunyaviridae 0 0 0 0 0 3.881 4.113 3.547 3.908 2.659 3.952 1.632 Culex mononega-like virus 1 (CMLV1) Mononegavirales 0 0 0 0 0 0.193 0.068 0.191 0 0.059 0.063 0 Culex mononega-like virus 2 (CMLV2) Mononegavirales 0 0 0 0 0 0.011 0.022 0 0.021 0.034 0.016 0.009 Culex rhabdo-like virus (CRLV) Rhabdoviridae 0 0 0 0 0 0 0 0.217 0.138 0 0 0.169 Wuhan mosquito virus 6 (WHMV6)1 Orthomyxoviridae 0 0 0 0 0 1.035 1.494 3.340 1.353 1.756 1.358 1.380

Aedes alboannulatus orthomyxo-like on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Orthomyxoviridae 0 0 0 0 0.217 0 0 0 0 0 0 0 virus (AAOLV) Wilkie qin-like virus (WQLV) Qinvirus (New -ve sense) 0 0.008 0 0.014 0 0.074 0 0 0 0 0 0 Wilkie ophio-like virus (WOLV) Ophioviridae 0 0 0 0.003 0 0 0 0 0 0 0 0 Culex negev-like virus 1 (CNLV1) Negev virus-related 0 0 0 0 0 0.286 0.236 0 0 0 0.501 0 Culex negev-like virus 2 (CNLV2) Negev virus-related 0 0 0 0 0 0 0 0 2.645 0 0 0 Culex negev-like virus 3 (CNLV3) Negev virus-related 0 0 0 0 0 0 0 0 4.092 0 0 0 Aedes camptorhynchus negev-like virus Negev virus-related 0 0 0.389 0 0 0 0 0 0 0 0 0 (ACNLV) Culex luteo-like virus (CLLV) Luteoviridae-related 0 0 0 0 0 0 0.031 0.050 0 0 0.044 0 Point-Douro narna-like virus (PDNLV) Narnaviridae 0 0 0 0 0 0.036 0 0 0 0 0 0 Zhejiang mosquito virus 3 (ZJMV3)2 Narnaviridae 0 0 0 0 0 0.840 0.449 1.510 0.342 0 0.080 2.181 Wilkie narna-like virus 1 (WNLV1) Narnaviridae 0 0 0.002 0.009 0 0 0 0 0 0 0 0 Wilkie narna-like virus 2 (WNLV2) Narnaviridae 0 0 0 0.013 0 0 0 0 0 0 0 0 Ngewotan virus3 Mesoniviridae 0 0 0 0 0 0 0 0 4.326 0 0 0 Wilkie partiti-like virus 1 (WPLV1) Partitiviridae-related 0 0 0 0.002 0 0 0 0 0 0 0 0 Wilkie partiti-like virus 2 (WPLV2) Partitiviridae-related 0 0 0 0.005 0 0.002 0 0 0 0 0 0 Leschenault partiti-like virus (LPLV) Partitiviridae-related 0 0.006 0 0 0 0 0 0 0 0 0 0 Aedes camptorhynchus reo-like virus Reoviridae 0 0.132 0 0 0 0 0 0 0 0 0 0 (ACRLV) Aedes camptorhynchus toti-like virus Totiviridae 0.013 0 0 0.001 0 0 0 0 0 0 0 0 (ACTLV) Hubei chryso-like virus 1 (HBCLV1)2 Chrysoviridae 0 0 0 0 0 0.108 0.142 0.131 0.044 0 0.027 0 Shuangao chryso-like virus 1 (SCLV1)2 Chrysoviridae 0 0 0 0 0 0 0 0 0 0 0 0.141 All viruses 0.013 0.146 0.391 0.047 0.217 6.464 6.555 8.987 16.870 4.508 6.095 5.513 648 649 1Li et al. 2015 (6); 2Shi et al. 2016 (8); 3Vasilakis et al. 2013 (18).

26 Downloaded from http://jvi.asm.org/

650 Table 2. The most abundant genes from mosquitoes and other microbial organisms present in mosquitoes Aedes Culex Culex Culex Aedes camptorhynchus alboannulatus globocoxitus australicus quinquefasciatus Organisms Gene LocA LocB LocC LocD LocD LocA LocC LocD LocA LocC LocD LocD

Mosquito (principle host) cox1 0.455 0.669 0.335 0.346 0.437 1.114 0.851 0.606 0.587 0.830 0.866 0.499

Mosquito (principle host) RPL32 0.041 0.040 0.034 0.039 0.045 0.043 0.057 0.053 0.051 0.054 0.065 0.069

Fungi: Unknown sp1 cox1 0 0 0 0 0 0.032 0 0 0 0 0 0 on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Fungi: Unknown sp1 RPL32 0 0 0 0 0 0.00028 0 0 0 0 0 0

Fungi: Unknown sp2 cox1 0 0 0 0.026 0 0 0 0 0 0 0 0

Fungi: Unknown sp2 RPL32 0 0 0 0.00065 0 0 0 0 0 0 0 0

Fungi: Unknown sp3 cox1 0 0.125 0 0 0 0 0 0 0 0 0 0

Fungi: Unknown sp3 RPL32 0 0.00126 0 0 0 0 0 0 0 0 0 0

Fungi: Microsporidia sp RPL32 0 0 0.00008 0.00033 0 0 0 0 0 0 0 0

Protist: Leishmania sp cox1 0 0 0 0 0 0 0 0 0.00022 0 0 0.00006

Protist: Leishmania sp RPL32 0 0 0 0 0 0 0 0 0.00027 0 0 0

Protist: Trypanosoma sp RPL32 0 0 0.00005 0 0 0 0 0 0 0 0 0

Nematode: sp cox1 0 0.00153 0 0 0 0 0 0 0 0 0 0

Nematode: Onchocercidae sp cox1 0 0 0.00138 0 0 0 0 0 0 0 0 0

Bacteria: Zymobacter palmae gyrB 0.00018 0 0 0 0 0 0 0 0 0 0 0

Bacteria: Zymobacter palmae recA 0.00049 0 0 0 0 0 0 0 0 0 0 0

Bacteria: Wolbachia wPip gyrB 0 0 0 0 0 0 0 0 0 0 0 0.00028

Bacteria: Wolbachia wPip recA 0 0 0 0 0 0 0 0 0 0 0 0.00069

651

27 Downloaded from http://jvi.asm.org/

652 Table 3. Criteria used to identify viruses likely associated with mosquitoes Relatively high abundance level Found in Close relatives are Positive association Virus Name (>0.1% of total RNA in the library) >2 libraries mosquito or insect viruses with mosquitoes Culex phasma-like virus Yes Yes Yes Strong Culex mononega-like virus 1 Yes Yes Yes Strong Culex mononega-like virus 2 Yes Yes Strong Culex rhabdo-like 1 Yes Yes Yes Strong Wuhan mosquito virus 6 Yes Yes Yes Strong

Aedes alboannulatus orthomyxo-like virus Yes Yes Strong on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Wilkie qin-like virus Yes Low: viruses co-appear with fungi Wilkie ophio-like virus 1 Low Culex negev-like virus 1 Yes Yes Yes Strong Culex negev-like virus 2 Yes Yes Strong Culex negev-like virus 3 Yes Yes Strong Culex negev-like virus 4 Yes Yes Strong Culex luteo-like virus Yes Yes Strong Point-Douro_narna-like_virus Low Culex narna-like virus Yes Yes Yes Strong Wilkie narna-like virus 1 Low Wilkie narna-like virus 2 Low Nam Dinh virus Yes Yes Strong Wilkie partiti-like virus 1 Low Wilkie partiti-like virus 2 Yes Low: viruses co-appear with fungi Leschenault partiti-like virus Low Aedes alboannulatus reo-like virus Yes Yes Strong Aedes alboannulatus toti-like virus Yes Strong Culex chryso-like virus Yes Yes Yes Strong chryso-like virus Yes Yes Strong 653

28 LocC Culex australicus 0.05 85 LocA Culex australicus 76 LocA Culex globocoxitus Culex LocD australicus/globocoxitus 95 Culex australicus LocC Culex globocoxitus Downloaded from 100 LocD Culex globocoxitus LocD Culex quinquefasciatus LocA Aedes camptorhynchus LocB Aedes camptorhynchus http://jvi.asm.org/ 100 Aedes camptorhynchus LocD Aedes camptorhynchus LocC Aedes camptorhynchus LocD Aedes alboannulatus on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209

Perth LocD: South Guildford

LocB: Leschenault Peninsula LocA: Point Douro

LocC: Siesta Park

0 100 200km Downloaded from

Aedes alboannulatus Aedes camptorhynchus Culex globocoxitus Culex australicus Culex quinquefasciatus

ABCDDACDACDD Location Keys 12 A: Point Douro http://jvi.asm.org/ 10 Low abundance viruses B: Leschenault Peninsula 8 High abundance viruses C: Siesta Park 6 D: South Guildford 4

Numberviruses of 2 0 100 Nido Toti-Chryso on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Reo 80 Narna Negev Partiti 60 Other negative-sense 40 Orthomyxo 20 virusfamilies/orders (% of total viraltotal of (% RNA) Mononega Proportioneach majorof 0 Bunya

16

12

8

4

AbundancevirusesRNA of 0 (% of total RNA in the library)inRNAthe total of (%

0.08

0.06

0.04

0.02

0.00 (% of total RNA in the library)inRNAthe total of (% AbundanceRPL32host of gene A Cx. glo. (10) 2 5 Downloaded from

Cx. qui. Ae. cam. 9 (6) (9) http://jvi.asm.org/ 5

Cx. aus.

(13) on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Ae. alb. (1)

B Point Douro Leschenault (16) 7 Peninsula (3) 1 Siesta Park 11 (16) 1 9 South Guildford (18) Bunyaviridae: Phasmavirus related Culex phasma-like virus

100 Wuhan mosquito virus 2 [Culex sp] Seg 1 L 6489bp 100 0.5 Culex phasma-like virus (CPLV) 100 Kigluaik phantom virus [Chaoborus trivitattus] Hubei diptera virus 7 Seg 2 N 2249bp 94 Hubei diptera virus 6 99 100 Wuhan mosquito virus 1 99 Hubei odonate virus 9 Seg 3 G 2076bp 100 100 Hubei odonate virus 8 Wuchang cockroach virus 1 Rhabdoviridae: Dimarhabodovirus related Downloaded from Hubei bunya-like virus 8 Shuangao insect virus 2 100 Bovine ephemeral fever virus Hubei bunya-like virus 9 99 Wongabel virus Ferak virus [Culex sp] 0.5 99 Tupaia virus 100 Sanxia Water Strider Virus 2 Nishimuro virus Wuhan insect virus 2 Vesicular stomatitis Indiana virus 100 99 98 Jonchet virus [Culex quinquefasciatus] 100 Drosophila immigrans 99 Drosophila melanogaster sigmavirus Hubei diptera virus 10 http://jvi.asm.org/ Culex rhabdo-like virus 1 100 Hubei lepidoptera virus 2 99 North Creek virus [Culex sitiens] NP G L 11473bp 100 Culex rhabdo-like virus (CRLV) 100 M Tongilchon virus 1 [Culex bitaeniorhynchus] Hubei dimarhabdovirus virus 3 rhabdovirus Long Island tick rhabdovirus Moussa virus

Mononegavirales: Borna- and Nyamivirus related on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Culex mononega-like virus 1 (CMLV1) Culex mononega-like virus 1 0.5 100 Xincheng Mosquito Virus [ sinensis] 99 Culex mononega-like virus 2 (CMLV2) Seg 1 L 7116bp 100 Hubei diptera virus 11 100 Shuangao Virus 2 Seg 2 N G 5980bp Hubei rhabdo-like virus 7 Hubei orthoptera virus 5 100 Wenling crustacean virus 12 Culex mononega-like virus 2 90 Wenzhou crab virus 1 91 G L 13259bp 100 Beihai rhabdo-like virus 3 Beihai rhabdo-like virus 5 100100 Midway virus Wenzhou tapeworm virus 1 Orthomyxoviridae 100 Wuhan Mosquito Virus 5 [Culex] 100 Aedes alboannulatus orthomyxo-like virus (AAOLV) Wuhan Mosquito Virus 6 0.5 99 Wuhan Mosquito Virus 3 [Culex] 100 Wuhan Louse Fly Virus 3 Seg 1 PB1 2432bp Wuhan mosquito virus 4 [Culex] 100 Wuhan Mosquito Virus 6 [Culex] (WHMV6) Seg 2 PB2 2428bp 100 Wuhan Mosquito Virus 7 97 Hubei earwig virus 1 Wellfleet Bay virus Seg 3 PA 2219bp Quaranfil virus 100 99 Johnston Atoll virus Seg 4 N 1854bp 100 Thogoto virus Hubei orthoptera virus 6 Seg 5 G 1456bp Influenza D virus Influenza C virus Influenza virus Seg 6 849bp 100 Influenza B virus 100 Influenza A virus

Ophioviridae related Qinvirus 100 Hubei qinvirus-like virus 1 0.5 Mirafiore lettuce virus 0.5 99 Wuhan insect virus 15 99 Lettuce ring necrosis virus Ophioviridae Beihai sesarmid crab virus 4 Citrus psorosis virus Sanxia qinvirus-like virus 1 100 Fusarium poae negative-stranded virus 1 Xinzhou nematode virus 3 56 Hubei qinvirus-like virus 2 Rhizoctonia solani negative-stranded virus 2 Wilkie qin-like virus 100 Wilkie ophio-like virus 1 Wenzhou qinvirus-like virus 2 Negev virus related V: Viral methyltransferase R: RdRp S: S domain capsid

100 Culex negev-like virus 2 (CNLV2) A: FtsJ-like methyltransferase R’: Permuted RdRp RT: Read-through 0.5 100 Negev virus [Culex] H: RNA helicase M: Membrane protein 100 Brejeira virus [Culex] 100 Loreto virus [Anopheles albimanus] Culex negev-like virus 1 94 Wuhan house centipede virus 1 100 Wuhan insect virus 8 VA H R 10859bp

Beihai barnacle virus 2 Downloaded from 100 Culex negev-like virus 3 (CNLV3) Culex negev-like virus 2 100 Goutanap virus [Culicidae] 100 9324bp Wallerfield virus [Culex] VA HR M 100 Tanay virus [Culex quinquefasciatus] Hubei virga-like virus 7 Culex negev-like virus 3 98 Citrus leprosis virus C V A H RM 9176bp 100 Muthill virus [Drosophila immigrans] 100 Marsac virus [Scaptodrosophila deflexa] Aedes camptorhynchus negev-like virus (ACNLV) http://jvi.asm.org/ 100 Aedes camptorhynchus negev-like virus 94 Hubei virga-like virus 17 Culex negev-like virus 1 (CNLV1) V H R’ 11470bp Xinzhou nematode virus 1 80 99 RT 100 Wuhan heteroptera virus 1 Narnaviridae 100 Hubei virga-like virus 16 Wuhan insect virus 9 99 virus C Beihai anemone virus 1 100 91 0.5 Ourmia melon virus on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 Hubei virga-like virus 23 98 80 Wuhan spider virus 7 95 Bofa virus 100 Wenzhou shrimp virus 10 Hubei virga-like virus 21 Hubei mosquito virus 3 Boutonnet virus Uncultured virus AGW51782 [Culicine sp] 99 Hubei virga-like viurs 9 99 Zhejiang mosquito virus 3 [Culex sp] (ZJMV3) 100 Narnaviridae environmental sample [Culex pipiens] 89 99 Uncultured virus 2 AJT39597 [Culex pipiens] Zhejiang mosquito virus 3 (Narnaviridae) 98 Hubei narna-like virus 20 Saccharomyces 20S RNA RdRp Point-Douro narna-like virus (PDNLV) 99 Narnavirus 3205bp 99 Saccharomyces 23S RNA narnavirus 100 Phytophthora infestans RNA virus 4 100 Wilkie narna-like virus 1 (WNLV1) Beihai barnacle virus 10 99 Beihai narna-like virus 23 Culex luteo-like virus Beihai narna-like virus 22 100 99 Wilkie narna-like virus 2 (WNLV2) Seg1 R 2830bp Wuhan horsefly Virus 3 FS: frame shift Hubei narna-like virus 23 Ophiostoma 4 99 Mitovirus Seg2 S 1400bp 98 Ophiostoma mitovirus 3a RT Mesoniviridae Luteoviridae related 99 Ngewotan virus [Culex vishnui] 99 Ngewotan virus [Culex australicus] 99 Alphamesonivirus 1 [Culex pipiens] 100 La Tardoire virus 100 0.5 0.1 Houston virus [Aedes albopictus] 99 Wuhan house centipede virus 5 100 Nam Dinh virus Hubei diptera virus 14 Cavally virus [Aedes harrisoni] 100 Culex luteo-like virus (CLLV) 100 100 Bontang virus [Culex tritaeniorhynchus] 100 Hubei sobemo-like virus 41 [Culicine sp] 100 100 Karang Sari virus [Culex vishnui] Humaita-Tubiacanga virus 100 Kamphang Phet virus Wuchan romanomermis nematode virus 3 90 Hana virus [Culex sp] 99 Wenzhou shrimp virus 9 Casuarina virus [Coquillettidia xanthogaster] Hubei sobemo-like virus 39 [Culicine sp] Meno virus Sanxia water strider virus 12 Nse virus Downloaded from

Totiviridae, Chrysoviridae related Aedes camptorhynchus toti-like virus RdRp 6363bp Shuangao chryso-like virus 1 [Culex quinquefasciatus] (SCLV1)

100 http://jvi.asm.org/ Hubei chryso-like virus 1 [Culex] (HBCLV1) Shuangao chryso-like virus 1 99 Hubei chryso-like virus 2 [Diptera] Chryso- 100 Penicillium chrysogenum virus viridae Seg 1 RdRp 3544bp 100 Helminthosporium victoriae 145S virus Magnaporthe oryzae chrysovirus 1 Seg 2 Protease 3197bp Wenling toti-like virus 1 Hubei toti-like virus 9 [Odonate] Seg 3 3154bp 99 dsRNA virus environmental sample [Ochlerotatus sierrensis] on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 100 dsRNA virus environmental sample [Culiseta incidens] Seg 4 3145bp 99 Hubei toti-like virus 10 [Culicine] 85 Aedes camptorhynchus toti-like virus (ACTLV) 100 Saccharomyces cerevisiae virus L-BC Saccharomyces cerevisiae virus L-A Partitiviridae related I 100 Trichomonas vaginalis virus Totiviridae Eimeria brunetti RNA virus 1 Atkinsonella hypoxylon partitivirus 99 100 99 Leishmania RNA virus 1-4 Wilkie partiti-like virus 1 (WPLV1) White clover cryptic virus 2 1 79 Wilkie partiti-like virus 2 (WPLV2) Sclerotinia sclerotiorum partitivirus S White clover cryptic virus 1

Phytoreovirus (Reoviridae) related 0.5

98 Homalodisca vitripennis reovirus Partitiviridae related II 100 Rice dwarf virus Phytoreovirus 99 Ustilaginoidea virens RNA virus M Rice gall dwarf virus Leschenault partiti-like virus (LPLV) Hubei reo-like virus 10 [Odonate] Zygosaccharomyces bailii virus Z Hubei reo-like virus 11 [Odonate] Hubei partiti-like virus 59 Aedes camptorhynchus reo-like virus Beihai barnacle virus 14

0.5 0.5 Downloaded from http://jvi.asm.org/

Qinvirus Fungi: Cox1 on May 27, 2018 by UNIV OF WESTERN AUSTRALIA M209 LocD Aedes camptorhynchus LocA Culex globocoxitus LocB Aedes camptorhynchus

0.05 0.05