Genetics: Early Online, published on September 21, 2017 as 10.1534/genetics.117.300310
1
2
3 TITLE:
4
5 Insights into DDT resistance from the Drosophila melanogaster Genetic Reference Panel
6
7
8 AUTHORS:
9
10 Joshua M. Schmidt1,2,*, Paul Battlay2,*, Rebecca S. Gledhill-Smith2, Robert T. Good2, Chris
11 Lumb2, Alexandre Fournier-Level2, Charles Robin2
12
13
14 1. Department of Evolutionary Genetics, Max Planck Institute for Evolutionary
15 Anthropology, Leipzig, 04103, Saxony, Germany.
16 2. * School of BioSciences and the Bio21 Institute, University of Melbourne,
17 Melbourne, 3010, Victoria, Australia.
18 * These authors contributed equally.
19
20
21
22
23
24
25
1
Copyright 2017. 26
27
28
29 Short running title:
30 DDT resistance in the DGRP
31
32 Keywords:
33 Drosophila Genetic Reference Panel (DGRP), DDT, CG10737, Cyp6w1, triallele
34
35 Corresponding author:
36 Charles Robin, The Bio21 Institute, 30 Flemington Road, Parkville, Melbourne Victoria,
37 3010. Australia.
38 Ph:61 3 8344 2349
39 Email: [email protected]
40
41
42
43
44
45
46
47
48
49
50
2
51 ABSTRACT
52 Insecticide resistance is considered a classic model of microevolution, where a strong
53 selective agent is applied to a large natural population, resulting in a change in frequency of
54 alleles that confer resistance. While many insecticide resistance variants have been
55 characterized at the gene level, they are typically single genes of large effect identified in
56 highly resistant pest species. In contrast, multiple variants have been implicated in DDT
57 resistance in Drosophila melanogaster, however only the Cyp6g1 locus has previously been
58 shown to be relevant to field populations. Here we use genome-wide association studies to
59 identify DDT-associated polygenes and use selective sweep analyses to assess their adaptive
60 significance. We identify and verify two candidate DDT resistance loci. A largely
61 uncharacterized gene, CG10737, has a function in muscles that ameliorates the effects of
62 DDT, while a putative detoxifying P450, Cyp6w1, shows compelling evidence of positive
63 selection.
64
65
66
67
68
3
69 INTRODUCTION
70 Many insights into the genetic basis of adaptation have been gained by genetic analyses of
71 resistance to man-made chemicals; whether that be bacteria to antibiotics (Zhang et al. 2011),
72 weeds to herbicides (Baucom 2016), fungi to fungicides (Schoustra et al. 2005), or insects to
73 insecticides (Crow 1957, McKenzie 1996). Typically, the genetics of such traits are thought
74 to be strongly monogenic and thus distinct from most other traits whose variation is governed
75 by the combined effect of multiple loci of small effect (Lande 1983, Roush and McKenzie
76 1987, Allen et al. 2010). This contention is central to the debate over the genetics of DDT
77 resistance in D. melanogaster. As it is not considered a pest, this model insect may have
78 never been exposed to the high selection intensities thought necessary for the evolution of
79 major-gene-based resistance (MacNair 1991, McKenzie and Batterham 1994). Despite most
80 genetic investigations supporting a polygenic architecture, (Crow 1954, Dapkus and Merrell
81 1977, King and Sømme 1958, Shepanski et al. 1977) or more precisely weak polygenicity
82 (Dapkus 1992; i.e. at most one or two loci per chromosome arm), a gene of major effect,
83 Cyp6g1, was identified and described as ‘necessary and sufficient’ for DDT resistance
84 (Daborn et al. 2002). Proposed solutions to this apparent contradiction between ‘polygenic’
85 and ‘monogenic’ schools include delineating a difference between field and laboratory
86 evolved resistance (Ffrench- Constant 2013) or suggesting that Cyp6g1 is only inconsistently
87 associated with resistance (Kuruganti et al. 2007).
88
89 In the specific case of the long-term DDT-selected strain 91-R, there is a strong argument for
90 the polygenicity of DDT resistance. 91-R is over 1500 times more resistant to DDT than
91 another lab strain, Canton-S (Strycharz et al. 2013); levels far beyond those attributable to
92 Cyp6g1 alone. Much of the recent work aimed at identifying polygenes associated with DDT
93 resistance has focussed on contrasting these resistant and susceptible lab strains, and has
4
94 allowed the generation of transcriptomic (Pedra et al. 2004), proteomic (Festucci-Buselli et
95 al. 2005), physiological (Strycharz et al. 2013) and genomic (Steele et al. 2014, Steele et al.
96 2015) datasets of differentiated factors. These candidates, however, remain to be validated,
97 and their contribution to either the selection response in the lab, or to field resistance, have
98 yet to be quantified. Until now Cyp6g1 has remained the only DDT-resistance locus with
99 molecularly defined alleles that contribute to DDT-related genetic variation observed in field
100 populations.
101
102 Previously Schmidt et al. (2010) demonstrated that an allelic series at the Cyp6g1 locus is
103 sweeping through D. melanogaster populations. The resistant Cyp6g1 variant described by
104 Daborn et al. (2002), which harbored a partial Accord transposable element insertion
105 upstream of the Cyp6g1 locus, was subsequently found to be invariably associated with a
106 duplication of the Cyp6g1 locus (Schmidt et al. 2010). It is still unclear to what extent the
107 upregulation of Cyp6g1 observed at this allele is due to the Accord element (Chung et al.
108 2007) or to the presence of multiple Cyp6g1 copies, or both. However, it is clear that the
109 ancestral Cyp6g1 allele (Cyp6g1-M) is at very low frequencies throughout the world and has
110 largely been replaced by haplotypes carrying the duplication and various transposable
111 element insertions. The Cyp6g1-AA allele (which has two copies of the gene both bearing the
112 Accord element) and the Cyp6g1-BA allele (where one of the duplicated copies contains an
113 additional HMS-Beagle transposable element insertion) both confer resistance to DDT
114 relative to the Cyp6g1-M allele, although they show only subtle phenotypic differences
115 between one another. A fourth allele, Cyp6g1-BP, is at high frequencies in populations in
116 northeastern Australia, and provides a further increase in resistance, in both a set of
117 isochromosomal lines and in a field population (Schmidt et al. 2010). In the latter, the
5
118 Cyp6g1-BP allele accounts for approximately 16% of the total variation in the DDT
119 resistance phenotype.
120
121 The analysis of field variation at the Cyp6g1 locus provides two arguments suggesting that
122 other loci contribute to DDT resistance in the field. Firstly, as Cyp6g1 allelic variation does
123 not account for all the 20-50% heritability expected for insecticide resistance (McKenzie
124 2000), there is much genetic variation that is not accounted for (20-68%). Secondly, as a
125 succession of Cyp6g1 alleles provides a series of adaptive events at one locus, there has been
126 time and selective pressure for other adaptive events arising elsewhere in the genome.
127
128 DDT affects the nervous system by targeting the voltage-gated sodium channel resulting in
129 uncontrolled nerve firing. In D. melanogaster, the alpha subunit of the voltage-gated sodium
130 channel is encoded by para, and target-site resistance to DDT has been described in para
131 mutants by Pittendrigh et al. (1997) and Lindsay et al. (2008). However, para variation
132 affecting DDT resistance has not yet been identified in outbred populations of D.
133 melanogaster, nor has it been implicated in the phenotypes of well-studied DDT-resistant
134 laboratory strains (Dapkus and Merrell 1977; Steele et al. 2014).
135
136 DDT-induced mortality is typically preceded by the loss of peripheral muscle control and
137 manifests as a ‘knockdown’ phenotype in which the fly lies paralyzed with appendages
138 twitching, or exhibits uncontrolled flight. Here we use the Drosophila Genetic Reference
139 Panel (DGRP; Mackay et al. 2012; Mackay and Huang 2017), a resource for the dissection of
140 quantitative traits to characterize the genetic architecture of both knockdown and mortality
141 caused by DDT exposure using predominantly field-derived genetic variation from a single
142 population. This population has shown great phenotypic diversity for, and furnished genetic
6
143 associations with, a diverse range of traits including insecticide resistance (Mackay et al.
144 2012, Huang, Massouras et al. 2014, Battlay et al. 2016). We perform genome-wide
145 association studies on two DDT-related traits to gain insights into insecticide biology and
146 insecticide-driven selection using a classic insecticide model.
147
148
149 MATERIALS AND METHODS
150 Fly stocks: DGRP lines were obtained from the Bloomington Stock Center, and were
151 maintained on corn meal media within a 22-25 degree Celsius temperature range. For the
152 DDT assays, larval density was controlled by setting up vials with 10 mated females that
153 were allowed to lay for three days, then transferred to fresh media for a further three days to
154 establish an additional brood to provide enough progeny for replication. At this stage 2-3
155 replicates per line were established. Adults were allowed to eclose for 2-3 days, and females
156 were collected to food-containing holding vials after brief CO2 anaesthetization. After 2-3
157 days recovery adult females were assayed, thus the age range was 4-6 days for flies assayed.
158
159 DDT assays: Glass scintillation vials were inoculated with 200ul of Acetone/DDT solution at
160 a concentration of 0.5 µg/µL, giving a final contact assay amount of 100 µg of DDT. Cotton
161 wool moistened with 10% sucrose solution was used to stopper the scintillation vials. Assays
162 were setup between 7-8 am, roughly corresponding with the onset of the light phase of the
163 diurnal cycle. Time of setup was recorded, and each vial was scored for the presence of flies
164 exhibiting either knockdown or mortality at approximately one- hour intervals. Knockdown
165 here is defined as either flies permanently seen to be in a prone position with jolting
166 movement of legs or wings, or a prolonged inability to right themselves from a prone position
167 after tapping of the scintillation vial on the lab bench. Mortality was determined as the point
7
168 all external movement ceased. Data was recorded in data sheets with time points as columns
169 and DGRP line number as rows. Assay vials contained 10 females, and there were at least 3
170 replicates per DGRP line. A total of 184 DGRP lines were assayed. In total the assay time
171 was 30 hours. Phenotypes were scored through 1-15, 24 and 30 hours.
172
173 Estimation of heritability: At least three replicates of ten flies per vial were analyzed per
174 DGRP line, which allowed the calculation of broad sense heritability (H2). Following the
175 method of Clowers et al. (2010), heritability was estimated from the variance components of
176 a linear model of the form: Phenotype = Population mean + Line effect + error. As these lines
177 are inbred, the components of the Genetic variance (e.g. additive or dominant), cannot be
178 separated, thus these estimates are of broad sense (H2) heritability. An ANOVA was
179 performed in SPSS, and the variance components were estimated as the Mean Sum of
180 Squares. Total phenotypic variance was estimated as Genetic Variance + Environmental
2 181 Variance. H was thus estimated as Gv/Gv+Ev. This was calculated for each discreet time
182 point.
183
184 Genome-wide Association Studies: Phenotypes were submitted to the Mackay Lab DGRP2
185 website for GWAS (http://dgrp.gnets.ncsu.edu/; Huang, Massouras et al. 2014). To perform
186 our own GWAS for DDT resistance traits, we used the PLINK GWAS program (PLINK
187 v1.9, https://www.cog-genomics.org/plink2; Chang et al. 2015). For genotypes, we initially
188 downloaded the DGRP freeze 1 SNP tables
189 (http://www.hgsc.bcm.tmc.edu/projects/dgrp/freeze1_July_2010/snp_calls/). These tables
190 encoded ‘haploid’ variant calls, with heterozygous positions encoded with the relevant
191 ambiguity codes. To make PLINK compatible files, we used perl scripts to modify these SNP
192 tables into diploid genotypes. We were surprised by a large number of sites that contained
8
193 more than two alleles. The initial DGRP freeze 1 GWAS tool removed all such sites, but
194 assuming that many of these represented low frequency sites or errors, we ranked each
195 variant by frequency and kept the two most frequent ones. The one major difference between
196 our initial results and that of the DGRP freeze 1 GWAS tool was the tri-allelic variant at
197 Cyp6w1.
198
199 However, our analyses here are based on the newer DGRP freeze 2 release. For this we
200 downloaded the dgrp2.bed file (PLINK format) from the DGRP2 website
201 (http://dgrp2.gnets.ncsu.edu/data/website/dgrp2.bed). Thus, these genotypes are identical to
202 those used in the DGRP webtool, except for choices in variant filtering as indicated in the
203 PLINK command line below. It should also be noted that now no heterozygous genotypes
204 are present, and that the two most common variants at tri-allelic sites are also present in this
205 bed file.
206
207 We used the following base PLINK command to perform GWAS (linear regression of
208 genotype phenotype association assuming an additive model of allelic effects) after initially
209 filtering for SNPs with MAF >= 0.05 and at least 70% genotyping rate (given that a tri-allelic
210 site with equal proportions of all three genotypes would have a rate of ~ 66% in the DGRP
211 freeze 2 data).
212
213 plink --allow-extra-chr --allow-no-sex --bfile dgrp2 --covar
214 dgrp2_ESTRAT_PCA20.txt --linear --map3 --no-fid --no-parents --no-pheno --
215 no-sex --aperm 5 1000000
216
217 The ‘aperm’ option was used to generate empirical estimates of the p-value for each SNP. To
9
218 calculate the family wise error rate (FWER) for a given p-value or level of significance, we
219 generated 5,000 random permutations of the 4hrkd and 24hrm phenotype data and performed
220 the same linear association model as for the observed phenotypes. For each permutation we
221 record the lowest observed p-value. The FWER is then the number of random permutations
222 that generated a p-value lower than or equal to that observed, for each SNP i.e. that chance of
223 at least one false positive at this level of significance. After filtering, 1,776,058 SNPs were
224 tested for each PLINK GWAS.
225
226 We performed Principal Components Analysis on the DGRP data using PLINK after pre-
227 filtering genotypes for minor allele frequency (>0.05), missing rate (<0.70) and linkage
228 disequilibrium (LD) pruning (r2 < 0.2) using the indep pairwise command, with both window
229 and step size of 500 variants. To control for confounding cryptic relatedness in the DGRP, we
230 used the first 20 Principle Components as co-variables in all GWAS, as indicated above with
231 the ‘--covar dgrp2_ESTRAT_PCA20.txt’ option in the PLINK command. As inversions and
232 wolbachia infection status can also influence the phenotypes of the DGRP lines, we used the
233 phenotypes adjusted for these factors outputted from the DGRP2 website. Thus, our results
234 for the DGRP freeze 1 and DGRP freeze 2 differ due to now correcting for population
235 structure, adjusting phenotypes for inversion and wolbachia infection status, and the removal
236 of heterozygous genotypes in the DGRP freeze 2. Finally, for annotating variants we used the
237 annotations made available on the DGRP2 website
238 (http://dgrp2.gnets.ncsu.edu/data/website/dgrp.fb557.annot.txt).
239
240 To test specific variants at Cyp6g1 we used PCR primers described in Schmidt et al. (2010)
241 and/or local alignments of Illumina and 454 reads to determine the Cyp6g1 genotypes. For
242 Cyp6w1, our DGRP freeze 1 GWAS indicated the presence of a triallelic site at
10
243 2R:6174944[V6]. We used the re-processed DGRP genotype data from the Drosophila
244 Genome Nexus (Lack et. al. 2015) files to extract genotype data for this position (described
245 further under the iHS and nSL tests methods), as the genotypes for the third variant state are
246 removed from the DGRP freeze 2 data. For both these genes, the sites of interest are triallelic
247 in the DGRP. We generated files with the ancestral allele and derived alleles, both singularly
248 and combined after collapsing to the same allelic code, included to test for associations with
249 DDT phenotypes. In the particular case of Cyp6g1 we introduced a variable site to code
250 Cyp6g1-M, Cyp6g1-AA, or Cyp6g1-BA alleles.
251
252 iHS and nSL tests of recent positive selection: We downloaded the haploid FASTA
253 alignments for 205 DGRP2 genomes, described by Lack et. al. (2015), that are contained in
254 the Drosophila Genome Nexus (DGN; http://johnpool.net/genomes.html). We applied the
255 supplied masking filter for identity by decent (IBD) regions, but not the masks for admixture.
256 For ease of downstream processing we converted these FASTA alignments into VCF files
257 using custom bash and perl scripts. Note, these scripts kept the genotype information at
258 triallelic sites.
259
260 After IBD masking and the pre-filtering of excessively heterozygous regions in the DGN data
261 (using the scripts available from http://johnpool.net/genomes.html), the genomes varied
262 drastically in the proportion of sites with missing data. We excluded chromosomes with more
263 than 20% missing data, leaving us with 134 lines for chromosome 3L. For other
264 chromosomes, we randomly sampled lines without replacement to n=134. In the compilation
265 of VCF files, we excluded sites with greater than 15% missing genotypes in the DGRP2.
266 There were of course missing data remaining after these two steps, still rendering them
267 unsuitable for haplotype tests. We filled in missing genotypes by randomly sampling from
11
268 called genotypes, in proportion to their population frequency. In the main this procedure will
269 reduce local linkage disequilibrium, and thus is conservative vis. a vis. LD based tests of
270 selection, and is similar to the method employed by Garud et. al. (2015).
271
272 We used two tests of selection that both utilize signatures of LD around a beneficial allele.
273 This LD can be measured by the extended haplotype homozygosity (EHH) statistic (Sabeti et.
274 al. 2002). EHH summarises the probability that two chromosomes bearing the beneficial
275 allele are also identical by state at a nearby neutral variant. Increased LD will thus also
276 increase the EHH statistic. The integrated haplotype score (iHS; Voight et. al., 2006), extends
277 EHH by calculating the area under the EHH curve (iHH, and thus integrated) for both the
278 derived and ancestral variant, and then finding the ratio of iHHDerived/iHHAncestral. If LD
279 for both the derived and ancestral allele is approximately the same this ratio should be close
280 to 1, whereas large departures could indicate the action of recent positive selection. As LD is
281 also influenced by the age of mutations, iHS scores are standardised in bins of derived allele
282 frequency, as under a neutral model frequency is indicative of allele age (on average). nSL
283 (number of segregating sites by length; Ferrer-Admetlla et. al., 2014) is similar in spirit to
284 iHS, but instead of EHH the base statistic is the average count of SNPs for which two
285 haplotypes are identical. We use it here because this statistic may be less biased towards
286 regions of low recombination, and have an increased ability to detect soft sweeps (ie the
287 beneficial allele is segregating on more than one distinct haplotype at the time of selection).
288 To calculate both iHS and nSL we used the versions implemented in selscan with default
289 parameters for each (Szpiech and Hernandez 2014). We kept bi-allelic SNPs with 5% < DAF
290 < 95%. The ancestral state was estimated as the homologous Drosophila simulans reference
291 genome allele (aligned to D. melanogaster, also downloaded from
292 http://johnpool.net/genomes.html). Both statistics were normalised in 1% frequency bins per
12
293 chromosome, and a p-value per site was calculated from the empirical distribution of
294 normalised scores (|iHS| |nSL|) per chromosome. Genetic map positions were estimated using
295 the recombination maps generated by Comeron et. al. (2012).
296 For the triallelic sites at Cyp6g1 and Cyp6w1, we manually constructed haplotype files
297 containing the set of one of the derived haplotypes and the ancestral haplotypes. These sets
298 are a subset of those for biallelic sites, and p-values for each of these was calculated as above.
299 For the extended haplotype structure for the Cyp6w1_GLY allele, the extended haplotype
300 homozygosity (EHH) curve does not decay below 0.05 within the default window size, a
301 reflection of both increased haplotype structure but also a region of increased heterozygosity
302 downstream of Cyp6w1 which is filtered out in most DGRP2 individuals and introduces a
303 gap >200kb. To compute an iHS value for this variant we relaxed the default parameters, so
304 that the max gap is 250 kb. Manual inspection of the EHH curves (figure 4) suggests that the
305 high iHS statistic for this variant is due to haplotype structure and not this gap in the
306 estimated haplotypes.
307
308 CG10737 RNAi: Line CG10737-KK (VDRC ID 106383) that has a UAS followed by a
309 hairpin sequence matching CG10737 with no recorded off-targets was obtained from the
310 VRDC. It was crossed to Mef2-GAL4 (Bloomington ID 27390). Line y, w[1118];
311 P(attP,y+,w[3’]) (VDRC ID 60100) was used a control and also crossed to the Mef2-GAL4
312 driver. The offspring of these crosses were evaluated for the DDT resistance phenotypes, 3-
313 hour knockdown and 24-hour mortality. The screens for each cross were performed over a
314 range of DDT concentrations using replicates of 10 females aged 4 to 6 days old. For every
315 DDT concentration, a minimum of 80 females were screened for 3-hour knockdown and a
316 minimum of 50 females were screened for 24-hour mortality. The relationship between the 3-
13
317 hour knockdown resistance phenotypes and DDT concentration was evaluated using
318 XLSTAT to perform a Probit analysis, a log normal regression of the knockdown data
319 (Sakuma 1998). Dosage mortality curves were constructed for the relevant crosses and their
320 controls and the ED50, the dose that will theoretically knockdown 50% of the population, was
321 estimated. To ascertain that the gene had indeed been knocked down total RNA was extracted
322 from 20-30 adult females using the TRIsure-reagent. The extracted RNA was treated with
323 RNA-free DNase (NEB) and residual genomic DNA contamination was assessed for each
324 sample by attempting to amplify a PCR product from the RNA. The samples were converted
325 to cDNA using 1µg of RNA in 20µl reaction with Anchored Oligo (dT)20 and MuLV Reverse
326 Transcriptase (NEB) according to the manufacturer’s instructions.
327
328 Cyp6w1 transgenic overexpression: UAS-Cyp6w1 constructs were designed using the y; cn
329 bw sp; reference strain Cyp6w1 coding sequence as a template. Constructs containing each
330 state of Cyp6w1 triallele identified in the 24hrm GWAS were created by mutating the
331 VAL370 codon in the reference sequence to ALA370 and GLY370. All three constructs were
332 manufactured by Biomatik USA (Delaware, USA), and were supplied cloned into plasmid
333 pUASTattB.
334
335 The UAS-Cyp6w1 constructs were transformed into the y1 M{vas-int.Dm}ZH- 2A w*; (EPS)
336 M{36P3-RFP.attP}ZH-86Fb recipient strain, which has a defined integration site on
337 Chromosome 3, using the attP–attB system and QC31 integrase (Bischof et al. 2007). To
338 generate heterozygous Cyp6w1/tubulin-GAL4 or serrate flies, virgin female homozygous
339 UAS-Cyp6w1 flies were collected on light CO2, stored at 22-25 degrees Celsius for four days
340 to ensure virginity, and then 10 females per vial were crossed to tubulin-GAL4/serrate males.
341 At the same time vials with either 10 UAS-Cyp6w1 or tubulin- GAL4/serrate females were
14
342 established. Adults for all these fly lines were allowed to eclose for 2-3 days, and females
343 were collected to food-containing holding vials after brief CO2 anesthesia.
344
345 Each line was tested on a minimum of 5 doses of DDT. A minimum of 5 replicates of 20
346 individuals per dose per line were treated in the bio-assay. Glass scintillation vials were
347 inoculated with 200µL of Acetone/DDT solution with concentrations ranging from 5×10-5 to
348 1.0µg/µL, giving a final contact assay range of 0.01 to 200µg of DDT. Cotton wool
349 moistened with 10% sucrose solution was used to stopper the scintillation vials. Flies were
350 assayed at 25 degrees Celsius for 24 hours. Data for each tested line was individually
351 analyzed using PriProbit (Sakuma 1998) to estimate the LC50 and generate data for plotting
352 dosage mortality curves and 95 per cent confidence intervals.
353
354 91-R and 91-C genome examination: .bam files containing alignments of 91-R and 91-C
355 sequencing reads to the y; cn bw sp; reference genome were obtained from the sequence read
356 archive (Leinonen et al. 2010; SRR1237973; SRR1237974). Alignments at relevant loci were
357 visualized using IGV software (Thorvaldsdóttir, Robinson & Mesirov 2013) to determine
358 presence of SNPs and discordant reads consistent with structural variation.
359
360 Raw phenotypic data is available in the Supplementary material file S1.
361
362
363 RESULTS
364 Genome-wide association studies: 184 lines of the Drosophila Genetic Reference Panel
365 (DGRP) were exposed to a dose of 100µg of DDT and monitored over 30 hours. Two
366 phenotypes were recorded; percentage knocked-down and percentage dead. The median of
15
367 the 50% knockdown time for the DGRP lines was 15 hours, while the median of 50%
368 mortality time was 18 hours.
369
370 We chose to focus on 24-hour mortality (24hrm), as it is a standard assay in D. melanogaster
371 insecticide resistance literature (Daborn et al. 2001, Schmidt et al. 2010). We also observed
372 minimal zero-dose control mortality and maximal broad-sense heritability (H2= 0.8) at this
373 time point. Of our knockdown timepoints, we chose four-hour knockdown (4hrkd) for
374 further analysis, as the H2 of our assay reached an early peak at this time point and it thus
375 provides a robust yet contrasting dataset to the 24-hour mortality (fig. 1A; file S1). Based on
376 the number of lines phenotyped and genotyped (n=179), the number of replicates (r=3) and
377 the broad-sense heritability of the trait (H2=0.8), the method of Mackay and Huang (2017)
378 was used to estimate the power to detect variants for a range of effect sizes. Variants with an
379 effect of 12% of H2 (~0.1 units of the trait's standard deviation) would be detected with 50%
380 chance and variants with an effect of 22% of H2 (~0.18 units of the trait's standard
381 deviation) with 95% chance, respectively (File S4).
382
383 The 4hrkd phenotype may reveal the intrinsic variation in insecticide uptake routes and
384 nervous system sensitivities, while the later 24hrm phenotype may encompass variation in the
385 physiological response to the insecticide such as inducible detoxification mechanisms. 4hrkd
386 and 24hrm show distinctly different distributions within the DGRP. The population mean for
387 4hrkd is 21 per cent with standard deviation of 19 per cent, while 24hrm has a mean mortality
388 of 58 per cent, with standard deviation 29 per cent. The genetic correlation between the two
389 traits was estimated to be 0.42 (fig. 1B).
390
16
A ●●●●●●●●●●●●●●●● 100 100 ●●●●● ● ●●●● ● ●●
●●●●●
●● ● ● ●●●● ● ●●●●●●● ●●●●●●●● ● 80 ● 80 ●●●● ●● ●● ●● ●●● ● ●● ● ●● ●●●●
●●● ● ●●●●● ●●● ● ) ● ) ●● t t ●●● ●●●● n
n ●●●●●● 60 ● 60 ●
● e ●●●● e ● ●● ●●● c ●●● c
●● r ●●●●●●●●● r ●● ●● e
e ● ●●●●● ●●●● p
p ●
●● ( ●● (
●●●●● ● ●● ●●●
40 m 40 ●● ●●● ●
kd ●●●● ●●● ● ● ●●●●●● hr
hr ●●●●● ●●●●●
4 ●●●●●●●● ●●●● ● ● 24 ●● ●●●●●● ● ● ●●●●●●●● ● ● ●●●●●●●●● ●
●●● ●●●● 20 ●●●● 20 ● ●●●●●●●●● ●●●● ●●●●●●●●● ●● ●●●●●●●● ●●●●●
●●●●●●●●●● ● ●●● ●●●●●●●●●●●●● ●● ●● ● ●●●● ● ●●●●●●●●●●●●●● ●● 0 ●●●●●●●●●●●●●●●●● 0 ●●●●●
0 50 100 150 200 0 50 100 150 200 4hrkd rank 24hrm rank B
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 80 ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
k 60 ● ● ● ● ● ● ●
n ● ● ● ● ● ● ● a
r ● ● ● ● ● ● ● ● ●
m ● ● ● ● ●
hr ● ● ● ● 40 ● ● ● ● ● ● ● ● 24 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ●
● ● ● ● ● ● ● ● ● ● 0 ● ●
0 20 40 60 80 100 4hrkd rank 391
392 Figure 1: DDT Knockdown and Mortality. (A) The mean percentage four-hour knockdown
393 and 24-hour mortality for each DGRP line (based on at least three replicates of 10 individuals
394 each) with lines ordered from lowest to highest for each trait. (B) The scatterplot shows the
395 correlation between 24-hour mortality and four-hour knockdown. Flies that are knocked
396 down early can recover and exhibit late mortality.
397
398 We performed GWAS on our four-hour knockdown and 24-hour mortality phenotypes using
17
399 both the DGRP2 webtool (Huang, Massouras et al. 2014) and a custom implementation using
400 PLINK (version 1.9). We did not observe a significant (ANOVA p > 0.05) effect on DDT
401 induced knockdown or mortality of Wolbacchia infection (knockdown: r2= 0.013 and
402 mortality: r2= 0.010), inversions with a frequency greater than 5% (knockdown: r2 range = -
403 0.145 - 0.176 and mortality: r2 range = -0.264 - 0.136) or genome size
404 (knockdown: r2=0.005 and mortality: r2=0.004). All four GWAS detected variants with p-
405 values less than the arbitrary genome-wide significance threshold (1×10-5; file S2; Mackay et
406 al. 2012). None of the variants associated with 24hrm were also associated with the 4hrkd
407 phenotype. Only 24hrm with the PLINK pipeline showed a variant with family-wise error
408 rate (FWER) < 0.05, in a cytochrome P450 gene Cyp6w1. Using a less conservative cutoff of
409 p<1×10-6, 4hrkd showed 8 variants. Of note, one of these is in CG10737 a gene that has
410 previously been implicated in DDT resistance.
411
412 CG10737: Two SNPs separated by 30 nucleotides occur in the intron of CG10737, and are
413 highly associated with 4hrkd in both PLINK and DGRP2 GWAS; they are in complete LD
414 with each other and have slightly different p-values of association because of missing data
415 (2R:19169399[V6], 2R:19169429[V6]; file S2). CG10737 is predicted to encode a protein
416 with C1 and C2 domains that are capable of diacylglycerol binding and calcium dependent
417 targeting respectively, and is therefore postulated to be involved in intracellular signal
418 transduction (Attrill et al. 2015; Mitchell et al. 2014). A previous study highlighted that
419 CG10737 is one of five ‘lipid metabolism’ genes shown to be differentially expressed among
420 two DDT-resistant lines and a susceptible lab line. Specifically, CG10737 was expressed at
421 significantly lower levels in the DDT-resistant Wisc-1 line than the susceptible Canton-S line
422 (Pedra et al. 2004). This led us to hypothesize that a reduction of CG10737 transcript would
423 lead to increased resistance to DDT-knockdown. To test this hypothesis, we crossed a VDRC
18
424 UAS-CG10737-KK line to knockdown this gene using a Mef2-GAL4 driver, which targets the
425 muscles. The justification for this driver is four-fold; (i) according to the modENCODE
426 datasets CG10737 is highly expressed in the muscle associated tissues (crop, hindgut, heart,
427 carcass and male accessory gland) and is considered muscle-enriched in two microarray
428 analyses (Zhan et al. 2007, Weake et al. 2008), (ii) genes that are co-expressed with
429 CG10737 are known muscle genes (Myosin heavy chain, myosin light chain 1 and 2,
430 troponin and myosuppressins) (iii) chromatin co-immunoprecipitation experiments show that
431 MEF2 binds to the CG10737 cis regulatory region (Sandmann et al. 2006, Cunha et al. 2010)
432 and (iv) Bonn (2010) demonstrated that antibodies to CG10737 stained less intensely in the
433 somatic and visceral mesoderm when in a mef2 null background.
434
435 The progeny of the Mef2-GAL4 x UAS-CG10737-KK cross were assayed for DDT
436 knockdown resistance over a range of doses. The dose of DDT required to knockdown 50%
437 of flies (ED50) was more than twice as high for UAS-CG10737-KK lines as it was for the
438 control lines (109 ug/vial versus <47ug/vial; fig. 2). These findings support the hypothesis
439 that CG10737 transcript reduction increases resistance to DDT knockdown.
440
19
125
100
75
50
ED50 (ug/mL) 25
0
CG10737-KK 60100 CG10737-KK × w1118 × mef2-GAL4 × mef2-GAL4
441
442 Figure 2: RNAi knockdown confirms that lowering CG10737 transcript abundance increases
443 DDT resistance, and that this effect can be mediated by manipulating muscle expression.
444 Flies in which CG10737 is knocked down in muscles using the mef2-GAL4 driver crossed to
445 the VDRC CG10737-KK line are significantly more resistant to DDT than control flies which
446 are the result of substituting the relevant genetic background line for either mef2-GAL4
447 (w1118) or the CG10737-KK line (60100) in the cross.
448
449 Cyp6w1: This gene is the standout candidate as judged by significance of association with
450 variation in 24-hour mortality. The associated SNP (2R:6174944[V6]) results in a coding
451 sequence change in an enzyme belonging to a family capable of detoxification and which is
20
452 most abundantly expressed in the fly fat body, a known detoxification tissue. It is also
453 expressed in the spermatheca, heart, head and carcass (Chintapalli et al. 2007), and Pedra et
454 al. (2004) found it to be overexpressed in the DDT-resistant 91-R line relative to the Canton-
455 S susceptible line. The variant site associated with DDT 24hrm in Cyp6w1 is particularly
456 interesting as three different allelic states at this site are observed within the DGRP, and each
457 state encodes a different amino acid. Such triallelic sites, which represent 2.5% of variable
458 sites in the DGRP, get excluded by many analyses (either explicitly- by excluding sites with
459 more than two states, or, implicitly- because the most abundant two states combined are not
460 enough to pass abundance thresholds).We surmise that this latter factor is the reason we do
461 not recover this variant using the Mackay DGRP2 webtool (Huang, Massouras et al. 2014;
462 http://dgrp.gnets.ncsu.edu/), as less than 75 per cent of lines have a called genotype for this
463 position. In contrast, we used a lower missing genotype threshold of 60 per cent in our
464 PLINK GWAS pipeline. We initially identified 2R:6174944[V6], as a bi-allelic TC transition
465 resulting in a Val370Ala substitution. By investigating this site in the Drosophila genome
466 DGRP2 data, we found a third state, G, that is at a frequency of 14% in the DGRP, and
467 results in a Val370Gly substitution. Combining the effect of the C and G alleles, the p-value
468 for association with DDT mortality is 7.88×10-7. Individually the G allele is insignificant
469 genome-wide (p=0.015), but taken together these results suggest that it too could influence
470 DDT resistance.
471
472 To test what effect the three amino acid states at site 370 of CYP6W1 have on DDT
473 resistance three transgenic lines were generated such they that only differed at that single site.
474 The three lines were crossed to a GAL4 driver line under the control of tubulin promoter
475 sequence, so that the UAS-Cyp6w1 isoforms were expressed in a ubiquitous manner. All
476 three of the Cyp6w1 transgenes were at least 12 times more resistant to DDT than their intra-
21
477 cross sibs when under the control of the tubulin-GAL4 driver (Figure 3; LD50 [the dose
478 predicted to result in 50% mortality] for Cyp6w1 overexpression lines: Cyp6w1_VAL =
479 14ug/vial, Cyp6w1_GLY = 12ug/vial and Cyp6w1_ALA = 28ug/vial). This indicates that
480 overexpression of Cyp6w1 in itself can result in a DDT resistance. However, an additional
481 effect is seen for the Ala370 amino acid substitution, as flies overexpressing this isoform are
482 at least two times more resistant than either Val370 or Gly370 counterparts (fig. 3). Thus, the
483 Ala370 substitution appears to increase DDT resistance, supporting the results of our 24-hour
484 mortality GWAS.
485
c 40
30 b 20 Cyp6w1_ALA Cyp6w1_GLY
10 a a Cyp6w1_VAL LD50 (ug/mL) 0 UAS-Cyp6w1 intra-cross UAS-Cyp6w1 parent control × Tub-GAL4 486
487 Figure 3: Toxicology of transgenic lines overexpressing the three allelic forms of Cyp6w1
488 (far right) and various controls. All three allelic forms of Cyp6w1 confer DDT resistance
489 relative to controls, and Cyp6w1_ALA confers a further significant increase in DDT
490 resistance relative to the other two alleles.
491
492 As our transgenic analyses showed that Cyp6w1 overexpression can yield resistance and
493 Cyp6w1 transcripts are more abundant in Pedra et al. (2004), we examined Cyp6w1
494 transcription in females from DGRP transcriptome dataset (Huang et al. 2015). However
22
495 there was no strong correlation between Cyp6w1 transcription level and DDT mortality or
496 knockdown data, nor between Cyp6w1 transcription level and the triallelic states of Cyp6w1.
497 Similarly, we did not find that any of the other eight detoxification or four lipid metabolism
498 genes that were focused on by Pedra et al. (2004) to have strong correlations between the
499 DGRP transcription data sets and either DDT traits we examined (file S3).
500
501 Selective sweep tests at DDT-associated loci: To assess whether 4hrkd and 24hrm-
502 associated (p<1×10-5) variants from the PLINK GWAS exhibited evidence of being the
503 targets of selection we used the integrated Haplotype Score (iHS) test (Voight et al. 2006,
504 Sabeti et al. 2002). We also calculated the related statistic nSL (number of segregating sites
505 by length; Ferrer-Admetlla et. al. 2014), which defines haplotype lengths via pairwise
506 differences and may have higher power to detect soft selective sweeps. The only derived
507 variant yielding a significant iHS or nSL score was the Glycine allele of Cyp6w1 (|iHS| =
508 3.82, p-value = 0.00085; fig. 4; |nSL| = 3.04, p-value = 0.0028). The other derived variants
509 did not show significant departures from expectation, including the second derived allele at
510 Cyp6w1 – the alanine allele that confers increased DDT resistance in our transgenic
511 experiments.
512
513 To ascertain the power of these iHS and nSL approaches with this dataset we also examined
514 Cyp6g1, the genomic region around which Catania et al. (2004), Schlenke and Begun (2004)
515 and Garud et al. (2015) noted a reduction in genetic diversity indicative of a selective sweep.
516 While there is an allelic series at this locus (Schmidt et al. 2010), the two predominate alleles
517 in the DGRP are Cyp6g1-AA and Cyp6g1-BA, and the EHH for both of these alleles and for
518 the ancestral Cyp6g1-M allele were determined. Clearly both derived Cyp6g1 alleles exhibit
519 large extended haplotypes. Both of these are significant compared to ancestral Cyp6g1-M
23
520 allele (fig. 4; Cyp6g1-AA : |iHS| = 2.45, p-value = 0.018; |nSL| = 2.61, p-value = 0.011;
521 Cyp6g1-BA : |iHS| = 1.96, p-value = 0.027; |nSL| = 2.47, p-value = 0.012).
A B 1.0 1.0 Cyp6w1−Val Cyp6g1−M Cyp6w1−Ala Cyp6g1−AA Cyp6w1−Gly Cyp6g1−BA 0.8 0.8 0.6 0.6 EHH EHH 0.4 0.4 0.2 0.2
−0.05 0.00 0.05 0.10 0.15 −0.2 −0.1 0.0 0.1 0.2 0.3
genetic distance from focal variant (cM) genetic distance from focal variant (cM) 522
523 Figure 4: Extended Haplotype Homozygosity (EHH) plots, (A) around the Cyp6w1 locus
524 using the triallelic second site of codon 370 as the focal variant, and (B) around the Cyp6g1
525 locus using transposable element insertion sites as the focal variant.
526
527 Further evidence that insecticide resistance alleles may be adaptive comes from elevated
528 levels of population differentiation (Taylor et al. 1995). We examined Cyp6w1 allele
529 frequencies in African, European, Australian and North American populations (Pool et al.
530 2012, Martins et al. 2014, Bergman and Haddrill 2015, Reinhardt et al. 2014; table 1); most
531 populations indicate segregation of each of the triallelic states, except for Portugal, which
532 does not exhibit Ala370. The frequency of the variants, however, shows marked population
533 differentiation, perhaps indicating geographical variation in selection intensity by DDT or
534 other insecticides.
535
24
536 Table 1: Population frequencies of CYP6W1 codon 370 tri-alleles
Allele Frequency Data Sample Continent Population Source Size VAL ALA GLY
Pool et al. Africa Combined 139 0.9 0.07 0.3 2012 Reinhardt Queensland 17 0.53 0.03 0.46 et al. 2014 Australia Reinhardt Tasmania 15 1 0 0 et al. 2014 Haddrill and France - A 50 0.78 0.02 0.2 Bergman 2012 Europe Pool et al. France - B 8 0.5 0 0.5 2012 Martins et Portugal 12 0.95 0 0.05 al. 2014 Reinhardt Maine 16 0.69 0.25 0.06 et al. 2014 North North Mackay et 162 0.54 0.32 0.14 America Carolina al. 2012 Reinhardt Florida 16 0.63 0.31 0.06 et al. 2014 537
538
539 DISCUSSION
540 The Drosophila genetic reference panel (DGRP) is a powerful resource for identifying natural
541 variation contributing to a range of phenotypes (Mackay and Huang 2017). However, it is of
542 particular interest to the study of insecticide resistance due to the important role insecticides
543 have apparently played in the evolutionary history of this population; two of the strongest
544 peaks of selection in the DGRP, genome wide, are Ace and Cyp6g1 (Garud et al. 2015).
545
25
546 Ace is the molecular target of organophosphate and carbamate insecticides, and within the
547 DGRP segregate three of the four substitutions in Ace known to confer resistance to these
548 insecticide classes in D. melanogaster (Menozzi et al. 2004, Battlay et al. 2016). Cyp6g1 has
549 likewise been demonstrated to provide resistance to insecticides, including the neonicotinoids
550 imidacloprid and nitenpyram (Daborn et al. 2001, Daborn et al. 2007, Joussen et al. 2008)
551 and the organophosphate azinphos-methyl (Battlay et al. 2016). There is also evidence that
552 Cyp6g1 may contribute to resistance in other organophosphates including malathion,
553 parathion, and diazinon (Kikkawa 1961, Pyke 2004). But the best studied of Cyp6g1’s
554 insecticide substrates remains DDT (Daborn et al. 2002, Daborn et al. 2007, Joussen et al.
555 2008, Schmidt et al. 2010); derived alleles of Cyp6g1 have previously been shown to
556 contribute significantly to DDT resistance, with the Cyp6g1-BP allele explaining ~16% of the
557 total variance in DDT resistance in a field population from Queensland, Australia (Schmidt et
558 al. 2010).
559
560 While DDT is not used in much of the world today, DDT resistance is a trait of interest for
561 two important reasons: Firstly, it may help to explain the strong and recent sweeps at the
562 Cyp6g1 locus (Schmidt et al. 2010, Kolaczkowski et al. 2011, Garud et al. 2015), and
563 secondly it was the subject of the classic work by Crow (1956), Sokal and Hunter (1954) and
564 others into microevolution. We performed GWAS of two DDT resistance phenotypes in
565 hopes of dissecting the quantitative nature of a selective response to DDT exposure.
566
567 Cyp6g1 was not identified in the GWAS: The association of Cyp6g1 allelic variation with
568 either the 4hrkd or 24hrm DDT phenotypes is not significant, even at the genome-wide
569 significance threshold. This is partly attributable to the Cyp6g1 sweep itself; resistant
570 Cyp6g1-AA and Cyp6g1-BA alleles are present in all but nine DGRP lines (all nine were
26
571 phenotyped in this study), while the Cyp6g1-BP allele, which confers the most drastic
572 increase in DDT resistance (Schmidt et al. 2010), is absent from the population. Although
573 much of the power to detect the effect of Cyp6g1 resistance is gone from the DGRP, Battlay
574 et al. (2016) have previously shown that a haplotype in linkage disequilibrium with the
575 ancestral Cyp6g1-M allele was strongly associated with susceptibility to a low dose of the
576 organophosphate insecticide azinphos-methyl in DGRP larvae. One reason Cyp6g1 was
577 associated with the azinphos-methyl phenotype and not our DDT phenotypes could be the
578 relative contributions of Cyp6g1 overexpression to these traits: Daborn et al. (2007)
579 demonstrated a 3-fold increase in adult DDT LD50 when Cyp6g1 was overexpressed using the
580 GAL4-UAS system, while Battlay et al. (2016) observed a 6.5-fold increase in larval
581 azinphos-methyl LD50 in the same crosses. It is also possible that the dose of the DDT used in
582 this study did not maximize the chances of recovering the Cyp6g1 variation present in the
583 DGRP. Steele et al. (2014) also reported an absence of Cyp6g1 signal in their comparison of
584 91-R and 91-C genome sequences. This is due, however, to Steele et al. (2014) limiting their
585 study to analysis of open reading frames. We revisited genome sequence alignments
586 generated by Steele et al. (2014), and found that evidence of the Accord LTR insertion into
587 the 5’ UTR of Cyp6g1 (Daborn et al. 2002) as well as copy number variation in both Cyp6g1
588 and Cyp6g2 (Schmidt et al. 2010), both identifiers of Cyp6g1 resistance alleles, are present in
589 91-R but not 91-C.
590
591 The two detoxification enzymes implicated in DDT mortality: Cyp6g1 and Cyp6w1 could
592 interact in in parallel or in series. Epistatic interactions between alleles at the two loci were
593 examined but unfortunately, we did not have the power to test for such interaction in the
594 DGRP because there is only one DGRP line (Ral-486) that is homozygous for the susceptible
595 Cyp6g1-M allele and homozygous for the resistant Cyp6w1-Ala allele. We have however
27
596 assessed pairwise epistatic interactions of the GWAS nominally associated variants listed in
597 the combined mortality and knockdown lists, and none were significantly extreme to pass the
598 Bonferroni correction threshold (~0.000015) for either the knockdown (lowest value: 0.0002)
599 or the mortality (lowest value: 0.0015) traits.
600
601 Variation at CG10737 contributes to four-hour knockdown but not 24-hour mortality:
602 The GWAS presented here demonstrate that the two DDT-associated traits under
603 examination have distinct genetic architectures, and for each trait we have generated
604 transgenic evidence to support the involvement of a top candidate in DDT phenotype
605 variation. CG10737 has previously been implicated as a DDT resistance candidate, being one
606 of 158 genes shown to be significantly differentially regulated between DDT-resistant 91-R
607 and a standard susceptible lab stock (Canton-S; Pedra et al. 2004). Moreover, inspection of
608 the 91-R and 91-C genome sequences generated by Steele et al. (2014) shows that both
609 CG10737 GWAS SNPs, associated with increased knockdown susceptibility, are present in
610 91-C but absent in 91-R. Using a completely independent approach we have found that these
611 non-coding variants in CG10737, present in ~10% of the DGRP, are strongly associated with
612 4hrkd. Our RNAi knockdown of CG10737 increased DDT resistance, consistent with Pedra
613 et al.’s (2004) observation that CG10737 transcript was less abundant in the resistant 91-R
614 line. We note that the significantly associated GWAS variants also lie in a region that
615 modENCODE found to bind the Trithorax-like (Trl) GAGA factor, suggesting an untested
616 mechanism linking the naturally occurring polymorphisms to transcriptional abundance.
617
618 An important insight gained through these experiments is that they demonstrate a role of
619 CG10737 in the muscles of flies, and a relationship between DDT and muscle biology.
620 CG10737 knockdown in the muscles resulted in a two to four-fold increase in resistance,
28
621 demonstrating that DDT perturbation can be ameliorated by genetic manipulation limited to
622 the muscles.
623
624 The molecular function of CG10737 has not been well characterized. It is a gene with 14
625 coding exons and produces at least 12 alternatively spliced mRNA isoforms. It putatively
626 encodes a transmembrane protein with C1 and C2 calcium/lipid binding domain and it has
627 been annotated as ‘protein kinase c-like’ although it lacks a catalytic domain. C1 domains
628 have cysteine motifs that form zinc fingers and typically bind diacylglycerol (Colón-
629 González and Kazanietz 2006) while C2 domains typically bind calcium ions and are
630 associated with protein-protein and protein-membrane interactions (Ochoa 2001). Given the
631 important role that calcium plays in synapse and muscle signaling perhaps a decrease in the
632 expression level of CG10737 makes flies less sensitive to Ca2+ ion signaling. It thereby may
633 reduce the expression of the DDT-knockdown phenotype by mediating the effect of over-
634 stimulation of muscle cells caused by the uncontrolled firing of nerves due to DDT’s effect
635 on the voltage gated sodium channel, PARA, rather than influencing nerve functionality
636 directly.
637
638 While CG10737 is associated with DDT-knockdown in the GWAS it was not significantly
639 associated with 24-hour mortality GWAS. Furthermore, a re-examination of the DDT-
640 knockdown phenotype for DGRP lines over the initial time course of the experiment found
641 that variants in this gene were only strongly associated with knockdown over a narrow time
642 window (3-5 hours), suggesting that the gene’s role in knockdown resistance is transient.
643 Additionally, in the RNAi experiments the dose that effectively interrogated the temporally
644 defined knockdown was so high that the flies did not live long enough to accurately score 24-
645 hour mortality. This is attributable to genetic background differences between the lines going
29
646 into the RNAi crosses and the DGRP lines. These issues highlight the differences in genetic
647 architecture of subtly different DDT-related traits. While the knockdown trait may give us
648 insight into the perturbations of DDT onto insect physiology, and the function of previously
649 uncharacterized genes, one has to ask how relevant this variation is to the field exposure to
650 DDT. We were unable to detect any signature of selection around variants in this gene, or
651 indeed any high frequency derived knockdown-associated variants, that would indicate
652 microevolutionary change in the recent history of the Raleigh population of D. melanogaster
653 – the source of the DGRP. However, this may reflect an issue of power to detect subtle
654 changes in frequency of pre-existing variants rather than an irrelevance of the phenotype to
655 survivorship in the field. We also note that in early DDT selection experiments in D.
656 melanogaster by Hunter and Sokal (1954) pupation height was correlated with DDT
657 resistance and that Riedl et al. (2007) mapped a pupal height QTL to a narrow region
658 encompassing CG10737.
659
660 Triallelic variation at Cyp6w1 separates DDT-resistant allele from a selective sweep
661 signal: Like CG10737, Cyp6w1 was previously identified as a DDT resistance candidate by
662 Pedra et al. (2004), who found it had high transcriptional abundance in 91-R relative to the
663 susceptible Canton-S. Again, inspection of the Steele et al. (2014) genomes revealed that 91-
664 R also carries the DDT resistance-associated Cyp6w1_ALA allele, whereas 91-C carries the
665 Cyp6w1_GLY allele. While our transgenic manipulation of this gene shows that increased
666 expression of three different alleles of Cyp6w1 can indeed lead to increased resistance to
667 DDT, it is amino acid variants that appear to explain the difference in resistance between
668 DGRP lines. The associated variant is a rare triallelic site where each state encodes a
669 different amino acid (Valine GTG, Alanine GCG or Glycine GGG). No sites are in linkage
670 disequilibrium with it that are also significantly associated with 24-hour mortality, and the
30
671 sharp and immediate decrease in Extended Haplotype Homozygosity for both the Val370 and
672 Ala370 variants makes it highly unlikely that this variant is a proxy for an undetected, linked
673 regulatory change polymorphism. Amino acid site 370 of CYP6W1 is predicted to occur in
674 Substrate Recognition Domain SRS5 (Gotoh 1992, Zawaira et al. 2011), of an enzyme in a
675 gene family frequently associated with insecticide resistance. It is therefore completely
676 credible that it acts to metabolize DDT, perhaps in a similar way to its paralog CYP6G1
677 (Joußen et al. 2008, Hoi et al. 2014).
678
679 Inspection of CYP6W1 site 370 in related species suggest that Val370 is the ancestral state of
680 this site; the Ala370 and Gly370 substitutions are therefore both derived. The Ala370 variant
681 is strongly associated (p=5×10-10) with reduced 24-hour morality in our GWAS, and this is
682 supported by our transgenic work, which found overexpression of Cyp6w1_ALA increased
683 DDT resistance relative to Cyp6w1_VAL and Cyp6w1_GLY. Calculation of iHS and nSL did
684 identify a significant signal of selection at Cyp6w1_ALA, but it was associated with the other
685 derived allele, Gly370. This allele is not significantly associated with 24-hour mortality in
686 our GWAS, and transgenic overexpression of Cyp6w1_GLY does confer resistance to DDT
687 relative to Cyp6w1_VAL. There is no evidence, therefore, that DDT is responsible for the
688 selective footprint identified at Cyp6w1, and we propose that the Gly370 mutation increases
689 fitness under a different selective pressure, possibly by increasing the affinity of Cyp6w1 to
690 another insecticide. The extreme population differentiation observed (TABLE 1) is also
691 consistent with multiple selective agents acting on this site.
692
693 GWAS-associated variants in 91-R and 91-C: Interestingly, resistance-associated variants
694 in CG10737 and Cyp6w1 from our GWAS were present in 91-R but not 91-C. To see if any
695 of our other DGRP resistance candidates were differentiated between 91-R and 91-C, we
31
696 manually searched the genome sequences of these lines generated by Steele et al. (2014) for
697 each of our GWAS candidates (p<1×10-5; file S2). 50 distinct GWAS candidate variants
698 varied between 91-R and 91-C. Of these, 17 (including the two SNPs in CG10737 and the
699 Cyp6w1 triallele) varied in the expected direction, with the ‘resistant’ allele in 91-R and the
700 ‘susceptible’ allele in 91-C (table 2). Thus, these variants may have been segregating in the
701 population used to found 91R and 91C and potentially contributed to the strong selection
702 response observed in 91R.
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
32
721 Table 2: GWAS resistance variants enriched in 91R but not 91C
Phenotype Pipeline Location [V6] p-value Gene Site class 91-C 91-R
non- 24hrm PLINK 2R:6,174,944 1.72E-09 Cyp6w1 T C synonymous 4.28E- DGRP2, 4hrkd 2R:19,169,429 07, CG10737 intron T G PLINK 7.58E-07
4hrkd DGRP2 2L:14,188,350 1.64E-06 smi35A intron G G/C
4hrkd PLINK 3L:4,059,748 1.75E-06 - intergenic T C
4hrkd PLINK 2L:5,600,056 1.97E-06 - intergenic C A/C
4hrkd PLINK 3L:12,404,415 3.00E-06 CG32103 synonymous C C/T
4hrkd DGRP2 3R:30,968,896 3.70E-06 - intergenic A ATA
5.45E- DGRP2, 4hrkd 2R:19,169,399 06; CG10737 intron T C PLINK 7.62E-06
24hrm PLINK 2L:10,546,377 5.68E-06 Trim9 intron A A/G
4hrkd DGRP2 3L:7,102,962 6.79E-06 form3 intron C C/T
4hrkd PLINK 3L:6,935,976 7.88E-06 - intergenic A A/G
24hrm PLINK 3L:15,854,279 8.76E-06 pHCl intron G A
4hrkd DGRP2 2L:8,232,517 9.49E-06 Pvr intron C/A C
4hrkd DGRP2 2L:4,086,206 9.57E-06 ed intron T G
4hrkd DGRP2 2L:19,231,491 1.66E-05 - intergenic A G
2.02E- DGRP2, 4hrkd 3L:7,495,781 05; Cyp4d8 intron G T PLINK 7.38E-06
4hrkd DGRP2 3R:25,716,964 2.15E-05 LpR2 intron A C
722
723
33
724 Other DDT resistance genes: Three substitutions in Cyp6a2 increasing its DDT
725 metabolism capacity have been described in RalDDTR (Amichot et al. 2004), a lab strain
726 selected for DDT resistance over 50 generations (Cùany et al. 1990). The first two
727 substitutions (R335S and L336V) are absent from the DGRP, as well as population samples
728 from Africa (DPGP; Pool et al. 2012), France and Portugal in Europe, Florida and Maine in
729 the USA, and Queensland and Tasmania in Australia (Bergman and Haddrill 2015; Pool et al.
730 2012; Martins et al. 2014; Reinhardt et al 2014). The third substitution, V476L, is found in
731 around a quarter of DGRP and DPGP sequences (24.1% of DGRP, 26.5% of screened
732 DGRP, 32.2% of DPGP), but this variant does not affect DDT metabolism by itself (Amichot
733 et al. 2004), and was not associated with either 4hrkd or 24hrm at p<1×10-5. Similarly, the
734 resistance-linked Cyp6a2 promoter variant described by Wan et al. (2013) is present in the
735 DGRP (68.8% of DGRP, 68.6% of screened DGRP), but was not associated with either of
736 our phenotypes at p<1×10-5.
737
738 Another P450, Cyp12d1, is overexpressed in the DDT-resistant Wisc-1 strain relative to
739 Canton-S (Pedra et al. 2004), and has been shown to increase DDT resistance when
740 overexpressed transgenically (Daborn et al. 2007), as well as reducing resistance when
741 knocked down with RNAi (Gellatly et al. 2015). We did not find associations with Cyp12d1
742 in any of our four GWAS, nor did we find strong correlations between our phenotypes and
743 Cyp12d1 copy number variation or transcription level (file S3). Steele et al. (2014) likewise
744 found no differentiation in Cyp12d1 coding regions between 91-R and 91-C, and our
745 inspection of these genomes found no differentiation in copy number between the two strains.
746
747 Cyp12d1 has been shown to be inducible by DDT (Brandt et al. 2002, Festucci- Buselli et al.
748 2005, Willoughby et al. 2006), as well as other xenobiotics (Willoughby et al. 2006, Misra et
34
749 al. 2011). Our results do not rule out the involvement of this gene in DDT resistance in the
750 DGRP, but show that genomic variation at Cyp12d1, and its transcription level without
751 induction, do not correlate with the DDT-resistance phenotypes measured in this study.
752
753 In the case of para, neither the kdr nor super-kdr resistance mutations are present in the
754 DGRP, nor any of the analogous sites conferring resistance in mutagenized lines
755 characterized by Pittendrigh et al. (1997) or Lindsay et al. (2008).
756
757
758 CONCLUSIONS
759 There are three main lines of evidence used in the literature to associate genes with DDT
760 resistance in D. melanogaster: Differentiation between resistant and susceptible laboratory
761 strains, association of natural alleles with resistance in outbred populations, and transgenic
762 manipulation in a controlled genetic background. Until now, only Cyp6g1 has bridged the
763 divides between these approaches. With this study, Cyp6w1 and CG10737 can be added to
764 this list. These results alone support a polygenic model and are reinforced by the possibility
765 that other genes from our GWAS, or other studies, also contribute to DDT resistance but have
766 yet to be verified. We have focused on two specific DDT-related traits; four-hour knockdown
767 and 24-hour mortality, both in adult females with a single dose of DDT, and they give only
768 partially correlated results. Which of these has greater relevance in the field, however, is not
769 clear. Furthermore, there are many alternate parameter values and indeed parameters (eg.
770 temperature; Fournier-Level et al. 2016) that may make lab-based DDT assays more
771 accurately reflect the variation most relevant to field survivorship. Perhaps one of the most
772 compelling diagnostics for a relevant assay would be if the loci uncovered showed the
773 molecular signs of selection. Previous studies have revealed that the major DDT-resistance
35
774 locus of some populations, Cyp6g1, shows the hallmarks of a selective sweep. We did not,
775 however, observe signs of selection around our DGRP GWAS candidates. While we did
776 detect signs of a sweep at Cyp6w1 it was associated with a second derived variant, not with
777 the variant associated with DDT resistance. Given that Cyp6g1 is also capable of providing
778 resistance to other insecticides, the failure to find sweeps at DDT-specific resistance loci
779 strengthens the possibility that other insecticides have contributed to driving the Cyp6g1
780 selective sweeps in field populations of Drosophila melanogaster. In the case of DDT
781 resistance alleles that negatively impact fitness, relaxed DDT selection since the insecticide’s
782 deregulation may explain their absence from the population. Finally, it is also a possibility
783 that statistics such as iHS and nSL are insensitive to the slight changes in allele frequency that
784 might occur if a trait is largely polygenic.
785
786
787 ACKNOWLEDGEMENTS
788 This work was supported by ARC Discovery Grant DP0985013. Jason Somers contributed to
789 the cloning and transgenic delivery of the Cyp6w1 constructs. We thank Trudy Mackay for
790 her advice and support for this work, and the anonymous reviewers for suggestions that
791 improved this manuscript.
792
36
793 LITERATURE CITED
794 Allen, H. L., Estrada, K., Lettre, G., Berndt, S. I., Weedon, M. N., Rivadeneira, F., ... &
795 Ferreira, T. (2010). Hundreds of variants clustered in genomic loci and biological pathways
796 affect human height. Nature, 467(7317), 832-838.
797
798 Amichot, M., Tares, S., Brun‐Barale, A., Arthaud, L., Bride, J. M., & Bergé, J. B. (2004).
799 Point mutations associated with insecticide resistance in the Drosophila cytochrome P450
800 Cyp6a2 enable DDT metabolism. European Journal of Biochemistry, 271(7), 1250-1257.
801
802 Attrill, H., Falls, K., Goodman, J. L., Millburn, G. H., Antonazzo, G., Rey, A. J., ... &
803 FlyBase Consortium. (2015). FlyBase: establishing a Gene Group resource for Drosophila
804 melanogaster. Nucleic acids research, gkv1046.
805
806 Battlay, P., Schmidt, J. M., Fournier-Level, A., & Robin, C. (2016). Genomic and
807 Transcriptomic Associations Identify a New Insecticide Resistance Phenotype for the
808 Selective Sweep at the Cyp6g1 Locus of Drosophila melanogaster. G3: Genes| Genomes|
809 Genetics, 6(8), 2573-2581.
810
811 Baucom, R. S. (2016). The remarkable repeated evolution of herbicide resistance. American
812 journal of botany, 103(2), 181-183.
813
814 Bergman, C. M., & Haddrill, P. R. (2015). Strain-specific and pooled genome sequences for
815 populations of Drosophila melanogaster from three continents. F1000Research, 4.
816
817 Bischof, J., Maeda, R. K., Hediger, M., Karch, F., & Basler, K. (2007). An optimized
37
818 transgenesis system for Drosophila using germ-line-specific φC31 integrases. Proceedings of
819 the National Academy of Sciences, 104(9), 3312-3317.
820
821 Bonn, B. Myosin heavy chain like (Mhcl) agiert während der Embryonalentwicklung und
822 Myogenese von Drosophila melanogaster in Redundanz zu Zipper, die Funktion des C2-
823 Domänen-Proteins CG10737-P während der Muskelentwicklung bleibt unklar. PhD Thesis,
824 Marburg: Philipps-Universität, 2010
825
826 Brandt, A., Scharf, M., Pedra, J. H. F., Holmes, G., Dean, A., Kreitman, M., & Pittendrigh,
827 B. R. (2002). Differential expression and induction of two Drosophila cytochrome P450
828 genes near the Rst (2) DDT locus. Insect molecular biology, 11(4), 337-341.
829
830 Browning, S. R., & Browning, B. L. (2007). Rapid and accurate haplotype phasing and
831 missing-data inference for whole-genome association studies by use of localized haplotype
832 clustering. The American Journal of Human Genetics, 81(5), 1084-1097.
833
834 Catania, F., Kauer, M. O., Daborn, P. J., Yen, J. L., fFrench‐Constant, R. H., & Schlötterer,
835 C. (2004). World‐wide survey of an Accord insertion and its association with DDT resistance
836 in Drosophila melanogaster. Molecular ecology, 13(8), 2491-2504.
837
838 Chang, C. C., Chow, C. C., Tellier, L. C., Vattikuti, S., Purcell, S. M., & Lee, J. J. (2015).
839 Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience,
840 4(1), 7.
841
38
842 Chintapalli, V. R., Wang, J., & Dow, J. A. (2007). Using FlyAtlas to identify better
843 Drosophila melanogaster models of human disease. Nature genetics, 39(6), 715-720.
844
845 Chung, H., Bogwitz, M. R., McCart, C., Andrianopoulos, A., Batterham, P., & Daborn, P. J.
846 (2007). Cis-regulatory elements in the Accord retrotransposon result in tissue-specific
847 expression of the Drosophila melanogaster insecticide resistance gene Cyp6g1. Genetics,
848 175(3), 1071-1077.
849
850 Clowers, K. J., Lyman, R. F., Mackay, T. F., & Morgan, T. J. (2010). Genetic variation in
851 senescence marker protein-30 is associated with natural variation in cold tolerance in
852 Drosophila. Genetics research, 92(02), 103-113.
853
854 Colón-González, F., & Kazanietz, M. G. (2006). C1 domains exposed: from diacylglycerol
855 binding to protein–protein interactions. Biochimica et Biophysica Acta (BBA)-Molecular and
856 Cell Biology of Lipids, 1761(8), 827-837.
857
858 Comeron, J. M., Ratnappan, R., & Bailin, S. (2012). The many landscapes of recombination
859 in Drosophila melanogaster. PLoS genetics, 8(10), e1002905.
860
861 Crow, J. F. (1954). Analysis of a DDT-resistant strain of Drosophila. Journal of Economic
862 Entomology, 47(3), 393-398.
863
864 Crow, J. F. (1956). Genetics of DDT resistance in Drosophila. In Cytologia Proc. Int. Genet.
865 Symp (pp. 408-409).
866
39
867 Crow, J. F. (1957). Genetics of insect resistance to chemicals. Annual review of entomology,
868 2(1), 227-246.
869
870 Cùany, A., Pralavorio, M., Pauron, D., Berge, J. B., Fournier, D., Blais, C., ... & Lange, R.
871 (1990). Characterization of microsomal oxidative activities in a wild-type and in a DDT
872 resistant strain of Drosophila melanogaster. Pesticide Biochemistry and Physiology, 37(3),
873 293-302.
874
875 Cunha, P. M., Sandmann, T., Gustafson, E. H., Ciglar, L., Eichenlaub, M. P., & Furlong, E.
876 E. (2010). Combinatorial binding leads to diverse regulatory responses: Lmd is a tissue-
877 specific modulator of Mef2 activity. PLoS Genet, 6(7), e1001014.
878
879 Daborn, P., Boundy, S., Yen, J., & Pittendrigh, B. (2001). DDT resistance in Drosophila
880 correlates with Cyp6g1 over-expression and confers cross-resistance to the neonicotinoid
881 imidacloprid. Molecular Genetics and Genomics, 266(4), 556-563.
882
883 Daborn, P. J., Yen, J. L., Bogwitz, M. R., Le Goff, G., Feil, E., Jeffers, S., ... & Feyereisen,
884 R. (2002). A single P450 allele associated with insecticide resistance in Drosophila. Science,
885 297(5590), 2253-2256.
886
887 Daborn, P. J., Lumb, C., Boey, A., Wong, W., & Batterham, P. (2007). Evaluating the
888 insecticide resistance potential of eight Drosophila melanogaster cytochrome P450 genes by
889 transgenic over-expression. Insect biochemistry and molecular biology, 37(5), 512-519.
890
40
891 Dapkus, D., & Merrell, D. J. (1977). Chromosomal analysis of DDT-resistance in a long-term
892 selected population of Drosophila melanogaster. Genetics, 87(4), 685-697.
893
894 Dapkus, D. (1992). Genetic localization of DDT resistance in Drosophila melanogaster
895 (Diptera: Drosophilidae). Journal of economic entomology, 85(2), 340-347.
896
897 Ferrer-Admetlla et al. 2014. On detecting incomplete soft or hard selective sweeps using
898 haplotype structure. Mol. Biol. Evol. doi: 10.1093/molbev/msu077.
899
900 Festucci‐Buselli, R. A., Carvalho‐Dias, A. S., Oliveira‐Andrade, D., Caixeta‐Nunes, C., Li,
901 H. M., Stuart, J. J., ... & Pittendrigh, B. R. (2005). Expression of Cyp6g1 and Cyp12d1 in
902 DDT resistant and susceptible strains of Drosophila melanogaster. Insect molecular biology,
903 14(1), 69-77.
904
905 fFrench-Constant, R. H. (2013). The molecular genetics of insecticide resistance. Genetics,
906 194(4), 807.
907
908 Fournier-Level, A., Neumann-Mondlak, A., Good, R. T., Green, L. M., Schmidt, J. M., &
909 Robin, C. (2016). Behavioural response to combined insecticide and temperature stress in
910 natural populations of Drosophila melanogaster. Journal of evolutionary biology, 29(5),
911 1030-1044.
912
913 Garud, N. R., Messer, P. W., Buzbas, E. O., & Petrov, D. A. (2015). Recent selective sweeps
914 in North American Drosophila melanogaster show signatures of soft sweeps. PLoS Genet,
915 11(2), e1005004.
41
916
917 Gellatly, K. J., Yoon, K. S., Doherty, J. J., Sun, W., Pittendrigh, B. R., & Clark, J. M. (2015).
918 RNAi validation of resistance genes and their interactions in the highly DDT-resistant 91-R
919 strain of Drosophila melanogaster. Pesticide biochemistry and physiology, 121, 107-115.
920
921 Gotoh, O. (1992). Substrate recognition sites in cytochrome P450 family 2 (CYP2) proteins
922 inferred from comparative analyses of amino acid and coding nucleotide sequences. Journal
923 of Biological Chemistry, 267(1), 83-90.
924
925 Hoi, K. K., Daborn, P. J., Battlay, P., Robin, C., Batterham, P., O’Hair, R. A., & Donald, W.
926 A. (2014). Dissecting the insect metabolic machinery using twin ion mass spectrometry: a
927 single P450 enzyme metabolizing the insecticide imidacloprid in vivo. Analytical chemistry,
928 86(7), 3525-3532.
929
930 Huang, W., Massouras, A., Inoue, Y., Peiffer, J., Ràmia, M., Tarone, A. M., ... & Magwire,
931 M. M. (2014). Natural variation in genome architecture among 205 Drosophila melanogaster
932 Genetic Reference Panel lines. Genome research, 24(7), 1193-1208.
933
934 Huang, W., Carbone, M. A., Magwire, M. M., Peiffer, J. A., Lyman, R. F., Stone, E. A., ... &
935 Mackay, T. F. (2015). Genetic basis of transcriptome diversity in Drosophila melanogaster.
936 Proceedings of the National Academy of Sciences, 112(44), E6010-E6019.
937
938 Joußen, N., Heckel, D. G., Haas, M., Schuphan, I., & Schmidt, B. (2008). Metabolism of
939 imidacloprid and DDT by P450 CYP6G1 expressed in cell cultures of Nicotiana tabacum
42
940 suggests detoxification of these insecticides in Cyp6g1‐overexpressing strains of Drosophila
941 melanogaster, leading to resistance. Pest management science, 64(1), 65-73.
942
943 Kikkawa, H. (1961). Genetical studies on the resistance to parathion in Drosophila
944 melanogaster. Annu Rep Sci Wks Osaka Univ, 9, 1-20.
945
946 King, J. C., & Sømme, L. (1958). Chromosomal analyses of the genetic factors for resistance
947 to DDT in two resistant lines of Drosophila melanogaster. Genetics, 43(3), 577.
948
949 Kolaczkowski, B., Kern, A. D., Holloway, A. K., & Begun, D. J. (2011). Genomic
950 differentiation between temperate and tropical Australian populations of Drosophila
951 melanogaster. Genetics, 187(1), 245-260.
952
953 Kuruganti, S., Lam, V., Zhou, X., Bennett, G., Pittendrigh, B. R., & Ganguly, R. (2007).
954 High expression of Cyp6g1, a cytochrome P450 gene, does not necessarily confer DDT
955 resistance in Drosophila melanogaster. Gene, 388(1), 43-53.
956
957 Lack, J. B., Cardeno, C. M., Crepeau, M. W., Taylor, W., Corbett-Detig, R. B., Stevens, K.
958 A., ... & Pool, J. E. (2015). The Drosophila genome nexus: a population genomic resource of
959 623 Drosophila melanogaster genomes, including 197 from a single ancestral range
960 population. Genetics, 199(4), 1229-1241.
961
962 Lande, R. (1983). The response to selection on major and minor mutations affecting a
963 metrical trait. Heredity, 50(1), 47-65.
964
43
965 Leinonen, R., Sugawara, H., & Shumway, M. (2010). The sequence read archive. Nucleic
966 acids research, 39(suppl_1), D19-D21.
967
968 Lindsay, H. A., Baines, R., Lilley, K., Jacobs, H. T., & O'Dell, K. M. (2008). The dominant
969 cold-sensitive Out-cold mutants of Drosophila melanogaster have novel missense mutations
970 in the voltage-gated sodium channel gene paralytic. Genetics, 180(2), 873-884.
971
972 Low, W. Y., Feil, S. C., Ng, H. L., Gorman, M. A., Morton, C. J., Pyke, J., ... & Gooley, P.
973 R. (2010). Recognition and detoxification of the insecticide DDT by Drosophila
974 melanogaster glutathione S-transferase D1. Journal of molecular biology, 399(3), 358-366.
975
976 Mackay, T. F., Richards, S., Stone, E. A., Barbadilla, A., Ayroles, J. F., Zhu, D., ... &
977 Richardson, M. F. (2012). The Drosophila melanogaster genetic reference panel. Nature,
978 482(7384), 173-178.
979 Mackay, T.F.C. and W. Huang (2017). Charting the genotype-phenotype map:lessons from
980 the Drosophila melanogaster Genetic Reference Panel. WIREs Dev. Biol. e289 doi:
981 10.1002/wdev.289.
982
983 Macnair, M. R. (1991). Why the evolution of resistance to anthropogenic toxins normally
984 involves major gene changes: the limits to natural selection. Genetica, 84(3), 213-219.
985
986 Martins, N. E., Faria, V. G., Nolte, V., Schlötterer, C., Teixeira, L., Sucena, É., & Magalhães,
987 S. (2014). Host adaptation to viruses relies on few genes with different cross-resistance
988 properties. Proceedings of the National Academy of Sciences, 111(16), 5938-5943.
989
44
990 McKenzie, J. A., & Batterham, P. (1994). The genetic, molecular and phenotypic
991 consequences of selection for insecticide resistance. Trends in Ecology & Evolution, 9(5),
992 166-169.
993
994 McKenzie, J. A. (1996). Ecological and evolutionary aspects of insecticide resistance. RG
995 Landes.
996
997 McKenzie, J. A. (2000). The character or the variation: the genetic analysis of the insecticide-
998 resistance phenotype. Bulletin of entomological research, 90(01), 3-7.
999
1000 Menozzi, P., Shi, M. A., Lougarre, A., Tang, Z. H., & Fournier, D. (2004). Mutations of
1001 acetylcholinesterase which confer insecticide resistance in Drosophila melanogaster
1002 populations. BMC evolutionary biology, 4(1), 1.
1003
1004 Misra, J. R., Horner, M. A., Lam, G., & Thummel, C. S. (2011). Transcriptional regulation of
1005 xenobiotic detoxification in Drosophila. Genes & development, 25(17), 1796-1806.
1006
1007 Mitchell, A., Chang, H. Y., Daugherty, L., Fraser, M., Hunter, S., Lopez, R., ... & Sangrador-
1008 Vegas, A. (2014). The InterPro protein families database: the classification resource after 15
1009 years. Nucleic acids research, gku1243.
1010
1011 Mueller, J. C., & Andreoli, C. (2004). Plotting haplotype-specific linkage disequilibrium
1012 patterns by extended haplotype homozygosity. Bioinformatics, 20(5), 786-787.
1013
45
1014 Ochoa, W. F., Garcia-Garcia, J., Fita, I., Corbalan-Garcia, S., Verdaguer, N., & Gomez-
1015 Fernandez, J. C. (2001). Structure of the C2 domain from novel protein kinase Cϵ. A
1016 membrane binding model for Ca 2+-independent C2 domains. Journal of molecular biology,
1017 311(4), 837-849.
1018
1019 Pedra, J. H. F., McIntyre, L. M., Scharf, M. E., & Pittendrigh, B. R. (2004). Genome-wide
1020 transcription profile of field-and laboratory-selected dichlorodiphenyltrichloroethane (DDT)-
1021 resistant Drosophila. Proceedings of the National Academy of Sciences of the United States of
1022 America, 101(18), 7034-7039.
1023
1024 Pittendrigh, B., Reenan, R., & Ganetzky, B. (1997). Point mutations in the Drosophila
1025 sodium channel gene para associated with resistance to DDT and pyrethroid insecticides.
1026 Molecular and General Genetics MGG, 256(6), 602-610.
1027
1028 Pool, J. E., Corbett-Detig, R. B., Sugino, R. P., Stevens, K. A., Cardeno, C. M., Crepeau, M.
1029 W., ... & Langley, C. H. (2012). Population genomics of sub-Saharan Drosophila
1030 melanogaster: African diversity and non-African admixture. PLoS Genet, 8(12), e1003080.
1031
1032 Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., ... & Sham,
1033 P. C. (2007). PLINK: a tool set for whole-genome association and population-based linkage
1034 analyses. The American Journal of Human Genetics, 81(3), 559-575.
1035
1036 Pyke, F. M., Bogwitz, M. R., Perry, T., Monk, A., Batterham, P., & McKenzie, J. A. (2004).
1037 The genetic basis of resistance to diazinon in natural populations of Drosophila melanogaster.
1038 Genetica, 121(1), 13-24.
46
1039
1040 Reinhardt, J. A., Kolaczkowski, B., Jones, C. D., Begun, D. J., & Kern, A. D. (2014). Parallel
1041 geographic variation in Drosophila melanogaster. Genetics, 197(1), 361-373.
1042
1043 Riedl, C. A., Riedl, M., Mackay, T. F., & Sokolowski, M. B. (2007). Genetic and behavioral
1044 analysis of natural variation in Drosophila melanogaster pupation position. Fly, 1(1), 23-32.
1045
1046 Roush, R. T., & McKenzie, J. A. (1987). Ecological genetics of insecticide and acaricide
1047 resistance. Annual review of entomology, 32(1), 361-380.
1048
1049 Sabeti, P. C., Reich, D. E., Higgins, J. M., Levine, H. Z., Richter, D. J., Schaffner, S. F., ... &
1050 Ackerman, H. C. (2002). Detecting recent positive selection in the human genome from
1051 haplotype structure. Nature, 419(6909), 832-837.
1052
1053 Sakuma, M. (1998). Probit analysis of preference data. Applied entomology and zoology,
1054 33(3), 339-347.
1055
1056 Sandmann, T., Jensen, L. J., Jakobsen, J. S., Karzynski, M. M., Eichenlaub, M. P., Bork, P.,
1057 & Furlong, E. E. (2006). A temporal map of transcription factor activity: mef2 directly
1058 regulates target genes at all stages of muscle development. Developmental cell, 10(6), 797-
1059 807.
1060
1061 Schlenke, T. A., & Begun, D. J. (2004). Strong selective sweep associated with a transposon
1062 insertion in Drosophila simulans. Proceedings of the National Academy of Sciences of the
1063 United States of America, 101(6), 1626-1631.
47
1064
1065 Schmidt, J. M., Good, R. T., Appleton, B., Sherrard, J., Raymant, G. C., Bogwitz, M. R., ... &
1066 Robin, C. (2010). Copy number variation and transposable elements feature in recent,
1067 ongoing adaptation at the Cyp6g1 locus. PLoS Genet, 6(6), e1000998.
1068
1069 Schoustra, S. E., Slakhorst, M., Debets, A. J., & Hoekstra, R. F. (2005). Comparing artificial
1070 and natural selection in rate of adaptation to genetic stress in Aspergillus nidulans. Journal of
1071 evolutionary biology, 18(4), 771-778.
1072
1073 Shepanski, M. C., Glover, J., & Kuhr, R. J. (1977). Resistance of Drosophila melanogaster to
1074 DDT. Journal of economic entomology, 70(5), 539-543.
1075
1076 Sokal, R. R., & Hunter, P. E. (1954). Reciprocal selection for correlated quantitative
1077 characters in Drosophila. Science, 119(3097), 649-651.
1078
1079 Szpiech, Z. A. and Hernandez, R. D. 2014. selscan: an efficient multithreaded program to
1080 perform EHH- based scans for positive selection. Molecular Biology and Evolution, 31:
1081 2824–2827.
1082
1083 Steele, L. D., Muir, W. M., Seong, K. M., Valero, M. C., Rangesa, M., Sun, W., ... &
1084 Pittendrigh, B. R. (2014). Genome-wide sequencing and an open reading frame analysis of
1085 dichlorodiphenyltrichloroethane (DDT) susceptible (91-C) and resistant (91-R) Drosophila
1086 melanogaster laboratory populations. PloS one, 9(6), e98584.
1087
48
1088 Steele, L. D., Coates, B., Valero, M. C., Sun, W., Seong, K. M., Muir, W. M., ... &
1089 Pittendrigh, B. R. (2015). Selective sweep analysis in the genomes of the 91-R and 91-C
1090 Drosophila melanogaster strains reveals few of the ‘usual suspects’ in
1091 dichlorodiphenyltrichloroethane (DDT) resistance. PloS one, 10(3), e0123066.
1092
1093 Strycharz, J. P., Lao, A., Li, H., Qiu, X., Lee, S. H., Sun, W., ... & Clark, J. M. (2013).
1094 Resistance in the highly DDT-resistant 91-R strain of Drosophila melanogaster involves
1095 decreased penetration, increased metabolism, and direct excretion. Pesticide biochemistry
1096 and physiology, 107(2), 207-217.
1097
1098 Tang, A. H., & Tu, C. P. D. (1995). Pentobarbital-induced changes in Drosophila glutathione
1099 S-transferase D21 mRNA stability. Journal of Biological Chemistry, 270(23), 13819-13825.
1100
1101 Taylor, M. F. J., Shen, Y., & Kreitman, M. E. (1995). A population genetic test of selection at
1102 the molecular level. Science, 270(5241), 1497.
1103
1104 Thorvaldsdóttir, H., Robinson, J. T., & Mesirov, J. P. (2013). Integrative Genomics Viewer
1105 (IGV): high-performance genomics data visualization and exploration. Briefings in
1106 bioinformatics, 14(2), 178-192.
1107
1108 Voight, B. F., Kudaravalli, S., Wen, X., & Pritchard, J. K. (2006). A map of recent positive
1109 selection in the human genome. PLoS Biol, 4(3), e72.
1110
49
1111 Wan, H., Liu, Y., Li, M., Zhu, S., Li, X., Pittendrigh, B. R., & Qiu, X. (2014). Nrf2/Maf‐
1112 binding‐site‐containing functional Cyp6a2 allele is associated with DDT resistance in
1113 Drosophila melanogaster. Pest management science, 70(7), 1048-1058.
1114
1115 Wang, K., Li, M., & Hakonarson, H. (2010). ANNOVAR: functional annotation of genetic
1116 variants from high-throughput sequencing data. Nucleic acids research, 38(16), e164-e164.
1117
1118 Weake, V. M., Lee, K. K., Guelman, S., Lin, C. H., Seidel, C., Abmayr, S. M., & Workman,
1119 J. L. (2008). SAGA‐mediated H2B deubiquitination controls the development of neuronal
1120 connectivity in the Drosophila visual system. The EMBO journal, 27(2), 394-405.
1121
1122 Willoughby, L., Chung, H., Lumb, C., Robin, C., Batterham, P., & Daborn, P. J. (2006). A
1123 comparison of Drosophila melanogaster detoxification gene induction responses for six
1124 insecticides, caffeine and phenobarbital. Insect biochemistry and molecular biology, 36(12),
1125 934-942.
1126
1127 Zawaira, A., Ching, L. Y., Coulson, L., Blackburn, J., & Chun Wei, Y. (2011). An expanded,
1128 unified substrate recognition site map for mammalian cytochrome P450s: analysis of
1129 molecular interactions between 15 mammalian CYP450 isoforms and 868 substrates. Current
1130 drug metabolism, 12(7), 684-700.
1131
1132 Zhan, M., Yamaza, H., Sun, Y., Sinclair, J., Li, H., & Zou, S. (2007). Temporal and spatial
1133 transcriptional profiles of aging in Drosophila melanogaster. Genome research, 17(8), 1236-
1134 1243.
1135
50
1136 Zhang, Q., Lambert, G., Liao, D., Kim, H., Robin, K., Tung, C. K., ... & Austin, R. H.
1137 (2011). Acceleration of emergence of bacterial antibiotic resistance in connected
1138 microenvironments. Science, 333(6050), 1764-1767.
1139
1140
1141 FIGURE LEGENDS
1142 Figure 1: DDT Knockdown and Mortality. (A) The mean percentage four-hour knockdown
1143 and 24-hour mortality for each DGRP line (based on at least three replicates of 10 individuals
1144 each) with lines ordered from lowest to highest for each trait. (B) The scatterplot shows the
1145 correlation between 24-hour mortality and four-hour knockdown. Flies that are knocked
1146 down early can recover and exhibit late mortality.
1147
1148 Figure 2: RNAi knockdown confirms that lowering CG10737 transcript abundance increases
1149 DDT resistance, and that this effect can be mediated by manipulating muscle expression.
1150 Flies in which CG10737 is knocked down in muscles using the mef2-GAL4 driver crossed to
1151 the VDRC CG10737-KK line are significantly more resistant to DDT than control flies which
1152 are the result of substituting the relevant genetic background line for either mef2-GAL4
1153 (w1118) or the CG10737-KK line (60100) in the cross.
1154
1155 Figure 3: Toxicology of transgenic lines overexpressing the three allelic forms of Cyp6w1
1156 (far right) and various controls. All three allelic forms of Cyp6w1 confer DDT resistance
1157 relative to controls, and Cyp6w1_ALA confers a further significant increase in DDT
1158 resistance relative to the other two alleles.
1159
51
1160 Figure 4: Extended Haplotype Homozygosity (EHH) plots, (A) around the Cyp6w1 locus
1161 using the triallelic second site of codon 370 as the focal variant, and (B) around the Cyp6g1
1162 locus using transposable element insertion sites as the focal variant.
1163
1164 File S1: PLINK-formatted phenotypes generated in this study. Knockdown and mortality
1165 were measured hourly at 1-24 hours, and at 30 hours.
1166
1167 File S2: Variants associated with 4hrkd and 24hrm phenotypes using DGRP2 and custom
1168 PLINK GWAS pipelines, p<1×10-5. Also includes calls for each variant in 91-R and 91-C
1169 genomes (generated by Steele et al. [2014]).
1170
1171 File S3: Phenotype to transcription level tests performed using data from the DGRP
1172 transcriptome (Huang et al. 2015).
1173
1174 File S4: The power to detect association given the number of lines, the heritability and the
1175 minimum number of replicates per line (Mackay and Huang 2017).
1176
1177
1178
1179
1180
1181
1182
1183
52
1184 Table 1: Population frequencies of CYP6W1 codon 370 tri-alleles
Allele Frequency Data Sample Continent Population Source Size VAL ALA GLY
Pool et al. Africa Combined 139 0.9 0.07 0.3 2012 Reinhardt Queensland 17 0.53 0.03 0.46 et al. 2014 Australia Reinhardt Tasmania 15 1 0 0 et al. 2014 Haddrill and France - A 50 0.78 0.02 0.2 Bergman 2012 Europe Pool et al. France - B 8 0.5 0 0.5 2012 Martins et Portugal 12 0.95 0 0.05 al. 2014 Reinhardt Maine 16 0.69 0.25 0.06 et al. 2014 North North Mackay et 162 0.54 0.32 0.14 America Carolina al. 2012 Reinhardt Florida 16 0.63 0.31 0.06 et al. 2014 1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
53
1195 Table 2: GWAS resistance variants enriched in 91R but not 91C
Phenotype Pipeline Location [V6] p-value Gene Site class 91-C 91-R
non- 24hrm PLINK 2R:6,174,944 1.72E-09 Cyp6w1 T C synonymous 4.28E- DGRP2, 4hrkd 2R:19,169,429 07, CG10737 intron T G PLINK 7.58E-07
4hrkd DGRP2 2L:14,188,350 1.64E-06 smi35A intron G G/C
4hrkd PLINK 3L:4,059,748 1.75E-06 - intergenic T C
4hrkd PLINK 2L:5,600,056 1.97E-06 - intergenic C A/C
4hrkd PLINK 3L:12,404,415 3.00E-06 CG32103 synonymous C C/T
4hrkd DGRP2 3R:30,968,896 3.70E-06 - intergenic A ATA
5.45E- DGRP2, 4hrkd 2R:19,169,399 06; CG10737 intron T C PLINK 7.62E-06
24hrm PLINK 2L:10,546,377 5.68E-06 Trim9 intron A A/G
4hrkd DGRP2 3L:7,102,962 6.79E-06 form3 intron C C/T
4hrkd PLINK 3L:6,935,976 7.88E-06 - intergenic A A/G
24hrm PLINK 3L:15,854,279 8.76E-06 pHCl intron G A
4hrkd DGRP2 2L:8,232,517 9.49E-06 Pvr intron C/A C
4hrkd DGRP2 2L:4,086,206 9.57E-06 ed intron T G
4hrkd DGRP2 2L:19,231,491 1.66E-05 - intergenic A G
2.02E- DGRP2, 4hrkd 3L:7,495,781 05; Cyp4d8 intron G T PLINK 7.38E-06
4hrkd DGRP2 3R:25,716,964 2.15E-05 LpR2 intron A C
1196
54