| INVESTIGATION

Pervasive Strong Selection at the Level of Codon Usage Bias in Drosophila melanogaster

Heather E. Machado,*,1 David S. Lawrie,† and Dmitri A. Petrov‡ *Cancer, Ageing, and Somatic Mutation, Wellcome Sanger Institute, Hinxton CB10 1SA, UK, †Department of and Evolutionary Biology, University of California, Irvine, California 92697-3958, and ‡Department of Biology, , California 94305-5020 ORCID IDs: 0000-0002-1523-3937 (H.E.M.); 0000-0002-6782-9986 (D.S.L.); 0000-0002-3664-9130 (D.A.P.)

ABSTRACT Codon usage bias (CUB), where certain codons are used more frequently than expected by chance, is a ubiquitous phenomenon and occurs across the tree of life. The dominant paradigm is that the proportion of preferred codons is set by weak selection. While experimental changes in codon usage have at times shown large phenotypic effects in contrast to this paradigm, genome-wide population genetic estimates have supported the weak selection model. Here we use deep genomic population sequencing of two Drosophila melanogaster populations to measure selection on synonymous sites in a way that allowed us to estimate the prevalence of both weak and strong purifying selection. We find that selection in favor of preferred codons ranges from weak (|Nes| 1) to strong (|Nes| . 10), with strong selection acting on 10–20% of synonymous sites in preferred codons. While previous studies indicated that selection at synonymous sites could be strong, this is the first study to detect and quantify strong selection specifically at the level of CUB. Further, we find that CUB-associated polymorphism accounts for the majority of strong selection on synonymous sites, with secondary contributions of splicing (selection on alternatively spliced genes, splice junctions, and spliceosome-bound sites) and transcription factor binding. Our findings support a new model of CUB and indicate that the functional importance of CUB, as well as synonymous sites in general, have been underestimated.

KEYWORDS codon usage bias; CUB; selection; splicing; synonymous sites; Drosophila

HE degeneracy of the genetic code leads to protein-coding selection, the and strength of natural selection acting Tmutations that do not affect amino acid composition. to maintain CUB is disputed. Despite this, such synonymous mutations often have conse- The most common explanations for CUB postulate selec- quences for phenotype and fitness. The first evidence of the tion on either the rate or the accuracy with which ribosomes functionality of synonymous sites came from the discovery of translate mRNA to protein (Hershberg and Petrov 2008). The codon usage bias (CUB), where, for a given amino acid, certain existence of selection at synonymous sites at the level of codons are used more frequently in a genome than expected translation is supported by several key observations. First, by chance (Grantham et al. 1981; Ikemura 1981). While the the preference toward particular “preferred” codons is con- consensus in the field is that CUB is often driven by natural sistent across genes within a particular genome suggesting a global, genome-wide process, and not preference for the use of particular codons within specific genes (Grantham et al. 1980; Chen et al. 2004). Second, optimal codons tend to Copyright © 2020 Machado et al. doi: https://doi.org/10.1534/genetics.119.302542 correspond to more abundant tRNAs, suggesting a functional Manuscript received October 9, 2019; accepted for publication December 12, 2019; relationship between translation and CUB (Post et al. 1979; published Early Online December 23, 2019. Available freely online through the author-supported open access option. Ikemura 1981, 1982; Qian et al. 2012). Third, preferred co- This is an open-access article distributed under the terms of the Creative Commons dons are more abundant in highly expressed genes than in Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), the rest of the genome (Gouy and Gautier 1982; Bulmer which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1991; Novoa and Ribas de Pouplana 2012), consistent with Supplemental material available at figshare: https://doi.org/10.25386/genetics. selection being proportional to mRNA transcript abundance. 11316794. 1Corresponding author: Cancer, Ageing, and Somatic Mutation, Wellcome Sanger Finally, constrained amino acid positions tend to contain pre- Institute, Hinxton CB10 1SA, UK. E-mail: [email protected] ferred codons more frequently, suggesting a link between

Genetics, Vol. 214, 511–528 February 2020 511 CUB and translational accuracy (Escherichia coli: Stoletzki and the test independent of the demographic history of a pop- Eyre-Walker 2007; Drosophila melanogaster: Akashi 1994; ulation. These and similarly constructed tests have been used mammals: Drummond and Wilke 2008). In addition to speed to estimate the strength of selection on CUB in D. mela- and accuracy, there is evidence that other processes are af- nogaster, and have typically failed to find any evidence of fected by codon composition, such as cotranslational folding strong selection on CUB (Singh et al. 2007; Zeng and (Pechmann and Frydman 2013), RNA stability (Presnyak et al. Charlesworth 2009, 2010; Clemente and Vogl 2012a; 2015; Carneiro et al. 2019), and transcription (Carlini and Campos et al. 2013). However, strong purifying selection Stephan 2003; Newman et al. 2016; Zhou et al. 2016). results in an enrichment of very low allele frequency vari- Beyond the level at which selection operates to generate ants, requiring very deep population sequencing to allow for CUB, it is important to consider how strong selection at the detection of its effects in SFS data. In the absence of very synonymous sites is likely to be. This question has been deep and accurate population sequencing, an alternative addressed with population genetics approaches introduced method is to utilize information about the proportion of sites by the seminal papers of Li and Bulmer (Li 1987; Bulmer that are polymorphic (polymorphism level). Since both 1991). The Li–Bulmer model proposes that the observed, strong purifying selection and a decreased mutation rate relative proportion of codons can be explained by the balance can lower the polymorphism level, the selected class of sites of mutation, selection (in favor of preferred codons), and would have to be compared with a neutral reference that is random genetic drift. This model assumes a constant selec- matched for mutation rate and levels of linked selection. tion coefficient per codon or codon preference group, and Thus, the limit of detection for strong selection in the afore- predicts that the strength of selection in favor of preferred mentioned studies, which either ignored polymorphism codons should be on the order of the reciprocal of the effec- level or did not have a control for it, was set by the lowest tive population size (Nes 1). The predicted weak selection allele frequency class in the dataset (set by the number of should be detectable as a slight deviation in the site frequency individuals sampled). spectrum (SFS). Mutations from preferred to unpreferred Intriguingly, a study by Lawrie et al. (2013) that did in- codons should reach comparatively lower frequencies in the corporate polymorphism level and SFS with the use of population than those in the opposite direction. Such devia- matched neutral controls did find evidence of strong purify- tions have, in fact, been observed in many organisms that ing selection in synonymous sites, but was unable to correlate show clear CUB (D. melanogaster: Zeng and Charlesworth this substantial selection with CUB. As the authors did not 2009; Caenorhabditis remanei: Cutter and Charlesworth have enough variants in their study to test the SFS of pre- 2006; E. coli: Sharp et al. 2010), supporting the conclusion ferred sites separately, they could not test for weak selection that selection at synonymous sites is weak but detectable. on CUB or distinguish strong selection from lethal CUB- While weak selection driving CUB may match the intuition associated mutations. that a synonymous change should not have a large phenotypic Here, we test the prevalence of strong purifying selection effect, there is abundant experimental evidence that this is not on CUB in two distinct D. melanogaster populations (the always the case. For example, optimizing the codon compo- DGRP Freeze 2 dataset and a high diversity African popula- sition of the viral protein BPV1 increases the heterologous tion). We accomplish this by comparing the polymorphism translation of the protein in humans by .1000 fold (Zhou level and SFS of fourfold degenerate synonymous sites in et al. 1999). In humans, a change in codon composition in the preferred and unpreferred codons to that of a short intron gene KRAS, from rare to common codons, increases KRas neutral reference. The neutral reference is produced by protein expression and is associated with tumorigenicity matching each fourfold site to a short intron site that is lo- (Lampson et al. 2013). In D. melanogaster, changing a small cated within 1 kb and has the same nucleotide at the position number of preferred codons to unpreferred codons in the of interest and at the 59 and 39 neighboring sites. This creates alcohol dehydrogenase (Adh) gene resulted in substantial a neutral reference that is subject to the same mutation rate changes in gene expression and in ethanol tolerance and environment of linked selection as the fourfold sites. We (Carlini and Stephan 2003; Carlini 2004). The authors find evidence that there is a distribution of selection strengths estimated that such unpreferred codons are subject to on CUB, ranging from weak to strong. Our findings of strong s . 1024—much larger than the Li–Bulmer predictions of selection on CUB directly conflict with previous models of the strength selection strength on synonymous sites. The CUB that predict uniformly weak selection. Further, we find Bulmer 1991 paper acknowledged this generally as a puzzle that CUB explains a large proportion of the signal of strong in need of solving, as discrepancies between their theoretical selection on synonymous sites, indicating that the functional predictions of the strength of selection and that of mechanis- effects of CUB have been generally underestimated. tic models of CUB were already apparent. We can characterize the selection acting on CUB using Materials and Methods population genetic data. One commonly used method of Sequence data estimating level of selection is to compare the SFS of a putatively selected class of sites to that of a neutral reference. We used sequence data from two D. melanogaster population This approach is powerful, as the neutral reference can make samples, one from North America (DGRP Freeze 2), consisting

512 H. E. Machado, D. S. Lawrie, and D. A. Petrov of 200 inbred lines (Mackay et al. 2012), and one from Africa We produced 200 such matched 4D/SI datasets, eachwith (Zambia), consisting of 197 haploid embryos (Lack et al. the same 871,218 DGRP or 754,503 Zambia 4D sites, and an 2015), downloaded from the Drosophila Genome Nexus average of 288K SI sites for DGRP and 244K SI sites for (http://www.johnpool.net/genomes.html). To reduce the Zambia (each SI site is matched to an average of 3 4D sites). effect of sequencing and mapping error, for each individual Note that we find this decision of SI re-use to be important. we filtered out all sites with low mapping quality (MAPQ For example, if we instead remove 4D sites for which there , 20). We also filtered out sites within 10 bp of an indel. is not a unique SI match, this results in a skew in the tri- Since mapping errors are more common in regions around nucelotide composition, and has a systematic effect on 4D/SI indels, and since introns have a greater number of indels, polymorphism levels. Correcting for this skew by further including these regions could have artificially inflated downsampling restores the polymorphism levels but results the short intron polymorphism level. We also eliminated in too few sites to perform the ML analyses (Figure S2). For polymorphic sites with more than two alleles (1% of confidence interval estimation, each dataset was resampled polymorphisms). with replacement. When estimating selection parameters Per population, we downsampled sites to a uniform cov- and their confidence intervals on these resampled sets, erage of 160 haplotypes, and excluded sites with .160 hap- linkage between 4D sites within these sets would be a lotypes. We considered only the four major autosomal violation of the SFS assumptions that posit independence chromosome arms because of systematic differences be- between sites. However, as a consequence of the structure of tween D. melanogaster autosomes and X chromosomes thedata,theaboveresamplingschemeworkswellaslinkage (Singh et al. 2005), and polarized polymorphic sites by par- in D. melanogaster breaks down across short distances simony, identifying the ancestral state as the allele found in (Feder et al. 2012), and the 4D sites used (those with a the D. simulans v2 reference genome (Hu et al. 2013). We matching nearby SI site) are sufficiently sparsely populated used the D. melanogaster reference allele for cases where the across the genome. ancestry was ambiguous, either because there was no direct Maximum-likelihood estimation of selection parameters D. simulans alignment, or because neither allele was present from SFS in D. simulans. While instances of mispolarization can occur, we note that as the polarization is not used for unfolding We employed a variation of the site SFS method described in spectra, mispolarization has little to no effect on the shape Lawrie et al. (2013). The method uses both SNP density and of the SFS (i.e., no exchange of mutations from low-to-high frequency information of SFS to calculate the distribution of frequency), and, therefore, will have little effect on the fitness effects (DFE) for a test set of sites given a “neutral” maximum-likelihood (ML) estimates. Fourfold degenerate reference—in this case, the DFE for 4D synonymous sites synonymous (4D) sites and intronic regions were identified with SI sites as the reference. For the purposes of ML estima- from Flybase annotations (release 5.5; www.flybase.org). tion, the spectra are folded. The DFE itself is modeled as a The total number of 4D sites in our two datasets was categorical distribution, where the program estimates effec-

1,976,830 for DGRP and 1,862,290 for Zambia. We classi- tive selection coefficients (g = Nes) and the percentages of fied short introns (SI) as introns .86 bp in length, and ex- sites ( f ) evolving under those selection coefficients for a pre- cluded the first and last 8 bp of each intron, as these regions determined number and type of selection categories. This has are known to be under constraint (Haddrill et al. 2005; the advantage of not assuming a particular distribution Halligan and Keightley 2006; Clemente and Vogl 2012b). shape, such as gamma or lognormal, but comes at the cost The total number of SI sites was 550,587 for DGRP and of additional free parameters per additional categories. For 446,462 for Zambia. example, a three category model that has a neutral class

We created the SI control dataset by matching each 4D site ( f0) + a weak selection class ( fW,0. gW .210) + a strong to a SI site. To control for mutation rate differences between selection class ( fS, 210 . gS .2inf) requires four free pa- 4D sites and their matched controls, we required each rameters to fully describe it ( f0 =12 fW 2 fS, g0 = 0). The matched SI site to have the same ancestral allele and the method also estimates the per site effective mutation rate, same neighboring nucleotides (3 bp context) as the 4D site. u (4Nem), for the SI spectra. While estimating parameters We matched blind to the direction or strand (i.e., matching on SFS, the effective population size, Ne, is held constant, with the forward, reverse, reverse complement, or comple- allowing the other parameters to be fit to the data relative ment SI sequence). To control for the effect of linked selec- to that Ne. tion on the level of 4D polymorphism, we also required each Demography, linked selection, and other forces affecting matched SI site to be within 1000 bp of the 4D site, such that both 4D and SI sites, can skew the spectra and bias the SI control would be subject to the same linked selective pres- estimation of the above DFE parameters. To compensate, sure from nonsynonymous sites as the 4D sites. We found we used frequency-dependent correction factors, ax, which 1000 bp to be a sufficiently small distance, as we found no adjusts the probability of seeing a site with a SNP at fre- significant correlation between SI polymorphism and dis- quency of x in the sample 2 p(x|model) (Eyre-Walker et al. tance between the 4D sites and the matched intron over 2006). The likelihood (l) of the SFS under the model’s the range of 0–1000 bp (Supplemental Material, Figure S1). framework is shown below:

Strong Selection on Codon Usage Bias 513 ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ In Equation 5, binompdf (k; N , z) is the binomial probabil- lfullðSFS4D; SFSSI u;g; f;aÞ¼lðSFS4D u;g; f;aÞ 3 lðSFSSI u;aÞ (1) s ity of choosing k out of Ns derived alleles at a frequency z in

1 the population. Equation 6 is the standard Wright-Fisher ⇀ ⇀ Y2 ⇀ ⇀ ⇀ ⇀ ⇀ k0 ⇀ kx ... fi lðSFS4D u;g; f;aÞ¼pð0 u;g; f;a; L; NsÞ ðaxpðx u;g; f; L; NsÞÞ (2) model for calculating, gc(z| ), the probability of nding a x¼ 1 Ns mutation at a frequency z in a diploid population with co- : ∶¼ ; ∶¼ # ; dominant alleles where the fitness of the derived, mutant where a 1 1 Ns of sample frequenciesP ∶¼ # Ns ; ∶¼ allele is 1+2s the fitness of the ancestral allele (Wright kx of sites at frequency x L x kx 1938). 1 X2 ⇀ ⇀ ⇀ ⇀ ⇀ pð0 u;g; f;a; L; NsÞ¼1 2 axpðx u;g; f; L; NsÞ (3) Model parameters for demography and linkage x¼ 1 Ns All inferences were made with an Ne of 2000. This parameter ⇀ X was chosen to be big enough for the binomial sampling ð ; ⇀; ; ; Þ¼ ½ ð j ; ; ; ; Þþ ðð 2 Þj ; ; ; ; Þ p x u g f L Ns pc x Ne m gc Lc Ns pc 1 x Ne m gc Lc Ns (4) c assumptions to hold, while still being computationally trac- ⇀ P table. Assuming a constant effective population size is con- : ∶¼ ; ð : ;⇀; ; ; Þ∶¼ ðÞ: j ; ; ; ; where Lc fcL p 0 5u g f L Ns pc 0 5 Ne m gc Lc Ns venient for generating theoretical spectra, but deviations of c As in Lawrie et al. 2013, the likelihood of the model to the putatively neutral SI SFS from the theoretical neutral explain the full data set is the product of the likelihoods of the SFS are expected to exist due to an organism’sdemographic model explaining the 4D and SI data independently (Equa- history, as well as factors such as linkage to sites under tion 1). The model itself is defined by u, the array of g coef- selection. While our paired-bootstrap method endeavors ficients for each selection class (c), the array of f fraction of to ensure a similar overall level of linked selection affecting sites in each class, and the array of a demographic-correction 4D and SI spectra, any residual skew from linkage must still parameters. For each set of sites, the likelihood is calculated be corrected for when estimating selection parameters. using the multinomial distribution where the probability of To account for this deviation of the SI SFS from the theo- seeing a site at a given sample frequency, p(x|...), in the retical neutral, we performed a ML fit of offsets (a values) population is modified by an a parameter and exponentiated for each allele frequency bin. Frequency-dependent correc- by the actual number of sites at frequency x (Equation 2). In tion factors produce a fidelity in re-estimating statistics from the analyses where all sites, including monomorphic, are in- simulated spectra comparable to commonly used, computa- cluded in the analysis the probability of seeing a site in the tionally tractable, demographic models (Tataru et al. 2017). monomorphic, “zero”, class is 1—probability of seeing a poly- The frequency spectra were divided, according to a power morphism (Equation 3). Equation 4 shows the folded proba- law, into six separate allele frequency bins with the same ax bility of finding a mutation at frequency x in the sample over within each bin (Table S1). For speed, the demographic all selection classes. The likelihood equations for the short parameters were first estimated on the SI data alone before intron sites are the same as Equations 2–4 above except that the rest of the model was estimated on the entirety of the the model is one of pure neutrality ( f0 =1,g0 = 0), but it data set [tests did not reveal any significant differences be- shares the same u and a parameters. tween estimating the a parameters on the whole data set, vs. A major difference from Lawrie et al. (2013) is in calculating on the SI data set (not shown)]. pc(x|...), the probability of seeing a mutation at frequency x in Model parameters for selection the finite sample for a given selection class c.Lawrieet al. (2013) usedanapproximationthatresultedintheeffectsoffinite sam- We tested five different ML models: (1) neutral, (2) neutral pling being canceled out. This approximation allowed for the + lethal, (3) neutral + 1 selection coefficient, (4) neutral + faster calculation of probabilities and worked well for estimating selection + lethal, and (5) neutral + 2 selection coefficients. the number of sites in each selection category and for estimating For ML estimation without polymorphism level data, see the strength of weak selection regimes. However, for strong pu- Supplemental Material, Text S2. The neutral + 2 selection rifying selection, the approximation causes our method to un- coefficients model requires a parameter that is the boundary derestimate the strength of selection by 10–30%. In our condition between weak and strong selection classes. We current study, we calculated the probability pc(x|...), accounting tested a broad range of boundary conditions and found Nes = for the effects of finite sampling using the binomial distribution: 210 to permit all ML peaks to be reached for differentiating 2Ne21 X2Ne strong and weak selection. Thus, selection categories with p ðxN ; m; g ; L ; N Þ¼ g ðzm; g ; L Þ 3 binompdfðN x; N ; zÞ c e c c s c c c s s (5) Nes ,210 classify strong purifying selection. The lethal z¼ 1 2Ne class is defined purely by a drop in polymorphism density, 8 — 2 ð 2 Þ with no concomitant excess of rare alleles essentially in- > 2 4gc 1 z < 3 1 e 6¼ fi 2mLc 2 gc 0 nitely strong purifying selection. In practice, it represents g ðz m; g ; L Þ¼ zð1 2 zÞð1 2 e 4gc Þ c c c :> all selection effects stronger than what can be observed with = ¼ 2mLc z gc 0 the SFS, and is indistinguishable from a drop in mutation- (6) rate relative to the neutral control. This latter property is

514 H. E. Machado, D. S. Lawrie, and D. A. Petrov why it is important to differentiate strong, but finite, puri- should be viewed as descriptors of the true underlying DFE, fying selection from lethality. rather than a model of the DFE itself. The ML program requires seed values for selection Power analysis and model validation strength, selection proportion, lethal proportion, and theta. To identify the highest likelihood model, we used the In order to assess our power in differentiating strong se- Matlab function fminsearch, an implementation of the lection from a lethal class or 4D/SI mutational differences, Nelder–Mead simplex method (Nelder and Mead 1965), we performed power analyses of our ML method of selection with multiple seed values to bolster the chance of finding estimation. We did this by creating theoretical spectra the global maximum (see Table S2 for seed values). To for a range of selection strengths and proportions with determine the best fit model, we performed a chi-squared theta values reflecting those we found in our ML inferences likelihood ratio test of the ML scores. Since the chi-squared for the DGRP (0.01) and Zambia (0.035) populations. test is approximate for determining significance in a com- We then estimated selection for these spectra using a posite-likelihood (Coffman et al. 2016), we also use confi- theoretical neutral reference with the same theta value dence intervals. To calculate 95% confidence intervals, we and number of sites. We performed a chi-squared likelihood performed a rank bootstrap, sampling with replacement ratio test with one degree of freedom comparing the each of the 200 matched 4D and SI datasets, performing 2-category selection model (neutral + one selection class) our ML estimate of selection and using the 5th and the with the neutral + lethal model. This analysis demonstrates 195th rank values for each ML score, proportion of selec- how an increasing number of SNPs, increasing polymor- tion, and strength of selection. phism level (e.g., larger theta), and a greater proportion of sites under selection increase our power to distin- Model interpretation guish strong selection from lethality/mutational differ- Our program takes a nonparametric approach to modeling ences (Figure S4, A and B). selection and demography. In the case of demography, the We then tested the power and biases of using a categor- frequency-dependent correction factors skew the mutation– ical DFE in combination with frequency-dependent correc- selection equilibrium SFS to fit the data. While in the true, tion factors to account for demography when demography unknown, evolutionary history of the sites being studied, was complex. We simulated sites with two selection cate- the amount of effective selection on deleterious mutations gories evolving both under mutation–selection equilibrium fluctuates over time due to demographic events and linkage, and under a bottleneck/growth demographic scenario the inferred Nes values returned by our method are the mu- which had been estimated for Zambia (Table 2 in Singh tation–selection balance equivalent—a single, constant Nes et al. 2013: third codon model) and then attempted to value that captures the average amount of effective selec- re-infer the simulation parameters. These Wright-Fisher tion experienced by the typical deleterious mutation in simulations were run using a modified version of the pro- these sites. gram GO Fish (Lawrie 2017). In Figure S4, C–G, we find For selection, there is some unknown distribution of that, overall, the estimation of the percentage of sites in the selection coefficients over the sites, referred to as the non-neutral selection category is robust to demographic Distribution of Fitness Effects (DFE), which we have rep- effects. One exception is when the selected sites are very resented with either one or two selection masses. ML in- weakly deleterious, in which case the proportion of such ference using this categorical model estimates what the sites is underestimated in the bottleneck/growth demo- proportion of sites, and how strong the selection in each graphic model. Likewise, our estimates of the strength of category has to be in order to produce the same overall SFS purifying selection become increasingly conservative as the true, unknown DFE. It has been shown in the liter- (underestimated) for simulations with stronger selection ature that the use of such point estimates to represent against deleterious mutations. Further, we lose power to the DFE provides an unbiased description for a range of distinguish finite strong purifying selection from lethality real underlying DFEs (Eyre-Walker and Keightley 2007; quicker than when sites are evolving under mutation– Kousathanas and Keightley 2013). A simple example selection equilibrium. We also ran three selection category of the behavior of our inference model can be found in simulations with similar results (not shown). The primary Figure S3. Here, we performed the ML inference using cause for the increased difficulty in making these estima- only two selection categories (neutral and deleterious) tions for the bottleneck/growth demography model with on simulated data with true underlying DFEs with three respect to mutation–selection equilibrium is that muta- selection categories: neutral, weakly deleterious, and strongly tions in different selection regimes respond to the same deleterious. The lone inferred deleterious selection category nonequilibrium demography differently, causing different is forced to represent the total amount of selection in the skews in the SFS for each. The frequency-dependent correc- system with varying degrees of success for the different sce- tion factors meanwhile infer a global skew that modifies the narios, but, in all cases, the interpretability of a single non- spectra of all selection regimes the same way. neutral selection category can be problematic. In contrast to Most of the potential biases mentioned above are small, parametric DFE inference, these inferred selection categories and all of them are conservative. Supposing, for example,

Strong Selection on Codon Usage Bias 515 that the demographic model used above is indeed an accu- may not hold for all datasets. While we demonstrate that this rate reflection of the evolutionary history of the Zambian relationship does generally apply to our data, the linear re- population: as compared to what we report in Table 1 for lationship, and, thus, this analysis may not be applicable to Zambia preferred 4D sites, the true strength of selection in other datasets (e.g., where there are large differences in se- the strong categories may actually be underestimated; lection strengths across datasets). meanwhile, the confidence intervals for the proportion of Identification of putatively functional regions sites in the weak selection category are already quite wide, so a small underestimation of their tally does not change Codon usage bias: We calculated the relative synonymous much. codonusage(RSCU)foreachcodonastheobservedfre- To further validate our methodology, we checked for quency of a codon in the dataset divided by the expected consistency between real Zambia and DGRP data sets and usage if all four codons were used equally (0.25) (Sharp and the expected spectra generated from our model using an- Li 1986). We classified each 4D site as being in a preferred other SFS inference method, DFE-alpha (Eyre-Walker and (highest RSCU for the amino acid) or unpreferred codon Keightley 2009). We find a good fit between DFE-alpha (lowest three RSCUs for the amino acid). The amino acids parameter estimates for theobservedspectraandour and their respective preferred codons are as follows: alanine model-generated spectra (see Supplemental Material, Text GCC, glycine GGC, leucine CTG, proline CCC, threonine S1). ACC,andvalineGTG.Forpolymorphic4Dsites,weused the ancestral allele to designate the codon and preferred Polymorphism ratio estimate of strong selection state. We identified a total of 850,973 (366,458 with SI In order to make a precise estimate of purifying selection using controls) and 794,471 (312,523 with SI controls) 4D sites our SFS-based ML method, we require a large number of sites in preferred codons for DGRP and Zambia, respectively. (. 100K). When we have few sites, we can use alternative When analyzing sets of 4D sites, all sites with the same methods for estimating purifying selection. One proxy for the ancestral codon are grouped together, regardless of any de- amount of strong purifying selection is the depletion of poly- rived polymorphism. morphism in a selected class compared with a neutral class. We additionally measured the polymorphism ratio for in- We quantified this depletion as the “polymorphism ratio”, dividual 4D mutations (e.g., CCC/CCA), and examined the which is a variation of the classic statistic, PN /PS, for compar- relationship between the resulting RSCU change and poly- ing the density of neutral and selected segregating sites. morphism ratio. In order to appropriately calculate the poly- However, since polymorphism can at times be greater at morphism ratio for each codon change, we matched 4D sites 4D sites given certain selection scenarios, we opted to use to all SI sites with the same possible states. For example, for a statistic that was symmetric with regard to the 4D or the class of 4D sites of an ancestral “CCC” proline codon and a SI polymorphism depletion and use the log of this ratio: derived “CCA” proline codon, we matched the 4D proline “C” log(PSI /P4D), where PSI and P4D are counts of the number monomorphic sites and “C/A” polymorphic sites to either SI of polymorphic sites after the matching of SI and 4D sites. “C” monomorphic sites or “C/A” polymorphic sites (or the Because strong purifying selection is expected to remove complement), as well as matching for distance and muta- polymorphism from the dataset, we tested whether this simple tional context. polymorphism log-ratio statistic, which quantifies the deple- tion of polymorphism in 4D sites relative to SI sites, can Transcription factor binding sites: We used modEncode capture the number of 4D sites under strong purifying selec- chromatin immunoprecipitation sequencing (ChIP-seq) exper- tion. As we show using simulated data (Figure S5A), this iments to assess the contribution of transcription factor binding statistic provides a largely unbiased estimate of strong puri- (TFB) sites to the signal of purifying selection on synonymous fying selection, even when the number of sites is significantly sites (http://intermine.modencode.org). This dataset repre- less than what is required for the ML estimates. Note that there sents 25 experiments, testing 15 transcription factor targets is only a roughly one-to-one relationship between polymor- (antibodies: odg-GFP, anti-trem, Sin3A-RC, Su(var)3-9, KW4- phism ratio and the amount of strong purifying selection, and PCL-D2, KW3-D-D2, KW3-Trl-D2, bon (GP37), HP1 antibody this varies with the strength of selection (Figure S5A). Weak (ab24726), HP1-Covance, KW4Hr39-D1, KW3-Kr-D2, KW3- selection does affect the polymorphism ratio as well; however, CG8478-D1, KW3-hkb-D1, KNI-D2,KW3-Trl-D2; modENCODE it does so to a much lesser extent. In our datasets, the submissions 3229, 3230, 3232, 3234, 3237, 3238, 3239, 3240, correlation between our ML estimates and polymorphism 3241, 3242, 3243, 3245, 3390, 3391, 3392, 3393, 3394, 3395, ratio measures most closely resembles the pattern predicted 3396, 3398, 3399, 3400, 3401, 3402, 3403). We consider a by strong purifying selection (Figure S5B). We therefore use “transcription factor bound region” to be any region with evi- this metric as a proxy for the proportion of sites under strong dence for TFB in any of the noncontrol experiments (minimum purifying selection for subsets of sites, specifically when there binding score: 50). We identified a total of 294,703 (129,611 are too few sites to perform ML inference. It is important to with SI controls) and 289726 (118,463 with SI controls) tran- remember that the linear relationship between polymorphism scription factor (TF) bound 4D sites for DGRP and Zambia, ratio and proportion of sites under strong purifying selection respectively.

516 H. E. Machado, D. S. Lawrie, and D. A. Petrov Spliceosome binding: We used modEncode RNA immuno- estimate of purifying selection, we use the polymorphism precipitation sequencing (RIP-seq) experiments targeting pu- ratio to calculate a rough estimate the number of sites under tative spliceosome proteins to assess the contribution of strong purifying selection in each subset. We estimate the spliceosome binding to the signal of purifying selection number of sites under strong purifying selection by calculating (http://intermine.modencode.org). The experiments tested the increase in polymorphism ratio in a functional class above for RNA-protein binding of a total of 30 putative splicing the background polymorphism ratio (the polymorphism ratio proteins. We considered a region to be bound if it had a for sites in a functional class minus the polymorphism ratio for binding score of 5 or greater in any of the experiments. This the remaining sites), multiplied by the total number of sites in left a total of 321,290 (153,037 with SI controls) and the functional class (pre-SI matching). 316,740 (137,967 with SI controls) spliceosome-bound 4D Conservation scores sites for DGRP and Zambia, respectively. We calculated the level of conservation of each 4D site across a Alternative splicing: We distinguished between genes with 10-species Drosophila phylogeny. We excluded D. mela- and without alternative splicing using the analysis in nogaster from the phylogenetic analysis in order to avoid a Brown et al. (2014). We considered any gene with more confounding effect of D. melanogaster polymorphism on both than one transcript as alternatively spliced. We found a polymorphism ratio and phyloP score. The PRANK multiple total of 1,196,063 (635,814 with SI controls) and 1,136,535 sequence alignments of the 10 species (D. simulans, D. sechel- (556,673 with SI controls) 4D sites in alternatively spliced lia, D. yakuba, D. erecta, D. ananassae, D. pseudoobscura, genes for DGRP and Zambia, respectively. D. persimilis, D. virilis, D. mojavensis, D. grimshawi) were gen- erously provided by Dr. Sandeep Venkataram. We calculated Splice junctions: We used the splice junctions identified by the probability of conservation for each 4D site using the Brooks et al. (2015). We found 18,410 (13,447 with SI con- phyloP function of the PHAST software (method=“likelihood trols) and 17,528 (11,920 with SI controls) 4D sites in splice ratio test”)(Cooperet al. 2005). junctions for DGRP and Zambia, respectively. Recombination rate

Ribosomal occupancy: We estimated ribosomal occupancy We used D. melanogaster recombination rate estimates from using the ribosomal profiling experiments conducted by Dunn Comeron et al. (2012). The recombination rate used is the et al. (2013). We first normalized each pooled experiment file mean across 100-kb windows, and is expressed in cM/Mb/ (GEO accession GSE49197) by dividing the number of counts female meiosis. We split the 4D sites into three recombina- in each region by the total number of counts across regions. tion rate categories: low [bottom third: (0–1.08) cM/Mb/ All regions with zero counts for either the footprinting or meiosis], medium [middle third: (1.08–2.71) cM/Mb/meio- expression experiments were excluded. We estimated ribo- sis], and high [top third: (2.71–14.8) cM/Mb/meiosis]. somal occupancy by dividing the normalized ribosomal foot- Data availability print values by the normalized expression values (for each fi DNA strand separately). The top and bottom 1 percentile of The authors state that all data necessary for con rming the ribosomal occupancy scores were omitted from downstream conclusions presented in the article are represented fully analysis, leaving translational efficiency scores for 1,391,585 within the article. All data used in this study are publicly 4D sites. We defined high ribosomal occupancy as the available, and are referenced in Materials and Methods. The fi top 25% of values (represented by 116,645 and 104,168 Supplemental Materials le, including Supplemental Text, SI-matched 4D sites for DGRP and Zambia, respectively). Figures and Tables, is available at FigShare. Matlab code for the ML estimation of selection is available on Github: Frequency of preferred codons: We calculated the frequency https://github.com/DL42/SFS_DFE_categorical. The modi- fi of preferred codons (FOP) per gene. As before, preferred ed version of the GO Fish program used to generate the codons were defined as the most frequent codon for a given Wright-Fisher simulations is also available on Github: amino acid. The FOP was calculated with our 4D datasets, https://github.com/DL42/3P_custom_synonymous. Supple- fi such that codons that did not appear in our datasets (e.g., mental material is available at gshare: https://doi.org/ those without 4D sites) did not contribute to the FOP calcu- 10.25386/genetics.11316794. lation. Sites were classified as being in genes with either low (bottom quartile), medium (middle two quartiles), or high Results (top quartile) FOP. The average proportion of preferred co- Sequence data and neutral controls dons for sites in low, medium, and high FOP genes was 28, 42, and 54%, respectively. We identified all 4D synonymous sites and putatively neutral SI sites in two datasets, one of an African (Zambia) and one ofa Estimating the number of sites under strong purifying North American (DGRP Freeze (2) D. melanogaster popula- selection in functional class subsets: As most of the func- tion (see Materials and Methods). We used short introns as tional class subsets have too few sites to perform the ML our neutral reference, as D. melanogaster short introns have

Strong Selection on Codon Usage Bias 517 Table 1 Nested maximum-likelihood models tested for the Zambia and DGRP datasets Population Dataset Model Prop. S1 -Ns S1 Prop. S2 -Ns S2 Delta LL P

Zambia Full n —————— n + l 0.09 (0.09–0.10) Inf ——255 1*102111 n + s 0.12 (0.11–0.14) 26 (14–44) ——303 3*10222 n + s + l 0.10 (0.06–0.90) 14 (5–33) 0.03 (0–0.06) Inf 307* 3*1023 n + s + s 0.06 (0–0.158) 7 (0.1–10) 0.07 (0.03–0.13) 81 (18-Inf) 306 1 DGRP Full n —————— n + l 0.13 (0.12–0.14) Inf ——239 1*102104 n + s 0.14 (0.12–0.15) Inf (75-Inf) ——241* 0.04 n + s + l 0.11 (0–0.87) 166 (0-Inf) 0.03 (0–0.14) Inf 241 1 n+s+s 0(0–0.01) 0.2 (0.1–6) 0.14 (0.12–0.15) Inf (76-Inf) 241 1 Zambia Preferred n —————— n + l 0.18 (0.16–0.19) Inf ——446 2*102194 n + s 0.30 (0.27–0.33) 7 (4–11) ——669 9*10299 n + s + l 0.23 (0.18–0.28) 3 (1–6) 0.07 (0.03–0.10) Inf 691 1*10211 n + s + s 0.82 (0.19–0.86) 0.3 (0.2–2) 0.16 (0.10–0.20) 29 (15–75) 699* 9*1025 DGRP Preferred n —————— n + l 0.25 (0.24–0.26) Inf ——512 2*102223 n + s 0.29 (0.27–0.31) 29 (12–57) ——574 1*10228 n + s + l 0.14 (0.08–0.21) 3 (1–10) 0.15 (0.09–0.19) Inf 597* 2*10211 n + s + s 0.15 (0.11–0.79) 2 (0.1–5) 0.19 (0.14–0.24) 116 (52-Inf) 598 0.1 n: neutral; l: lethal; s: selection. Maximum-likelihood parameter estimates per model (median of 200 sets of matched 4D and SI sites). Values in parentheses are 95% bootstrap confidence intervals. Model comparison was performed with chi2 goodness of fit test. *Best fit model with P , 0.05. The delta log-likelihood (Delta LL) is with respect to the neutral model, while the P value is with respect to the nested model above. been previously found to be under minimal selective con- Reduced 4D polymorphism relative to SI polymorphism: straint (Haddrill et al. 2005; Parsch et al. 2010; Clemente The decrease in 4D polymorphism compared with the SI and Vogl 2012b). We matched 4D sites to SI sites based on controls can be expressed as the “polymorphism ratio”, de- ancestral nucleotide, mutational context, and location (within fined as the natural logarithm of the SI polymorphism to 4D 1000 bp). As the number of 4D sites was greater than the polymorphism ratio. We find this statistic to correlate well number of SI sites, we allowed SI sites to be matched to with the proportion of sites under strong purifying selection multiple 4D sites. We confirmed that our results are robust estimated by our ML method (Figure S5; see Materials and to this matching strategy by also performing an analysis Methods for discussion). A positive polymorphism ratio is due matching only one 4D site per SI site, and found our results to decreased 4D polymorphism (relative to SI polymorphism) to be qualitatively unchanged (but with decreased power, and can be indicative of strong purifying selection. We found Figure S2). We performed the 4D/SI matching 200 separate a reduction in 4D polymorphism in both the Zambia and times, producing 200 SI control sets. DGRP datasets (polymorphism ratio = 0.10 and 0.14, for Zambia and DGRP, respectively; Figure 1). We found an even Many synonymous sites in preferred codons are under greater reduction in 4D polymorphism for preferred codons strong selection (polymorphism ratio = 0.19 and 0.29 for Zambia and DGRP, In order to detect the presence of purifying selection on respectively), and almost no reduction in 4D polymorphism synonymous sites, we compared the synonymous 4D SFS in unpreferred codons (polymorphism ratio = 0.01 for Zam- and polymorphism level (the proportion of polymorphic sites) bia and DGRP), suggesting that preferred codons are under to that of the matched SI controls. Purifying selection will more strong purifying selection than unpreferred codons. affect the shape of the SFS and the polymorphism level in a number of ways. First, purifying selection in general removes ML estimates of purifying selection: The strong reduction in genetic variation from a population, resulting in a decrease in 4D polymorphism is suggestive of strong purifying selection the polymorphism level. The effect of purifying selection on operating on 4D sites; however, this measure is (1) under- the shape of the SFS is a function of the strength of selection. powered in detecting weak selection, and (2) does not provide

Weak purifying selection (Nes .210) decreases the den- robust quantitation of selection strength or proportion of sites sity of the SFS at intermediate allele frequencies, and en- under selection. We used a ML estimation to quantify the riches low frequency variants. Strong purifying selection strength and extent of purifying selection (see Materials and

(Nes ,210) results in an enrichment of very low allele fre- Methods for discussion), testing five different selection mod- quency variants, making a skew in the SFS detectable only els: (1) neutral, (2) neutral + lethal, (3) neutral + 1 selection when a large number of individuals have been sampled. Very coefficient, (4) neutral + 1 selection coefficient + lethal, and strong selection (and lethal mutations: Nes = 2Inf) will not (5) neutral + 1 weak selection coefficient + 1 strong selec- affect the shape of the SFS, and will result primarily in a tion coefficient (Table 1 and Figure 1 and see Materials and decrease in the polymorphism level. Methods). For the full Zambia dataset, the best fit model is the

518 H. E. Machado, D. S. Lawrie, and D. A. Petrov Figure 1 Polymorphism and selection estimates differ by CUB group. SFS (cumulative), polymorphism ratio, and best-fit selection estimates for fourfold synonymous (4D) and matched short intron control (SI) sites for the full dataset (top), for preferred codons (middle), and for unpreferred codons (bottom). L: lethal.

neutral + selection + lethal model, which inferred strong from neutrality, and from lethality. The corresponding model selection (10% at Nes = 214) with a small lethal class. The for DGRP likewise supports a range of selection effects in lack of a weak selection estimate for the full dataset is con- preferred sites (15% at Nes = 22, 19% at Nes = 2116). sistent with previous findings that the signal of weak selec- Estimating selection in the DGRP dataset is, again, hampered tion is too low for detection when including all sites. by lower power as this model is not a significantly better fit Similarly, for the full DGRP dataset, our best-fit selection than the neutral + selection + lethal model, with the 95% estimate also inferred a strong selection class; however, the confidence intervals overlapping lethality for the strong se- bootstrap 95% confidence interval includes lethality, indicat- lection class. Though lacking power, these DGRP results pro- ing that the neutral + 1 selection coefficient model was not vide independent support for the findings in Zambia of a wide significantly better than the neutral + lethal model (13% range of selection strengths affecting mutations in preferred lethal). The DGRP dataset has half the number of polymor- 4D sites in D. melanogaster. phisms as the Zambia dataset (47K in DGRP), and, thus, the 4D sites in unpreferred codons show a corresponding lack strongest selection that can routinely be distinguished from of purifying selection, with the neutral model having the best lethality is Nes 260 (for power analyses, see Figure S4). fit to the data. This is consistent with the low polymorphism With fewer sites, the test lacks power in the DGRP data, but ratios for these datasets (0.01 for both Zambia and DGRP, its results are consistent with the presence of strong purifying Figure 1). The enrichment of sites under selection in the set of selection in 4D sites as found using Zambia data. preferred codons and the lack of selection found in the set of When we narrow the focus to 4D sites in preferred codons, a unpreferred codons indicates that selection on CUB is a major class ofsites putatively enrichedfor purifying selection, the ML component of the total amount of purifying selection on syn- analysis is able to discriminate more features of the distribu- onymous sites, and that the identification of both a weak and tion of selection coefficients operating on 4D sites. The best-fit a strong selection class for preferred codons indicates that model for Zambia preferred codons is the neutral + 2 selection selection on CUB may not be limited to weak selection. coefficient model (82% at Nes = 20.3, 16% at Nes = 229), Phylogenetic conservation scores support finding of indicating a range of weak and strong selection coefficients strong selection on CUB acting at the preferred sites. The 95% confidence interval for the proportion of sites in the weak selection category is large If our ML and polymorphism ratio estimates truly do reflect (19–86%), but, importantly, both the weak and strong puri- selection levels, we might also expect our estimates to corre- fying selection classes are distinguishable from each other, late well with signatures of long-term selection, such as

Strong Selection on Codon Usage Bias 519 phylogenetic conservation. We calculated phyloP phylo- genetic conservation scores across a 10-species Drosophila phylogeny (excluding D. melanogaster). The phyloP score measures the extent of conservation or divergence per site, with positive values representing conservation and negative values representing divergence. We asked if there was a cor- relation between the proportion of sites we identified to be under purifying selection and the level of phylogenetic con- servation. We found a strong correlation between polymor- phism ratio and phyloP conservation score of 4D sites (Zambia: R2 = 0.96, P , 2*10216; DGRP: R2 = 0.94, P , 2*10216; Figure 2). We also performed ML estimates of the proportion of sites under selection for 4D sites in low (lower quartile), medium (middle two quartiles), or high (upper quartile) phyloP scores. Again, we observe the same relation- ship of increasing purifying selection with increasing con- Figure 2 Phylogenetic conservation correlates with selection estimates. servation (Figure 2). The correlation of phylogenetic Correlation between the phyloP conservation score across a Drosophila conservation with our estimates of purifying selection phylogeny, and the proportion of sites under selection, as estimated by the polymorphism ratio (lines) and the maximum-likelihood (ML) method supports the relevance of our estimates to long-term (triangles). Dark red: Zambia, light blue: DGRP. The polymorphism ratio constraint. was estimated in sliding windows of 100K SNPs. The ML estimates were made for three groups: the lowest quartile, the middle two quartiles, and Level of preference for a codon predicts proportion of the highest quartile of phyloP scores. ML estimates are plotted against the sites under strong selection median phyloP score for each group. Our findings suggested that a substantial proportion of synon- More selection on synonymous sites due to CUB than ymous sites in preferred codons were under strong purifying due to other processes selection. Since the biased usage of codons actually exists on a continuum, rather than binary designations of “preferred” and Several processes other than those related to CUB have also “unpreferred”, we next asked whether or not the level of biased been hypothesized to act on synonymous sites. In order to usage (for a particular codon) correlates with the amount of assess the relative importance of various processes driving the strong selection observed. We used the relative synonymous observed selection on synonymous sites, we tested several codon usage (RSCU) as a measure of the level of codon putatively functional classes of sites for enrichment of puri- preference (Sharp and Li 1986). We measured RSCU for fying selection. In addition to preferred codons, we tested TF each 4D codon and compared that to the polymorphism bound regions, alternatively spliced genes, RNA binding pro- ratio. We found a strong positive relationship between tein (RBP) bound regions, splice junctions, and high ribosomal RSCU and polymorphism ratio (Zambia: R2 = 0.56, P = occupancy regions. We calculated the polymorphism ratio for 4*1027;DGRP:R2 = 0.63, P = 4*1028;Figure3). each functional class, and compared it to that of the set of sites We next asked if the change in RSCU, from ancestral to excluding the functional class—the latter providing the back- derived, correlated with polymorphism ratio. We hypothe- ground level. We found a significantly elevated (above back- sized that mutations to a less preferred state (positive RSCU ground) polymorphism ratio not only for preferred codons change) would show evidence for strong purifying selection but also for alternatively spliced genes, spliceosome bound (positive polymorphism ratio), whereas mutations to a more regions, splice junctions, and TF bound regions in both the preferred state (negative RSCU change) would be positively Zambia and the DGRP populations (Figure 4 and Figure S8). selected for, and have an increased level of 4D polymorphism We then estimated the relative contribution of each func- relative to the SI control (negative polymorphism ratio). We tional class to the signal of strong purifying selection (see found a strong, positive relationship between RSCU change Materials and Methods). We combined the three splicing-re- and polymorphism ratio, with negative polymorphism ratios lated classes (alternatively spliced genes, spliceosome bound for strongly preferred derived mutations on unpreferred an- regions, and splice junctions), leaving three general groups cestral codons (Figure 3). Under the assumption that SI putatively under purifying selection: CUB, splicing, and tran- sites are neutral, negative polymorphism ratios (i.e., greater scription factor binding. In the Zambia dataset, we estimated levels of polymorphism at 4D than SI sites) can result from there to be 150K sites under strong purifying selection asso- particular mutation and selection regimes (McVean and ciated with CUB, 38K with splicing, and 4K with transcription Charlesworth 1999; Lawrie et al. 2011), such as positive se- factor binding (Figure 4). The DGRP dataset showed similar lection on 4D sites increasing 4D polymorphism. Our results trends: 217K sites under strong purifying selection associated support the hypothesis of purifying selection on the strongest with CUB, 100K with splicing, and 13K with transcription unpreferred changes and positive selection on the strongest factor binding. In summary, we found that CUB explained preferred changes. the greatest number of 4D sites under purifying selection,

520 H. E. Machado, D. S. Lawrie, and D. A. Petrov Figure 3 Level of CUB correlates with strong selection measurements. Top: Polymorphism ratio for each codon as a function of the level of bias for the codon (relative synonymous codon usage: RSCU) (median over 200 matched controls). Bottom: Polymorphism ra- tio for each ancestral/derived codon pair as a function of the change in RSCU.

representing 2–4 times as many sites as splicing and tran- rate. While there is a greater proportion of preferred codons scription factor binding combined. in high recombination rate regions (42.1% and 42.6% for We also measured the polymorphism ratio for the sites least Zambia and DGRP, respectively) than in low recombination likely to be under selection. We excluded the two largest rate regions (40.1% and 41.2% for Zambia and DGRP, re- contributors to selection on synonymous sites: preferred co- spectively; both chi2 P , 10215), we found no evidence of dons and alternatively spliced genes. The set of unpreferred increased strong purifying selection on preferred codons in codons in nonalternatively spliced genes consisted of 137K high recombination rate regions compared with those in low sites in Zambia and 158K sites in DGRP, and represented the recombination rate regions (Figure 5), nor any general in- 4D sites least likely to be under strong selection. Interestingly, crease in selection with recombination rate (Figure S7). we found that this set of 4D sites had more polymorphism than Features of strong selection on CUB by gene type or their SI matched control set (negative polymorphism ratio), genic location indicating greater purifying selection in short introns and/or the presence of positive selection on these 4D sites. Preferred codons may be under greater purifying selection in some genes than in others. We asked if a greater proportion of Recombination rate does not influence strong selection preferred codons were under strong purifying selection in on CUB genes with high CUB compared to genes with low CUB. One Previous studies have found mixed evidence of a correlation measure of the amount CUB in a gene is the FOP. We calcu- between recombination rate and CUB (Kliman and Hey 1993; lated FOP per gene and asked if, as expected, there was a Marais et al. 2001; Singh et al. 2005; Campos et al. 2012, stronger signal of purifying selection on 4D sites in genes with 2013). We therefore tested for increased levels of purifying higher FOP. We found a trend toward a larger polymorphism selection on preferred 4D sites as a function of recombination ratio for 4D sites in high FOP genes (Zambia: 0.103; DGRP:

Strong Selection on Codon Usage Bias 521 Figure 4 (A) Extent of strong purifying selection as measured by the polymorphism ratio for each class of site (gray) and the dataset excluding the focal class sites (white) (Zambia dataset). The number of sites in a focal class is listed below the corresponding bar. The red dashed line is the polymorphism ratio for the full dataset. Error bars represent 2 SE. (B) Relative proportion of synonymous sites under strong purifying selection due to splicing, CUB, or transcription factor (TF) binding.

0.150) compared with low FOP genes (Zambia: 0.087; DGRP: end . middle: Zambia P = 8*1025, DGRP P = 0.03). How- 0.131), albeit the trend was not significant (Zambia: t-test P = ever, this pattern was also observed in unpreferred codons (t- 0.19; DGRP: t-test P = 0.26; Figure S6). test start . middle: Zambia P = 0.05, DGRP P = 0.04; t-test We then evaluated the patterns of CUB-associated poly- end . middle: Zambia P = 0.06, DGRP P = 0.5), indicating morphism by grouping 4D sites into three categories: pre- that this effect may be unrelated to CUB. Alternatively, this ferred, unpreferred with mutations to another unpreferred could be a result of purifying selection on synonymous sites state, unpreferred with mutations to the preferred state. We important for splicing. We next assessed polymorphism ratio found no trend of polymorphism ratio verses FOP for preferred as a function of the exon position along the gene (either first codons, indicating that a similar proportion of preferred exon, last exon, intermediate exons, or exons of single-exon codons were under strong selection in genes with low overall genes). No consistent patterns were observed with location of biased codon usage compared with genes with high bias and the exon (Figure 7). consequently,that a larger number of preferred codons in high FOP genes are subject to strong selection (Figure 6). Discussion Interestingly,we found a pattern of negative polymorphism Strong and weak purifying selection on CUB ratios for unpreferred codons specifically in high FOP genes, which was particularly pronounced for sites that were ances- We find evidence that selection on CUB is not limited to weak trally unpreferred with derived preferred mutations. This selection, and find that 10–20% of 4D sites in preferred co- pattern was much stronger in high FOP genes than low dons are under strong purifying selection. Our study builds FOP genes (Zambia: t-test P = 0.02; DGRP: t-test P = on methodology developed in Lawrie et al. (2013), recapit- 3*1025). This is important for the interpretation of Figure ulating their major result of strong purifying selection on S6, as these negative polymorphism ratios at unpreferred synonymous sites, and extending the analysis to identify codons in high FOP genes lead to lower overall polymor- functional associations. We were able to gain a finer view phism ratios than would be expected given the larger number by (1) use of multiple, deeply sampled datasets, (2) ancestral of preferred codons subject to strong selection in such genes polarization of alleles, and (3) strict filtering of sites with low (Figure S6). These patterns are generally consistent with quality or near indels to reduce noise. Although Lawrie and stronger positive selection in favor of preferred codons in colleagues found evidence of strong selection on 4D sites, the high FOP genes (Jackson et al. 2017). underlying processes examined could not account for this CUB has also been shown to vary depending on the location signal. While our finding of strong purifying selection on in the gene (Plotkin and Kudla 2011). We first asked if pre- 4D sites is consistent with the findings of Lawrie et al. ferred codons vary in the amount of strong purifying selection (2013), our study finds that CUB accounts for the majority that they are under as a function of the location in the exon. of this selection, and, in conjunction with splicing and TFB, We measured the polymorphism ratio for each class of codon can fully account for the decreased levels of 4D polymor- preference at the start (first quartile) of an exon, the middle phism. In addition to finding that many 4D sites are subject of an exon (second and third quartile), or the end of an exon to strong selection, we also find evidence that a substantial (fourth quartile). In preferred codons, we found a trend to- proportion of 4D sites are under weak purifying selection ward increased polymorphism ratio at the start and the end on CUB, which is consistent with the signal of weak selec- of exons, compared with the middle of the exons (Figure 7; tion previously observed in D. melanogaster (Zeng and t-test start . middle: Zambia P = 0.1, DGRP P = 0.01; t-test Charlesworth 2009, 2010; Campos et al. 2013).

522 H. E. Machado, D. S. Lawrie, and D. A. Petrov ratios for codons that we would expect to be under the greatest amount of positive selection, i.e., codons with highly unpre- ferred ancestral states and highly preferred derived mu- tations. Positive selection could be driving an influx of mutations that alter 4D sites from generally less-fit unpre- ferred states to generally more-fit preferred states. The neg- ative polymorphism ratios are also consistent with excess purifying selection on the control SI sites relative to the tested 4D sites. Figure 5 Polymorphism ratio by codon preference and recombination It is well established that certain genes, particularly those rate (RR). RR groups are classified into low (lowest third), medium (middle with high expression, tend to have a greater proportion of third), and high (top third). U: unpreferred; P: preferred. Arrows separate preferred codons (Gouy and Gautier 1982; Bulmer 1991; ancestral (pre-arrow) and derived (postarrow) states. Error bars are 2 SE. Novoa and Ribas de Pouplana 2012). We measured the poly- For methodological reasons, many previous methods iden- morphism ratio for sites in genes of low, medium, and high tified only weak selection on CUB. Strong purifying selection is FOP. From this analysis, we have three major findings: (1) not detectable with methods that use only the polymorphic the proportion of preferred codons under strong purifying SFS, without sufficiently high depth of population sequencing selection is relatively constant across genes (Figure 6), (2) (Zeng and Charlesworth 2009, 2010; Campos et al. 2013) or there is evidence for increased positive selection for derived methods that incorporate polymorphism level, but assume a preferred mutations in high FOP genes (Figure 6 and Jackson DFE that can provide misleading biological inferences if the et al. 2017), and (3) the contribution of excess 4D polymor- assumed distribution does not match the true DFE, e.g., the phism from these derived preferred mutations in high FOP gamma distribution in Eyre-Walker and Keightley (2007), genes is an example of how the polymorphism ratio measure Andolfatto et al. (2011) and Kousathanas and Keightley of strong purifying selection can be dampened by positive (2013). Other inference methods that incorporate polymor- selection. To more fully articulate the third point, polymor- phism level, but without making use of a neutral reference, phism ratio in high FOP genes is the combination of two would also have difficulty detecting strong purifying selec- competing processes: more ancestral preferred codons in- tion (Zeng and Charlesworth 2009, 2010; Campos et al. creasing the polymorphism ratio, and more positive selection 2013). In our analysis, we use the polymorphism level and for derived preferred codons reducing the polymorphism ra- SFS to make point estimates of selection strengths, which is tio. The opposing actions result in no significant increase in robust to a range of real underlying DFEs (Eyre-Walker and overall polymorphism ratio from low to high FOP genes (Fig- Keightley 2007; Kousathanas and Keightley 2013). This al- ure S6), despite the expectation that a larger number of pre- lows us to detect selection occurring at both the weak and the ferred codons should produce higher estimates of selection in strong range of selection coefficients, such as in the set of high FOP genes (Jackson et al. 2017). Zambia preferred codons, where we detect peaks of selection Selection on other functional classes coefficients at Nes = 20.3 as well as at Nes = 229. Thus, our method is able to infer a wide range of selection strengths We find splicing to be the second-most important process operating in 4D synonymous sites. underlying purifying selection on synonymous sites. We tested three classes of sites putatively enriched for selection Polymorphism ratio correlates with the level of CUB per due to splicing: alternatively spliced genes, spliceosome- codon and per gene bound regions, and splice junctions. Of these three classes, Since we control for mutation rate and local determinants of although alternatively spliced genes explain the greatest polymorphism such as linked selection and recombination, we amount of selection on synonymous sites (90K sites), owing find that the polymorphism ratio of the SI to 4D sites can be a to the large number of sites in alternatively spliced genes, we good proxy for the proportion of sites under strong purifying find that splice junctions have the greatest proportion of sites selection, as seen in simulations as well as empirically in the under selection (45% under strong purifying selection), relationship between polymorphism ratio and both the ML followed by spliceosome-bound regions. Splicing is known estimates of selection and the level of phylogenetic conser- to be a critical function for proper development and function vation (Figure 2 and Figure S5). Further, the estimated pro- of an organism. Similar research in humans has likewise portion of sites under strong selection is highly correlated found evidence of widespread, strong evolutionary constraint with the extent of CUB, as measured by the relative synony- affecting 4D sites in exonic splice enhancers (Savisaar and mous codon usage (RSCU) (Figure 3). The change in RSCU Hurst 2018). from ancestral to derived also correlates with the proportion There is also evidence for strong selection on TF bound 4D of sites under strong selection. These results further support sites. We estimate that 3K 4D sites are under strong selec- our conclusion of strong purifying selection on CUB. tion due to TFB. To identify TF bound sites, we used ChIP-seq We also notice negative polymorphism ratios in the RSCU experiments targeted at 16 different TFs. With a larger analysis, where we find strongly negative polymorphism breadth of TFB data, 4D sites in TF-bound regions may prove

Strong Selection on Codon Usage Bias 523 more appropriate match than if we had not polarized by ancestral state. Since we match locally (, 1000 bp), 4D sites and their matched SI controls will also be subject to the same local mutation and recombination rate effects. Further, these results should not be confounded by biased gene conversion as this phenomenon appears to be absent in D. melanogaster (Robinson et al. 2014; Jackson et al. 2017), and the local 4D/ SI matching would correct for it even if it were active. Finally, we observe that our estimates of purifying selection in poly- Figure 6 Polymorphism ratio by class of codon preference and the fre- morphism correlate both with long-term signals of purifying fi quency of preferred codons (FOP). FOP groups are classi ed into low selection in divergence (Figure 2), and with the putative (lowest quartile), medium (middle two quartiles), and high (top quartile). U: unpreferred; P: preferred. Arrows separate ancestral (pre-arrow) and functionality of a class of sites (Figure 4), such as preferred derived (postarrow) states. Error bars are 2 SE. codons, splice junctions, RBP bound regions, and alterna- tively spliced genes, supporting the claim that our results to be under a greater amount of selection than we can detect reflect the action of selection, rather than being artifactual. here. Potential biases in selection estimates We find that CUB, splicing, and TFB are sufficient to explain the polymorphism differences between 4D and SI control Our estimates of purifying selection on 4D sites may be sites, indicating that these processes also explain the bulk conservative, underestimating the true amount of selection of strong purifying selection acting on synonymous sites. on 4D sites. This could be the case if there was any constraint However, it is important to note that our measures are only on the SI controls, or if there was positive selection on the 4D correlative with the functional class being tested, such that we sites themselves. There were two methodological decisions cannot say that these processes directly underlie the selection. that may have contributed to constraint in short introns. Both In addition, there are likely multiple other processes acting on Halligan and Keightley (2006) and Parsch et al. (2010) found synonymous variants that we have not included. Other that short introns (,65 bp and ,120 bp, respectively) have processes that have been shown, or hypothesized, to act on the least constraint on bases 8–30. As we included a larger 4D sites include transcriptional regulation (Newman et al. portion of the intron, it is possible that we have also included 2016), protein folding (Rodriguez et al. 2018), and RNA SI sites under a greater amount of conservation. We also transcript stability (Presnyak et al. 2015; Carneiro et al. excluded regions surrounding indels (10 bp on either side) 2019) and secondary structure (Gebert et al. 2019). Given in order to reduce false polymorphisms due to mismapping. the explanatory power of our results, we suggest that these This more strongly affects short introns (as they are more other processes are either less affected by synonymous vari- permissive to indels than coding regions) and will select for ation or that they are correlated with the processes already more conserved SI regions. Purifying selection acting on the tested. control SI sites can result in a negative polymorphism ratio, as can positive selection acting on 4D sites. As previously Controlling for linked selection and mutation rate discussed, we actually do observe a negative polymorphism One caveat to our polymorphism level based method of ratio in certain data subsets. Regardless of whether the ex- estimating selection is that multiple processes can reduce planation for a negative polymorphism ratio is purifying se- the observed level of polymorphism of a site. These include lection on control SI sites or positive selection on 4D sites, linked selection, low recombination rate, a reduced mutation both dampen our ability to detect purifying selection. rate or selection on the site itself. In order to isolate the effects Our estimates of purifying selection can also be biased as of selection on 4D sites, we controlled the distance between the result of demographic forces. For example, a simulated matched 4D and SI sites so that, on average, the sites in the 4D demographic scenario for Zambia shows that our ML model and SI site frequency spectra were experiencing the same rate fails to completely correct for the bottleneck/growth demo- of recombination and mutation as well as comparable levels of graphic scenario, and may result in reduced estimates of the linked selection. We found that, with an increasing distance of proportion of weakly deleterious sites and the strength of up to 1000 bp from the focal 4D site to its SI control, there strong purifying selection (Figure S4). Additionally, the sen- was no systematic change in polymorphism in the SI control, sitivity of our ML model to estimate multiple selection cate- indicating that, overall, the matched controls were under a gories is dependent on the number of sites under selection. sufficiently similar amount of linked selection (Figure S1). In Deeper population sampling and a larger set of sites support- order to account for mutational differences, we added a fur- ing more selection categories could reveal greater detail about ther requirement that matched controls have the same 3 bp the true, underlying DFE of 4D sites. mutational context as the 4D sites. In Drosophila, there is a New model of CUB significant effect of 3 bp context on mutation rate (Sharp and Agrawal 2016). For polymorphic sites we used the ancestral Our finding that selection on CUB ranges from weak to allele for matching (polarized from D. simulans), providing a strong directly contradicts the standard Li–Bulmer model of

524 H. E. Machado, D. S. Lawrie, and D. A. Petrov mutations with similar fitness and phenotypic effect sizes to missense mutations, often enriched in genes with high CUB, or causing significant changes in the Codon Adaptation Index (Sharon et al. 2018; She and Jarosz 2018). We propose a new model where the strength of selection per codon varies from nonexistent to strong within a gene, with the level of CUB in a gene set primarily by the distribution of selection coefficients across sites. Genes that have high CUB under our model would have more sites subject to strong selection in favor of preferred codons compared to genes with low CUB, as we in fact see in the data. This eliminates the problem of setting the proportion of preferred codons by fine- tuning the strength of selection at all preferred sites to a

particular value of s 1/Ne under the Li–Bulmer model. In addition, under our model, a substantial proportion of pre- Figure 7 Polymorphism ratio by codon preference and exonic position. ferred codons is subject to such strong purifying selection Polymorphism ratio by class of codon preference and the position along (s .. 1/Ne), that reduction in effective population size by the exon (top) or the exon position along the gene (bottom). U: unpre- orders of magnitude due either to demographic shifts or mod- ferred; P: preferred. Arrows separate ancestral (pre-arrow) and derived (postarrow) states. Error bars are 2 SE. ulation in the strength of genetic draft would still not abolish CUB, as many preferred sites would still remain subject to selection on CUB. The Li–Bulmer model assumes a constant strong selection (s . 1/Ne). At the other extreme, a substan- selection coefficient for a codon, and, given the intermediate tial increase in effective population size would not generate proportion of preferred codons observed in many species, complete CUB as many preferred sites may not be subject to predicts that selection on CUB is weak (Li 1987; Bulmer purifying selection at all. 1991). However, the Li–Bulmer model has not always agreed If this model is correct, the key question that remains is with the data. First, since population sizes vary by several what determines whether a particular synonymous site is orders of magnitude across species, the selection coefficient subject to strong, weak, or no selection in favor of preferred would have to vary inversely by several orders of magnitude codons. Specifically, the sites under very strong selection as well in order to result in the observed intermediate levels might play a disproportionately important role by, for exam- of CUB (Hershberg and Petrov 2008). There is no intuitive ple, being essential for cotranslational folding, transcription, reason to think that the selection coefficient would be in- RNA stability, translational efficiency, or translational accu- versely related to the population size, or that it should vary racy.This would suggest that the location of such synonymous by several orders of magnitude. Second, if selection is weak, sites should be largely conserved across species, as we in fact there should be more CUB in high recombination rate re- detect to some extent by showing a correlation between gions. This prediction comes from the increased effect of polymorphism ratio and phylogenetic constraint in the Hill-Robertson interference (linked selection) in low recom- Drosophila genus (Figure 2). bination rate regions (Felsenstein 1974; Charlesworth and Conclusion Campos 2014). While there is some evidence for a correlation between CUB and recombination rate in D. melanogaster We find evidence that CUB is under a substantial amount of (Kliman and Hey 1993; Campos et al. 2012), this is not true purifying selection in D. melanogaster, and that this is not for the D. melanogaster X chromosome (Singh et al. 2005; limited to weak selection. Our finding that there is a distri- Campos et al. 2013), and the correlations that have been bution of fitness effects for CUB, ranging from weak to strong found can be explained by mutation rate variation (Marais selection, resolves the contradiction between the intermedi- et al. 2001). Third, though previous studies of polymorphism ate frequencies of preferred codons observed in most species, data have inferred selection on synonymous sites to be weak and the population-size independence of said frequencies. (in keeping with Li–Bulmer), these patterns of polymorphism We also reconcile the observations that changes in synony- have also suggested a more complex selection model among mous codons can have large phenotypic effects, but that ge- 4D codons than simply preferred vs. unpreferred, contraven- nomic methods have identified only weak selection. We ing the model’s standard formulation (Singh et al. 2007; Zeng suggest that the reasons previous studies did not find evi- 2010; Clemente and Vogl 2012a). Fourth, there is experi- dence for strong selection on CUB are methodological. Our mental evidence that changes in one or more synonymous use of a test that includes the polymorphism level, while codons can have large phenotypic and fitness effects, suggest- controlling for mutation rate and linked selection, provides ing that selection on CUB is not always weak (Zhou et al. sufficient power for identifying strong purifying selection. 1999; Carlini and Stephan 2003; Carlini 2004; Sharon et al. While this study was performed in Drosophila, the impor- 2018; She and Jarosz 2018; Yannai et al. 2018). Indeed, tance of a new model of CUB is general, as both CUB and experiments in yeast have found large numbers of synonymous the assumption of constant weak selection on CUB is

Strong Selection on Codon Usage Bias 525 widespread. Further, this study underscores the importance lution in Drosophila. Annu. Rev. Genet. 48: 383–403. https:// of CUB, and of synonymous variation in general, to the fitness doi.org/10.1146/annurev-genet-120213-092525 of an organism, and opens research directions to further un- Chen, S. L., W. Lee, A. K. Hottes, L. Shapiro, and H. H. McAdams, 2004 Codon usage between genomes is constrained by ge- derstand this phenomenon. nome-wide mutational processes. Proc. Natl. Acad. Sci. USA 101: 3480–3485. https://doi.org/10.1073/pnas.0307827100 Clemente, F., and C. Vogl, 2012a Evidence for complex selection Acknowledgments on four-fold degenerate sites in Drosophila melanogaster. J. Evol. Biol. 25: 2582–2595. https://doi.org/10.1111/jeb.12003 The authors would like to thank Arbel Harkpak, Jamie Clemente, F., and C. Vogl, 2012b Unconstrained in Blundell, Marc Feldman, Elizabeth Hadly, and Jonathan short introns? an analysis of genome-wide polymorphism and Pritchard for helpful discussion of the project. We would like divergence data from Drosophila. J. Evol. Biol. 25: 1975–1990. to thank Emily Ebel, Alison Feder, Sandeep Venkataram, https://doi.org/10.1111/j.1420-9101.2012.02580.x Coffman, A. J., P. H. Hsieh, S. Gravel, and R. N. Gutenkunst, Kerry Geiler-Samerotte, Chris McFarland, David Enard, Tori 2016 Computationally efficient composite likelihood statistics Arch, and the Editor and the anonymous reviewers for very for demographic inference. Mol. Biol. Evol. 33: 591–593. helpful feedback on the manuscript. The GO Fish population https://doi.org/10.1093/molbev/msv255 genetics simulations were run on a Titan Xp donated by the Cooper, G. M., E. A. Stone, G. Asimenos, E. D. Green, S. Batzoglou NVIDIA Corporation. et al., 2005 Distribution and intensity of constraint in mam- malian genomic sequence. Genome Res. 15: 901–913. https:// Paper babies: Oliver and Christopher Lawrie and Ellie de doi.org/10.1101/gr.3577405 Glanville Blundell. Comeron, J. M., R. Ratnappan, and S. Bailin, 2012 The many landscapes of recombination in Drosophila melanogaster. PLoS Genet. 8: e1002905. https://doi.org/10.1371/journal.pgen. 1002905 Literature Cited Cutter, A. D., and B. Charlesworth, 2006 Selection intensity on preferred codons correlates with overall codon usage bias in Akashi, H., 1994 Synonymous codon usage in Drosophila mela- Caenorhabditis remanei. Curr. Biol. 16: 2053–2057. https:// nogaster: natural selection and translational accuracy. Genetics doi.org/10.1016/j.cub.2006.08.067 136: 927–935. Drummond, D. A., and C. O. Wilke, 2008 Mistranslation-induced Andolfatto, P., K. M. Wong, and D. Bachtrog, 2011 Effective pop- protein misfolding as a dominant constraint on coding-sequence ulation size and the efficacy of selection on the X chromosomes evolution. Cell 134: 341–352. https://doi.org/10.1016/j.cell. of two closely related Drosophila species. Genome Biol. Evol. 3: 2008.05.042 114–128. https://doi.org/10.1093/gbe/evq086 Dunn,J.G.,C.K.Foo,N.G.Belletier,E.R.Gavis,andJ.S.Weissman, Brooks, A. N., M. O. Duff, G. May, L. Yang, M. Bolisetty et al., 2013 Ribosome profiling reveals pervasive and regulated stop 2015 Regulation of alternative splicing in Drosophila by codon readthrough in Drosophila melanogaster. eLife 2: e01179. 56 RNA binding proteins. Genome Res. 25: 1771–1780. https://doi.org/10.7554/eLife.01179 https://doi.org/10.1101/gr.192518.115 Eyre-Walker, A., and P. D. Keightley, 2007 The distribution of Brown, J. B., N. Boley, R. Eisman, G. E. May, M. H. Stoiber et al., fitness effects of new mutations. Nat. Rev. Genet. 8: 610–618. 2014 Diversity and dynamics of the Drosophila transcriptome. https://doi.org/10.1038/nrg2146 Nature 512: 393–399. https://doi.org/10.1038/nature12962 Eyre-Walker, A., and P. D. Keightley, 2009 Estimating the rate of Bulmer, M., 1991 The selection-mutation-drift theory of synony- adaptive molecular evolution in the presence of slightly delete- mous codon usage. Genetics 129: 897–907. rious mutations and population size change. Mol. Biol. Evol. 26: Campos, J. L., B. Charlesworth, and P. R. Haddrill, 2012 Molecular 2097–2108. https://doi.org/10.1093/molbev/msp119 evolution in nonrecombining regions of the Drosophila mela- Eyre-Walker, A., M. Woolfit, and T. Phelps, 2006 The distribution nogaster genome. Genome Biol. Evol. 4: 278–288. https:// of fitness effects of new deleterious amino acid mutations in doi.org/10.1093/gbe/evs010 humans. Genetics 173: 891–900. https://doi.org/10.1534/ge- Campos, J. L., K. Zeng, D. J. Parker, B. Charlesworth, and P. R. netics.106.057570 Haddrill, 2013 Codon usage bias and effective population Feder, A. F., D. A. Petrov, and A. O. Bergland, 2012 LDx: estima- sizes on the X chromosome vs. the autosomes in Drosophila tion of linkage disequilibrium from high-throughput pooled re- melanogaster. Mol. Biol. Evol. 30: 811–823. https://doi.org/ sequencing data. PLoS One 7: e48588. https://doi.org/10.1371/ 10.1093/molbev/mss222 journal.pone.0048588 Carlini, D. B., 2004 Experimental reduction of codon bias in the Felsenstein, J., 1974 The evolutionary advantage of recombina- Drosophila alcohol dehydrogenase gene results in decreased tion. Genetics 78: 737–756. ethanol tolerance of adult flies. J. Evol. Biol. 17: 779–785. Gebert, D., J. Jehn, and D. Rosenkranz, 2019 Widespread selec- https://doi.org/10.1111/j.1420-9101.2004.00725.x tion for extremely high and low levels of secondary structure in Carlini, D. B., and W. Stephan, 2003 In vivo introduction of un- coding sequences across all domains of life. Open Biol. 9: preferred synonymous codons into the Drosophila Adh gene 190020. https://doi.org/10.1098/rsob.190020 results in reduced levels of ADH protein. Genetics 163: 239– Gouy, M., and C. Gautier, 1982 Codon usage in bacteria: correla- 243. tion with gene expressivity. Nucleic Acids Res. 10: 7055–7074. Carneiro, R. L., R. D. Requião, S. Rossetto, T. Domitrovic, and F. L. https://doi.org/10.1093/nar/10.22.7055 Palhano, 2019 Codon stabilization coefficient as a metric to Grantham, R., C. Gautier, M. Gouy, R. Mercier, and A. Pavé, gain insights into mRNA stability and codon bias and their re- 1980 Codon catalog usage and the genome hypothesis. Nu- lationships with translation. Nucleic Acids Res. 47: 2216–2228. cleic Acids Res. 8: r49–r62. https://doi.org/10.1093/nar/ https://doi.org/10.1093/nar/gkz033 8.1.197-c Charlesworth, B., and J. Campos, 2014 The relations between Grantham, R., C. Gautier, M. Gouy, M. Jacobzone, and R. Mercier, recombination rate and patterns of molecular variation and evo- 1981 Codon catalog usage is a genome strategy modulated for

526 H. E. Machado, D. S. Lawrie, and D. A. Petrov gene expressivity. Nucleic Acids Res. 9: r43–r74. https:// Marais, G., D. Mouchiroud, and L. Duret, 2001 Does recombina- doi.org/10.1093/nar/9.1.213-b tion improve selection on codon usage? Lessons from nematode Haddrill, P. R., B. Charlesworth, D. L. Halligan, and P. Andolfatto, and fly complete genomes. Proc. Natl. Acad. Sci. USA 98: 5688– 2005 Patterns of intron sequence evolution in Drosophila are 5692. https://doi.org/10.1073/pnas.091427698 dependent upon length and GC content. Genome Biol. 6: R67. McVean, G. A. T., and B. Charlesworth, 1999 A population genet- https://doi.org/10.1186/gb-2005-6-8-r67 ic model for the evolution of synonymous codon usage: patterns Halligan, D. L., and P. D. Keightley, 2006 Ubiquitous selective and predictions. Genet. Res. 74: 145–158. https://doi.org/ constraints in the Drosophila genome revealed by a genome- 10.1017/S0016672399003912 wide interspecies comparison. Genome Res. 16: 875–884. Nelder, J. A., and R. Mead, 1965 A simplex method for https://doi.org/10.1101/gr.5022906 function minimization. Comput. J. 7: 308–313. https://doi.org/ Hershberg, R., and D. A. Petrov, 2008 Selection on codon bias. 10.1093/comjnl/7.4.308 Annu.Rev.Genet.42:287–299. https://doi.org/10.1146/ Newman,Z.R.,J.M.Young,N.T.Ingolia,andG.M.Barton, annurev.genet.42.110807.091442 2016 Differences in codon bias and GC content contribute to the Hu, T. T., M. B. Eisen, K. R. Thornton, and P. Andolfatto, 2013 A balanced expression of TLR7 and TLR9. Proc. Natl. Acad. Sci. USA second-generation assembly of the Drosophila simulans genome 113: E1362–E1371. https://doi.org/10.1073/pnas.1518976113 provides new insights into patterns of lineage-specific divergence. Novoa, E. M., and L. Ribas de Pouplana, 2012 Speeding with Genome Res. 23: 89–98. https://doi.org/10.1101/gr.141689.112 control: codon usage, tRNAs, and ribosomes. Trends Genet. Ikemura, T., 1981 Correlation between the abundance of 28: 574–581. https://doi.org/10.1016/j.tig.2012.07.006 Escherichia coli transfer RNAs and the occurrence of the respec- Parsch, J., S. Novozhilov, S. S. Saminadin-Peter, K. M. Wong, and P. tive codons in its protein genes: a proposal for a synonymous Andolfatto, 2010 On the utility of short intron sequences as a codon choice that is optimal for the E. coli translational system. reference for the detection of positive and negative selection in J. Mol. Biol. 151: 389–409. https://doi.org/10.1016/0022- Drosophila. Mol. Biol. Evol. 27: 1226–1234. https://doi.org/ 2836(81)90003-6 10.1093/molbev/msq046 Ikemura, T., 1982 Correlation between the abundance of yeast Pechmann, S., and J. Frydman, 2013 Evolutionary conservation transfer RNAs and the occurrence of the respective codons in of codon optimality reveals hidden signatures of cotranslational protein genes: differences in synonymous codon choice patterns folding. Nat. Struct. Mol. Biol. 20: 237–243. https://doi.org/ of yeast and Escherichia coli with reference to the abundance of 10.1038/nsmb.2466 isoaccepting transfer RNAs. J. Mol. Biol. 158: 573–597. https:// Plotkin, J. B., and G. Kudla, 2011 Synonymous but not the same: doi.org/10.1016/0022-2836(82)90250-9 the causes and consequences of codon bias. Nat. Rev. Genet. 12: Jackson, B. C., J. L. Campos, P. R. Haddrill, B. Charlesworth, and K. 32–42. https://doi.org/10.1038/nrg2899 Zeng, 2017 Variation in the intensity of selection on codon Post, L. E., G. D. Strycharz, M. Nomura, H. Lewis, and P. P. Dennis, bias over time causes contrasting patterns of base composition 1979 Nucleotide sequence of the ribosomal protein gene clus- evolution in Drosophila. Genome Biol. Evol. 9: 102–123. ter adjacent to the gene for RNA polymerase subunit beta in https://doi.org/10.1093/gbe/evw291 Escherichia coli. Proc. Natl. Acad. Sci. USA 76: 1697–1701. Kliman, R. M., and J. Hey, 1993 Reduced natural selection asso- https://doi.org/10.1073/pnas.76.4.1697 ciated with low recombination in Drosophila melanogaster. Mol. Presnyak, V., N. Alhusaini, Y. H. Chen et al., 2015 Codon optimal- Biol. Evol. 10: 1239–1258. ity is a major determinant of mRNA stability. Cell 160: 1111– Kousathanas, A., and P. D. Keightley, 2013 A comparison of mod- 1124. https://doi.org/10.1016/j.cell.2015.02.029 els to infer the distribution of fitness effects of new mutations. Qian, W., J. R. Yang, N. M. Pearson, C. Maclean, and J. Zhang, Genetics 193: 1197–1208. https://doi.org/10.1534/genetics. 2012 Balanced codon usage optimizes eukaryotic translational 112.148023 efficiency. PLoS Genet. 8: e1002603. https://doi.org/10.1371/ Lack, J. B., C. M. Cardeno, M. W. Crepeau, W. Taylor, R. B. Corbett- journal.pgen.1002603 Detig et al., 2015 The Drosophila genome nexus: a population Robinson, M. C., E. A. Stone, and N. D. Singh, 2014 Population genomic resource of 623 Drosophila melanogaster genomes, genomic analysis reveals No evidence for GC-biased gene con- including 197 from a single ancestral range population. Genet- version in Drosophila melanogaster. Mol. Biol. Evol. 31: 425– ics 199: 1229–1241. https://doi.org/10.1534/genetics.115. 433. https://doi.org/10.1093/molbev/mst220 174664 Rodriguez, A., G. Wright, S. Emrich, and P. L. Clark, 2018 %Min- Lampson, B., N. Pershing, J. Prinz, J. R. Lacsina, W. F. Marzluff Max: a versatile tool for calculating and comparing synonymous et al., 2013 Rare codons regulate KRas oncogenesis. Curr. Biol. codon usage and its impact on protein folding. Protein Sci. 27: 23: 70–75. https://doi.org/10.1016/j.cub.2012.11.031 356–362. https://doi.org/10.1002/pro.3336 Lawrie, D. S., 2017 Accelerating wright-Fisher forward simula- Savisaar, R., and L. D. Hurst, 2018 Exonic splice regulation im- tions on the graphics processing unit. G3: genes, genomes. Ge- poses strong selection at synonymous sites. Genome Res. 28: netics 7: 3229–3236. 1442–1454. https://doi.org/10.1101/gr.233999.117 Lawrie, D. S., D. A. Petrov, and P. W. Messer, 2011 Faster than Sharp, P. M., and W. H. Li, 1986 Codon usage in regulatory genes neutral evolution of constrained sequences: the complex inter- in Escherichia coli does not reflect selection for ‘rare’ codons. play of mutational biases and weak selection. Genome Biol. Nucleic Acids Res. 14: 7737–7749. https://doi.org/10.1093/ Evol. 3: 383–395. https://doi.org/10.1093/gbe/evr032 nar/14.19.7737 Lawrie, D. S., P. W. Messer, R. Hershberg, and D. A. Petrov, Sharp, N. P., and A. F. Agrawal, 2016 Low genetic quality alters 2013 Strong purifying selection at synonymous sites in D. mel- key dimensions of the mutational spectrum. PLoS Biol. 14: anogaster. PLoS Genet. 9: e1003527. https://doi.org/10.1371/ e1002419. https://doi.org/10.1371/journal.pbio.1002419 journal.pgen.1003527 Sharp, P. M., L. R. Emery, and K. Zeng, 2010 Forces that influence Li, W. H., 1987 Models of nearly neutral mutations with particular the evolution of codon bias. Philos. Trans. R. Soc. Lond. B Biol. implications for nonrandom usage of synonymous codons. J. Mol. Sci. 365: 1203–1212. https://doi.org/10.1098/rstb.2009.0305 Evol. 24: 337–345. https://doi.org/10.1007/BF02134132 Sharon, E., S. A. Chen, N. M. Khosla, J. D. Smith, J. K. Pritchard Mackay,T.F.C.,S.Richards,E.A.Stone,A.Barbadilla,J.F.Ayroles et al., 2018 Functional genetic variants revealed by massively et al., 2012 The Drosophila melanogaster genetic reference panel. parallel precise genome editing. Cell 175: 544–557.e16. https:// Nature 482: 173–178. https://doi.org/10.1038/nature10811 doi.org/10.1016/j.cell.2018.08.057

Strong Selection on Codon Usage Bias 527 She, R., and D. F. Jarosz, 2018 Mapping causal variants with Yannai, A., S. Katz, and R. Hershberg, 2018 The codon usage of single-nucleotide resolution reveals biochemical drivers of phe- lowly expressed genes is subject to natural selection. Genome notypic change. Cell 172: 478–490.e15. https://doi.org/10.1016/ Biol. Evol. 10: 1237–1246. https://doi.org/10.1093/gbe/evy084 j.cell.2017.12.015 Zeng, K., 2010 A simple multiallele model and its application to Singh, N. D., J. C. Davis, and D. A. Petrov, 2005 Codon bias and identifying preferred–unpreferred codons using polymorphism noncoding GC content correlate negatively with recombination data. Mol. Biol. Evol. 27: 1327–1337. https://doi.org/10.1093/ rate on the Drosophila X chromosome. J. Mol. Evol. 61: 315– molbev/msq023 324. https://doi.org/10.1007/s00239-004-0287-1 Zeng K, Charlesworth B 2009 Estimating selection intensity on Singh, N. D., V. L. Bauer DuMont, M. J. Hubisz, R. Nielsen, and C. synonymous codon usage in a nonequilibrium population. Ge- F. Aquadro, 2007 Patterns of mutation and selection at synon- netics 183: 651–662. https://doi.org/10.1534/genetics.109. ymous sites in Drosophila. Mol. Biol. Evol. 24: 2687–2697. 101782 https://doi.org/10.1093/molbev/msm196 Zeng, K., and B. Charlesworth, 2010 Studying patterns of recent Singh, N. D., J. D. Jensen, A. G. Clark, and C. F. Aquadro, evolution at synonymous sites and intronic sites in Drosophila 2013 Inferences of demography and selection in an African melanogaster. J. Mol. Evol. 70: 116–128. https://doi.org/ population of Drosophila melanogaster. Genetics 193: 215–228. 10.1007/s00239-009-9314-6 https://doi.org/10.1534/genetics.112.145318 Zhou, J., W. J. Liu, S. W. Peng, X. Y. Sun, and I. Frazer, Stoletzki, N., and A. Eyre-Walker, 2007 Synonymous codon usage 1999 Papillomavirus capsid protein expression level depends in Escherichia coli: selection for translational accuracy. Mol. Biol. on the match between codon usage and tRNA availability. Evol. 24: 374–381. https://doi.org/10.1093/molbev/msl166 J. Virol. 73: 4972–4982. Tataru, P., M. Mollion, S. Glémin, and T. Bataillon, 2017 Inference Zhou, Z., Y. Dang, M. Zhou, L. Li, C. H. Yu et al., 2016 Codon of distribution of fitness effects and proportion of adaptive sub- usage is an important determinant of gene expression levels stitutions from polymorphism data. Genetics 207: 1103–1119. largely through its effects on transcription. Proc. Natl. Acad. https://doi.org/10.1534/genetics.117.300323 Sci. USA 113: E6117–E6125. https://doi.org/10.1073/pnas. Wright, S., 1938 The distribution of gene frequencies under irre- 1606724113 versible mutation. Proc. Natl. Acad. Sci. USA 24: 253–259. https://doi.org/10.1073/pnas.24.7.253 Communicating editor: C. Langley

528 H. E. Machado, D. S. Lawrie, and D. A. Petrov