Expression of Tandem Gene Duplicates Is Often Greater Than Twofold
Total Page:16
File Type:pdf, Size:1020Kb
Expression of tandem gene duplicates is often greater than twofold David W. Loehlina,b and Sean B. Carrolla,b,1 aHoward Hughes Medical Institute, University of Wisconsin-Madison, Madison, WI 53706; and bLaboratory of Cell & Molecular Biology, University of Wisconsin-Madison, Madison, WI 53706 Contributed by Sean B. Carroll, April 13, 2016 (sent for review March 11, 2016; reviewed by Daniel L. Hartl and Harmit S. Malik) Tandem gene duplication is an important mutational process in a twofold increase in gene output in the course of pursuing the evolutionary adaptation and human disease. Hypothetically, two genetic basis of the sixfold greater ADH enzyme activity in tandem gene copies should produce twice the output of a single brewery-adapted Drosophila virilis relative to its sibling Drosophila gene, but this expectation has not been rigorously investigated. americana (Fig. 1A). Two copies of the entire D. virilis gene, including Here, we show that tandem duplication often results in more than all known regulatory elements, occur within a 7-kb tandem duplica- double the gene activity. A naturally occurring tandem duplication tion, whereas the orthologous sequence in D. americana is single copy of the Alcohol dehydrogenase (Adh) gene exhibits 2.6-fold greater (9). We cloned the duplicated Adh region from D. virilis and found expression than the single-copy gene in transgenic Drosophila. that the two duplicate copies in our laboratory strain were nearly This tandem duplication also exhibits greater activity than two copies of the gene in trans, demonstrating that it is the tandem identical, with only three distinguishing single-nucleotide changes arrangement and not copy number that is the cause of overactivity. located distal to the transcription unit (Fig. 1B). We therefore pre- We also show that tandem duplication of an unrelated synthetic re- sumed that the tandem duplication would account for twofold higher porter gene is overactive (2.3- to 5.1-fold) at all sites in the genome activity, with the remaining threefold change in activity accounted for that we tested, suggesting that overactivity could be a general prop- by subsequent changes in regulatory or coding sequences. erty of tandem gene duplicates. Overactivity occurs at the level of We tested this presumption by inserting duplicate and single- RNA transcription, and therefore tandem duplicate overactivity ap- copy D. virilis Adh transgenes (Fig. 1B) into an inbred Adh-null pears to be a previously unidentified form of position effect. The D. melanogaster recipient line at a specific chromosomal in- increment of surplus gene expression observed is comparable to sertion site (attP ZH-86Fb), followed by measurement of ADH many regulatory mutations fixed in nature and, if typical of other activity from whole-fly homogenates. We were surprised to ob- genomes, would shape the fate of tandem duplicates in evolution. serve 2.6-fold higher ADH enzyme activity from the duplicates than from the single copy of D. virilis Adh (Fig. 1C). The dif- tandem duplication | gene expression | position effect | gene structure | ference between single and duplicate was significantly greater genome evolution than expected (t test, P = 0.0005; see Tables S1–S14 for details of underlying mixed-effects models). In addition, we tested for any volutionarily and medically relevant phenotypes often derive effect of the between-copy nucleotide changes by engineering a from quantitative changes in gene expression. It is becoming E construct where the left and the right genes were identical (Fig. increasingly appreciated that relatively modest changes in gene 1B). ADH activity from this identical-duplicate construct was in- expression or protein activity can have meaningful effects. For ex- = ample, alleles with 1.1- to 1.6-fold effects on transcription or enzyme distinguishable from the original cloned duplicate (t test, P 0.64; activity (1) have been identified that show evidence for selection, including Adh in Drosophila melanogaster and Lactase and Significance Prodynorphin in humans (1–3). Furthermore, most transcriptional variation in Drosophila species is on the order of twofold or less (4). Differences among individuals and species originate from Understanding the mutational basis of these activity changes is a changes to the genome. Yet our knowledge of the principles necessary step to predict phenotypes based on genomic sequences. that might allow prediction of the effects of any particular One simple way for gene activity to double is through tandem mutation is limited. One such prediction might be that dupli- gene duplication. Gene duplication is a common mutational process, cating a gene would double the gene’s output. We show that − − occurring with estimated rates of 10 9 to 10 7 new duplicates per gene this is actually not the case in Drosophila flies. Instead, in al- per generation in flies, worms, and yeast (5, 6). Gene duplication has most all of the cases we tested (using a naturally occurring and been of long-standing interest in evolution because, once genes have an artificially constructed tandem duplicate gene), we ob- duplicated, one copy may acquire a novel function (7, 8), and many served that the output of the duplicated genes was greater genes involved in physiological and developmental diversification occur than double the output of single copies—as much as five times as tandem duplicates in gene complexes. However, relatively little is greater. This finding suggests that tandem duplicate genes known empirically about the first step in this process—the immediate could have disproportionate effects when they occur. phenotypic consequences of a single gene duplication. This may be due Author contributions: D.W.L. and S.B.C. designed research; D.W.L. performed research; to the difficulty of isolating the effects of increased copy number from D.W.L. contributed new reagents/analytic tools; D.W.L. analyzed data; and D.W.L. and any potential contribution of subsequent sequence divergence to gene S.B.C. wrote the paper. expression of a duplicate pair. Here, we uncovered an effect of tandem Reviewers: D.L.H., Harvard University; and H.S.M., Fred Hutchinson Cancer Research duplications on gene activity in the Drosophila melanogaster genome Center. that is greater than twofold. We suggest that this phenomenon, which The authors declare no conflict of interest. we refer to as “tandem duplicate overactivity,” may be a previously Freely available online through the PNAS open access option. unidentified type of positioneffectongeneexpression. Data deposition: The DNA sequences reported in this paper have been deposited in the GenBank database [accession no. KU559568 (Drosophila virilis Adh locus)]. Results and Discussion 1To whom correspondence should be addressed. Email: [email protected]. Adh Tandem Duplication of Is Overactive. We encountered the This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. possibility that tandem gene duplicates might not simply produce 1073/pnas.1605886113/-/DCSupplemental. 5988–5992 | PNAS | May 24, 2016 | vol. 113 | no. 21 www.pnas.org/cgi/doi/10.1073/pnas.1605886113 Downloaded by guest on October 2, 2021 A C limited to one chromosomal location. We tested the first hy- 1.6 Site ZH-86Fb pothesis by constructing duplications of an unrelated gene, the 1.6 well-studied synthetic reporter gene vgQ-lacZ that consists of the Escherichia coli β-galactosidase reporter gene linked to the ∼800-bp quadrant enhancer of the D. melanogaster vestigial gene (10). 1.2 We inserted single and duplicate constructs into the same inser- 1.2 tion site used above and then measured β-galactosidase activity in third-instar wing imaginal disk cells. The activity of duplicate 2x Single transgenes relative to singletons was again significantly greater 0.8 than twofold (t test, P = 0.01; Fig. 3A), even though the gene, 0.8 ADH activity tissue, and measurement assay used were completely different. We examined whether duplicate overactivity was dependent per min mg soluble protein) 340 on chromosomal position by inserting single and duplicate vgQ- 0.4 Abs lacZ transgenes at six additional sites. We selected attP insertion Δ 0.4 ( sites that are commonly used by Drosophila researchers because of their faithful expression of transgenes. The vgQ-lacZ dupli- 0.0 cates were significantly overactive at all sites (Fig. 3 and Tables D. americana D. virilis Dup Ident_dup Single S1–S14), with duplicate activity ranging from 2.3-fold (in four of seven sites) to 5.1-fold higher than singletons. Even considering B 1kb the possibility that the insertion sites selected are a biased sample Dup of the genome, this result suggests that overactivity is common and Adh Adh could have a typical value. It also suggests that the degree of Ident_dup overactivity is influenced by chromosome location. Single Dup Adh Adh Fig. 1. Tandem duplication of Adh from D. virilis is overactive. (A) ADH enzyme activity is sixfold higher in D. virilis than D. americana. Boxplots Single Adh show median and interquartile range with thin lines extending to the lesser of 1.5× the interquartile range or the data extremes. n = 15 samples. Uninserted (B) Schematic of the tandem duplicated Adh locus in D. virilis (“Dup”). Vertical bars delimit the duplicated region. Ovals mark the three nucleotides Site ZH-86Fb that distinguish the left copy from the right copy. Also shown are engi- neered constructs with SNPs removed (“Ident_dup”) and the isolated single copy (“Single”). (C) ADH activity of D. melanogaster flies (ZH-86Fb attP site, Adh-null) transformed with D. virilis Single and Dup constructs. Dashed line shows predicted twofold mean activity of the Single construct. Error bars 1.2 show 95% confidence interval of means (Tables S1–S14). Sample sizes for this and subsequent plots are in Tables S1–S14. We verified that assay mea- surements scaled one-to-one with homogenate concentration (Fig. S1). P = 0.001 Fig.