Adventitious Changes in Long-Range Gene Expression Caused by Polymorphic Structural Variation and Promoter Competition
Total Page:16
File Type:pdf, Size:1020Kb
Adventitious changes in long-range gene expression caused by polymorphic structural variation and promoter competition Karen M. Lowera, Jim R. Hughesa, Marco De Gobbia, Shirley Hendersonb, Vip Viprakasitc, Chris Fishera, Anne Gorielya, Helena Ayyuba, Jackie Sloane-Stanleya, Douglas Vernimmena, Cordelia Langfordd, David Garricka, Richard J. Gibbonsa, and Douglas R. Higgsa,1 aMedical Research Council Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, The John Radcliffe Hospital, Headington, Oxford, OX3 9DS, United Kingdom; bNational Haemoglobinopathy Reference Laboratory, Oxford Radcliffe Hospitals NHS Trust, Oxford, OX3 7LJ, United Kingdom; cDepartment of Paediatrics, Faculty of Medicine, Siriaj Hospital, Mahidol University, Bangkok, 10700, Thailand; and dMicroarray Facility, The Wellcome Trust Sanger Institute, Cambridge, CB10 1SA, United Kingdom Edited by Mark T. Groudine, Fred Hutchinson Cancer Research Center, Seattle, WA, and approved October 14, 2009 (received for review August 17, 2009) It is well established that all of the cis-acting sequences required for 70 kb upstream the ␣-like genes (4). In addition, we have also shown fully regulated human ␣-globin expression are contained within a that a 120-kb region of conserved synteny containing the human region of Ϸ120 kb of conserved synteny. Here, we show that activa- ␣-like globin genes, together with their major upstream regulatory tion of this cluster in erythroid cells dramatically affects expression of element (MCS-R2, also called HS-40), is sufficient to obtain apparently unrelated and noncontiguous genes in the 500 kb sur- optimal tissue- and developmental stage-specific expression in a rounding this domain, including a gene (NME4) located 300 kb from mouse model (6). However, here we have asked whether globin the ␣-globin cluster. Changes in NME4 expression are mediated by gene activation within this region has more far reaching conse- physical cis-interactions between this gene and the ␣-globin regula- quences, by affecting the expression of apparently unrelated genes tory elements. Polymorphic structural variation within the globin in the surrounding chromosomal neighborhood. cluster, altering the number of ␣-globin genes, affects the pattern of To investigate this hypothesis, we examined the expression of 14 GENETICS NME4 expression by altering the competition for the shared ␣-globin genes in an extensive region (500 kb) surrounding the ␣-globin regulatory elements. These findings challenge the concept that the cluster in nonerythroid cells (when the ␣-globin genes are silent) genome is organized into discrete, insulated regulatory domains. In and in erythroid cells (when the ␣-globin genes and their regulatory addition, this work has important implications for our understanding elements are fully active). When the ␣-globin genes are switched on, of genome evolution, the interpretation of genome-wide expression, expression of the functionally unrelated gene (C16orf35), contain- expression-quantitative trait loci, and copy number variant analyses. ing the ␣-globin regulatory elements, is increased by Ϸ30-fold. In addition, we have shown that another apparently unrelated gene ͉ ͉ ͉ chromosome looping copy number variants globin gene expression (NME4), located 300 kb from the ␣-globin cluster (in which we have allele-specific expression ͉ 4C identified a potential erythroid cis-acting element) physically inter- acts with, and is regulated by, MCS-R2, such that its expression is ecent global analyses of mammalian genomes have revised our also increased 10-fold in erythroid cells. All other genes lying Rview of the relationship between genome organization and the between MCS-R2 and NME4 are unaffected. When the ␣-globin regulation of gene expression. It was previously thought that the genes are deleted from this chromosomal region, expression of genome might be arranged as a series of independently regulated NME4 (300 kb away) is further increased by 8-fold, as a result of chromosomal domains flanked by boundary elements (1). In con- increased competition for the shared regulatory element (MCS- trast, it is now clear that cis-acting regulatory elements (locus R2). Because ␣-globin deletions have been selected to reach high control regions, enhancers, silencers, enhancer blockers, and chro- frequencies in many populations (as they cause ␣-thalassemia, matin barrier elements), controlling tissue- or developmental stage- which protects against falciparum malaria), the levels of NME4 will specific genes, may be dispersed over tens to thousands of kilobases be expected to vary in such populations in parallel with changes in (2, 3). Furthermore, we now know that in gene-rich regions such the number of cis-linked ␣-globin genes. elements are commonly interspersed with widely expressed genes This study therefore demonstrates a common mechanism by (2). These observations raise important general questions, such as: which patterns and levels of gene expression across a large chro- How does the activation of specialized regulatory elements and mosomal region may radically change in an unexpected way. their cognate genes influence the expression of other apparently ␣ unrelated genes in a shared chromosomal environment? How do Common structural polymorphisms in the -globin genes, which common structural variants which alter genome architecture affect have been selected during evolution, have a dramatic effect on gene expression? What, if any, are the consequences of such expression of an unrelated gene (NME4) lying 300 kb away in what apparently adventitious effects on gene expression? appears to be a shared chromosomal environment. These findings To investigate these issues in detail we have examined the pattern have important, general implications for the evolution of the of gene expression across a large segment of the human genome and genome, and for understanding how common expression quanti- studied how polymorphic variation in this region may influence long-range patterns of gene expression. In particular, we analyzed Author contributions: K.M.L., D.G., R.J.G., and D.R.H. designed research; K.M.L., J.R.H., a well-characterized, gene-dense, telomeric region of the genome M.D.G., H.A., and D.V. performed research; J.R.H., S.H., V.V., C.F., A.G., J.S.-S., and C.L. (16p13.3) containing the human ␣-like globin genes (, ␣2, and ␣1), contributed new reagents/analytic tools; K.M.L. analyzed data; and K.M.L., R.J.G., and which are activated and transcribed at very high levels only in D.R.H. wrote the paper. erythroid cells (4, 5). We have previously shown that critical, remote The authors declare no conflict of interest. regulatory elements controlling ␣-globin expression, MCS-R1 to This article is a PNAS Direct Submission. -R4 (representing previously identified DNaseI hypersensitive sites 1To whom correspondence should be addressed. E-mail: [email protected]. HS-48, HS-40, HS-33, and HS-10, respectively), three of which lie This article contains supporting information online at www.pnas.org/cgi/content/full/ within the introns of a widely expressed gene (C16orf35) lying 50 to 0909331106/DCSupplemental. www.pnas.org͞cgi͞doi͞10.1073͞pnas.0909331106 PNAS Early Edition ͉ 1of6 Downloaded by guest on October 2, 2021 A Chromosome 16 0k 100k 200k 300k 400k 500k Genes CYXorf1 POLR3K MPG HBZ HBZpsHBA1 Luc7L ITFG3 RGS11 ARHGDIG MRPL28 NME4 RAB11FIP3 gs3 SNRNP25 C16orf35 HBM HBQ PDIA2 TMEM8 DECR2 IL9R3ps RHBDF1 HBA1ps AXIN1 HBA2 Conserved Synteny 123 4 MCS-R Deletions MCS-R −− H4ac 40 Fig. 1. Overview and expression analysis of the terminal 500 kb of human chromo- 20 some 16p, containing the ␣-globin cluster. 0 (A) Representation of the genes contained GATA1 10 within this chromosomal region. Conserved synteny with the mouse region and the 5 MCS-R region, which is conserved and re- ␣ 1 quired for full expression of the -globin SCL genes are shown. The minimal regions de- 30 leted, in all cases of ␣-thalassemia affecting 15 the MCS-R elements (⌬MCS-R) one (-␣)or two (--) ␣ genes, are shown. ChIP for the 0 PolII activating chromatin mark H4ac, the eryth- 70 roid-specific binding factors GATA1 and 35 SCL, and RNA polymerase II were carried out in erythroid cells and hybridized to a 0 tiled microarray covering this region (ChIP- chip). Tracks are representative of a mini- 0 10 B Cfull length NME4 mum of two biological replicates. (B) Ex- -1 pression of genes contained within this 10 eNME4 erythroid-specific region, and an erythroid control gene -2 eNME4 expression amplicon EPOR, in hES and erythroid cells. Expression 10 NME4 expression amplicon was normalized to 18S. Values represent an -3 Ϯ 10 NME4 SNP (rs14293) average of three biological replicates 1 standard deviation. The y axis is a log scale. -4 10 (C) Schematic representation of the genomic structure of NME4 and eNME4. -5 10 (Black boxes) Exons; (gray box) alternative Expression relative to 18S relative Expression erythroid-specific exon; (full line) introns; -6 10 (dashed lines) splicing of mature transcript. Amplicons used for expression analysis are -7 10 shown; further information can be found in MPG HBA EPOR Tables S1–S3. The highly polymorphic SNP LUC7LITFG3 AXIN1 NME4DECR2 POLR3K MRPL28TMEM8 SNRNP25RHBDF1 C16orf35 RAB11FIP3 used for allele-specific expression (rs14293) hES erythroid is shown in red. tative trait loci and copy number variants (CNVs) may influence activating chromatin modifications (H4ac, H3ac, H3K4me2, gene expression across long segments of the human genome. H3K4me3) in erythroid cells