The ␤-globin locus control region Reviews Locus control regions coming of age at a decade plus

The ␤-globin locus control region (LCR) is the founding member of a novel class of cis-acting regulatory elements that confer high level, tissue-specific, site-of-integration-independent, copy number-dependent expression on linked transgenes located in ectopic chromatin sites. Knowledge from ␤-globin and other LCR studies has shed light on our understanding of the long-range interaction between enhancers and promoters, the relationship between chromatin conformation and transcriptional regulation, and the developmental regulation of multiple gene loci. After over a decade of investigation and discovery, we take a retrospective look at the ␤-globin LCR and other LCRs, summarize their properties and review models of LCR function.

peculation regarding the presence of a cis-acting regu- They confer tissue-specific and physiological levels of Slatory element upstream of the ␤-globin genes, and the expression on linked genes, activate the transcription of subsequent discovery of the locus control region (LCR), transgenes in a position-independent, copy number- was a result of inferential data obtained from several dependent manner, and determine DNA replication timing sources. First, early studies of transgenic mice carrying and the origin utilized (Box 1). LCRs are novel elements, only the human ␤- or ␥-globin gene demonstrated that distinct from promoters, classical enhancers, chromoso- gene expression was significantly lower than endogenous mal insulators and matrix-attachment regions or scaffold- ␤-globin gene expression1–3. Moreover, transgene expres- attachment regions (MARs or SARs). They are more sion was observed only in a small proportion of transgenic complex than enhancers and insulators; in fact, they animals, indicating that they were susceptible to position incorporate the functions of both these elements. effects. These studies implied that a regulatory element Enhancers activate transcription (Box 1), but gene expres- must be present in the native locus that was missing in sion might be absent or variable at non-physiological these constructs. Second, in a number of human ␤- levels. This variation has been termed position effect (PE) thalassemias, the region upstream of the locus is deleted, and is thought to result from the positive or negative but the globin genes are intact. These genes function nor- effects of the chromatin surrounding the transgene. Unlike mally in vitro, but are silent and reside in inactive enhancers, LCRs buffer against PE. Insulators function as chromatin in vivo, suggesting that a regulatory element is boundary elements of chromatin structure, preventing PEs missing in the mutant loci4,5. Third, the existence of a regu- associated with transgenes without enhancing transcrip- latory element was suggested by the presence of develop- tion. In contrast to insulators, LCRs activate gene expres- mentally stable, erythroid-specific DNAse I-hypersensitive sion and can also contain MARs as part of their collection Qiliang Li* Ј ⑀ 6 sites (HSs) located 5 of the -globin gene . The functional of elements. li111640@u. involvement of these HSs in globin gene expression was washington.edu implied from studies of somatic cell hybrids7. When the Transcription- activity + chromosome carrying the ␤-globin locus in non-erythroid In the absence of the LCR, the transcription of the gene Susanna Harju [email protected] cells, where the HSs are normally absent, was introduced encoding human ␤-globin is usually less than 1% of the into an erythroid cell environment, the HSs were formed. murine ␤-globin mRNA in transgenic mice, if it is Kenneth R. Peterson+o Finally, studies in transgenic mice provided positive evi- expressed at all1–3. Inclusion of the LCR increases the [email protected] dence that this region conferred erythroid-specific, high- expression of the human ␤-globin gene to a level compara- *Division of Medical level expression upon a linked ␤-globin transgene in a ble with endogenous ␤-globin gene expression in all trans- Genetics, Mail Box position-independent, copy number-dependent manner8. genic animals, indicating that the LCR has strong 357720, Department of The ␤-globin LCR is located upstream of the ␤-globin enhancer activity8. LCR enhancer activity is also signifi- Medicine, University of gene and consists of five DNAse I-hypersensitive sites cant in situ, as demonstrated by LCR-deletion experi- Washington, Seattle, (5ЈHS1–5) (Fig. 1). 5ЈHS 1–4 are erythroid-specific, but ments9,10. These deletions in the native chromosome of WA 98195, USA. 5ЈHS5 is not. mouse or human cell lines severely reduce the expression + of globin genes. Department of Properties of LCRs The enhancer activity of the LCR resides in 5ЈHS2, 3 Biochemistry and Ј Molecular Biology and Technically, DNA fragments that are capable of enhanc- and 4, but not in 5 HS1 or 5 (Ref. 11 and references o Ј Department of Anatomy ing the expression of a linked gene that is integrated in therein). 5 HS2 behaves as a classical enhancer, in that its and Cell Biology, ectopic chromatin sites in a copy number-dependent man- activity can be detected in transient transfection assays. University of Kansas ner in transgenic mice are grouped together as LCRs. Enhancer activity in 5ЈHS3 or 4 can be detected only when Medical Center, 3901 LCRs are powerful genetic regulatory elements that are they are integrated into chromatin. A requirement for Rainbow Blvd, Kansas characterized phenotypically by three major properties. chromosomal integration suggests that alteration of City, KS 66160, USA.

0168-9525/99/$ – see front matter © 1999 Elsevier Science Ltd. All rights reserved. PII: S0168-9525(99)01780-1 TIG October 1999, volume 15, No. 10 403 Reviews The ␤-globin locus control region

FIGURE 1. The human β-globin locus BOX 1. Cis-acting transcriptional regulatory elements

(DCR) Locus control region (LCR) (LAR) Opens locus chromatin domains LCR Insulates against effects of surrounding positive or negative chromatin Has cell lineage-specific enhancer activity 5′HS 54 32 1 3′HS1 Influences timing of replication and choice of origin utilized ε Gγ Aγ ψβ δ β Enhancer Stimulates transcription in an orientation-independent manner (V)(IV)(III)(II)(I) (1) (2)(3)(4) Classical enhancers function independently of orientation and distance Hispanic deletion 10 kb Human LCR deletion Creates independent functional domain without enhancement or Mouse LCR deletion trends in Genetics activation function by blocking the effects of surrounding positive or negative chromatin The locus consists of five functional genes, indicated as white boxes, arrayed in their order of Interrupts communication between a and another regulatory developmental expression, 5Ј-⑀-G␥-A␥-␦-␤-3Ј. There are two developmental switches in globin chain element when placed between them synthesis coincident with changes in site and type of erythropoiesis. During primitive erythropoiesis, the ⑀-globin gene is expressed in the embryonic yolk sac. The first switch occurs at approximately Promoter eight weeks gestation; the ⑀-globin gene is silenced and the G␥- and A␥-globin genes are expressed Region of DNA at which RNA polymerase binds and initiates transcription during definitive erythropoiesis in the fetal liver. The second switch occurs shortly after birth; the ␥- globin genes are silenced and the ␤-globin gene and, to a lesser extent, the ␦-globin gene, are Matrix attachment region (MAR) or scaffold attachment activated in the bone marrow. 5ЈHS1–5 are located Ϫ6, Ϫ11, Ϫ15, Ϫ18 and Ϫ22 kb relative to the region (SAR) ⑀-globin gene, respectively, and are indicated by arrows. Another HS is located 20 kb downstream of DNA segment that might bind the nuclear scaffold the ␤-globin gene (3ЈHS1). 3ЈHS1 is found only in erythroid cells; its function remains unclear. A DNA loop between two MARs might form an independent chromosomal Transcription start sites for the functional globin genes are at the left edge of the white boxes in the domain diagram. The LCR and its HSs were named differently in early literature. At the Seventh Conference on Has no enhancer activity Hemoglobin Switching held at Airlie House, Virginia in 1990, senior participants agreed upon a uniform nomenclature. The Locus Control Region (LCR) was the compromise name to be used in place of Locus Activation Region (LAR) or Dominant Control Region (DCR). A standard numbering system for Copy number-dependent expression and chromatin- the individual HSs was adapted. The obsolete names and numbering for HSs are shown in domain opening parenthesis. This should be helpful to those reading earlier LCR-related publications. Lines below the Another property of the LCR is its ability to confer diagram of the locus indicate deletions of the LCR discussed in the text. The Hispanic deletion, which position-independent, copy number-dependent expression causes (␥␦␤)0 thalassemia in humans, extends an additional 20 kb 5Ј of the LCR 5ЈHS5 (Ref. 20). on a linked gene8. Copy number-dependent expression is widely considered to be indicative of open chromatin chromatin structure is involved in propagating the structure; that is, DNA that is accessible to transcription enhancer activity of 5ЈHS3 or 4. The functions of 5ЈHS1 factors. Involvement of the LCR in creating open chro- and 5 remains to be identified. matin was suggested from the analysis of the naturally Like classical enhancers, individual HS activity is inde- occurring Hispanic ␤-thalassemia deletion. In this condi- pendent of orientation. However, unlike classical enhancers, tion, a region upstream of 5ЈHS1 of approximately 35 kb LCR activity is orientation-dependent12 and distance- is deleted, but the remainder of the globin locus is intact. dependent13–15, suggesting that the LCR does not act as a However, none of the globin genes are expressed. The simple enhancer of gene transcription. A proximal gene is deletion produces a closed chromatin conformation that more likely to be transcribed than a distal one. In addition, a spans the entire locus20. Consistent with this, only the combination of HSs can deliver enhancer activity over longer intact LCR (5ЈHS1–5) can provide position-independent distances than an individual HS (Ref. 16). However, the chromatin-opening activity in single-copy transgenic mice LCR can be forced to interact with a proximal gene by the carrying the entire ␤-globin locus21. When one of the HSs insertion of a gene in or near the LCR (Ref. 17). was deleted from the LCR, expression of the ␤-globin The enhancer activity of the LCR is tissue specific8,18; genes appeared to be sensitive to the position of inte- that is, the expression of globin genes is confined to ery- gration. In transgenic mice carrying single copies of small, throid cells when linked to the ␤-globin LCR. In addition, recombinant HS-globin gene constructs, only 5ЈHS3 is the LCR is able to enhance expression from hetero- able to confer copy-number-dependent gene expression. geneous, non-globin gene promoters in erythrocytes18. This observation led to the conclusion that 5ЈHS3 However, when a non-globin gene was coupled to the possesses the dominant chromatin-opening activity of the LCR, ectopic expression was observed in some transgenic LCR (Ref. 22). However, 5ЈHS3 chromatin-opening mice19. In these instances, although the ␤-globin LCR con- activity might not be dominant because it appears to be ferred erythroid-specific gene expression upon the hetero- dependent upon the constitution of the constructs23. geneous gene, the natural function of the linked promoter Surprisingly, recent reports demonstrate that when the allowed expression outside of the erythroid compartment. entire mouse LCR (5ЈHS1–6) or most of the human LCR Thus, tissue-specific control of basal transcription might (5ЈHS2–5) is deleted in the context of its normal chromo- reside in the promoter, as is the case for the globin genes, some location in cell lines, the formation of the remaining whereas tissue-specific enhancement of gene expression HSs of the locus and the presence of general DNAse I- might be a property of the LCR. These data suggest that sensitivity associated with the ␤-globin locus domain are tissue specificity is not really an intrinsic property of the not affected, although transcription of all ␤-like globin LCR, but depends both upon the LCR and upon the genes is reduced9,10. These results strongly suggest that the promoters that it interacts with. LCR is not required for opening and maintaining the HS

404 TIG October 1999, volume 15, No. 10 The ␤-globin locus control region Reviews

chromatin domain at the endogenous ␤-globin locus. FIGURE 2. Models of LCR function These LCR deletions are smaller than the Hispanic Ј deletion, which includes most of the LCR and 5 DNA (a) sequences and results in a severe reduction of transcription and closure of chromatin. The chromatin-opening and maintenance functions might be encoded by elements that are upstream of the LCR, in the region contained within trends in Genetics the Hispanic deletion. It should be stressed that these data are derived from cell lines; if these results are confirmed in mouse models, some interpretations of LCR function will change. Studies of the LCR in transgenic mice provide evi- dence that LCRs can alter local chromatin structure to (b) confer position-independent, copy number-dependent gene expression at an ectopic site. Together with the deletion data discussed above, these observations empha- size that genes can behave differently when located in their normal chromosomal environment versus ectopic sites fol- lowing transgenesis. Questions remain regarding the iden- tity of the elements that are involved in chromatin alter- ation, and whether they initiate chromatin opening of the ␤-globin locus domain in erythroid cells, or prevent the (c) domain from closing in erythroid lineages if it is open before differentiation.

Timing and origin of DNA replication Replication of the human ␤-globin locus occurs in early S phase in erythroid cells, but in middle-to-late S phase in non-erythroid cells24. In all instances, DNA replication ini- tiates from a region 5Ј of the ␤-globin gene. However, when the LCR is deleted in Hispanic ␤-thalassemia, repli- cation of the locus shifts to late S phase in erythroid cells20. (a) Looping model. Sequence-specific activation factors bound at the locus control region (LCR), Furthermore, the deletion inactivates the normal origin of coactivators and a globin gene promoter are brought together to form a transcription complex by replication and initiation begins at a downstream looping of the DNA between the LCR and the distal gene. (b) Tracking model. Sequence-specific 25 sequence . These results suggest that the LCR, or adjacent activators bound at the LCR track down the helix until they encounter the appropriate stage-specific sequences, could be involved in the determination of the promoter bound by other activators, and then transcription is initiated. However, transcription might timing of DNA replication and what origin sequence is initiate at sites within the LCR and intergenic sequences of the globin locus, but these transcripts utilized. Further studies are necessary to determine if the are not exported from the nucleus for translation. (c) Facilitated tracking model. Sequence-specific effects on replication are a general feature of LCRs, and activators bind the LCR and a loop is formed bringing the factors into contact with a DNA region whether this is a direct effect, or secondary to the effects distal to the LCR and proximal to the globin gene promoter. Tracking occurs until the correct on transcription. promoter and associated transcription factors are encountered, then a transcription complex is formed and transcription begins. A globin gene is represented as a rectangular box; promoter Role of individual HSs in LCR function sequences are depicted to the left of the vertical line. For the LCR holocomplex, the four erythroid- specific hypersensitive site (HS) cores are shown as small boxes brought together in close The LCR HSs contain a 200–300 bp DNAse I-hypersensi- juxtaposition to represent the active site. The core-flanking regions are displayed as looped lines tive core, consisting of evolutionarily conserved DNA that constrain the LCR holocomplex conformation. Transcription factors and co-activators are sequences that encompass clusters of binding sites for represented as different sized circles and ovals. Wavy arrows indicate transcripts. erythroid-specific and ubiquitous transcription factors. These cores are flanked by sequences that are similar, but not identical, between species. Initial experiments were yeast artificial chromosome (YAC). The results derived designed to define the minimal sequences required to pro- from these two approaches are largely in agreement. duce the phenotypic properties of the LCR and to test the The effect of large deletions of individual HSs that function of each of the HS motifs, their interaction with encompass the HS cores and regions of evolutionarily con- the globin genes and the other HSs, and their role in devel- served flanking sequences, has been examined. Deletion of opmental regulation of globin gene switching. While this 5ЈHS2 in the context of the endogenous murine ␤-globin approach defined the essential sequences of 5ЈHS2, 3 and locus or in human ␤-globin–YAC transgenic mice reduced 4, it also revealed that the LCR is more than the sum of its the expression of the globin genes only slightly, and the individual HSs and that the chromatin environment of the temporal switching pattern was unaffected26,27. These gene was an important determinant of expression. Thus, results indicate that 5ЈHS2 is not the exclusive LCR recent experiments have utilized deletions or substitutions enhancer element and that there is functional redundancy of individual HSs in the context of whole ␤-globin loci in between the individual HSs. Deletions of 5ЈHS3 demon- order to test the same questions by loss-of-function assays. strate its specificity for activation of the ⑀-globin gene in These methods involve the measurement of globin gene the embryonic yolk sac and the ␥-globin genes in the fetal expression in mice, in which the HSs have been knocked liver27–30. Deletion of 5ЈHS3 does not affect the timing of out or rearranged, either in the endogenous murine ␤- globin gene switching or disrupt the type of chain synthe- globin locus or in transgenic mice that carry the entire sis that is normally associated with a site of hematopoiesis human ␤-globin locus on a linked-cosmid construct or during a defined development stage27,30. Bungert et al.29

TIG October 1999, volume 15, No. 10 405 Reviews The ␤-globin locus control region

TABLE 1. LCR and LCR-like elements

Year of discovery Locus or gene Organism Ref. Comments

1987 ␤-globin locus ␣-fetoprotein gene Mouse 45 1989 CD2 gene Human 40 Three HS clusters 3’ of gene, HS1: enha, HS3: domb 1990 Lysozyme gene Chicken 37 Three 5’HSs, Ϫ6.1c: enh, Ϫ2.7: enh, Ϫ2.4: sild e 1992 Adenosine deaminase gene Human 36 Six HSs in intron 1, HS3: enh, HS2: PEsup Red-green visual pigment genes Human 46 S100 ␤ gene Human 47 Kallikrein genes Mouse, rat 48 Whey acidic protein gene Rat 49 ␤-lactoglobulin gene Sheep 50 1993 Keratin 18 gene Human 51 MHC class I HLA G gene Human 52 Immunoglobulin ␮ heavy chain Mouse 53 Metallothionein I, II genes Mouse 54 5ЈHSs-MT-II gene-MT-I gene-3ЈHSs MHC Class II Ea gene Mouse 55 5ЈHSs, overlap, proximal and distal to promoter 1994 CD4 gene Human 56 T-cell receptor ␣/␦ locus Human 41 Tyrosinase gene Mouse 57 Aldolase C gene Rat 58 LAP (C/EBP ␤) gene Rat 59 Ј Ј Ј 1995 Apolipoprotein B gene Human 60 5 , 3 MARs, 5 enh/PEsup region Apolipoprotein E/C-I locus Human 61 3ЈHSs 15 kb downstream ␤ myosin heavy chain gene Human 62 Complement C4A and B genes Human 63 Growth hormone gene Human 64 Four 5ЈHSs, two pituitary-specific at Ϫ15, two pituitary/placenta-specific, at Ϫ32 Ig heavy chain locus Human 65 Four HSs 3Ј of C? gene, HS1, 2: enh CD34 gene Mouse 66 HSs 5Ј and 3Ј of gene, in introns 1 and 4 Glycophorin gene Mouse 67 1997 Ig C ␣ 1 and 2 genes Human 68 MHC class I HLA-B7 gene Human 38 1998 Desmin gene Human 69 ␭ ␭ Ј ␭ 1999 5-VpreB1 locus Mouse 70 HSs between 5 and VpreB1 genes, 3 of 5

aenh, enhancer. bdom, chromatin domain-opening element. cDistances (kb) relative to mRNA start site of gene. dsil, . e PEsup, suppressor of position effect.

found that 5ЈHS3 can fully complement the loss of 5ЈHS4 ferent HSs can substitute and compensate for the lack of at function, and 5ЈHS4 can partially complement the loss least one of the HSs. of 5ЈHS3 function, which reinforces the idea of the functional redundancy of HSs. Mechanisms of transcription activation and Interestingly, deletion of only the 200–300 bp DNAse enhancement I-hypersensitive core regions of 5ЈHS3 or 5ЈHS4 has a As discussed above, there is some uncertainty about the more profound effect on the expression of globin genes direct effects of the LCR on chromatin conformation. than larger deletions that remove all 5ЈHS3 or 4 sequence Analysis of cell lines suggests that LCR function is limited similarity28,29. Deletion of the core elements might gener- to transcription enhancement at its endogenous location, ate a dominant-negative phenotype that disrupts LCR whereas transgenic experiments suggest that it is also necess- function. The LCR might function as a holocomplex, in ary for the establishment and maintenance of an open which the HS cores cluster to form an active site and the chromatin domain9,10. Nevertheless, studies that use native core-flanking regions constrain the holocomplex confor- loci and constructs in transgenics offer insights as to how mation. Deletion of an entire HS allows the remaining HSs the LCR accomplishes transcriptional activation (Fig. 2). to form an alternative holocomplex structure, albeit one Evidence suggests that the LCR interacts with only one that is less effective. However, deletion of a HS core region globin gene promoter at a time and that it ‘flip-flops’ might disrupt only the active site; the core-flanking between two or more promoters, depending upon the regions maintain the original holocomplex structure. The stage of development (Fig. 2a)21,32. Other data suggest that LCR active site is ‘poisoned’ and, as a result, the LCR is the LCR can interact simultaneously with more than one catastrophically disabled. Recent studies indicate that promoter, but these data examine multiple cells and thus deletion of 5ЈHS5 and the newly identified 5ЈHS6 in the represent the ‘average’ of interactions in the population16. murine ␤-globin locus does not have a major effect on A model that is consistent with these data is one in which globin gene expression31. the HSs of the LCR fold to form a holocomplex that physi- All of these experiments indicate that the intact LCR is cally ‘loops’ and comes in close proximity to the appropri- required for full and correct developmental expression of ate promoter. Enhanced gene expression is catalysed by the ␤-like globin genes. The HSs could have some develop- the delivery of transcription factors and other co-activators mental-stage specificity and proper function might require that form a stable transcription complex. The LCR holo- the presence both of the core and the flanking regions of complex is probably dynamic and free to move from gene the HSs. Some functional redundancy exists because dif- to gene. A parameter that is relevant to holocomplex

406 TIG October 1999, volume 15, No. 10 The ␤-globin locus control region Reviews

function is the distance between the LCR and its target commonly a copy number-dependent, site-of-integration- gene, which has been shown to affect the probability that independent expression of their cognate loci. these two elements will interact13–15,32; this probability is Some novel data regarding LCR function have come constant for a gene at a specific stage of development. As from studies of other LCRs. In one of the more-elegant development proceeds, the LCR has increasingly stable analyses, sequences of the human CD2 LCR were shown interactions with more-distant globin genes, which is to be essential for establishing an open chromatin configu- largely a function of the changing milieu of transcription ration39. This LCR consists of three HS clusters (HS1–3) factors. Thus, it is mainly the availability of specific tran- contained within 2 kb of 3Ј flanking region40. HS1 func- scription factors and the distance of a gene from the LCR tions as a classical transcription enhancer, whereas HS2 that constrains the frequency of LCR–gene interactions and 3 have no enhancer activity. In the absence of HS3, during development13–15. position effects (PEs) were exhibited, indicating that the DNA tracking or scanning is an alternative mechanism human CD2 LCR HS3 functions in the establishment for LCR–gene interaction (Fig. 2b). In this model, and/or maintenance of an open chromatin domain39. erythroid-specific and ubiquitous transcription factors Thus, LCRs appear to operate by ensuring an open bind recognition sequences in the LCR and then move or chromatin configuration, even in the absence of enhancer ‘track’ along locus DNA until they encounter the develop- activity. mental-stage-correct promoter, where transcription is ini- The T-cell-specific TCR ␣/␦ (TCR␣) LCR consists of tiated. If this model is valid, the expectation would be that eight HSs that are located downstream of the gene encod- some aberrant transcripts would arise from cryptic start ing the T-cell receptor41. It is presumed that the TCR␣ sites along the locus. In fact, transcripts have been detected LCR is a bifunctional element that regulates the T-cell across the LCR and intergenic regions in erythroid cells, receptor gene and the adjacent, ubiquitously expressed but not in non-erythroid cells33,34. However, these tran- Dad1 anti-apoptosis gene. Ortiz et al.42 identified two scripts are nuclear specific; they are not found in the cyto- sub-regions of the TCR␣ LCR: one that constituted a plasm, suggesting that they are not processed into mature novel non-tissue-restricted chromatin-opening element; messenger RNAs. The role of these transcripts could be to and another sequence, immediately upstream and com- deliver transcription-complex proteins to the globin gene prised of the four proximal HSs, that restored tissue spe- promoters, via the tracking mechanism. Alternatively, it is cificity to the downstream chromatin-opening element. possible that the function of transcription is to establish The HSs of this tissue-specificity region map to loci near and maintain an open chromatin conformation that is per- two transcriptional silencers, the TCR␣ enhancer and in a missive for gene transcription, although the persistence of region of unknown function; the region between the DNAse I-sensitivity following deletion of the LCR in cell enhancer and the unknown HS appears to be responsible lines argues against this role9,10. for the tissue specificity43. The proximal tissue-specific Facilitated tracking35 incorporates elements from the element might insulate the TCR gene from the LCR in looping and from the tracking model (Fig. 2c). An LCR- other tissues, without affecting the TCR␣ LCR–Dad1 bound and co-activator complex interaction. The occurrence of activators and insulators loops to interact with promoter-distal regions. This com- in LCRs appears to be a common theme, suggesting plex then tracks in small steps along the chromatin until it that the interaction of these elements modulates LCR encounters the correct promoter. A stable loop structure is function. established and gene expression proceeds. Although some A cassette derived from four HSs of the 3Ј murine mechanistic aspects of how the LCR regulates globin gene immunoglobulin heavy chain (IgH) locus LCR was linked expression are understood, the developmental control of to the Myc (also known as c-myc) gene44. This LCR medi- globin gene switching is still largely uncharacterized. ated a widespread increase in acetylation, not only within the promoter region of the Myc gene, but also over sub- LCRs in other systems stantial distances upstream and downstream of the tran- Thirty-two elements that appear to meet the criteria of scription site. These results suggest that LCRs can, in some LCR function have been identified in mammals, including instances, activate gene expression through a mechanism humans, mice, rats, chickens, rabbits, sheep and goats that includes increased histone acetylation. (Table 1). Structurally, these LCRs are comprised of vary- ing numbers of tissue-specific DNAse I-hypersensitive Conclusion sites. The HSs of non-globin LCRs that have been exten- LCRs control the level of transcription by ensuring that sively characterized consist of a 150–300 bp central core the locus is active in all cells at all times. They also protect that contains a high density of transcription-factor- against position effects and negative chromatin, and both binding sites36–38. The ␤-globin LCR consists of five HSs these functions are performed in a tissue-specific manner. clustered on one contiguous piece of DNA. However, the LCRs appear to be comprised of multiple elements, sequences that embody a complete LCR do not have to be including enhancers, insulators, chromatin domain- located in one position, upstream or downstream, relative opening elements and tissue-specificity elements; these can to the genes they control (Table 1). Other LCRs are a col- be clustered together or scattered throughout a locus. lection of elements, with different numbers of HSs spread Individually, each of these elements differentially affects over large distances. The relative simplicity of the ␤-globin gene expression; but it is their collective interaction that LCR with regard to its single group of HSs might have functionally defines an LCR. The discovery of the ␤- contributed to its early discovery. Identification of LCRs globin LCR opened a new avenue to study cis regulation in complex multigene loci, where the elements are inter- of gene expression in multiple gene loci. The challenge will spersed among the genes, will be a difficult task. be to discern the dynamic conformation of LCRs and Functionally, they all exhibit some or all of the properties specific interactions between LCRs and their cogent loci. associated with the ␤-globin LCR (Box 1); most- This knowledge will be key to understanding fundamental

TIG October 1999, volume 15, No. 10 407 Reviews The ␤-globin locus control region

mechanisms of gene regulation. Lessons learned in the last Acknowledgements decade will provide the framework for unravelling LCR This work was supported by National Institutes of Health function in the next. grants DK53510, HL53750 and DK45365.

References the human ␤-globin gene locus. Proc. Natl. Acad. Sci. U. S. A. 11, 345–358 1 Magram, J. et al. (1985) Developmental regulation of a cloned 85, 8081–8085 49 Dale, T.C. et al. (1992) High-level expression of the rat whey adult ␤-globin gene in transgenic mice. Nature 315, 338–340 25 Aladjem, M.I. et al. (1995) Participation of the human ␤- acidic protein gene is mediated by elements in the promoter 2 Townes, T.M. et al. (1985) Erythroid-specific expression of globin locus control region in initiation of DNA replication. and 3Ј untranslated region. Mol. Cell. Biol. 12, 905–914 human ␤-globin genes in transgenic mice. EMBO J. 4, Science 270, 815–819 50 Whitelaw, C.B. et al. (1992) Position-independent expression 1715–1723 26 Fiering, S. et al. (1995) Targeted deletion of 5ЈHS2 of the of the ovine ␤-lactoglobulin gene in transgenic mice. 3 Kollias, G. et al. (1986) Regulated expression of human A␥-, murine ␤-globin LCR reveals that it is not essential for proper Biochem. J. 286, 31–39 ␤-, and hybrid ␥␤-globin genes in transgenic mice: regulation of the ␤-globin locus. Genes Dev. 9, 2203–2213 51 Neznanov, N. et al. (1993) Transcriptional insulation of the manipulation of the developmental expression patterns. Cell 27 Peterson, K.R. et al. (1996) Effect of deletion of 5ЈHS3 or human keratin 18 gene in transgenic mice. Mol. Cell. Biol. 13, 46, 89–94 5ЈHS2 of the human ␤-globin locus control region on the 2214–2223 4 Kioussis, D. et al. (1983) ␤-globin gene inactivation by DNA developmental regulation of globin gene expression in ␤- 52 Schmidt, C.M. et al. (1993) Extraembryonic expression of the translocation in ␥␤-thalassaemia. Nature 306, 662–666 globin locus yeast artificial chromosome transgenic mice. human MHC class I gene HLA-G in transgenic mice. Evidence 5 Driscoll, M.C. et al. (1989) ␥␦␤-thalassemia due to a de novo Proc. Natl. Acad. Sci. U. S. A. 93, 6605–6609 for a positive regulatory region located 1 kilobase 5Ј to the mutation deleting the 5Ј ␤-globin gene activation-region 28 Navas, P.A. et al. (1998) Developmental specificity of the start site of transcription. J. Immunol. 151, 2633–2645 hypersensitive sites. Proc. Natl. Acad. Sci. U. S. A. 86, interaction between the locus control region and embryonic or 53 Jenuwein, T. et al. (1993) The immunoglobulin ␮ enhancer 7470–7474 fetal globin genes in transgenic mice with an HS3 core core establishes local factor access in nuclear chromatin 6 Tuan, D. et al. (1985) The Ј␤-like-globinЈ gene domain in deletion. Mol. Cell. Biol. 18, 4188–4196 independent of transcriptional stimulation. Genes Dev. 7, human erythroid cells. Proc. Natl. Acad. Sci. U. S. A. 82, 29 Bungert, J. et al. (1995) Synergistic regulation of human ␤- 2016–2032 6384–6388 globin gene switching by locus control region elements HS3 54 Palmiter, R.D. et al. (1993) Distal regulatory elements from 7 Forrester, W.C. et al. (1987) Evidence for a locus activation and HS4. Genes Dev. 9, 3083–3096 the mouse metallothionein locus stimulate gene expression in region: the formation of developmentally stable hypersensitive 30 Hug, B.A. et al. (1996) Analysis of mice containing a targeted transgenic mice. Mol. Cell. Biol. 13, 5266–5275 sites in globin-expressing hybrids. Nucleic Acids Res. 15, deletion of ␤-globin locus control region 5Ј hypersensitive site 55 Carson, S. and Wiles, M.V. (1993) Far upstream regions of 10159–10177 3. Mol. Cell. Biol. 16, 2906–2912 class II MHC Ea are necessary for position-independent, copy- 8 Grosveld, F. et al. (1987) Position-independent, high-level 31 Bender, M.A. et al. (1998) Description and targeted deletion of dependent expression of Ea transgene. Nucleic Acids Res. 21, expression of the human ␤-globin gene in transgenic mice. 5Ј hypersensitive site 5 and 6 of the mouse ␤-globin locus 2065–2072 Cell 51, 975–985 control region. Blood 92, 4394–4403 56 Wurster, A.L. et al. (1994) Elf-1 binds to a critical element in a 9 Epner, E. et al. (1998) The ␤-globin LCR is not necessary for 32 Wijgerde, M. et al. (1995) Transcription complex stability and second CD4 enhancer. Mol. Cell. Biol. 14, 6452–6463 an open chromatin structure or developmentally regulated chromatin dynamics in vivo. Nature 377, 209–213 57 Ganss, R. et al. (1994) A cell-specific enhancer far upstream transcription of the native mouse ␤-globin locus. Mol. Cell 2, 33 Tuan, D. et al. (1992) Transcription of the hypersensitive site of the mouse tyrosinase gene confers high level and copy 447–455 HS2 enhancer in erythroid cells. Proc. Natl. Acad. Sci. U. S. A. number-related expression in transgenic mice. EMBO J. 13, 10 Reik, A. et al. (1998) The locus control region is necessary for 89, 11219–11223 3083–3093 gene expression in the human ␤-globin locus but not the 34 Ashe, H.L. et al. (1997) Intergenic transcription and 58 Arai, Y. et al. (1994) Position-independent, high-level, and maintenance of an open chromatin structure in erythroid cells. transinduction of the human ␤-globin locus. Genes Dev. 11, correct regional expression of the rat aldolase C gene in the Mol. Cell. Biol. 18, 5992–6000 2494–2509 central nervous system of transgenic mice. Eur. J. Biochem. 11 Hardison, R. et al. (1997) Locus control regions of mammalian 35 Blackwood, E.M. and Kadonaga, J.T. (1998) Going the 221, 253–260 ␤-globin gene clusters: combining phylogenetic analyses and distance: a current view of enhancer action. Science 281, 59 Talbot, D. et al. (1994) The 5Ј flanking region of the rat LAP experimental results to gain functional insights. Gene 205, 61–63 (C/EBP ␤) gene can direct high-level, position-independent, 73–94 36 Aronow, B.J. et al. (1992) Functional analysis of the human copy number-dependent expression in multiple tissues in 12 Tanimoto, K. et al. (1999) Effects of altered gene order or adenosine deaminase gene thymic regulatory region and its transgenic mice. Nucleic Acids Res. 22, 756–766 orientation of the locus control region on human ␤-globin ability to generate position-independent transgene expression. 60 Kalos, M. and Fournier, R.E. (1995) Position-independent gene expression in mice. Nature 398, 344–348 Mol. Cell. Biol. 12, 4170–4185 transgene expression mediated by boundary elements from 13 Hanscombe, O. et al. (1991) Importance of globin gene order 37 Bonifer, C. et al. (1990) Tissue specific and position the apolipoprotein B chromatin domain. Mol. Cell. Biol. 15, for correct developmental expression. Genes Dev. 5, independent expression of the complete gene domain for 198–207 1387–1394 chicken lysozyme in transgenic mice. EMBO J. 9, 2843–2848 61 Dang, Q. et al. (1995) Structure of the hepatic control region 14 Peterson, K.R. and Stamatoyannopoulos, G. (1993) Role of 38 Kushida, M.M. et al. (1997) A 150-base pair 5Ј region of the of the human apolipoprotein E/C-I gene locus. J. Biol. Chem. gene order in developmental control of human ␥- and ␤- MHC class I HLA-B7 gene is sufficient to direct tissue-specific 270, 22577–22585 globin gene expression. Mol. Cell. Biol. 13, 4836–4843 expression and locus control region activity: the ␣ site 62 Knotts, S. et al. (1995) Position independent expression and 15 Dillon, N. et al. (1997) The effect of distance on long-range determines efficient expression and in vivo occupancy at developmental regulation is directed by the ␤ myosin heavy chromatin interactions. Mol. Cell 1, 131–139 multiple cis-active sites throughout this region. J. Immunol. chain geneЈs 5Ј upstream region in transgenic mice. Nucleic 16 Bresnick, E.H. and Tze, L. (1997) Synergism between 159, 4913–4929 Acids Res. 23, 3301–3309 hypersensitive sites confers long-range gene activation by the 39 Festenstein, R. et al. (1996) Locus control region function and 63 Tee, M.K. et al. (1995) A promoter within intron 35 of the ␤-globin locus control region. Proc. Natl. Acad. Sci. U. S. A. heterochromatin-induced position effect variegation. Science human C4A gene initiates abundant adrenal-specific 94, 4566–4571 271, 1123–1125 transcription of a 1 kb RNA: location of a cryptic CYP21 17 Kim, C.G. et al. (1992) Inactivation of the human ␤-globin 40 Greaves, D.R. et al. (1989) Human CD2 3Ј-flanking sequences promoter element? Hum. Mol. Genet. 4, 2109–2116 gene by targeted insertion into the ␤-globin locus control confer high-level, T cell-specific, position-independent gene 64 Jones, B.K. et al. (1995) The human growth hormone gene is region. Genes Dev. 6, 928–938 expression in transgenic mice. Cell 56, 979–986 regulated by a multicomponent locus control region. Mol. Cell. 18 Blom van Assendelft, G. et al. (1989) The ␤-globin dominant 41 Diaz, P. et al. (1994) A locus control region in the T cell Biol. 15, 7010–7021 control region activates homologous and heterologous receptor ␣/␦ locus. Immunity 1, 207–217 65 Madisen, L. and Groudine, M. (1994) Identification of a locus promoters in a tissue-specific manner. Cell 56, 969–977 42 Ortiz, B.D. et al. (1997) Adjacent DNA elements dominantly control region in the immunoglobulin heavy- chain locus that 19 Tewari, R. et al. (1996) The human ␤-globin locus control restrict the ubiquitous activity of a novel chromatin-opening deregulates c-myc expression in plasmacytoma and BurkittЈs region confers an early embryonic erythroid-specific region to specific tissues. EMBO J. 16, 5037–5045 lymphoma cells. Genes Dev. 8, 2212–2226 expression pattern to a basic promoter driving the bacterial 43 Ortiz, B.D. et al. (1999) A new element within the T-cell 66 May, G. and Enver, T. (1995) Targeting gene expression to lacZ gene. Development 122, 3991–3999 receptor ␣ locus required for tissue-specific locus control haemopoietic stem cells: a chromatin-dependent upstream 20 Forrester, W.C. et al. (1990) A deletion of the human ␤-globin region activity. Mol. Cell. Biol. 19, 1901–1909 element mediates cell type-specific expression of the stem locus activation region causes a major alteration in chromatin 44 Madisen, L. et al. (1998) The immunoglobulin heavy chain cell antigen CD34. EMBO J. 14, 564–574 structure and replication across the entire ␤-globin locus. locus control region increases histone acetylation along linked 67 Terajima, M. et al. (1995) Inducible expression of erythroid- Genes Dev. 4, 1637–1649 c-myc genes. Mol. Cell. Biol. 18, 6281–6292 specific mouse glycophorin gene is regulated by proximal 21 Milot, E. et al. (1996) Heterochromatin effects on the 45 Hammer, R.E. et al. (1987) Diversity of ␣-fetoprotein gene elements and locus control region-like sequence. J. Biochem. frequency and duration of LCR-mediated gene transcription. expression in mice is generated by a combination of separate 118, 593–600 Cell 87, 105–114 enhancer elements. Science 235, 53–58 68 Mills, F.C. et al. (1997) Enhancer complexes located 22 Ellis, J. et al. (1996) A dominant chromatin-opening activity in 46 Wang, Y. et al. (1992) A locus control region adjacent to the downstream of both human immunoglobulin C␣ genes. J. Exp. 5Ј hypersensitive site 3 of the human ␤-globin locus control human red and green visual pigment genes. Neuron 9, Med. 186, 845–858 region. EMBO J. 15, 562–568 429–440 69 Raguz, S. et al. (1998) Muscle-specific locus control region 23 Li, Q. and Stamatoyannopoulos, J.A. (1994) Position 47 Friend, W.C. et al. (1992) Cell-specific expression of high activity associated with the human desmin gene. Dev. Biol. independence and proper developmental control of ␥-globin levels of human S100 ␤ in transgenic mouse brain is 201, 26–42 gene expression require both a 5Ј locus control region and a dependent on gene dosage. J. Neurosci. 12, 4337–4346 70 Sabbattini, P. et al. (1999) Analysis of mice with single and downstream sequence element. Mol. Cell. Biol. 14, 6087–6096 48 Smith, M.S. et al. (1992) Tissue-specific expression of multiple copies of transgenes reveals a novel arrangement for ␭ 24 Epner, E. et al. (1988) Asynchronous DNA replication within kallikrein family transgenes in mice and rats. DNA Cell Biol. the 5-VpreB1 locus control region. Mol. Cell. Biol. 19, 671–679

408 TIG October 1999, volume 15, No. 10