<<

Epigenetic regulation of promiscuous expression in thymic medullary epithelial cells

Lars-Oliver Tykocinskia,1,2, Anna Sinemusa,1, Esmail Rezavandya, Yanina Weilandb, David Baddeleyb, Christoph Cremerb, Stephan Sonntagc, Klaus Willeckec, Jens Derbinskia, and Bruno Kyewskia,3

aDivision of Developmental Immunology, Tumor Immunology Program, German Cancer Research Center, D-69120 Heidelberg, Germany; bKirchhoff Institute for Physics, University of Heidelberg, D-69120 Heidelberg, Germany; and cInstitute for Genetics, University of Bonn, D-53117 Bonn, Germany

Edited* by Philippa Marrack, National Jewish Health, Denver, CO, and approved September 28, 2010 (received for review July 2, 2010)

Thymic central tolerance comprehensively imprints the T-cell re- ing of delimited regions allowing access of general and specific ceptor repertoire before T cells seed the periphery. Medullary transcriptional factors to act on gene-specific control elements thymic epithelial cells (mTECs) play a pivotal role in this process by (8). This scenario is clearly different from the intricate regulation virtue of promiscuous expression of tissue-restricted autoantigens. of functionally related gene families like the locus or β The molecular regulation of this unusual gene expression, in the -globin gene locus (9). A similar phenomenon as observed in Drosophila particular the involvement of epigenetic mechanisms is only poorly has been reported for housekeeping but not for understood. By studying promiscuous expression of the mouse TRAs in vertebrates (10). casein locus, we report that of this locus proceeds Here we analyzed the interrelationship between emerging gene expression patterns at the single cell level, -associated from a delimited region (“entry site”) to increasingly complex pat- epigenetic marks, and the differentiation of mTECs in the murine terns along with mTEC maturation. Transcription of this region is casein locus. Our results argue for a role of local epigenetic preceded by promoter demethylation in immature mTECs followed control in initiating transcription of this locus. MTEC differen- upon mTEC maturation by acquisition of active marks and tiation goes along with increasingly complex patterns of gene local locus decontraction. Moreover, analysis of two additional expression in single cells. However, expression of certain TRAs gene loci showed that promiscuous expression is transient in single appears to be transient. The implications of these findings for the mTECs. Transient gene expression could conceivably add to the process of central tolerance will be discussed. local diversity of self-antigen display thus enhancing the efficacy IMMUNOLOGY of central tolerance. Results PGE Correlates with Gene-Specific Permissive Histone Marks in the central tolerance | locus decontraction | tissue-restricted antigens Casein Locus. To address local rather than global epigenetic mechanisms that regulate pGE in the thymus, we focused our he scope of central T-cell tolerance is to a large extent dic- analysis on the casein gene locus as a typical TRA gene cluster. Ttated by ectopic expression of numerous tissue-restricted Expression of the casein genes as well as the flanking sulfo- antigens (TRAs). This gene pool encompasses >10% of all known transferase and the UDP glycosyltransferase family members and genes and represents virtually all tissues of the body. Genes in this the family of salivary gland genes is tissue restricted. At the same pool show no obvious functional or structural commonalities. time, all of the genes within the cluster are expressed by mature but Whereas the cellular regulation and modes of tolerance induction not immature mTECs at the population level (2). This contiguous operating on this gene pool become increasingly clear, our un- expression of functionally unrelated genes within a cluster is likely derstanding of the molecular regulation of this promiscuous gene to be regulated at the epigenetic level. Hence, we analyzed the expression (pGE) has progressed slowly. To date only the auto- promoters of the casein genes as well as the promoter regions of immune regulator (Aire) has been identified as a molecular Ugt2a3, Sult1d1, Sult1e1, Smr1, and Muc10 for histone H4 acet- component, which directs the expression of a large fraction of ylation and 4 trimethylation () as these genes in medullary thymic epithelial cells (mTECs). Con- marks for active and histone H3 lysine 27 trimethylation sequently, the lack of a functional Aire protein leads to a severe () as a repressive mark, which is also found in bivalent multiorgan autoimmune disease—autoimmune polyendocrine chromatin domains (11, 12). Given the limited yield of ex vivo syndrome-1 (APS-1). Only 13 y after identifying the Aire gene as available mTECs, we improved the sensitivity of ChIP and rou- 5 A being responsible for APS-1, we begin to understand the molec- tinely used 10 mTECs per (IP) (Fig. S1 ). ular workings of Aire in the context of pGE (1). However, several Mammary gland epithelial cells (MECs) of lactating mice served distinctive features of pGE still seek an explanation at the mo- as a positive and thymocytes as a negative control for the casein lecular level. Promiscuously expressed genes are (i) highly en- locus. Expectedly, the casein gene promoters in MECs and the riched in tissue-restricted genes (2, 3) and (ii) preferentially lo- CD45 gene promoter in thymocytes were highly H4 acetylated and calize to genomic clusters in mice and man (2, 4). In particular the H3K4 trimethylated (Fig. 1). In MECs, the promoter regions of the nonexpressed genes flanking the casein genes as well as the segregation into gene clusters may offer clues as to how genes of fi different ontology and without obvious functional relatedness are CD45 promoter showed no signi cant acetylation of histone H4. targeted for coexpression in a single cell type. On the basis of our Thymocytes as well as immature mTECs showed none of the previous studies on the mouse casein locus (2, 5), we proposed that pGE might target genes via epigenetic marks rather than common sequence motives in their cis-acting regulatory elements Author contributions: L.-O.T., A.S., J.D., and B.K. designed research; L.-O.T., A.S., E.R., and (6). Thus, we found that epithelial cells of the lactating mammary J.D. performed research; Y.W., D.B., C.C., S.S., and K.W. contributed new reagents/analytic gland selectively coexpressed milk protein genes but not other tools; L.-O.T., A.S., and J.D. analyzed data; and L.-O.T., A.S., and B.K. wrote the paper. genes of the extended casein locus, whereas in mTECs all genes The authors declare no conflict of interest. within this locus were expressed at similar frequencies (with *This Direct Submission article had a prearranged editor. one exception) in an apparently stochastic manner irrespective of 1L.-O.T. and A.S. contributed equally to this work. their tissue affiliation (5). Such coexpression neighborhoods 2Present address: Department of Medicine V, Division of Rheumatology, University of of functionally unrelated genes have been described for the Heidelberg, INF 410, D-69120 Heidelberg, Germany. Drosophila genome and estimated to encompass up to 20% of all 3To whom correspondence should be addressed. E-mail: [email protected]. genes analyzed (7). Several mechanisms have been suggested to This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. account for this observation, one of which is the epigenetic open- 1073/pnas.1009265107/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1009265107 PNAS Early Edition | 1of6 Downloaded by guest on September 28, 2021 positive histone marks within the casein gene cluster. In mature acetylation was around 10–20% of expressing cells within a mixed mTECs, only the Csnb promoter was strongly acetylated at histone cell population (Fig. S1B), we cannot at present exclude that his- H4 and trimethylated at H3K4, whereas all other gene promoters tone modifications within a minor mTEC population escaped showed only background levels for these histone marks. The re- detection in our ChIP assay (see below). pressive H3K27me3 mark could not be detected to a significant degree in any promoter within the casein cluster in mTECs, MECs, PGE Correlates with Gene-Specific Promoter DNA Demethylation in or thymocytes. In contrast, the Hoxc10 promoter, known to be the Casein Cluster. The chromatin structure that determines the targeted by polycomb group complexes (13) and used in this study accessibility of a gene promoter depends on histone modifications as a positive control for H3K27me3, was highly trimethylated at as well as on the DNA status. Both epigenetic H3K27 in all four cell types. The casein cluster is thus character- modifications can influence each other and the relationship can ized neither by a state of facultative nor by bi- work in both directions (14). We analyzed the DNA methylation valent chromatin (carrying both H3K4me3 and H3K27me3 marks status of the 5′ regions of all casein genes and of the neighboring simultaneously) in any of the four cell populations analyzed. Re- Sult1e1 gene in immature and mature mTECs, MECs, and thy- markably, Csnb is the only gene in the casein cluster carrying active mocytes. As expected, all gene promoters were highly methylated histone modifications at its promoter region in mature mTECs, in thymocytes, which do not express any of the analyzed genes yet all genes in the casein cluster are expressed in mature mTECs (Fig. 2). In MECs, all casein gene promoters were highly deme- at the population level. Gene expression analysis at the single cell thylated. In immature as well as in mature mTECs, the Sult1e1 as level, however, showed that only 2–15% of the mature mTECs well as the Csna and Csnk promoter were highly methylated, express a particular gene of the casein cluster. Csnb is an exception whereas the 5′ regions of the Csng and Csnd genes were partially being expressed in more than 80% of mature mTECs (5). It was demethylated. However, compared with MECs, the degree of therefore possible that other genes in the casein cluster might also demethylation was clearly less pronounced. In contrast, the Csnb have active histone modifications at their promoter regions in promoter was highly demethylated in mature and interestingly those cells actively expressing the particular gene, but their fre- also in immature mTECs at a level comparable to MECs (Fig. 2). quency might be under the detection threshold of the ChIP assay. Because Csnb is not expressed in immature mTECs, the Csnb Because the detection threshold of the ChIP for histone H4 gene might be already poised for transcription at an early stage of mTEC maturation. Thus, transcription of Csnb in mTECs is preceded by DNA demethylation and correlates with histone H4 acetylation and H3K4 trimethylation. Because demethylation of the Csnb promoter only becomes clearly detectable at day E16/17 (see below) when already more than 30% of the cells express Csnb (Fig. S2), promoter demethylation of other casein genes expressed at frequencies lower than 15% might be missed.

PGE Correlates with Gene-Specific Permissive Epigenetic Marks in the Gad67 Locus. As argued above, genes expressed in mTECs in the range of 2–15% (5) might also carry active epigenetic marks but escape detection by the ChIP assay (Fig. S1B) or bisulfite se- quencing. To overcome this limitation, we made use of a reporter mouse strain in which the eGFP gene was knocked into the second exon of the Gad67 gene (15) and analyzed eGFP as a surrogate TRA. Gad67 is promiscuously expressed by mature mTECs in an Aire-independent fashion. EGFP expression was confined to mature mTECs like endogenous Gad67 (2). We isolated highly − pure mature eGFP+ and eGFP as well as immature mTECs from heterozygous GAD67/eGFP mice and assessed H4 panacetylation and H3K4me3 modifications. The eGFP gene region clearly showed a higher level of H4 acetylation in mature mTECs ex- − pressing eGFP than in the eGFP fraction of mature mTECs. EGFP+ brain cells served as a positive control showing clear H4 acetylation, immature mTECs, and thymocytes served as negative controls (Fig. 3A). As for the H4 acetylation status, the eGFP allele-specific region also showed higher levels of H3K4me3 in the − eGFP+, compared with the eGFP mTEC fraction. Analysis of the DNA methylation pattern showed that the CpG-rich GAD67 promoter/exon 1 region was highly demethylated in eGFP+ as well − as in eGFP mature mTECs and also in immature mTECs, therefore resembling the DNA methylation pattern of the Csnb promoter. In contrast to Csnb, the GAD67 promoter was also highly demethylated in thymocytes (Fig. 3B). Also the region 5′ of Fig. 1. Genes expressed at a high frequency in mTECs or MECs show active the transcription start site of the GAD67 wild-type allele as well as histone marks in their promoter regions. ChIP was performed with ex vivo fi neg neg hi hi neg the eGFP knockin allele were highly demethylated in all analyzed puri ed mature (CD45 CDR1 EpCAM CD80 ) and immature (CD45 cell populations (Fig. S3). Thus, promiscuous expression of eGFP CDR1negEpCAMhiCD80lo) mTECs; MECs and thymocytes were used as positive 5 under the Gad67 promoter, a surrogate TRA expressed at low and negative controls (10 cells per IP). Samples were analyzed for H4 acety- < lation and H3K4 and H3K27 trimethylation with qPCR focusing on different frequency ( 5% in total mTECs), was clearly correlated with promoter regions of the casein gene locus and control genes. Diagrams show permissive epigenetic marks, comparable to the Csnb gene. the frequency of a particular histone modification mark in the promoter re- gion normalized to H3. All data are the mean of three to four independent Entry Site and Local Decontraction of the Casein Gene Locus. Little is experiments, error bars represent the SEM. Background activity of pan-acetyl known about the temporal and special regulation of gene ex- H4 and H3K4me3 was defined as the mean value of all genes not expressed in pression neighborhoods either in the context of tissue-specific (8) a particular cell type (horizontal bar). Bars are aligned according to the gene or promiscuous gene expression. Here we analyzed the ontogeny order shown at the Top. of pGE in the casein locus. Interestingly, initial transcription of

2of6 | www.pnas.org/cgi/doi/10.1073/pnas.1009265107 Tykocinski et al. Downloaded by guest on September 28, 2021 Fig. 2. DNA methylation of the murine casein gene locus. Genomic DNA of thymocytes, immature and mature mTECs, and MECs from lactating mice was treated with sodium bisulfite and 5′ regions of the genes in the casein locus were amplified by PCR (positions marked relative to the first codon of the corresponding gene). Individual PCR products were cloned and se- quenced, each line of squares represents one allele. Each filled square

marks one methylated CpG motif, each open square a nonmethylated CpG IMMUNOLOGY motif; ambiguous results are displayed by gray squares. For each region and cell type, 6–10 representative sequence results are shown.

the casein locus (E14 to E15) was first confined to Csnb and Csng. The other casein genes followed only from E16 onward, with the relative frequency of Csng positive cells dropping thereafter (Fig. Fig. 3. Gad67/eGFP-enriched mTECs show active epigentic marks in the A 4A). Concomitant with the spreading of gene expression within Gad67 promoter. ( ) ChIP for H4 acetylation and H3K4me3 was performed this locus, coexpression patterns in single cells became more with freshly prepared thymocytes, total brain cells and the indicated mTEC subsets from heterozygous Gad67/eGFP mice. The 5′ end of the eGFP gene complex, proceeding from one casein gene at E14 to four to five B and as controls the promoters of the CD45, Csna and Csnb genes were an- casein genes at postnatal day 1 (PN1) (Fig. 4 ). These data sug- alyzed by qPCR and normalized to H3. All data are representative of two to gested that the region around Csn b/g might serve as an entry site three independent experiments. (B) The DNA methylation pattern of the from which expression spreads in either direction. Because de- Gad67/eGFP promoter/exon1 region was analyzed by bisulfite sequencing methylation of the promoter Csnb gene preceeded gene expres- (positions marked relative to the first codon of Gad67/eGFP) (for details see sion in mTECs postnatally, we also assessed this epigenetic Fig. 2 legend). Shown are five representative sequence results. mark from the initiation of Csnb transcription at E14 throughout thymic ontogeny (Fig. 4C). DNA demethylation of the Csnb promoter correlated well with the expression levels of Csnb mTEChigh, and as a positive control in MEC with high-level tran- during the fetal period, where Csnb is clearly detectable at the scription of casein genes. MEC and mTEChigh had similar distances population level at E17 (16) and at the single cell level at E15.5 (approximately 360 nm, corresponding to a compaction factor of (Fig. S2). Similarly to the ontogeny of methylation patterns, Csnb- 100), whereas mTEClow had a smaller distance (237 nm, corre- expressing cells do not reach maximal frequencies until the adult sponding to a compaction factor of 150), which differed signifi- state (Fig. S2). Due to the gradual emergence of Csnb-expressing cantly from the other samples (Fig. 5 B and C). Hence, correlating cells, we could not resolve a stage at which promoter demethy- with transcription, the region around the Csnb gene becomes lation preceded Csnb expression during embryogenesis. decondensed upon terminal mTEC differentiation. Next to local epigenetic signatures such as DNA methylation or histone tail modifications, the overall compaction of chromatin Allelic-Specific Expression of the Gad67 Locus in mTECs. PGE has around a target locus influences DNA accessibility and thus gene been shown to have probabilistic attributes (17). Thus, promiscuous expression. Active transcription tends to correlate with decon- expression of three Aire-regulated TRAs showed expression of ei- densed chromatin, which is more accessible for the transcription ther or both alleles, whereas expression of the corresponding genes machinery. Because the Csnb gene may serve as a potential entry in peripheral tissues was strictly biallelic. Single cell (SC) RT-PCR site, we tested whether active transcription is associated with allowed us to assess the coexpression pattern of the Gad67/eGFP changes in chromatin structure in this region. Most mTEChigh and Gad67 wild-type alleles in single mTECs and thus extend this express Csnb and thus these cells could be easily preenriched and type of analysis to an Aire-independent TRA. Multiplex SC PCR − FISH probes were chosen to flank this gene (Fig. 5A and Fig. S4). was performed with sorted eGFP+ and eGFP mature mTECs and The distance between the probes was chosen to give small geo- immature mTECs from heterozygous Gad67/eGFP mice. Of all metric distances (103 kb), which were still above the resolution of mature mTECs expressing this locus, 59% and 11% expressed ei- spectral precision distance microscopy (SPDM) measurements. A ther allele whereas 30% coexpressed both alleles. In contrast 75% larger distance would have increased the chance of measuring not of all eGFP+ neurons showed bialleic expression of the Gad67/ the true length (i.e., compaction) of the locus but twisted/coiled eGFP locus at the mRNA level (Fig. S5B) and virtually all of them at chromatin strands resulting in a higher apparent compaction. the protein level (Fig. S6), again emphasizing the different regula- Distances between the two probes were measured in mTEClow, tion of the same gene in mTECs versus the corresponding tissue.

Tykocinski et al. PNAS Early Edition | 3of6 Downloaded by guest on September 28, 2021 Fig. 5. Decontraction of the casein locus upon differentiation. (A) Probes were localized upstream and downstream of Csna/Csnb on mouse chromo- some 5. Probe 1 is 30 kb long [labeled with OregonGreen (Invitrogen)] and probe 2 is 46 kb long [labeled with Alexa-Fluor 647 (Invitrogen)]; both were constructed from a 220 kb BAC clone (RP23 −110B6). (B) Distances between probe 1 and probe 2 in the casein gene locus were measured in 3D in mature and immature mTECs as well as in MECs. Distributions of distances do not show any local maxima or differences in heterogeneity between the three cell populations. (C) Distances were plotted for all three cell populations and were found not to have multiple maxima indicative of different sub- populations. A significant decontraction of the casein locus was measured Fig. 4. Developmental dynamics of gene expression in the casein locus. (A) upon maturation from immature to mature mTECs (P < 0.05). Frequency of mature mTECs expressing different casein genes during on- togeny as analyzed by SC PCR. Note that expression of Csnb and -g precedes that of Csna, -d, and -k. Significance of the pairwise comparison (grouped by brackets) by exact McNemar test: *P < 0.005; **P < 0.0001. (B) Expression of lifespan of mature mTECs, but instead transient or intermittent. Such a discordancy will only be revealed, if the protein of interest casein genes in mature mTECs during ontogeny [embryonic day E14 to fi postnatal day 1 (PN1)] was assessed by SC PCR. Shown is the frequency of has a suf ciently long half-life, estimated to be about 1 d in the single cells expressing one to five casein genes simultaneously. (C) DNA of ex case of eGFP (19–21). Independent evidence for transient pGE vivo purified mature mTECs of different stages of was obtained by assessing the promoter activity of another tissue- (E14–E17) and of newborn mice (PN1) was analyzed for the methylation specific gene in mTECs using the lacZ reporter system. The fre- pattern of the Csnb promoter (for details see Fig. 2 legend). quency of mTECs expressing lacZ was compared between two different scenarios; either lacZ was driven directly by the tissue- specific promoter of connexin 57 (Cx57) (only expressed in hor- These data are well in accord with those reported for Aire-regulated izontal cells of the mouse retina (22) and in immature and mature genes (17). Given that terminally differentiated mTECs display the mTECs) or alternatively Cre-recombinase was driven by the Cx57 most complex pattern of pGE (2), we asked whether this feature promoter and this transgenic line was crossed with a ROSA26/ also pertains to allele-specific gene expression. We therefore cor- lacZ reporter strain. Whereas lacZ expression in the former line related allele-specific expression with coexpression of the tran- reveals cells with ongoing Cx57 promoter activity, the latter strain scriptional regulator Aire at the single cell level. Aire served here as lineage traces all cells in which the Cx57 promoter had been a marker of short-lived terminally differentiated mTECs (18). In- switched on at any time during the life span of these cells. Strik- terestingly, nearly all mTECs showing biallelic expression of the ingly, lacZ-positive stromal cells in the medulla were much more Gad67 locus also expressed Aire (91%), whereas this was not the numerous in the ROSA26 reporter strain than in the single case for cells expressing either allele or none (Fig. S5C). Hence, also transgenic mice (1.2% versus <0.1% as estimated from cytospins with regard to allele-specific expression terminally differentiated of purified mTECs) (Fig. 6 and Fig. S7). This result indicates mTECs display the most complex pattern, i.e., biallelic expression, transient activity of the Cx57 promoter in mTECs. possibly as a result of cumulative stochastic switch-on of both alleles with increasing lifespan of these cells. Discussion We have addressed the involvement of epigenetic mechanisms in PGE Is Transient in Mature mTECs. Surprisingly, eGFP mRNA ex- the regulation of pGE at two distinct gene loci. A particular pression was only detected in 33% of the eGFP+ mTECs. Thus, in feature of promiscuously expressed tissue-restricted genes is contrast to EpCAM and Aire, eGFP-specific mRNA and protein their segregation into numerous chromosomal clusters (2, 4, 23). were largely discordantly expressed (Fig. S5B). These data show When analyzed in more detail, all genes in such a cluster were found that eGFP mRNA expression is not maintained during the entire to be coexpressed at the population level and to variable degrees

4of6 | www.pnas.org/cgi/doi/10.1073/pnas.1009265107 Tykocinski et al. Downloaded by guest on September 28, 2021 also in single mTECs (2, 5), suggesting epigenetic regulation. resolution fluorescence in situ hybridization analysis, indicating Coregulation of clustered genes has been found in a number of changes in higher order chromatin configurations that extend be- species, whereby it has been argued that the local proximity would yond a single gene locus. It should however be emphasized that facilitate coregulation of genes serving a common function in despite similar epigenetic marks, the regulation of Csnb is different a particular cell lineage, i.e., muscle or red blood cell development in mTECs and MECs. Whereas Csnb expression requires C/EBPβ (24). Prominent examples in this regard are the Hox gene or the and Stat5ab in MECs, both factors are dispensable in mTECs globin gene locus (9). In the case of pGE, this concept has been (Fig. S8). extended to tissue-restricted genes, which are clustered irrespective For all other genes in the casein locus, we did not detect active of functional or structural relatedness or tissue-specific expression epigenetic marks. Given the threshold of the ChIP and bisulfite patterns. Targeting gene clusters rather than individual genes could sequencing method, we could not exclude that active histone explain how mTECs can express such an array of genes without any marks in those cells actually expressing a particular gene may have obvious commonalities (8). The casein region exemplifies such been missed. We therefore analyzed heterozygous Gad67/eGFP a cluster including genes specific for mammary gland, liver, kidney, knockin mice, in which case eGFP driven by the Gad67 promoter and salivary gland (2). Here we tested the proposition of whether served as surrogate TRA. Endogenous Gad67 is expressed at the entire cluster is primed for promiscuous transcription by per- much lower frequencies than Csnb and thus represents the ma- missive epigenetic marks irrespective of the particular expression jority of promiscuously expressed genes. Overall, we found the pattern at the single cell level. Gad67/eGFP gene to be similarly regulated as the Csnb gene. We analyzed the state of histone modifications and DNA Purified eGFP+ mTECs showed higher levels of H4 acetylation − methylation in mTEC subpopulations, MECs, and thymocytes for and H3K4me3 than nonexpressors (mature eGFP or immature different promoter regions in the casein cluster. In mature mTECs, mTECs). In addition the promoter of the Gad67 gene was we found only the Csnb gene to be epigenetically opened at the demethylated in immature and mature mTECs. We conclude that promoter region by histone modifications (H4 acetylation and the association of permissive epigenetic marks with promiscuous H3K4me3) and DNA demethylation. In contrast, permissive epi- expression is independent of the expression frequency in mTECs. genetic marks were observed for all casein genes in the MEC Demethylation of regulatory regions might actually be a pre- population in line with strict coexpression of these genes in single condition for promiscuous gene expression. epithelial cells of the lactating mammary gland (5). Strikingly, DNA A new twist in epigenetic regulation of pGE has been the find- demethylation of the Csnb promoter was already detectable in ing that the methylation status of H3K4 had been linked to the immature mTECs thus preceding gene expression. This suggests molecular action of Aire (25, 26). Actively transcribed genes such

a developmental order, whereby DNA demethylation precedes the as housekeeping genes typically carry the H3K4me3 mark at their IMMUNOLOGY introduction of permissive histone modifications. DNA demethy- promoters. Recently, it was reported that Aire binds with its PHD1 lation of the Csnb promoter before gene expression may mark this domain only to unmethylated H3K4. Consequently it was postu- as an access site into the casein locus from which pGE will spread in lated that Aire-dependent, tissue-restricted genes lack trimethy- either direction (8). Note that Csnb promoter demethylation is not lated H3K4 in mTECs. Such genes would require Aire binding to constitutive to the mTEC lineage but emerges during fetal de- unmethylated H3K4 to allow for recruitment of the transcription velopment. As argued previously, coexpression of gene neighbor- machinery (25–27). Implicitly, Aire-independent genes would not hoods might be based on the “tight” regulation of a few genes within require nonmethylated H3K4 promoters for promiscuous ex- such a cluster (i.e., Csnb) and the neighbored genes are “carried pression in mTECs. Our data differ from a recent study reporting along for a ride” (7). The pattern of gene expression in the casein higher levels of H3K4me3 for Aire-independent versus Aire- locus during early ontogeny (E14–E17) is indeed compatible with dependent genes in immature mTECs and the up-regulation of such a scenario: Incipient transcription of the casein locus centers H3K4me3 in Aire-dependent genes upon mTEC maturation (28). on the Csnb and Csng genes and only later extends to Csna, -d, and In contrast we only find up-regulation of H3K4me3 in mature -k. In line with the Csnb region representing an entry site, we ob- mTECs for two Aire-independent genes upon appropriate en- served a significant decontraction of the region encompassing richment of antigen-expressing mTECs. Without further enrich- the Csna and Csnb genes upon differentiation of mTECs by high- ment, the low frequency of mTECs expressing a given TRA, however, precludes the unambiguous analysis of epigenetic marks in our hands. Whether promoters of Aire-dependent versus Aire- independent TRAs are generally differentially H3K4 methylated in mTECs and whether this holds the key for the target specificity of Aire remains conjectural (29). The fact that all genes in the casein locus except for Csnb did not show permissive epigenetic marks does not lend support to the idea that locuswide epigenetic alterations upon maturation of mTECs are a precondition for implementing the various gene expression patterns observed in single mTECs (6). Likewise we did not find evidence for progressive genomewide hypomethylation to possibly account for progressive pGE during terminal mTEC dif- ferentiation (30) (Fig. S9). TRA expression in mTECs has previously been shown to entail a probabilistic component (5, 17), as reflected by a varying degree of mono- versus biallelic expression of certain Aire-regulated genes. In the case of the Aire-independent Gad67/eGFP locus, we also found both mono- and biallelic expression. It was however notable that biallelic expression of the Gad67/eGFP locus segre- gated with Aire expression at the single cell level, i.e., nearly all mTECs coexpressing Gad67 and eGFP also expressed Aire. Aire Fig. 6. Expression of connexin 57 is transient in mTECs. In situ staining of served here as a marker of terminally differentiated mTECs; we do thymus cryosections for expression of lacZ either driven by the Cx57 pro- not infer a deterministic role of Aire in allele-specific gene regu- moter (A) or by the ROSA26 promoter and revealed by Cre recombinase lation. With the genealogy between cells expressing one or two under the Cx57 promoter (B). Arrow indicates lacZ-positive cells. Cytopsins of alleles still unknown, we speculate that the latter will eventually sorted mTECs were stained for lacZ and the frequency of positive cells was derive from the former as a result of stochastic events, which will determined visually. accumulate in mature mTECs during their lifespan. In line with

Tykocinski et al. PNAS Early Edition | 5of6 Downloaded by guest on September 28, 2021 such a scenario, gene coexpression patterns in mTECs at the single a single mTEC to go through consecutive rounds of gene ex- cell level become increasingly complex during ontogeny. pression within its lifetime, either cycling the same or alternating Single cell expression analysis of the GAD67/eGFP locus sets of genes. Fluctuating pGE could influence the process of revealed a striking discrepancy between protein and mRNA ex- central tolerance induction. Provided a second or third round of pression. Only 33% of mature eGFP+ mTECs expressed eGFP- protein production would also be autonomously presented by specific mRNA, whereas about 91% of eGFP+ neurons expressed mTECs and/or cross-presented by DCs (31, 32), this could sub- specific mRNA. Although we cannot formally exclude that we stantially add to the diversity of self-antigen display over time missed low level eGFP mRNA expression in mTECs by SC-PCR, within a confined microenvironment. Postselection thymocytes we have no evidence for this as far as the analysis of EpCAM, might actually restrict their scanning range to medullary sub- Aire, and Csnb are concerned (5). We rather interpret this dis- territories (33). crepancy such that the majority of mature mTECs expressing the eGFP protein already have turned off eGFP-specific mRNA. Materials and Methods Such a dichotomy will only be revealed, if the protein half-life is +/lacZ +/lacZ fi Animals and Tissue. The connexin57 (Cx57 ) mice on the C57BL/6 suf ciently long to outlive termination of mRNA transcription. background have been described previously (22), and the connexin57-Cre These data argue in favor of transient or intermittent pGE at the recombinase strain (Cx57+/Cre) has been generated according to the same single cell level. This supposition is supported by an independent strategy. Details will be published elsewhere. See SI Materials and Methods experimental approach. The frequency of mTECs expressing lacZ for more details. was compared between two different conditions: either lacZ was driven directly by a tissue-specific promoter (i.e., Cx57) or alter- Preparation of Cells. MTECs were isolated by enzymatic digestion as described natively Cre was driven by the Cx57 promoter and this transgenic previously (34). For details and preparation of MECs and brain cells see SI line was crossed with a ROSA26/lacZ reporter strain. Whereas Materials and Methods. the lacZ positive cells in the former line reveal those cells with ongoing Cx57 promoter activity, the latter strain lineage marks Single Cell Sorting and Single Cell PCR. Primer design, cell sorting, reverse cells permanently, in which the Cx57 promoter had switched on transcription, first PCR amplification, and real-time quantitative PCR were Cre-mediated recombination at any time during the life span of performed as described (5). EGFP and Gad67 primers were designed to be these cells. Strikingly, lacZ-positive stromal cells in the medulla allele specific. were much more numerous in the reporter strain than in the single transgenic mice (about 10-fold more), a result indicative of Chromatin Immunoprecipitation, DNA Methylation Analysis, Fluorescence in transient activity of the Cx57 promoter in mTECs. If such tran- Situ Hybridization, and β-Galactosidase Staining. All are described in detail in sient expression were a general feature of pGE, it would allow SI Materials and Methods.

1. Abramson J, Giraud M, Benoist C, Mathis D (2010) Aire’s partners in the molecular 19. Ward CM, Stern PL (2002) The human cytomegalovirus immediate-early promoter is control of immunological tolerance. Cell 140:123–135. transcriptionally active in undifferentiated mouse embryonic stem cells. Stem Cells 20: 2. Derbinski J, et al. (2005) Promiscuous gene expression in thymic epithelial cells is 472–475. regulated at multiple levels. J Exp Med 202:33–45. 20. Ruan H, et al. (1999) Killing of brain tumor cells by hypoxia-responsive element 3. Anderson MS, et al. (2002) Projection of an immunological self shadow within the mediated expression of BAX. Neoplasia 1:431–437. Science – thymus by the aire protein. 298:1395 1401. 21. Holtkamp S, et al. (2006) Modification of antigen-encoding RNA increases stability, 4. Johnnidis JB, et al. (2005) Chromosomal clustering of genes controlled by the aire translational efficacy, and T-cell stimulatory capacity of dendritic cells. Blood 108:4009–4017. . Proc Natl Acad Sci USA 102:7233–7238. 22. Hombach S, et al. (2004) Functional expression of connexin57 in horizontal cells of the 5. Derbinski J, Pinto S, Rösch S, Hexel K, Kyewski B (2008) Promiscuous gene expression mouse retina. Eur J Neurosci 19:2633–2640. patterns in single medullary thymic epithelial cells argue for a stochastic mechanism. 23. Gotter J, Brors B, Hergenhahn M, Kyewski B (2004) Medullary epithelial cells of the Proc Natl Acad Sci USA 105:657–662. human thymus express a highly diverse selection of tissue-specific genes colocalized in 6. Kyewski B, Klein L (2006) A central role for central tolerance. Annu Rev Immunol 24: J Exp Med – 571–606. chromosomal clusters. 199:155 166. 7. Spellman PT, Rubin GM (2002) Evidence for large domains of similarly expressed 24. Hurst LD, Pál C, Lercher MJ (2004) The evolutionary dynamics of eukaryotic gene Nat Rev Genet – genes in the Drosophila genome. J Biol 1:5. order. 5:299 310. fi 8. Oliver B, Parisi M, Clark D (2002) Gene expression neighborhoods. J Biol 1:4. 25. Org T, et al. (2008) The autoimmune regulator PHD nger binds to non-methylated 9. Sproul D, Gilbert N, Bickmore WA (2005) The role of chromatin structure in regulating histone H3K4 to activate gene expression. EMBO Rep 9:370–376. the expression of clustered genes. Nat Rev Genet 6:775–781. 26. Koh AS, et al. (2008) Aire employs a histone-binding module to mediate immuno- 10. Dillon N (2006) Gene regulation and large-scale chromatin organization in the logical tolerance, linking chromatin regulation with organ-specific autoimmunity. nucleus. Chromosome Res 14:117–126. Proc Natl Acad Sci USA 105:15878–15883. 11. Azuara V, et al. (2006) Chromatin signatures of pluripotent cell lines. Nat Cell Biol 8: 27. Peterson P, Org T, Rebane A (2008) Transcriptional regulation by AIRE: Molecular 532–538. mechanisms of central tolerance. Nat Rev Immunol 8:948–957. 12. Bernstein BE, et al. (2006) A bivalent chromatin structure marks key developmental 28. Org T, et al. (2009) AIRE activated tissue specific genes have histone modifications Cell – genes in embryonic stem cells. 125:315 326. associated with inactive chromatin. Hum Mol Genet 18:4699–4710. 13. Boyer LA, et al. (2006) Polycomb complexes repress developmental regulators in 29. Koh AS, Kingston RE, Benoist C, Mathis D (2010) Global relevance of Aire binding Nature – murine embryonic stem cells. 441:349 353. to hypomethylated lysine-4 of histone-3. Proc Natl Acad Sci USA 107:13016– 14. Cedar H, Bergman Y (2009) Linking DNA methylation and histone modification: 13021. Patterns and paradigms. Nat Rev Genet 10:295–304. 30. Gotter J, Kyewski B (2004) Regulating self-tolerance by deregulating gene expression. 15. Tamamaki N, et al. (2003) Green fluorescent protein expression and colocalization Curr Opin Immunol 16:741–745. with calretinin, parvalbumin, and somatostatin in the GAD67-GFP knock-in mouse. 31. Gallegos AM, Bevan MJ (2004) Central tolerance to tissue-specific antigens mediated J Comp Neurol 467:60–79. by direct and indirect antigen presentation. J Exp Med 200:1039–1049. 16. Gäbler J, Arnold J, Kyewski B (2007) Promiscuous gene expression and the de- 32. Koble C, Kyewski B (2009) The thymic medulla: A unique microenvironment for velopmental dynamics of medullary thymic epithelial cells. Eur J Immunol 37:3363– J Exp Med – 3372. intercellular self-antigen transfer. 206:1505 1513. 17. Villaseñor J, Besse W, Benoist C, Mathis D (2008) Ectopic expression of peripheral- 33. Le Borgne M, et al. (2009) The impact of negative selection on thymocyte migration in tissue antigens in the thymic epithelium: Probabilistic, monoallelic, misinitiated. Proc the medulla. Nat Immunol 10:823–830. Natl Acad Sci USA 105:15854–15859. 34. Klein L, Klugmann M, Nave KA, Tuohy VK, Kyewski B (2000) Shaping of the 18. Gray D, Abramson J, Benoist C, Mathis D (2007) Proliferative arrest and rapid turnover autoreactive T-cell repertoire by a splice variant of self protein expressed in thymic of thymic epithelial cells expressing Aire. J Exp Med 204:2521–2528. epithelial cells. Nat Med 6:56–61.

6of6 | www.pnas.org/cgi/doi/10.1073/pnas.1009265107 Tykocinski et al. Downloaded by guest on September 28, 2021