MCF-7 As a Model for Functional Analysis of Breast Cancer Risk Variants
Total Page:16
File Type:pdf, Size:1020Kb
Published OnlineFirst July 10, 2019; DOI: 10.1158/1055-9965.EPI-19-0066 Research Article Cancer Epidemiology, Biomarkers MCF-7 as a Model for Functional Analysis of Breast & Prevention Cancer Risk Variants Alix Booms, Gerhard A. Coetzee, and Steven E. Pierce Abstract Background: Breast cancer genetic predisposition is gov- enhancers and 15 CTCF occupancy sites that, between them, erned by more than 142 loci as revealed by genome-wide overlapped 96 breast cancer credible risk variants at 42 loci. association studies (GWAS). The functional contribution of These risk enhancers likely regulate the expression of dozens these risk loci to breast cancer remains unclear, and additional of genes, which are enriched for GO categories, including post-GWAS analyses are required. estrogen and prolactin signaling. Methods: We identified active regulatory elements (enhan- Conclusions: Ten (of 142) breast cancer risk loci likely cers, promoters, and chromatin organizing elements) by his- function via enhancers that are active in MCF-7 and are well tone H3K27 acetylation and CTCF occupancy and determined suited to targeted manipulation in this system. In contrast, risk the enrichment of risk variants at these sites. We compared loci cannot be mapped to specific CTCF-binding sites, and the these results with previously published data and for other cell genes linked to risk CTCF sites did not show functional lines, including human mammary epithelial cells, and related enrichment. The identity of risk enhancers and their associated these data to gene expression. genes suggests that some risk may function during later pro- Results: In terms of mapping accuracy and resolution, our cesses in cancer progression. þ data augment previous annotations of the MCF-7 epigenome. Impact: Here, we report how the ER cell line MCF-7 can be After intersection with GWAS risk variants, we found 39 used to dissect risk mechanisms for breast cancer. Introduction genotypes are poorly understood and do not alter protein coding. Elucidating the functional basis of these common risk variants is According to the traditional model of carcinogenesis, normal therefore of great importance. tissue undergoes rounds of oncogenic mutations during prolif- A recent GWAS uncovered 65 new breast cancer risk loci eration that eventually leads to metastatic tumor growth. An contributing to a total of roughly 142 reproducible breast cancer individual's inherited genetic predisposition for cancer, which risk loci containing over 38,000 statistically significant (com- can be measured by linkage analysis or genome-wide association À bined P value < 5 Â 10 8) or near-significant (combined P value studies (GWAS), influences the rate of those oncogenic somatic À < 1 Â 10 5) risk variants (4). These variants primarily consist of mutations and the environment in which the tumors develop (1). SNPs (we use "variant" and "SNP" interchangeably in this text). In breast cancer, rare but high-penetrance inherited mutations to Due to the large number of risk-associated SNPs as well as to their genes, such as BRCA1 and BRCA2, contribute about 30% to the presence in noncoding DNA, it can be difficult to pinpoint which familial risk of developing breast cancer and, in general, have well- SNP or combination of SNPs are causal for a disease, let alone understood biological consequences (2). However, only about explain the biological mechanisms/genes involved. This will only 5% to 10% of breast cancer cases are actually associated with this become more challenging as the number of identified risk SNPs is type of germline mutation. In contrast, GWAS show that low- expected to increase as the sizes of case–control studies grow penetrance but common genetic variants explain up to 50% of larger in the near future. Our goal is therefore to prioritize SNPs disease heritability and contribute significant risk to the devel- based on their potential effects on cell-type–specific genomic opment of both familial and sporadic breast cancer (3). Unlike activity. high-penetrance rare mutations, in most cases, these other risk Risk SNPs are nonrandomly distributed throughout the genome and have been shown to be enriched in tissue-specific noncoding regulatory elements (RE), mainly in enhancers (5–7). Center for Neurodegenerative Science, Van Andel Research Institute, Grand REs are defined as regions of noncoding DNA that regulate the Rapids, Michigan. transcription of genes. Enhancers are a class of RE that influence Note: Supplementary data for this article are available at Cancer Epidemiology, cell fate and development through coordinated interaction Biomarkers & Prevention Online (http://cebp.aacrjournals.org/). between transcription factors (TF) and their target promoters to G.A. Coetzee and S.E. Pierce codirected this work. alter the transcription of genes (8). Enhancers are marked by surrounding histone modifications and nucleosome depletion. Corresponding Author: Gerhard A. Coetzee, Van Andel Research Institute, The histone modification most often used is H3K27 acetylation, 333 Bostwick Avenue N.E., Grand Rapids, MI 49503. Phone: 646-696-1560; E-mail: [email protected] because it has been shown that it marks active (engaged) enhan- cers (9). Enhancers are also relatively transient, and their activity is Cancer Epidemiol Biomarkers Prev 2019;28:1735–45 dictated by extracellular signals that activate complex enhancer doi: 10.1158/1055-9965.EPI-19-0066 networks to promote cell-type– and condition-specific gene Ó2019 American Association for Cancer Research. expression. Common breast cancer risk variants present in gene www.aacrjournals.org 1735 Downloaded from cebp.aacrjournals.org on September 26, 2021. © 2019 American Association for Cancer Research. Published OnlineFirst July 10, 2019; DOI: 10.1158/1055-9965.EPI-19-0066 Booms et al. REs are likely to subtly alter gene expression patterns, compared Chromatin immunoprecipitation sequencing with protective alleles, in ways that predispose some individuals For chromatin immunoprecipitation (ChIP), we followed pre- to cancer. For example, studies of the noncoding region 8q24 viously published protocols from Rhie and colleagues (21) with upstream of MYC have shown enrichment of variants associated slight modifications. Roughly 30 to 40 Â 106 cells were used per with different tumor types, including breast, in multiple enhan- ChIP. Upon reaching 70% to 80% confluency, cells were directly cers at this location (10–12). fixed in T75 flasks by adding 16% formaldehyde to the culture Another important feature shown to coincide with risk SNPs medium to a final concentration of 1%. The reaction was is CTCF (CCCTC binding-factor). CTCF is a highly conserved quenched for 5 minutes at room temperature with 10X (1.15 11-zinc finger protein and is the only known insulator protein mol/L) glycine. Using a Bioruptor Pico (Diagenode, Cat # in vertebrates (13). It is most often enriched at loop and B01060001), the isolated chromatin was sonicated for 30-second topologically associated domain (TAD) boundaries that sepa- on and 30-second off cycles to yield DNA fragments between 200 rate transcriptionally active and repressed genes (14). This and 500 base pairs. Note that 100 mg of sonicated chromatin was implicates its importance in the organization of chromatin used for immunoprecipitation, and 1 mg (1%) was used for the compartments to prevent aberrant gene–enhancer interactions. input control. To probe for CTCF or H3K27ac, samples were Lupianez and colleagues (15) showed that removing the CTCF- incubated at 4C overnight with a primary antibody [CTCF: Cell associated boundary elements at the Epha4 TAD causes abnor- Signaling monoclonal (D31H2), Cat# 3418; H3K27ac: Active mal interactions between adjacent TADs and expression of Motif, Cat #39133] or an IgG control (Sigma, Cat # R9133). A/G genes within those TADs. Others have shown that CTCF is magnetic beads (Pierce, Cat # 88802) were then incubated with important for spatial organization of the genome for proper the samples for 2 hours at 4C. Following this incubation, the induction and silencing of transcription (14, 16). Although the beads were washed with a series of buffers of varying salt con- relationship between SNPs and CTCF is less clear than that of centrations before overnight elution at 67C. The ChIP, IgG, and SNPs within enhancers, SNP-mediated disruptions of CTCF Input samples were all purified using a QIAprep Spin Miniprep Kit consensus sequences have been shown to change the binding (Qiagen, Cat # 27104). affinity of the CTCF protein (17). This could result in gene spatial network rewiring that has a strong influence on gene Construction and sequencing of ChIP-Seq libraries expression changes that indirectly lead to breast cancer. Such Libraries for input and IP samples were prepared by the Van rewiring was demonstrated by a CRISPR-mediated deletion of Andel Genomics Core from 10 ng of input material and all prostate cancer risk–associated CTCF sites, which identified available IP material using the KAPA Hyper Prep Kit (v5.16; Kapa repressive chromatin loops [17]. Biosystems). Prior to PCR amplification, end-repaired and poly- In the present study, we used the MCF-7 cell line as a model adenylated DNA fragments were ligated to Bio Scientific NEXTflex of breast cancer. It is one of the most utilized breast cancer cell Adapters (Bio Scientific). The quality and quantity of the finished lines in cancer research due to its long history and expression libraries were assessed using a combination of Agilent DNA High of the estrogen receptor (ER; ref. 18). The majority of patients Sensitivity chip (Agilent Technologies, Inc.), QuantiFluor dsDNA (72.7%) with a known HR/HER status (ER þ progesterone System (Promega Corp.), and Kapa Illumina Library Quantifica- þ receptor/human epidermal growth factor receptor) are HR / tion qPCR assays (Kapa Biosystems). Sequencing (75 bp, single À HER2 (19). MCF-7 cells are one of the few breast cancer cell end) was performed on an Illumina NextSeq 500 sequencer using þ À lines that is also HR /HER2 , making it an extremely relevant a 75-bp sequencing kit (v2; Illumina Inc.).