Chip-On-Chip Significance Analysis Reveals Large-Scale Binding and Regulation by Human Transcription Factor Oncogenes

Total Page:16

File Type:pdf, Size:1020Kb

Chip-On-Chip Significance Analysis Reveals Large-Scale Binding and Regulation by Human Transcription Factor Oncogenes ChIP-on-chip significance analysis reveals large-scale binding and regulation by human transcription factor oncogenes Adam A. Margolina,b,c,1, Teresa Palomerod,e, Pavel Sumazinb, Andrea Califanoa,b,d,2,3, Adolfo A. Ferrandod,e,f,2,3, and Gustavo Stolovitzkyb,c,2,3 aDepartment of Biomedical Informatics, bJoint Centers for Systems Biology, dInstitute for Cancer Genetics, eDepartment of Pathology, and fDepartment of Pediatrics, Columbia University, New York, NY 10032; and cFunctional Genomics and Systems Biology Group, IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 Edited by Barry H. Honig, Columbia University, New York, NY, and approved November 1, 2008 (received for review July 9, 2008) ChIP-on-chip has emerged as a powerful tool to dissect the complex tions. Although several studies have experimentally validated novel network of regulatory interactions between transcription factors target collections produced at a given statistical threshold (8–12), and their targets. However, most ChIP-on-chip analysis methods these studies likely miss a large number of true binding events, use conservative approaches aimed at minimizing false-positive obscuring the full complexity of transcriptional processes. transcription factor targets. We present a model with improved Using an empirically determined model of the distribution of sensitivity in detecting binding events from ChIP-on-chip data. Its intensity ratios for non-IP-enriched probes in ChIP2 experiments, application to human T cells, followed by extensive biochemical we developed an analytical method called ChIP2 Significance validation, reveals that 3 oncogenic transcription factors, NOTCH1, Analysis (CSA). When applied to ChIP2 data from the NOTCH1, MYC, and HES1, bind to several thousand target gene promoters, MYC, and HES1 protooncogenes in human T cell acute lympho- up to an order of magnitude increase over conventional analysis blastic leukemia (T-ALL) cells, CSA increased the number of methods. Gene expression profiling upon NOTCH1 inhibition detected binding sites by up to an order of magnitude compared shows broad-scale functional regulation across the entire range of with other routinely used methods. Both binding site analysis and predicted target genes, establishing a closer link between occu- biochemical validation demonstrate quantitative agreement with pancy and regulation. Finally, the increased sensitivity reveals a CSA-predicted false-positive rates. Analysis of gene expression combinatorial regulatory program in which MYC cobinds to virtu- signatures indicates functional regulation by NOTCH1 across the ally all NOTCH1-bound promoters. Overall, these results suggest an entire range of predicted targets. Finally, the increased sensitivity unappreciated complexity of transcriptional regulatory networks reveals that virtually all NOTCH1-bound promoters are also bound and highlight the fundamental importance of genome-scale anal- by MYC. Overall, these results highlight the power of the proposed ysis to represent transcriptional programs. analysis framework for the identification of transcriptional net- works and provide an improved and fundamentally different pic- regulatory networks ͉ T cell lymphoblastic leukemia ͉ ture of the transcriptional programs controlled by NOTCH1, transcriptional regulation ͉ systems biology HES1, and MYC in T-ALL. he dysregulated activity of oncogenic transcription factors Results T(TFs) contributes to neoplastic transformation by promoting Probe Statistics Are Accurately Modeled by CSA. T-ALL is a malig- aberrant expression of target genes involved in regulating cell nant tumor characterized by the aberrant activation of oncogenic homeostasis. Therefore, characterization of the regulatory net- TFs (13). We recently demonstrated that constitutive activation of works controlled by these TFs is a critical objective in understanding NOTCH1 signaling due to mutations in the NOTCH1 gene acti- the molecular mechanisms of cell transformation. ChIP-on-chip vates a transcriptional network that controls leukemic cell growth (ChIP2) (1) has emerged as a promising technology in the dissection (11, 14–16). These studies also demonstrated a fundamental role of transcriptional networks by providing high-resolution maps of for HES1 and MYC as transcriptional mediators of NOTCH1 genome-wide TF–chromatin interactions. signals (15, 17). To characterize the structure of the oncogenic ChIP2 uses microarray technology to measure the relative abun- transcriptional network driven by activated NOTCH1 in T cell dance of genomic fragments derived from an immunoprecipitate transformation, we sought to identify the direct transcriptional (IP) sample, which is enriched in fragments bound by an immuno- precipitated protein (usually a TF), and a whole-cell extract (WCE) sample, containing fragments derived from a total chromatin Author contributions: A.A.M., P.S., A.C., A.A.F., and G.S. designed research; A.A.M., T.P., and P.S. performed research; T.P. and A.A.F. contributed new reagents/analytic tools; preparation (input control) or an immunoprecipitation with a A.A.M. and P.S. analyzed data; and A.A.M., T.P., P.S., A.C., A.A.F., and G.S. wrote the paper. nonspecific control antibody (2). The 2 samples may either be The authors declare no conflict of interest. hybridized to different arrays or labeled with different dyes and This article is a PNAS Direct Submission. hybridized to the same array. Correct interpretation of ChIP2 data Freely available online through the PNAS open access option. depends critically on an accurate statistical model to compute the Data deposition: The microarray data have been deposited in the Gene Expression Omnibus probability that a given IP/WCE ratio is produced by a binding (GEO) Database, www.ncbi.nlm.nih.gov/geo (accession no. GSE12868). ChIP2 data is at event rather than experimental noise. http://wiki.c2b2.columbia.edu/califanolab/PNASAM2009/. 2 Recently, several elegant ChIP analysis methods have been 1Present address: The Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, proposed to tackle problems such as integrating measurements MA 02142. from adjacent probes (3–6) or inferring binding site locations at 2A.C., A.A.F., and G.S. contributed equally to this work. subprobe resolution (7). However, the lower-level problem of 3To whom correspondence may be addressed. E-mail: [email protected], califano@ developing an accurate error model to define meaningful statistical c2b2.columbia.edu, or [email protected]. thresholds has received comparably little attention [see SI and Fig. This article contains supporting information online at www.pnas.org/cgi/content/full/ 1]. Thus, ChIP2 data analysis methods often use highly conservative 0806445106/DCSupplemental. approaches aimed at minimizing the rate of false-positive predic- © 2008 by The National Academy of Sciences of the USA 244–249 ͉ PNAS ͉ January 6, 2009 ͉ vol. 106 ͉ no. 1 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0806445106 Downloaded by guest on September 27, 2021 accurate description of the individual and combinatorial regulatory programs controlled by these TFs. We first generated an empirical model of the distribution of IP/WCE intensity ratios for probes associated with unbound frag- ments (see Materials and Methods), and we used it to assign a P value to each probe in the analysis of ChIP2 assays representing replicate experiments for NOTCH1, MYC, and HES1. ChIP2 assays for these TFs were performed in HPB-ALL cells, a well-characterized T-ALL cell line with high expression levels of activated NOTCH1, MYC, and HES1. For NOTCH1, ChIP2 assays were also performed in CUTLL1 cells, another NOTCH1-dependent T-ALL cell line. The magnitude versus amplitude plots (Fig. 2A) of the intensity- dependent distributions of probe-ratio values showed marked dif- ferences for the four experiments. In each case CSA accurately modeled the left tail of the probe ratio probability distribution, where the contribution from bound probes is expected to be Fig. 1. Modeling errors of methods that use whole-dataset statistics for minimal (Fig. 2 A and B). We note that if bound-probe ratios are either normalization or significance detection. Blue bars represent a histo- well separated from the experimental noise, the P value distribution 2 gram of log2 IP/WCE probe ratio values from a MYC ChIP experiment. The for all probes should be uniform between zero and one (unbound histogram displays distinct, overlapping distributions for bound and unbound probes) with a single peak near zero (bound probes). Importantly, probes. The dotted red curve shows the log2 ratio values after mean centering, CSA accurately captured these statistical properties (see SI). a common normalization technique that, for this experiment, adjusts the mean of the null distribution to be negative to compensate for the large 2 number of high-ratio values for the bound probes. The green curve represents Improved ChIP Sensitivity by CSA. CSA then incorporates the probe a Gaussian fitted to the overall distribution, demonstrating that analysis significance model with an analytical method that integrates the methods that fit a global error model to these data will significantly overes- statistics for replicate experiments and probes with nearby genomic timate the variance of the null distribution and will incur a high false-negative locations (to account for ChIP2 fragmentation lengths, see Materials rate, as shown by the black arrow, which represents 2 standard deviations and Methods). We used CSA to compute the false
Recommended publications
  • Table S1. List of Proteins in the BAHD1 Interactome
    Table S1. List of proteins in the BAHD1 interactome BAHD1 nuclear partners found in this work yeast two-hybrid screen Name Description Function Reference (a) Chromatin adapters HP1α (CBX5) chromobox homolog 5 (HP1 alpha) Binds histone H3 methylated on lysine 9 and chromatin-associated proteins (20-23) HP1β (CBX1) chromobox homolog 1 (HP1 beta) Binds histone H3 methylated on lysine 9 and chromatin-associated proteins HP1γ (CBX3) chromobox homolog 3 (HP1 gamma) Binds histone H3 methylated on lysine 9 and chromatin-associated proteins MBD1 methyl-CpG binding domain protein 1 Binds methylated CpG dinucleotide and chromatin-associated proteins (22, 24-26) Chromatin modification enzymes CHD1 chromodomain helicase DNA binding protein 1 ATP-dependent chromatin remodeling activity (27-28) HDAC5 histone deacetylase 5 Histone deacetylase activity (23,29,30) SETDB1 (ESET;KMT1E) SET domain, bifurcated 1 Histone-lysine N-methyltransferase activity (31-34) Transcription factors GTF3C2 general transcription factor IIIC, polypeptide 2, beta 110kDa Required for RNA polymerase III-mediated transcription HEYL (Hey3) hairy/enhancer-of-split related with YRPW motif-like DNA-binding transcription factor with basic helix-loop-helix domain (35) KLF10 (TIEG1) Kruppel-like factor 10 DNA-binding transcription factor with C2H2 zinc finger domain (36) NR2F1 (COUP-TFI) nuclear receptor subfamily 2, group F, member 1 DNA-binding transcription factor with C4 type zinc finger domain (ligand-regulated) (36) PEG3 paternally expressed 3 DNA-binding transcription factor with
    [Show full text]
  • Computational Analysis of DNA Methylation and Gene Expression Patterns in Prostate Cancer
    Ieva Rauluševičiūtė Computational Analysis of DNA Methylation and Gene Expression Patterns in Prostate Cancer Master’s thesis in Molecular Medicine Trondheim, June 2018 Supervisors: dr. Morten Beck Rye and prof. Finn Drabløs Norwegian University of Science and Technology Faculty of Medicine and Health Sciences Department of Clinical and Molecular Medicine ABSTRACT DNA methylation is an important contributor for prostate cancer development and progression. It has been studied experimentally for years, but, lately, high-throughput technologies are able to produce genome-wide DNA methylation data that can be analyzed using various computational approaches. Thus, this study aims to bioinformatically investigate different DNA methylation and gene expression patterns in prostate cancer. DNA methylation data from three datasets (TCGA, Absher and Kirby) was correlated with gene expression data in order to distinguish different regulation patterns. Classically, increased DNA methylation in promoter regions is being associated with gene expression downregulation. Although, results of the present project demonstrate another robust pattern, where DNA hypermethylation in promoter regions of 1,058 common genes is accompanied by upregulated expression. After analyzing expression and methylation values in the same samples from TCGA dataset, expression overcompensation in a dataset as an explanation for upregulation was excluded. Further reasons behind the pattern were investigated using TCGA DNA methylation data with extended list of probes and includes the presence of methylated positions in CpG islands, distance to transcription start sites and alternative TSSs. As compared with the downregulated genes in the classical pattern, upregulated genes were shown to have more positions in CpG islands and closer to TSSs. Moreover, the presence of alternative TSS in prostate was demonstrated, also disclosing the limitations of methylation detection systems.
    [Show full text]
  • SOX4-Mediated Repression of Specific Trnas Inhibits Proliferation of Human Glioblastoma Cells
    SOX4-mediated repression of specific tRNAs inhibits proliferation of human glioblastoma cells Jianjing Yanga,b,c, Derek K. Smithc,d, Haoqi Nia,b,c,KeWua,b, Dongdong Huanga,b, Sishi Pana,b,c, Adwait A. Sathee, Yu Tangc,d, Meng-Lu Liuc,d, Chao Xinge,f,g, Chun-Li Zhangc,d,1, and Qichuan Zhugea,b,1 aDepartment of Neurosurgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China 325000; bZhejiang Provincial Key Laboratory of Aging and Neurological Disorder Research, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China 325000; cDepartment of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390; dHamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390; eMcDermott Center of Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX 75390; fDepartment of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX 75390; and gDepartment of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390 Edited by S. Altman, Yale University, New Haven, CT, and approved February 5, 2020 (received for review November 15, 2019) Transfer RNAs (tRNAs) are products of RNA polymerase III (Pol III) indicates that tRNA expression may also be under cell state- and essential for mRNA translation and ultimately cell growth and dependent regulations (12–16). proliferation. Whether and how individual tRNA genes are specif- In this study, we performed a systematic analysis on how ically regulated is not clear. Here, we report that SOX4, a well- NGN2/SOX4-mediated cell-fate reprogramming leads to cell known Pol II-dependent transcription factor that is critical for neuro- cycle exit of human glioblastoma cells.
    [Show full text]
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Figure S1. Representative Report Generated by the Ion Torrent System Server for Each of the KCC71 Panel Analysis and Pcafusion Analysis
    Figure S1. Representative report generated by the Ion Torrent system server for each of the KCC71 panel analysis and PCaFusion analysis. (A) Details of the run summary report followed by the alignment summary report for the KCC71 panel analysis sequencing. (B) Details of the run summary report for the PCaFusion panel analysis. A Figure S1. Continued. Representative report generated by the Ion Torrent system server for each of the KCC71 panel analysis and PCaFusion analysis. (A) Details of the run summary report followed by the alignment summary report for the KCC71 panel analysis sequencing. (B) Details of the run summary report for the PCaFusion panel analysis. B Figure S2. Comparative analysis of the variant frequency found by the KCC71 panel and calculated from publicly available cBioPortal datasets. For each of the 71 genes in the KCC71 panel, the frequency of variants was calculated as the variant number found in the examined cases. Datasets marked with different colors and sample numbers of prostate cancer are presented in the upper right. *Significantly high in the present study. Figure S3. Seven subnetworks extracted from each of seven public prostate cancer gene networks in TCNG (Table SVI). Blue dots represent genes that include initial seed genes (parent nodes), and parent‑child and child‑grandchild genes in the network. Graphical representation of node‑to‑node associations and subnetwork structures that differed among and were unique to each of the seven subnetworks. TCNG, The Cancer Network Galaxy. Figure S4. REVIGO tree map showing the predicted biological processes of prostate cancer in the Japanese. Each rectangle represents a biological function in terms of a Gene Ontology (GO) term, with the size adjusted to represent the P‑value of the GO term in the underlying GO term database.
    [Show full text]
  • Supplementary Table S4. FGA Co-Expressed Gene List in LUAD
    Supplementary Table S4. FGA co-expressed gene list in LUAD tumors Symbol R Locus Description FGG 0.919 4q28 fibrinogen gamma chain FGL1 0.635 8p22 fibrinogen-like 1 SLC7A2 0.536 8p22 solute carrier family 7 (cationic amino acid transporter, y+ system), member 2 DUSP4 0.521 8p12-p11 dual specificity phosphatase 4 HAL 0.51 12q22-q24.1histidine ammonia-lyase PDE4D 0.499 5q12 phosphodiesterase 4D, cAMP-specific FURIN 0.497 15q26.1 furin (paired basic amino acid cleaving enzyme) CPS1 0.49 2q35 carbamoyl-phosphate synthase 1, mitochondrial TESC 0.478 12q24.22 tescalcin INHA 0.465 2q35 inhibin, alpha S100P 0.461 4p16 S100 calcium binding protein P VPS37A 0.447 8p22 vacuolar protein sorting 37 homolog A (S. cerevisiae) SLC16A14 0.447 2q36.3 solute carrier family 16, member 14 PPARGC1A 0.443 4p15.1 peroxisome proliferator-activated receptor gamma, coactivator 1 alpha SIK1 0.435 21q22.3 salt-inducible kinase 1 IRS2 0.434 13q34 insulin receptor substrate 2 RND1 0.433 12q12 Rho family GTPase 1 HGD 0.433 3q13.33 homogentisate 1,2-dioxygenase PTP4A1 0.432 6q12 protein tyrosine phosphatase type IVA, member 1 C8orf4 0.428 8p11.2 chromosome 8 open reading frame 4 DDC 0.427 7p12.2 dopa decarboxylase (aromatic L-amino acid decarboxylase) TACC2 0.427 10q26 transforming, acidic coiled-coil containing protein 2 MUC13 0.422 3q21.2 mucin 13, cell surface associated C5 0.412 9q33-q34 complement component 5 NR4A2 0.412 2q22-q23 nuclear receptor subfamily 4, group A, member 2 EYS 0.411 6q12 eyes shut homolog (Drosophila) GPX2 0.406 14q24.1 glutathione peroxidase
    [Show full text]
  • The Function and Evolution of C2H2 Zinc Finger Proteins and Transposons
    The function and evolution of C2H2 zinc finger proteins and transposons by Laura Francesca Campitelli A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Molecular Genetics University of Toronto © Copyright by Laura Francesca Campitelli 2020 The function and evolution of C2H2 zinc finger proteins and transposons Laura Francesca Campitelli Doctor of Philosophy Department of Molecular Genetics University of Toronto 2020 Abstract Transcription factors (TFs) confer specificity to transcriptional regulation by binding specific DNA sequences and ultimately affecting the ability of RNA polymerase to transcribe a locus. The C2H2 zinc finger proteins (C2H2 ZFPs) are a TF class with the unique ability to diversify their DNA-binding specificities in a short evolutionary time. C2H2 ZFPs comprise the largest class of TFs in Mammalian genomes, including nearly half of all Human TFs (747/1,639). Positive selection on the DNA-binding specificities of C2H2 ZFPs is explained by an evolutionary arms race with endogenous retroelements (EREs; copy-and-paste transposable elements), where the C2H2 ZFPs containing a KRAB repressor domain (KZFPs; 344/747 Human C2H2 ZFPs) are thought to diversify to bind new EREs and repress deleterious transposition events. However, evidence of the gain and loss of KZFP binding sites on the ERE sequence is sparse due to poor resolution of ERE sequence evolution, despite the recent publication of binding preferences for 242/344 Human KZFPs. The goal of my doctoral work has been to characterize the Human C2H2 ZFPs, with specific interest in their evolutionary history, functional diversity, and coevolution with LINE EREs.
    [Show full text]
  • Appendix 2. Significantly Differentially Regulated Genes in Term Compared with Second Trimester Amniotic Fluid Supernatant
    Appendix 2. Significantly Differentially Regulated Genes in Term Compared With Second Trimester Amniotic Fluid Supernatant Fold Change in term vs second trimester Amniotic Affymetrix Duplicate Fluid Probe ID probes Symbol Entrez Gene Name 1019.9 217059_at D MUC7 mucin 7, secreted 424.5 211735_x_at D SFTPC surfactant protein C 416.2 206835_at STATH statherin 363.4 214387_x_at D SFTPC surfactant protein C 295.5 205982_x_at D SFTPC surfactant protein C 288.7 1553454_at RPTN repetin solute carrier family 34 (sodium 251.3 204124_at SLC34A2 phosphate), member 2 238.9 206786_at HTN3 histatin 3 161.5 220191_at GKN1 gastrokine 1 152.7 223678_s_at D SFTPA2 surfactant protein A2 130.9 207430_s_at D MSMB microseminoprotein, beta- 99.0 214199_at SFTPD surfactant protein D major histocompatibility complex, class II, 96.5 210982_s_at D HLA-DRA DR alpha 96.5 221133_s_at D CLDN18 claudin 18 94.4 238222_at GKN2 gastrokine 2 93.7 1557961_s_at D LOC100127983 uncharacterized LOC100127983 93.1 229584_at LRRK2 leucine-rich repeat kinase 2 HOXD cluster antisense RNA 1 (non- 88.6 242042_s_at D HOXD-AS1 protein coding) 86.0 205569_at LAMP3 lysosomal-associated membrane protein 3 85.4 232698_at BPIFB2 BPI fold containing family B, member 2 84.4 205979_at SCGB2A1 secretoglobin, family 2A, member 1 84.3 230469_at RTKN2 rhotekin 2 82.2 204130_at HSD11B2 hydroxysteroid (11-beta) dehydrogenase 2 81.9 222242_s_at KLK5 kallikrein-related peptidase 5 77.0 237281_at AKAP14 A kinase (PRKA) anchor protein 14 76.7 1553602_at MUCL1 mucin-like 1 76.3 216359_at D MUC7 mucin 7,
    [Show full text]
  • Virtual Chip-Seq: Predicting Transcription Factor Binding
    bioRxiv preprint doi: https://doi.org/10.1101/168419; this version posted March 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 Virtual ChIP-seq: predicting transcription factor binding 2 by learning from the transcriptome 1,2,3 1,2,3,4,5 3 Mehran Karimzadeh and Michael M. Hoffman 1 4 Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada 2 5 Princess Margaret Cancer Centre, Toronto, ON, Canada 3 6 Vector Institute, Toronto, ON, Canada 4 7 Department of Computer Science, University of Toronto, Toronto, ON, Canada 5 8 Lead contact: michael.hoff[email protected] 9 March 8, 2019 10 Abstract 11 Motivation: 12 Identifying transcription factor binding sites is the first step in pinpointing non-coding mutations 13 that disrupt the regulatory function of transcription factors and promote disease. ChIP-seq is 14 the most common method for identifying binding sites, but performing it on patient samples is 15 hampered by the amount of available biological material and the cost of the experiment. Existing 16 methods for computational prediction of regulatory elements primarily predict binding in genomic 17 regions with sequence similarity to known transcription factor sequence preferences. This has limited 18 efficacy since most binding sites do not resemble known transcription factor sequence motifs, and 19 many transcription factors are not even sequence-specific. 20 Results: 21 We developed Virtual ChIP-seq, which predicts binding of individual transcription factors in new 22 cell types using an artificial neural network that integrates ChIP-seq results from other cell types 23 and chromatin accessibility data in the new cell type.
    [Show full text]
  • UC San Diego UC San Diego Electronic Theses and Dissertations
    UC San Diego UC San Diego Electronic Theses and Dissertations Title Regulation of gene expression programs by serum response factor and megakaryoblastic leukemia 1/2 in macrophages Permalink https://escholarship.org/uc/item/8cc7d0t0 Author Sullivan, Amy Lynn Publication Date 2009 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, SAN DIEGO Regulation of Gene Expression Programs by Serum Response Factor and Megakaryoblastic Leukemia 1/2 in Macrophages A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Biomedical Sciences by Amy Lynn Sullivan Committee in charge: Professor Christopher K. Glass, Chair Professor Stephen M. Hedrick Professor Marc R. Montminy Professor Nicholas J. Webster Professor Joseph L. Witztum 2009 Copyright Amy Lynn Sullivan, 2009 All rights reserved. The Dissertation of Amy Lynn Sullivan is approved, and it is acceptable in quality and form for publication on microfilm and electronically: ______________________________________________________________ ______________________________________________________________ ______________________________________________________________ ______________________________________________________________ ______________________________________________________________ Chair University of California, San Diego 2009 iii DEDICATION To my husband, Shane, for putting up with me through all of the long hours, last minute late nights, and for not letting me quit no matter how many times my projects fell apart. To my son, Tyler, for always making me smile and for making every day an adventure. To my gifted colleagues, for all of the thought-provoking discussions, technical help and moral support through the roller- coaster ride that has been my graduate career. To my family and friends, for all of your love and support. I couldn’t have done it without you! iv EPIGRAPH If at first you don’t succeed, try, try, again.
    [Show full text]
  • Integrative Clinical Sequencing in the Management of Refractory Or
    Supplementary Online Content Mody RJ, Wu Y-M, Lonigro RJ, et al. Integrative Clinical Sequencing in the Management of Children and Young Adults With Refractory or Relapsed CancerJAMA. doi:10.1001/jama.2015.10080. eAppendix. Supplementary appendix This supplementary material has been provided by the authors to give readers additional information about their work. © 2015 American Medical Association. All rights reserved. Downloaded From: https://jamanetwork.com/ on 09/29/2021 SUPPLEMENTARY APPENDIX Use of Integrative Clinical Sequencing in the Management of Pediatric Cancer Patients *#Rajen J. Mody, M.B.B.S, M.S., *Yi-Mi Wu, Ph.D., Robert J. Lonigro, M.S., Xuhong Cao, M.S., Sameek Roychowdhury, M.D., Ph.D., Pankaj Vats, M.S., Kevin M. Frank, M.S., John R. Prensner, M.D., Ph.D., Irfan Asangani, Ph.D., Nallasivam Palanisamy Ph.D. , Raja M. Rabah, M.D., Jonathan R. Dillman, M.D., Laxmi Priya Kunju, M.D., Jessica Everett, M.S., Victoria M. Raymond, M.S., Yu Ning, M.S., Fengyun Su, Ph.D., Rui Wang, M.S., Elena M. Stoffel, M.D., Jeffrey W. Innis, M.D., Ph.D., J. Scott Roberts, Ph.D., Patricia L. Robertson, M.D., Gregory Yanik, M.D., Aghiad Chamdin, M.D., James A. Connelly, M.D., Sung Choi, M.D., Andrew C. Harris, M.D., Carrie Kitko, M.D., Rama Jasty Rao, M.D., John E. Levine, M.D., Valerie P. Castle, M.D., Raymond J. Hutchinson, M.D., Moshe Talpaz, M.D., ^Dan R. Robinson, Ph.D., and ^#Arul M. Chinnaiyan, M.D., Ph.D. CORRESPONDING AUTHOR (S): # Arul M.
    [Show full text]
  • Identification of Regulatory Elements from Nascent Transcription Using Dreg
    Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press Identification of regulatory elements from nascent transcription using dREG Zhong Wang1, Tinyi Chu1,2, Lauren A. Choate1, and Charles G. Danko1,3,* 1 Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853. 2 Graduate field of Computational Biology, Cornell University, Ithaca, NY 14853. 3 Department of Biomedical Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853. * Address correspondence to: Charles G. Danko, Ph.D. Baker Institute for Animal Health Cornell University Hungerford Hill Rd. Ithaca, NY 14853 Phone: (607) 256-5620 E-mail: [email protected] Abstract: Our genomes encode a wealth of transcription initiation regions (TIRs) that can be identified by their distinctive patterns of actively elongating RNA polymerase. We previously introduced dREG to identify TIRs using PRO-seq data. Here we introduce an efficient new implementation of dREG that uses PRO-seq data to identify both uni- and bidirectionally transcribed TIRs with 70% improvements in accuracy, 3-4-fold higher resolution, and >100-fold increases in computational efficiency. Using a novel strategy to identify TIRs based on their statistical confidence reveals extensive overlap with orthogonal assays, yet also reveals thousands of additional weakly-transcribed TIRs that were not identified by H3K27ac ChIP-seq or DNase-seq. Novel TIRs discovered by dREG were often associated with RNA polymerase III initiation, bound by pioneer transcription factors, or located in broad domains marked by repressive chromatin modifications. Our results suggest that transcription initiation can be a powerful tool for expanding the catalog of functional elements.
    [Show full text]