Integration of 198 ChIP-seq datasets reveals human cis-regulatory regions Hamid Bolouri & Larry Ruzzo Thanks to Stephen Tapscott, Steven Henikoff & Zizhen Yao Slides will be available from: http://labs.fhcrc.org/bolouri/ Email [email protected] for manuscript (Bolouri & Ruzzo, submitted) Kleinjan & van Heyningen, Am. J. Hum. Genet., 2005, (76)8–32 Epstein, Briefings in Func. Genom. & Protemoics, 2009, 8(4)310-16 Regulation of SPi1 (Sfpi1, PU.1 protein) expression – part 1 miR155*, miR569# ~750nt promoter ~250nt promoter The antisense RNA • causes translational stalling • has its own promoter • requires distal SPI1 enhancer • is transcribed with/without SPI1. # Hikami et al, Arthritis & Rheumatism, 2011, 63(3):755–763 * Vigorito et al, 2007, Immunity 27, 847–859 Ebralidze et al, Genes & Development, 2008, 22: 2085-2092. Regulation of SPi1 expression – part 2 (mouse coordinates) Bidirectional ncRNA transcription proportional to PU.1 expression PU.1/ELF1/FLI1/GLI1 GATA1 GATA1 Sox4/TCF/LEF PU.1 RUNX1 SP1 RUNX1 RUNX1 SP1 ELF1 NF-kB SATB1 IKAROS PU.1 cJun/CEBP OCT1 cJun/CEBP 500b 500b 500b 500b 500b 750b 500b -18Kb -14Kb -12Kb -10Kb -9Kb Chou et al, Blood, 2009, 114: 983-994 Hoogenkamp et al, Molecular & Cellular Biology, 2007, 27(21):7425-7438 Zarnegar & Rothenberg, 2010, Mol. & cell Biol. 4922-4939 An NF-kB binding-site variant in the SPI1 URE reduces PU.1 expression & is GGGCCTCCCC correlated with AML GGGTCTTCCC Bonadies et al, Oncogene, 2009, 29(7):1062-72. SATB1 binding site A distal single nucleotide polymorphism alters long- range regulation of the PU.1 gene in acute myeloid leukemia Steidl et al, J Clin Invest. 2007, 117(9):2611-20. ChIP-seq of 13 sequence-specific TFs Nanog, Oct4, STAT3, Smad1, Sox2, Zfx, c-Myc, n-Myc, Klf4, Esrrb, Tcfcp2l1, E2f1, and CTCF Number Number of loci Number of TFs bound within 100bp of nearest neighbor Chen et al, Cell, 2008;133(6):1106-17 Nature. 2011;473(7345):43-9 Cells: H1hesc, K562, HepG2, HUVEC, HSMM , HMEC, NHLF, NHEK + 1 HapMap B-lymphoblastoid Histone marks classified as: 1_Active_Promoter , 2_Weak_Promoter , 3_Poised_Promoter , 4_Strong_Enhancer , 5_Strong_Enhancer , 6_Weak_Enhancer , 7_Weak_Enhancer (total footprint of all histone marked regions = 627,972,582 bps , ~ 20.9% of the genome) (HeLaS3, HUVEC, K562, NHEK, H1hesc + 7 HapMap B-lymphoblastoid cell lines) 958,250 / 1,067,220 = 89.8% of DNase1 selected regions overlap histone marked regions (total footprint of DNase1-selected-regions = 22,388,756 bps , ~ 0.75% of the genome) 198 ChIPseq experiments analyzed in this study colored text indicates datasets expected to overlap red border: immortalized B-lymphocyte from 2nd HapMap individual A549GrPcr1xDexb Ecc1EralphaaV0416102Estradia1h Gm12878Atf3Pcr1x H1hescAtf3V0416102 Hepg2Atf3V0416101 K562Bcl3Pcr1x AR_Biaoyang A549GrPcr1xDexc Ecc1EralphaaV0416102Gen1h Gm12878BatfPcr1x H1hescBcl11aPcr1x Hepg2Bhlhe40V0416101 K562CebpbIggrab ArRD A549GrPcr1xDexd Ecc1Foxa1sc6553V0416102Dmso2 Gm12878Bcl11aPcr1x H1hescCjunIggrab Hepg2bZnf274Ucd K562CmycIfng30Std betaCatenin A549GrPcr2xDexa Ecc1GrV0416102Dexa Gm12878Cmyc H1hescCmyc Hepg2CebpbIggrab K562Cmyc CBPJurkat A549Usf1Pcr1xDex100nm Gm12878Ebfsc137065Pcr1x H1hescEgr1V0416102 Hepg2CjunIggrab K562E2f6sc22823V0416102 ELF1RD A549Usf1Pcr1xEtoh02 Gm12878Egr1V0416101 H1hescGabpPcr1x Hepg2Cmyc K562Efos ERGRD Nb4CmycStd Gm12878Elf1sc631V0416101 H1hescJundV0416102 Hepg2Elf1sc631V0416101 K562Egata2 Ets1Jurkat Htb11NrsfPcr2x Nb4MaxStd Gm12878Ets1Pcr1x H1hescMaxUcd Hepg2Fosl2V0416101 K562Egr1V0416101 EwsErgRD Gm12878GabpPcr2x H1hescNanogsc33759V0416102 Hepg2Foxa1sc101058V0416101 K562Ejunb Fli1RD Gm12878Irf3Std H1hescNrf1Iggrab Hepg2Foxa1sc6553V0416101 K562Ejund KLF4 HuvecCfosUcd SknshraP300V0416102 Gm12878Irf4sc6059Pcr1x H1hescNrsfV0416102 Hepg2Foxa2sc6554V0416101 K562Elf1sc631V0416102 Nanog HuvecCmyc SknshraUsf1sc8983V0416102 Gm12878Mef2aPcr1x H1hescP300V0416102 Hepg2GabpPcr2x K562Enr4a1 nfkbRD HuvecGata2Ucd SknshraYy1sc281V0416102 Gm12878Mef2csc13268V0416101 H1hescPou5f1sc9081V0416102 Hepg2Hey1V0416101 K562Ets1V0416101 Oct4Pou5F1 Gm12878Nrf1Iggmus H1hescRxraV0416102 Hepg2Hnf4asc8987V0416101 K562Fosl1sc183V0416101 P63goodRD Gm12878NrsfPcr2x H1hescSix5Pcr1x Hepg2Hnf4gsc6558V0416101 K562GabpV0416101 PU1RD Mcf10aesStat3Etoh01bStd T47dEralphaaPcr2xGen1h Gm12878P300Pcr1x H1hescSp1Pcr1x Hepg2Irf3Iggrab K562Gata2sc267Pcr1x Runx1Jurkat Mcf10aesStat3Etoh01cStd T47dEralphaaV0416102Estradia1h Gm12878Pax5c20Pcr1x H1hescSrfPcr1x Hepg2JundIggrab K562Hey1Pcr1x selectedStat1ChIPseqPeaks Mcf10aesStat3Etoh01Std T47dFoxa1sc6553V0416102Dmso2 Gm12878Pax5n19Pcr1x H1hescTcf12Pcr1x Hepg2JundPcr1x K562Irf1Ifna30Std Sox2 Mcf10aesStat3TamStd T47dGata3sc268V0416102Dmso2 Gm12878Pbx3Pcr1x H1hescUsf1Pcr1x Hepg2Maffm8194Iggrab K562Irf1Ifng6hStd Sox2cell2 Mcf7CmycVeh T47dP300V0416102Dmso2 Gm12878Pou2f2Pcr1x H1hescUsf2Iggrab Hepg2Mafkab50322Iggrab K562Mafkab50322Iggrab SpdefRD Gm12878Pu1Pcr1x H1hescYy1sc281V0416102 Hepg2Mafksc477Iggrab K562MaxV0416102 VdrRD Gm12878RxraPcr1x Hepg2Nrf1Iggrab K562Nrf1Iggrab ZNF263 Nt2d1Znf274Ucd Panc1NrsfPcr2x Gm12878Six5Pcr1x Hepg2NrsfPcr2x K562NrsfV0416102 Gm12878Sp1Pcr1x Hepg2P300V0416101 K562Pu1Pcr1x Gm12878SrfPcr2x Hepg2RxraPcr1x K562Six5Pcr1x not-ENCODE Trexhek293Znf263Ucd PbdeGata1Ucd Gm12878SrfV0416101 Hepg2Sp1Pcr1x K562Sp1Pcr1x Hek293bElk4Ucd Gm12878Stat1Std Hepg2SrfV0416101 K562Sp2sc643V0416102 Gm12878Stat3Iggmus Hepg2Tcf12Pcr1x K562SrfV0416101 U87NrsfPcr2x Pfsk1NrsfPcr2x Gm12878Tcf12Pcr1x Hepg2Tcf4Ucd K562Tal1sc12984Iggmus Gm12878Usf1Pcr2x Hepg2Usf1Pcr1x K562Usf1V0416101 Gm12878Usf2Iggmus Hepg2Usf2Iggrab K562Usf2Std Helas3Cebpb Shsy5yGata2Ucd Gm12878Zbtb33Pcr1x Hepg2Zbtb33Pcr1x K562Yy1V0416101 Helas3Cmyc Gm12878Zeb1sc25388V0416102 Hepg2Zbtb33V0416101 K562Yy1V0416102 Helas3Elk4 Gm12878Znf143166181apStd K562Zbtb33Pcr1x Helas3GabpPcr1x Gm12878Znf274Ucd K562Zbtb7asc34508V0416101 Helas3Irf3Iggrab Gm12891Pax5c20V0416101 Helas3NrsfPcr1x Gm12891Pou2f2Pcr1x File identifiers = <cell type><antibody><treatment> Helas3Stat3Iggrab Gm12891Pu1Pcr1x Helas3Usf2Iggmus Gm12891Yy1sc281V0416101 Pre-processing: remove peaks wider than 4Kbp Helas3Zzz3Std Gm12892Pax5c20V0416101 Gm12892Yy1V0416101 remove peaks with outlier score Numbers of experiments per cell-type. [Row 11 combines related cell-lines (SK-N-SH & SK-N-MC).] Cell type No. Expts. No. TFs. Source GMxxxxx 43 37 HapMap lymphoblastoid k562 36 33 Chronic Myelogenous Leukemia HEPG2 33 28 liver carcinoma H1HESC 25 25 embryonic stem cells HELAS3 10 10 cervical carcinoma A549 6 2 alveolar basal epithelium T47D 5 5 mammary ductal carcinoma MCF10 4 1 mammary epithelium ECC1 4 3 endometrial adenocarcinoma SK-N-SH/MC 4 3 neuroblastoma JURKAT 4 4 T-cell Leukemia HUVEC 3 3 umbilical vein endothelium VCAP 3 3 prostate cancer HEK293 2 2 embryonic kidney NB4 2 2 Acute Promyelocytic Leukemia HTB11 1 1 neuroblastoma MCF7 1 1 invasive breast ductal carcinoma 51 experiments in NT2D1 1 1 embryonal testicular carcinoma 24 cell types U87 1 1 glioblastoma SHSY5Y 1 1 neuroblastoma PANC1 1 1 pancreatic carcinoma PBDE 1 1 peripheral blood derived erythroblasts PFSK1 1 1 cerebral brain tumor HKC 1 1 primary keratinocytes PC3 1 1 pancreatic carcinomas HTC116 1 1 colorectal carcinoma LN229 1 1 glioblastoma HL60 1 1 promyelocytic leukemia CADO-ES1 1 1 Ewing's Sarcoma Observed ChIP-seq peak overlaps are unlikely by chance. 13 experiments used to test if the observed predictability is due to datasets for TFs that bind similar motifs. TF/antibody cell type lab Cebpb Hepg2 Sydh Cfos* Huvec Sydh Cmyc Nb4 Sydh Foxa1sc6553V0416101 Hepg2 Haib Gata2 K562E Uchicago Junb* K562E Uchicago Maffm8194 Hepg2 Sydh Mafkab50322 Hepg2 Sydh P300V0416102 Sknshra Haib Stat3Etoh01 Mcf10aes Sydh Tal1sc12984 K562 Sydh Znf143166181ap Gm12878 Sydh Znf263 Trexhek293 Sydh * JunB has 10-fold lower binding affinity for cJun DNA binding sites. “JunB differs from c-Jun in its DNA-binding and dimerization domains, and represses c-Jun by formation of inactive heterodimers.” T Deng and M Karin, Genes & Dev. 1993. 7: 479-490. Observed ChIP-seq peak overlaps are unlikely by chance. 30 files have >30K peaks, 97 files have <10K peaks. Frequency >30K peaks Number of peaks in experiment zoom <10K peaks Observed ChIP-seq peak overlaps are unlikely by chance. <10K-Peaks from >30K-Peaks >30K-Peaks from <10K-Peaks Datasets used to test cell-type specificity of ChIP-seq peak overlaps. (A) For within-cell-type predictability tests (B) For across-cell-type predictability tests unique experiments in unique experiments in experiments unique to HEPG2 experiments unique to K562 HEPG2 K562 Atf3 -- V0416101 Bcl3 -- Pcr1x Atf3 -- V0416101 Bcl3 -- Pcr1x Bhlhe40 -- V0416101 E2f6 -- sc22823V0416102 Bhlhe40 -- V0416101 Cebpb -- Iggrab Cebpb -- Iggrab Egr1 -- V0416101 Cebpb -- Iggrab Cmyc Cjun -- Iggrab Elf1 -- sc631V0416102 Cjun -- Iggrab E2f6 -- sc22823V0416102 Cmyc Fos Cmyc NR4a1 Elf1 -- sc631V0416101 Egr1 -- V0416101 Elf1 -- sc631V0416101 Ets1 -- V0416101 Fosl2 -- V0416101 Junb Fosl2 -- V0416101 Gata2 -- sc267Pcr1x Foxa1 -- sc101058V0416101 Elf1 -- sc631V0416102 Foxa1 -- sc101058V0416101 Irf1 -- Ifng6hStd Gabp -- Pcr2x NR4a1 Gabp -- Pcr2x Mafk -- ab50322Iggrab Hey1 -- V0416101 Ets1 -- V0416101 Hey1 -- V0416101 Max -- V0416102 Hnf4 -- asc8987V0416101 Fosl1 -- sc183V0416101 Hnf4 -- asc8987V0416101 Nrf1 -- Iggrab Irf3 -- Iggrab Gabp -- V0416101 Irf3 -- Iggrab Nrsf -- V0416102 Jund -- Iggrab Gata2 -- sc267Pcr1x Jund -- Iggrab Pu1 -- Pcr1x Maff -- m8194Iggrab Hey1 -- Pcr1x Maff -- m8194Iggrab Six5 -- Pcr1x Mafk -- ab50322Iggrab Irf1 -- Ifng6hStd Nrf1 -- Iggrab Mafk -- ab50322Iggrab P300 -- V0416101
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages36 Page
-
File Size-