Snyder Produc&On Group Summary for ENCODE 3
Total Page:16
File Type:pdf, Size:1020Kb
Snyder Producon Group Summary for ENCODE 3 Nathaniel Watson, Jessika Adrian, Tejaswini Mishra, Minyi Shi, Denis Salins, Trup Kawli, Michael Snyder Department of Gene,cs September 22, 2016 About ENCODE Snyder DPC: Released Work THE ENCYCLOPEDIA OF DNA ELEMENTS, or ENCODE, is an MAJOR RESEARCH GOALS: To Identify transcription factor WE HAVE POSITIVELY CHARACTERIZED ANTIBODIES for 302 international project that has been funded by NHGRI since its pilot binding sites in the human genome, and functionally targets spanning 14 cell lines and 10 bodily tissues; see Fig. 2. Not phase initiated in 2003, with the goal of identifying all functional characterize regulatory elements.! shown in Fig. 2 are 731 negative antibody characterizations that elements in the human genome.! have been submitted to the ENCODE Portal. ! Four Phases of the ENCODE Project! EXPERIMENTS range from ChIP-Seq, siRNA-Seq, ATAC-Seq, as well as ChIA-PET; see Fig. 1.! Sep. 2003 Sep. 2007 Sep. 2013 Feb. 2017 Number of Targets With Posi3ve An3body Pilot Phase I Phase II Phase III Phase IV Number of Released Experiments in Phase III Characterizaons for each Cell or Tissue Type in 500 ENCODE 3 450 500 400 450 THE CONSORTIUM is composed of multiple Data Production Centers 400 (DPCs), a central Data Coordination Center (DCC), and a central 350 350 Data Analysis Center (DAC).! 300 2016 300 250 250 2015 200 200 150 2014 150 100 100 50 50 0 0 ChIP-Seq ATAC-Seq ChIA-PET siRNA-Seq K562 liver A549 heart HeLa MCF-7 HepG2 Other HeLa-S3 IMR-90 Figure 1. The number of experiments performed in the Snyder DPC in GM12878 HEK293T ENCODE 3 that have been released to the public by the DCC. Figure 2. The number of targets (genes) per cell/tissue type that were Experiments are shown by type and year.! characterized in the Snyder data production center in ENCODE 3. ! Snyder DPC: Data Flow Sequencing Results are automatically uploaded to DNAnexus, where ENCODE standard pipelines, Experiments and Samples undergo Illumina sequencing such as the TF ChIP-Seq scoring pipeline, are run.! resources stored in LIMS at Stanford Sequencing Center for built using Syapse.! Genomics and Personalized Medicine! PRODUCTION CENTERS submit to the DCC. The ENCODE Portal at www.encodeproject.org provides public access to all released datasets. ! The in-house Python API around our Syapse LIMS is used to submit DNAnexus sequencing and experiments as JSON documents to Passing experiments are released scoring metadata are the DCC for review. ! automatically reported back on the ENCODE Portal for free into Syapse.! public access.! Contacts Acknowledgements •" Jessika Adrian, [email protected]; •" NHGRI for funding this project ! ENCODE Project Manager of the Snyder Data Produc,on Center; •" Esther Chan, DCC Data Wrangler for the Snyder DPC! Stanford School of Medicine •" Somalee Datta, Director of Bioinformatics at SCGPM! •" Snyder Lab members Teri Slifer, Xinqiong Yang, Lixia Jiang, Jie Zhai, and Yining Li! •" Nathaniel Watson, [email protected] •" Kevin White and Alec Victorsen at Uchicago! So_ware Developer and ENCODE Data Submi`er for the Snyder Data Produc,on Center; •" Peggy Farnham and Heather Witt at USC! Sequencing Center for Genomics and Personalized Medicine (SCGPM); Stanford School of Medicine .