Identification of Bacillus Subtilis RNA Genes Using Tiling Arrays Cyprien Guérin

Total Page:16

File Type:pdf, Size:1020Kb

Identification of Bacillus Subtilis RNA Genes Using Tiling Arrays Cyprien Guérin Identification of Bacillus subtilis RNA genes using Tiling Arrays Cyprien Guérin, . Basysbio To cite this version: Cyprien Guérin, . Basysbio. Identification of Bacillus subtilis RNA genes using Tiling Arrays. Bioin- formatique des ARN, Feb 2012, Toulouse, France. hal-02804688 HAL Id: hal-02804688 https://hal.inrae.fr/hal-02804688 Submitted on 5 Jun 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Identication of Bacillus subtilis RNA genes using Tiling Arrays Cyprien GUÉRIN BaSysBio Consortium Summary High-resolution transcriptome Analysis of Tiling Array signals Exemple of new features discovered with TA Promoter and terminator predictions Perspectives 2/25 High-resolution transcriptome Systematic exploration of B. subtilis transcriptional landscape New genes/features discovery in Bacillus subtilis. Explore most of the bacterium's lifestyles: 1 wild-type strain, maybe better called prototype strain. 1 array design (Basysbio tiling array, Nimblegen technology) : strand-specic expression signal with a 22-bp step. 269 hybridizations sampling a maximum variety of lifestyles, 104 dierent biological conditions, most with 2-3 biological replicates (experiments). Growth on various media (rich/poor, solid/liquid, aerobic/anaerobic), sporulation, germination, competence, variety of stresses (including ethanol, salt, temperature, oxidative), etc. 3/25 High-resolution transcriptome Tiling array 22 bp 22 bp ≈ 380; 000 probes tiling the 4.2 Mbp Bacillus subtilis genome. Long probes (45-65 nt), lengths adjusted to achieve relative homogenous anity. 4/25 Analysis of Tiling Array signals Principles Automatic detection of Transcription Units with a HMM model [1], taking into account: normalization (with genomic DNA hybridizations): 1. probes are not isothermal, 2. response is not linear, 3. outliers are discarded. continuous variation of the signal. [1] Nicolas P., et al. (2009). Bioinformatics. 5/25 Analysis of Tiling Array signals Normalisation using chromosomal DNA log(genomic DNA) from x4 pooled data log(mRNA) log(mRNA) − log(genomic DNA) Probe anity is variable, despite the adjustment of probe lengths. 6/25 Analysis of Tiling Array signals Shift and drift signal level 6 8 10 12 14 16 CDSs moves 1100000 1102000 1104000 1106000 1108000 1110000 position on chromosome (bp) 7/25 Exemple of new features discovered with TA RNA genes 1 228 001 1 238 000 mecA yjbF yjbG yjbL yjbM yjbN yjbH yjbI yjbJ yjbK yjbO yjbE 2.946 2.845 8/25 Exemple of new features discovered with TA Coding sequences 1 070 001 1 080 000 yhaI ecsA ecsB ecsC prsA yhaK hpr yhaH yhaG serC hit yhaA yhaJ Sequence features annotation Transcriptome Forward strand Log(2) ratio Fwd 5.667 Backward strand Log(2) ratio Bwd 9/25 2.409 Exemple of new features discovered with TA Antisense related to stress 3 567 001 3 577 000 yvcN crh yvcL yvcK yvcJ yvcI trxB yvcE yvcD 10/25 3.008 3.393 7.624 2.639 6.296 3.111 2.848 4.37 Exemple of new features discovered with TA A few numbers In B. subtilis annotation v3: 4,256 CDSs, 5 RNA genes, 30 rRNAs, 86 tRNAs, 57 (-1) 5' cis-acting regions. New features discovered with TA: 44 new CDSs, 136 new RNA genes, 423 antisense signals (including 4 CDSs and 87 RNA genes), 92 5' cis-acting regions (conrmed for 56), 676 long 5'UTR regions and 125 long 3'UTR regions. 11/25 Exemple of new features discovered with TA Combining gene expression with ChIP/chip (CcpN) 2 962 001 2 972 000 ytbD ytbE dnaI dnaB ytcG speD gapB ytcD ytaG ytaF mutM Sequence features annotation Transcriptome Forward strand Glucose to Malate Backward strand Forward strand Malate to Glucose Backward strand CcpN DNA binding (CHiP/chip) 12/25 Exemple of new features discovered with TA Combining RNA gene expression with ChIP/chip (CcpN) 1 528 001 1 538 000 SR1 pdhA pdhB pdhC pdhD yktA ykzI yktC ykzC slp speA yktB 13/25 Promoter and terminator predictions From upshifts to promoters TSSs position estimation using TA compared to RNA-Seq data [1]. Frequency 0 50 100 150 −100 −50 0 50 100 Distance between upshifts and TSSs 14/25 [1] Irnov I., et al. (2010). Nucleic Acids Res.. Promoter and terminator predictions From upshifts to promoters Summarizing correlations between promoter activities. Cluster Dendrogram Height 0.0 0.1 0.2 0.3 0.4 0.5 A 'promoter tree' is built by hierarchical clustering using average linkage on the dissimilarity matrix di;j = (1 − ri;j )=2 2 [0; 1] where ri;j is the correlation between activities of promoters i and j. 15/25 Promoter and terminator predictions From upshifts to promoters TSS −35 boxspacer −10 box background PWM2 PWM1 l2 S l1 D Promoters prediction: unsupervised algorithm for modeling of bipartite degenerate motifs [1], clustering of sequences from the 3,242 transcription upshifts. [1] Nicolas P., et al. (2012). Science. 16/25 Promoter and terminator predictions From upshifts to promoters Behavior of the MCMC algorithm, with K = 20 motifs 17/25 Promoter and terminator predictions From upshifts to promoters Comparison with known Sigma factor binding sites DBTBS: a database of transcriptional regulation in Bacillus subtilis DBTBS M19 M14 M4 M3 M7 M5 M16 M8 M11 M13 M17 M9 M1 M15 M10 - M2 M18 M20 M6 M12 - 401 369 349 213 218 170 170 134 127 113 80 43 63 72 48 44 16 11 12 4 5 SigA 59 90 49 1 33 1 22 0 1 0 19 0 1 0 1 1 0 0 0 7 0 SigB 0000000044000000000000 SigD 0000100000100023000000 SigE 0015404010000100000000 SigF 0008000101000010000000 SigG 0000000420000000000000 SigH 0001001100011200000000 SigI 000000000000000010000 SigK 1001038000000010000000 SigL 000000000000000006000 SigM 000000000001000000000 SigW 0010000000033000000000 SigX 000000000002000000000 SigY 000000000002000000000 Sequence logos to represent motifs 18/25 Promoter and terminator predictions From upshifts to promoters Predicted promoters: 758 promoters in DBTBS, 2,935 predicted promoters using algorithm above, 580 promoters in commun, 2,355 new promoters discovered. 46% genes with multiple promoters. 19/25 Promoter and terminator predictions Terminators and downshifts Terminator predictions: 3,510 putative sites from genome-scan with Petrin Software [1], identication of 2,126 high condence down-shift sites, 1,501 putative terminators conrmed by downshifts ( 70% of down-shifts). Three types of terminations: sharp, partial, missed termination. [1] d'Aubenton-Carafa Y., et al. (1990). J. Mol. Biol.. 20/25 Promoter and terminator predictions A few examples 1 070 001 1 078 000 1 230 001 1 234 000 yhaL coiA yhaI yhzF pepF yhzE yhaJ yhaH serC ecsA prsA hinT scoC trpP yizD U930.B U935.A1 U797.H U792.E U794.K U799.A4 U803.E U931.H D627 D536 D540 D544 U932.G D629 D535 D537 D539 D541 D542 D543 D545 U933.A1 D628 U793.A1 U798.A5 U802.A3 U804.A3 U805.A3 U934.E U795.M15 U796.A5 U801.A4 U806.A1 U807.G yhaL yhzE S349 yhaI yhzF ecsA coiA pepF S415 S347 S352 S354 S414 prsA yhaJ scoC yhaH trpP serC hinT yizD S348 S351 S353 S355 S356 S357 a b 21/25 Promoter and terminator predictions A few more examples 2 839 001 2 843 000 1 297 001 1 301 000 694 001 698 000 yrzT ndh yebD yrzF yrbD yjlB uxaC pbuG yrbE yrzH yjlA rex yebC Purine U2147.A1U2150.A4 U1005.B U2148.E U2151.E U1006.A5 U493.A5 U494.W D1421D1423 U2152.M21 D684 D314 D316 D1422 D313 D317 U2149.A7 U1008.H yrzFyrzT yrbD yjlBS451 ndh uxaC pbuG S228 S230 S1053 S450 rex yebC yebD yrzH yrbE S1051 yjlA S1052 S449 a b c 22/25 Perspectives Huge set of expression data (104 conditions) on gene repertoire for B. subtilis: functional annotation (CDSs, RNA genes, etc.). Antisense and transcription accuracy in bacteria: biological function, bias with alternative promoters, majority of signals with missed termination, promoters for antisense less conserved than promoters for CDSs. 23/25 Thank you 1 070 001 1 078 000 1 230 001 1 234 000 yhaL coiA yhaI yhzF pepF yhzE yhaJ yhaH serC ecsA prsA hinT scoC trpP yizD U930.B U935.A1 U797.H U792.E U794.K U799.A4 U803.E U931.H D627 D536 D540 D544 U932.G D629 D535 D537 D539 D541 D542 D543 D545 U933.A1 D628 U793.A1 U798.A5 U802.A3 U804.A3 U805.A3 U934.E U795.M15 U796.A5 U801.A4 U806.A1 U807.G yhaL yhzE S349 yhaI yhzF ecsA coiA pepF S415 S347 S352 S354 S414 prsA yhaJ scoC yhaH trpP serC hinT yizD S348 S351 S353 S355 S356 S357 a b 24/25 Thank you 25/25.
Recommended publications
  • Modulation of Mrna and Lncrna Expression Dynamics by the Set2–Rpd3s Pathway
    ARTICLE Received 20 Apr 2016 | Accepted 7 Oct 2016 | Published 28 Nov 2016 DOI: 10.1038/ncomms13534 OPEN Modulation of mRNA and lncRNA expression dynamics by the Set2–Rpd3S pathway Ji Hyun Kim1,2,*, Bo Bae Lee1,2,*, Young Mi Oh1,2, Chenchen Zhu3,4,5, Lars M. Steinmetz3,4,5, Yookyeong Lee1, Wan Kyu Kim1, Sung Bae Lee6, Stephen Buratowski7 & TaeSoo Kim1,2 H3K36 methylation by Set2 targets Rpd3S histone deacetylase to transcribed regions of mRNA genes, repressing internal cryptic promoters and slowing elongation. Here we explore the function of this pathway by analysing transcription in yeast undergoing a series of carbon source shifts. Approximately 80 mRNA genes show increased induction upon SET2 deletion. A majority of these promoters have overlapping lncRNA transcription that targets H3K36me3 and deacetylation by Rpd3S to the mRNA promoter. We previously reported a similar mechanism for H3K4me2-mediated repression via recruitment of the Set3C histone deacetylase. Here we show that the distance between an mRNA and overlapping lncRNA promoter determines whether Set2–Rpd3S or Set3C represses. This analysis also reveals many previously unreported cryptic ncRNAs induced by specific carbon sources, showing that cryptic promoters can be environmentally regulated. Therefore, in addition to repression of cryptic transcription and modulation of elongation, H3K36 methylation maintains optimal expression dynamics of many mRNAs and ncRNAs. 1 Department of Life Science, Ewha Womans University, Seoul 03760, Korea. 2 The Research Center for Cellular Homeostasis, Ewha Womans University, Seoul 03760, Korea. 3 Genome Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany. 4 Stanford Genome Technology Center, Stanford University School of Medicine, Stanford, California 94305, USA.
    [Show full text]
  • A Computational and Evolutionary Approach to Understanding Cryptic Unstable Transcripts in Yeast
    A Computational and Evolutionary Approach to Understanding Cryptic Unstable Transcripts in Yeast By Jessica M. Vera B.S. University of Wisconsin-Madison, 2007 A thesis submitted to the Faculty of the Graduate School in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Molecular, Cellular, and Developmental Biology 2015 This thesis entitled: A Computational and Evolutionary Approach to Understanding Cryptic Unstable Transcripts in Yeast written by Jessica M. Vera has been approved for the Department of Molecular, Cellular, and Developmental Biology Tom Blumenthal Robin Dowell Date The final copy of this thesis has been examined by the signatories, and we find that both the content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline iii Vera, Jessica M. (Ph.D., Molecular, Cellular and Developmental Biology) A Computational and Evolutionary Approach to Understanding Cryptic Unstable Transcripts in Yeast Thesis Directed by Robin Dowell Cryptic unstable transcripts (CUTs) are a largely unexplored class of nuclear exosome degraded, non-coding RNAs in budding yeast. It is highly debated whether CUT transcription has a functional role in the cell or whether CUTs represent noise in the yeast transcriptome. I sought to ascertain the extent of conserved CUT expression across a variety of Saccharomyces yeast strains to further understand and characterize the nature of CUT expression. To this end I designed a Hidden Markov Model (HMM) to analyze strand-specific RNA sequencing data from nuclear exosome rrp6Δ mutants to identify and compare CUTs in four different yeast strains: S288c, Σ1278b, JAY291 (S.cerevisiae) and N17 (S.paradoxus).
    [Show full text]
  • Genechip® Mouse Tiling 2.0R Array Set Is a Seven-Array Set Designed ORDERING INFORMATION for Chromatin Immunoprecipitation (Chip) Experiments
    ARRAYS Critical Specifi cations Feature Size 5 µm Tiling Resolution 35 base pair Hybridization Controls bioB, bioC, bioD, cre Tiling mRNA controls B. subtilis: dap, lys, phe, thr A.thaliana: CAB, RCA, RBCL, LTP4, LTP6, XCP2, RCP1, NAC1, TIM, PRKASE Array Format 49 Fluidics Protocol Fluidics Station 450: FS450_0001 Fluidics Station 400: EukGE-WS2v5, and manually add Array Holding Buffer to the cartridge prior to scan- ning Hybridization Volume 200 µL. The total fi ll volume of the cartridge is 250 µL. Library Files Mm35b_P01R_v01, Mm35b_P02R_v01, ® Mm35b_P03R_v01, Mm35b_P04R_v01, GeneChip Mouse Tiling 2.0R Mm35b_P05R_v01, Mm35b_P06R_v01, Array Set Mm35b_P07R_v01 For studying genome-wide protein/DNA interactions in chromatin ACCESSORY FILES immunoprecipitation experiments. Fluidics The fl uidics scripts can be downloaded from the Affymetrix web site. Addi- tional information, including lists of steps in the fl uidics protocol, can be found in the Affymetrix® Chromatin Immunoprecipitation Assay Protocol. Library Files Library fi les contain information about the probe array design layout and other characteristics, probe use and content, and scanning and analysis pa- rameters. Tiling arrays are accompanied by .bpmap fi les which associate each probe on the array with the genomic location. These fi les are unique for each probe array type. The library fi les can be downloaded from the following URL: www.affymetrix.com/support/technical/libraryfi lesmain.affx. INTENDED USE The GeneChip® Mouse Tiling 2.0R Array Set is a seven-array set designed ORDERING INFORMATION for chromatin immunoprecipitation (ChIP) experiments. Sequences used in the P/N Product Name Description design of the Mouse Tiling 2.0R Array Set were selected from NCBI mouse ARRAYS genome assembly (Build 33).
    [Show full text]
  • Systematic Evaluation of Variability in Chip-Chip Experiments Using Predefined DNA Targets
    Downloaded from genome.cshlp.org on September 25, 2021 - Published by Cold Spring Harbor Laboratory Press Letter Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets David S. Johnson,1,24 Wei Li,2,24,25 D. Benjamin Gordon,3 Arindam Bhattacharjee,3 Bo Curry,3 Jayati Ghosh,3 Leonardo Brizuela,3 Jason S. Carroll,4 Myles Brown,5 Paul Flicek,6 Christoph M. Koch,7 Ian Dunham,7 Mark Bieda,8 Xiaoqin Xu,8 Peggy J. Farnham,8 Philipp Kapranov,9 David A. Nix,10 Thomas R. Gingeras,9 Xinmin Zhang,11 Heather Holster,11 Nan Jiang,11 Roland D. Green,11 Jun S. Song,2 Scott A. McCuine,12 Elizabeth Anton,1 Loan Nguyen,1 Nathan D. Trinklein,13 Zhen Ye,14 Keith Ching,14 David Hawkins,14 Bing Ren,14 Peter C. Scacheri,15 Joel Rozowsky,16 Alexander Karpikov,16 Ghia Euskirchen,17 Sherman Weissman,18 Mark Gerstein,16 Michael Snyder,16,17 Annie Yang,19 Zarmik Moqtaderi,20 Heather Hirsch,20 Hennady P. Shulha,21 Yutao Fu,22 Zhiping Weng,21,22 Kevin Struhl,20,26 Richard M. Myers,1,26 Jason D. Lieb,23,26 and X. Shirley Liu2,26 1–23[See full list of author affiliations at the end of the paper, just before the Acknowledgments section.] The most widely used method for detecting genome-wide protein–DNA interactions is chromatin immunoprecipitation on tiling microarrays, commonly known as ChIP-chip. Here, we conducted the first objective analysis of tiling array platforms, amplification procedures, and signal detection algorithms in a simulated ChIP-chip experiment.
    [Show full text]
  • Modeling Dna Methylation Tiling Array Data
    Kansas State University Libraries New Prairie Press Conference on Applied Statistics in Agriculture 2010 - 22nd Annual Conference Proceedings MODELING DNA METHYLATION TILING ARRAY DATA Gayla Olbricht [email protected] Bruce A. Craig R. W. Doerge Follow this and additional works at: https://newprairiepress.org/agstatconference Part of the Agriculture Commons, and the Applied Statistics Commons This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License. Recommended Citation Olbricht, Gayla; Craig, Bruce A.; and Doerge, R. W. (2010). "MODELING DNA METHYLATION TILING ARRAY DATA," Conference on Applied Statistics in Agriculture. https://doi.org/10.4148/2475-7772.1061 This is brought to you for free and open access by the Conferences at New Prairie Press. It has been accepted for inclusion in Conference on Applied Statistics in Agriculture by an authorized administrator of New Prairie Press. For more information, please contact [email protected]. Conference on Applied Statistics in Agriculture Kansas State University MODELING DNA METHYLATION TILING ARRAY DATA Gayla R. Olbricht1, Bruce A. Craig1, and R. W. Doerge1;2 1Department of Statistics, Purdue University, West Lafayette, IN 47907, U.S.A. 2Department of Agronomy, Purdue University, West Lafayette, IN 47907, U.S.A. Abstract: Epigenetics is the study of heritable changes in gene function that occur without a change in DNA sequence. It has quickly emerged as an essential area for understanding inheri- tance and variation that cannot be explained by the DNA sequence alone. Epigenetic modifications have the potential to regulate gene expression and may play a role in diseases such as cancer.
    [Show full text]
  • Genechip Arabidopsis Tiling 1.0R Array
    ARRAYS Critical Specifi cations Feature Size 5 µm Tiling Resolution 35 base pair Hybridization Controls bioB, bioC, bioD, cre Tiling mRNA controls B. subtilis: dap, lys, phe, thr A. thaliana: CAB, RCA, RBCL, LTP4, LTP6, XCP2, RCP1, NAC1, TIM, PRKASE Array Format 49 Fluidics Protocol Fluidics Station 450: FS450_0001 Fluidics Station 400: EukGE-WS2v5, and manually add Array Holding Buffer to the cartridge prior to scanning Hybridization Volume 200 µL. The total fi ll volume of the cartridge is 250 µL. Library Files At35b_MR_v04 At35b_MR_v04-2_TIGRv5.bpmap GeneChip® Arabidopsis Tiling 1.0R Array RECOMMENDED ANALYSIS SOFTWARE 1. Affymetrix® Tiling Array Software (TAS) For studying protein/DNA interactions or identifying novel tran- 2. Integrated Genome Browser scripts in Arabidopsis thaliana. TAS and the Integrated Genome Browser are available for download from www.affymetrix.com ACCESSORY FILES Fluidics The fl uidics scripts can be downloaded from the Affymetrix web site. Addi- tional information, including lists of steps in the fl uidics protocol, can be found in the Affymetrix® Chromatin Immunoprecipitation Assay Protocol oror thethe GeneChip® WT Double-Stranded Target Assay Manual. Library Files Library fi les contain information about the probe array design layout and other characteristics, probe use and content, and scanning and analysis pa- INTENDED USE rameters. Tiling arrays are accompanied by .bpmap fi les which associate each The GeneChip® Arabidopsis Tiling 1.0R Array is designed for identifying probe on the array with the genomic location. These fi les are unique for each novel transcripts or mapping sites of protein/DNA interaction in chromatin probe array type. The library fi les can be downloaded from the following immunoprecipitation (ChIP) experiments.
    [Show full text]
  • The Paf1 Complex Broadly Impacts the Transcriptome of Saccharomyces Cerevisiae
    Genetics: Early Online, published on May 15, 2019 as 10.1534/genetics.119.302262 The Paf1 complex broadly impacts the transcriptome of Saccharomyces cerevisiae Mitchell A. Ellison*, Alex R. Lederer*, Marcie H. Warner*, Travis N. Mavrich*, Elizabeth A. Raupach*, Lawrence E. Heisler†, Corey Nislow‡, Miler T. Lee*, Karen M. Arndt* *Department of Biological Sciences, University of Pittsburgh, Pittsburgh PA, 15260 †Terrance Donnelly Centre and Banting & Best Department of Medical Research, University of Toronto, Toronto ON, Canada ‡Department of Pharmaceutical Sciences, University of British Columbia, Vancouver BC, Canada Copyright 2019. Ellison et al. Running title: Transcription regulation by Paf1C Key words: Paf1 complex, RNA polymerase II, histone modifications, chromatin, noncoding RNA Corresponding author: Karen M. Arndt Department of Biological Sciences University of Pittsburgh 4249 Fifth Avenue Pittsburgh, PA 15260 412-624-6963 [email protected] 2 Ellison et al. ABSTRACT The Polymerase Associated Factor 1 complex (Paf1C) is a multifunctional regulator of eukaryotic gene expression important for the coordination of transcription with chromatin modification and post-transcriptional processes. In this study, we investigated the extent to which the functions of Paf1C combine to regulate the Saccharomyces cerevisiae transcriptome. While previous studies focused on the roles of Paf1C in controlling mRNA levels, here we took advantage of a genetic background that enriches for unstable transcripts and demonstrate that deletion of PAF1 affects all classes of Pol II transcripts including multiple classes of noncoding RNAs. By conducting a de novo differential expression analysis independent of gene annotations, we found that Paf1 positively and negatively regulates antisense transcription at multiple loci. Comparisons with nascent transcript data revealed that many, but not all, changes in RNA levels detected by our analysis are due to changes in transcription instead of post- transcriptional events.
    [Show full text]
  • Statistical Methods for Affymetrix Tiling Array Data
    Kansas State University Libraries New Prairie Press Conference on Applied Statistics in Agriculture 2009 - 21st Annual Conference Proceedings STATISTICAL METHODS FOR AFFYMETRIX TILING ARRAY DATA Gayla Olbricht [email protected] Nagesh Sardesai Stanton B. Gelvin Bruce A. Craig R. W. Doerge See next page for additional authors Follow this and additional works at: https://newprairiepress.org/agstatconference Part of the Agriculture Commons, and the Applied Statistics Commons This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License. Recommended Citation Olbricht, Gayla; Sardesai, Nagesh; Gelvin, Stanton B.; Craig, Bruce A.; and Doerge, R. W. (2009). "STATISTICAL METHODS FOR AFFYMETRIX TILING ARRAY DATA," Conference on Applied Statistics in Agriculture. https://doi.org/10.4148/2475-7772.1080 This is brought to you for free and open access by the Conferences at New Prairie Press. It has been accepted for inclusion in Conference on Applied Statistics in Agriculture by an authorized administrator of New Prairie Press. For more information, please contact [email protected]. Author Information Gayla Olbricht, Nagesh Sardesai, Stanton B. Gelvin, Bruce A. Craig, and R. W. Doerge This is available at New Prairie Press: https://newprairiepress.org/agstatconference/2009/proceedings/9 Conference on Applied Statistics in Agriculture Kansas State University STATISTICAL METHODS FOR AFFYMETRIX TILING ARRAY DATA Gayla R. Olbricht1, Nagesh Sardesai2, Stanton B. Gelvin2, Bruce A. Craig1, and R.W. Doerge1 1Department of Statistics, Purdue University, 250 North University Street, West Lafayette, IN 47907-2066 USA; 2Department of Biological Sciences, Purdue University, 915 W. State Street, West Lafayette, IN 47907 USA Abstract Tiling arrays are a microarray technology currently being used for a variety of genomic and epigenomic applications, such as the mapping of transcription, DNA methylation, and histone modifications.
    [Show full text]
  • A Two-Stage Approach for Estimating the Effect of Dna Methylation on Differential Expression Using Tiling Array Technology
    Kansas State University Libraries New Prairie Press Conference on Applied Statistics in Agriculture 2008 - 20th Annual Conference Proceedings A TWO-STAGE APPROACH FOR ESTIMATING THE EFFECT OF DNA METHYLATION ON DIFFERENTIAL EXPRESSION USING TILING ARRAY TECHNOLOGY Suk-Young Yoo R. W. Doerge Follow this and additional works at: https://newprairiepress.org/agstatconference Part of the Agriculture Commons, and the Applied Statistics Commons This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License. Recommended Citation Yoo, Suk-Young and W., R. Doerge (2008). "A TWO-STAGE APPROACH FOR ESTIMATING THE EFFECT OF DNA METHYLATION ON DIFFERENTIAL EXPRESSION USING TILING ARRAY TECHNOLOGY," Conference on Applied Statistics in Agriculture. https://doi.org/10.4148/2475-7772.1101 This is brought to you for free and open access by the Conferences at New Prairie Press. It has been accepted for inclusion in Conference on Applied Statistics in Agriculture by an authorized administrator of New Prairie Press. For more information, please contact [email protected]. Conference on Applied Statistics in Agriculture Kansas State University A TWO-STAGE APPROACH FOR ESTIMATING THE EFFECT OF DNA METHYLATION ON DIFFERENTIAL EXPRESSION USING TILING ARRAY TECHNOLOGY Suk-Young Yoo and R.W. Doerge Department of Statistics Purdue University 150 North University Street West Lafayette, IN 47907 USA Abstract Epigenetics is the study of heritable alterations in gene function without changing the DNA sequence itself. It is known that epigenetic modifications such as DNA methylation and histone modifications are highly correlated with the regulation of gene expression. A two- stage analysis is proposed that employs a hidden Markov model and a linear model to evaluate differential expression as related to DNA methylation for the purpose of examining the effects of DNA methylation on gene regulation using tiling array technology.
    [Show full text]
  • Model-Based Analysis of Tiling-Arrays for Chip-Chip
    Model-based analysis of tiling-arrays for ChIP-chip W. Evan Johnson*†‡, Wei Li*†‡, Clifford A. Meyer*†‡, Raphael Gottardo§, Jason S. Carroll¶, Myles Brown¶, and X. Shirley Liu*‡ʈ *Department of Biostatistics and Computational Biology, Dana–Farber Cancer Institute, 44 Binney Street, Boston, MA 02115; ¶Department of Medical Oncology, Dana–Faber Cancer Institute and Harvard Medical School, 44 Binney Street, Boston, MA 02115; ‡Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115; and §Department of Statistics, University of British Columbia, 333-6356 Agricultural Road, Vancouver, BC, Canada V6T 1Z2 Edited by Michael S. Waterman, University of Southern California, Los Angeles, CA, and approved June 18, 2006 (received for review February 13, 2006) We propose a fast and powerful analysis algorithm, titled Model- calculated for each probe, and then uses a running window average based Analysis of Tiling-arrays (MAT), to reliably detect regions of the t statistics to identify ChIP regions (7). This method becomes enriched by transcription factor chromatin immunoprecipitation unreliable when there are only a few replicates to estimate probe (ChIP) on Affymetrix tiling arrays (ChIP-chip). MAT models the variance. TileMap (8) proposes an empirical Bayes shrinkage baseline probe behavior by considering probe sequence and copy improvement by weighting the observed probe variance and pooled number on each array. It standardizes the probe value through the variances of all of the probes on the array. TiMAT (http:͞͞bdtnp. probe model, eliminating the need for sample normalization. MAT lbl.gov͞TiMAT) first calculates an average fold change between uses an innovative function to score regions for ChIP enrichment, ChIPs and controls for each probe, then uses a sliding-window which allows robust P value and false discovery rate calculations.
    [Show full text]
  • A Wave of Nascent Transcription on Activated Human Genes
    A wave of nascent transcription on activated human genes Youichiro Wadaa,1, Yoshihiro Ohtaa,1, Meng Xub, Shuichi Tsutsumia, Takashi Minamia, Kenji Inouea, Daisuke Komuraa, Jun’ichi Kitakamia, Nobuhiko Oshidaa, Argyris Papantonisb, Akashi Izumia, Mika Kobayashia, Hiroko Meguroa, Yasuharu Kankia, Imari Mimuraa, Kazuki Yamamotoa, Chikage Matakia, Takao Hamakuboa, Katsuhiko Shirahigec, Hiroyuki Aburatania, Hiroshi Kimurad, Tatsuhiko Kodamaa,2, Peter R. Cookb,2, and Sigeo Iharaa aLaboratory for Systems Biology and Medicine, Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1, Komaba, Meguro-ku, Tokyo 153-8904, Japan; bSir Williams Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, United Kingdom; cGraduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 4259, Ngatsuda, Midori-ku, Yokohama 226-8501, Japan; and dGraduate School of Frontier Biosciences, Osaka University, 1-3, Yamadaoka, Suita, Osaka 565-0871, Japan Edited by Richard A. Young, Whitehead Institute, Cambridge, MA, and accepted by the Editorial Board September 1, 2009 (received for review March 12, 2009) Genome-wide studies reveal that transcription by RNA polymerase II sites where the RAD21 subunit of cohesin and CCCTC-binding (Pol II) is dynamically regulated. To obtain a comprehensive view of factor (CTCF) bind (19, 20). a single transcription cycle, we switched on transcription of five long human genes (>100 kbp) with tumor necrosis factor-␣ (TNF␣) and Results monitored (using microarrays, RNA fluorescence in situ hybridization, A Wave of premRNA Synthesis That Sweeps Down Activated Genes. At and chromatin immunoprecipitation) the appearance of nascent RNA, different times after stimulation with TNF␣, total nuclear RNA changes in binding of Pol II and two insulators (the cohesin subunit was purified and hybridized to a tiling microarray bearing RAD21 and the CCCTC-binding factor CTCF), and modifications of oligonucleotides complementary to SAMD4A, a long gene of 221 histone H3.
    [Show full text]
  • At-TAX: a Whole Genome Tiling Array Resource for Developmental Expression Analysis and Transcript Identification in Arabidopsis
    Open Access Method2008LaubingeretVolume al. 9, Issue 7, Article R112 At-TAX: a whole genome tiling array resource for developmental expression analysis and transcript identification in Arabidopsis thaliana Sascha Laubinger*, Georg Zeller*†, Stefan R Henz*, Timo Sachsenberg*, Christian K Widmer†, Naïra Naouar‡§, Marnik Vuylsteke‡§, Bernhard Schölkopf¶, Gunnar Rätsch† and Detlef Weigel* Addresses: *Department of Molecular Biology, Max Planck Institute for Developmental Biology, Spemannstr. 37-39, 72076 Tübingen, Germany. †Friedrich Miescher Laboratory of the Max Planck Society, Spemannstr. 39, 72076 Tübingen, Germany. ‡Department of Plant Systems Biology, VIB, Technologiepark 927, 9052 Ghent, Belgium. §Department of Molecular Genetics, Ghent University, Technologiepark 927, 9052 Ghent, Belgium. ¶Department of Empirical Inference, Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 Tübingen, Germany. Correspondence: Detlef Weigel. Email: [email protected] Published: 9 July 2008 Received: 15 May 2008 Revised: 12 June 2008 Genome Biology 2008, 9:R112 (doi:10.1186/gb-2008-9-7-r112) Accepted: 9 July 2008 The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2008/9/7/R112 © 2008 Laubinger et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Arabidopsis<p>Aods.</p> developmental expression expression atlas atlas, At-TAX, based on whole-genome tiling arrays, is presented along with associated analysis meth- Abstract Gene expression maps for model organisms, including Arabidopsis thaliana, have typically been created using gene-centric expression arrays.
    [Show full text]