Modeling Genomic Regulatory Networks with Big Data

Total Page:16

File Type:pdf, Size:1020Kb

Modeling Genomic Regulatory Networks with Big Data TIGS-1108; No. of Pages 10 Review Modeling genomic regulatory networks with big data Hamid Bolouri Division of Human Biology, Fred Hutchinson Cancer Research Center (FHCRC), 1100 Fairview Avenue North, PO Box 19024, Seattle, WA 98109, USA High-throughput sequencing, large-scale data generation throughput sequencing-based technologies and to compu- projects, and web-based cloud computing are changing tational modeling and analysis. At the same time, GRNs how computational biology is performed, who performs are both complex (i.e., can exhibit hard-to-predict/nonline- it, and what biological insights it can deliver. I review here ar behaviors) and complicated (i.e., they are composed of the latest developments in available data, methods, and large numbers of component parts and interactions). For software, focusing on the modeling and analysis of the this reason, mathematical and computational approaches gene regulatory interactions in cells. Three key findings are essential in GRN research. are: (i) although sophisticated computational resources Cellular behaviors have traditionally been character- are increasingly available to bench biologists, tailored ized as being mediated through highly distinct processes ongoing education is necessary to avoid the erroneous (e.g., DNA replication) and pathways (e.g., the canonical use of these resources. (ii) Current models of the regula- WNT signaling pathway). However, because of widespread tion of gene expression are far too simplistic and need interactions among cellular processes and pathways, the updating. (iii) Integrative computational analysis of large- use of unbiased, genome-wide technologies is essential to scale datasets is becoming a fundamental component of the discovery and characterization of GRNs. molecular biology. I discuss current and near-term oppor- In addition to the bedrock of ‘classical’ cis-regulatory tunities and challenges related to these three points. analysis, GRN modeling today is buttressed by four cor- nerstones: (i) high-throughput technologies, (ii) integrative Gene regulatory networks (GRNs) analysis of complementary data types, (iii) leveraging The past few years have witnessed dramatic milestones in large-scale public datasets, (iv) computational modeling high-throughput sequencing, large-scale data generation, and analysis. This article reviews recent developments and cloud computing, and computational biology. Supra-expo- discusses their implications for future research. nential improvements in the throughput and cost of DNA To maintain coherence and brevity, this review will sequencing (http://www.genome.gov/sequencingcosts/) have focus on developments in human GRN modeling and anal- been accompanied by improvements in accuracy and reduc- ysis. Diverse new GRN modeling opportunities are also tions in the required sample size. These improvements have opening up in both well-studied and less-studied organ- in turn led to the widespread adoption of a broad range of isms. These and the complex GRNs underlying interac- sequencing-based technologies (reviewed in [1]) to charac- tions between hosts and commensal or pathogenic terize not only genomes but also the regulatory interactions organisms are beyond the scope of the present review. that allow genomes to specify cellular structure, function, and behavior. Types and uses of human GRN modeling GRNs are defined as the set of interactions among genes A model is any representation of a system that can facili- and their products (RNAs and proteins) that determine the tate its analysis, communication, or documentation [3]. isoforms, location (cell type), timing, and rate of RNA Modeling is at the heart of GRN research at multiple expression [2] (see Figure 1 for examples). With the possi- levels. At the most basic level, statistical models are at ble exception of some metabolic and physiological process- the heart of all high-throughput data analysis. For exam- es, GRNs are the primary drivers of cellular behavior and ple, statistical models are commonly used to characterize function. DNA fragment length distribution as a first step towards Because GRNs are ultimately specified by the digital the identification of transcription factor (TF) binding peaks code of DNA, they are uniquely accessible to both high- in ChIP-seq (chromatin immunoprecipitation followed by high-throughput DNA sequencing) data. Given filtered data, methods such as network inference Corresponding author: Bolouri, H. ([email protected]). Keywords: gene regulatory networks; modeling; network biology; big data; computa- [4], guilt-by-association (e.g., through network or expres- tional biology; bioinformatics; systems biology. sion clustering; see Figures 2 and 3), and enrichment/over- 0168-9525/$ – see front matter representation analysis (e.g., to identify the impacted ß 2014 Published by Elsevier Ltd. http://dx.doi.org/10.1016/j.tig.2014.02.005 pathways or processes [5]) are used to organize genes and their products into broad-brush conceptual models. These models can then be refined and extended by inte- grating multiple data types each highlighting a different Trends in Genetics xx (2014) 1–10 1 TIGS-1108; No. of Pages 10 Review Trends in Genetics xxx xxxx, Vol. xxx, No. x (A) SourceId TargetId Source Sign Target Evidence Weak early effect? Stronger indirect effect? (Franco et al. 2006 shows effect on 32 171 Notch/CSL REPRESSES C/EBPa purified Thy-1+ cells; not seen in crude fracon in Taghon et al 2007) DN2, DN3 Stage negave 265 100 Regulator REPRESSES GATA-2 Tydell et al. 2007 J. Immunol. 264 163 DN3 Specific regulator PROMOTES Eva1 Tydell et al. 2007 J. Immunol. DN2, DN3 Stage negave 265 161 Regulator REPRESSES c-Kit Taghon et al. 2006 Immunity; Yui & Rothenberg 2004 J. Immunol. 264 165DN3 Specific regulator PROMOTES Notch1 Taghon et al. 2006 Immunity Key: 192 82 Notch-mod. GATA-3 REPRESSES TCF-1 Taghon et al 2007 Nat Immunol Expression: Wang, D.et al.: Anderson, J Immunol. 2006 Jul 1;177(1):109-19; Franco et al 2006 32 117 Notch/CSL PROMOTES HEBAlt PNAS. High Medium Low 32 160 Notch/CS L PROMOTES CD25 Taghon et al 2005 ; I. Maillardet al.: Pear, J Exp Med. 2006 Oct 2;203(10):2239-45. Taghon et al 2005 Genes Dev; Franco et al. 2006 PNAS; Taghon et al 2007 Nat 32 89 Notch/CSL PROMOTES Deltex1 Immunol; many earlier references 32 116 Notch/CSL PROMOTES Ets2 Taghon et al 2007 Nat Immunol; Franco et al 2006 PNAS GATA-3 GATA-3 GATA-3 GATA-3 32 34 Notch/CSL PROMOTES HES-1 Taghon et al 2005 Genes Dev; many other references Notch-mod. Notch-mod. Notch-mod. Notch-mod. 32 149 GATA-3 Notch/CSL PROMOTEGATA-3S LEF-1 Taghon et al 2007 Nat ImmunolGATA-3 GATA-3 32 165 Notch/CSL PROMOTES Notch1 Taghon et al 2007 Nat Immunol 32 166TCF-1 Notch/CSBcl11BL PROMOTE TCF-1S Notch3Bcl11B Taghon et al 2007 Nat ImmunoTCF-1 l Bcl11B TCF-1 Bcl11B Reizis, B. & Leder Genes Dev. 2002 Feb 1;16(3):295-300; Franco et al. 2006 PNAS; 32 0 Notch/CSL PROMOTES PTa Taghon et al 2007 Nat Immunol α α HES-1 Nrarp PTα HES-1 Nrarp PTα HES-1 Nrarp PT HES-1 Nrarp PT Taghon et al 2007 Nat Immunol; Franco et al 2006 PNAS; cf. also Nakagawa, 32 153 Notch/CSL PROMOTES Runx1 M.et al.: Chiba, Blood. 2006 Nov 15;108(10):3329-34. Notch3 Notch3 Notch3 Notch3 Deltex1 Notch1 CD25 Deltex1 Notch1 CD25 Deltex1 Notch1 CD25 Deltex1 Notch1 CD25 Pou6f1 LEF-1 SATB1 Pou6f1 LEF-1 SATB1 Pou6f1 LEF-1 SATB1 Pou6f1 LEF-1 SATB1 DN1 Bcl2 Erg DN2 Bcl2 Erg DN3 Bcl2 Erg DN4 Bcl2 Erg (B) TRENDS in Genetics Figure 1. Examples of gene regulatory network analysis, documentation, and visualization. (A) Part of BioTapestry visualization of a proposed early T cell specification gene regulatory network (GRN) (adapted from: http://www.its.caltech.edu/tcellgrn/Oldnetwork.html). Each gene (symbol with a bent arrow) is represented as having a regulatory region (horizontal line) and a transcriptional output (arrow). A transcription factor (TF)–DNA binding interaction is depicted as an arrow incident on the regulatory region of a gene. Protein–protein interactions are depicted by circles with incident and output arrows. The background color of each gene indicates the fold-change in expression of the gene at a particular developmental stage. Snapshots of the network over four developmental stages are shown [double negative (DN) 1 to 4]. In the interactive viewer, clicking on a gene brings up a table showing the experimental data supporting the indicated regulatory interactions. (B) Cytoscape visualization of potential T cell specification gene regulatory interactions derived from ChIP-seq and gene expression data. Arrows represent regulatory interactions. Node colors and sizes represent gene expression levels at early and late developmental stages. The inset shows a zoomed-in view of the lower portion of the network. Using Cytoscape utilities, the user can quickly and easily identify a set of genes coregulated by Sfpi1 and Lyl1 (edge arrows highlighted in gold). This example network was derived during a 1.5 h introductory laboratory session by novice computational biology students (see http://www.bu.edu/computationalimmunology/summer-school/ for details). 2 TIGS-1108; No. of Pages 10 Review Trends in Genetics xxx xxxx, Vol. xxx, No. x (A) (B) PC2 Expression in condion 2 Expression in condion 1 PC1 (C) (D) PC2 PC2 PC1 PC1 TRENDS in Genetics Figure 2. An example of how choices in data processing impact the findings. (A) Scatter plot of example simulated data showing the expression of 200 genes in two conditions. This simulated dataset was intentionally generated to have two distinct characteristics. First, genes near the lower-right (upper-left) corner of the plot are high (low) in condition 1 and low (high) in condition 2. In addition, there are two well-separated groups of genes distinguished by having generally higher (lower) expression in both conditions. (B) The same data plotted in the space of the principal components (PC) (in this case, because there were only two conditions, there are only two PCs).
Recommended publications
  • Java Jdk1.4 Software Free Download
    Java jdk1.4 software free download click here to download Downloading and installating Java Development Kit JDK for SVD Downloading and installing JDK for SVD applet and application. The JDK includes tools useful for developing and testing programs written in the links on the top of this page to learn more about it and register for one for free. For current Java releases, please consult the Oracle Software Download page. For information on how to download the JDK for use on Solaris 9 and Solaris 10 with Sun software please see SunSolve Document - Infodoc ref: The Java Development Kit is a collection of tools that developers use to Free Sun Microsystems Windows XP//Vista/Server /7. The Java Development Kit contains the software and tools needed to Free Sun Microsystems Windows XP//Vista/Server /7 0 stars Be the first to review this product. All Versions. out of 1 votes. 5 star. 1; 4 star. Java Development Kit, free and safe download. Java Development Kit latest version: Compile and run Java apps on bit systems. You can check if you are running bit Windows before downloading bit Java for Users should download bit Java software, if they are using bit browser on Go to www.doorway.ru; Click on the Free Java Download button, and start the. Download the latest version of the top software, games, programs and apps Download Java for Windows XP - Best Software & Apps. Filter by: Free . License: buy Download; Platform: Windows. Execute your JavaScript with this free tool. 4. JavaExe, free and safe download. JavaExe latest version: Launch your Java application as a Windows file (if you know how).
    [Show full text]
  • Cytoscape: the Network Visualization Tool for Genomespace Workflows
    F1000Research 2014, 3:151 Last updated: 21 JUL 2021 SOFTWARE TOOL ARTICLE Cytoscape: the network visualization tool for GenomeSpace workflows [version 2; peer review: 3 approved] Barry Demchak1, Tim Hull1, Michael Reich2, Ted Liefeld2, Michael Smoot3, Trey Ideker1, Jill P. Mesirov2 1Department of Medicine, University of California, San Diego, La Jolla, CA, 92093-0688, USA 2Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA , 02142, USA 3Qualcomm, Inc, San Diego, CA , 92121, USA v2 First published: 01 Jul 2014, 3:151 Open Peer Review https://doi.org/10.12688/f1000research.4492.1 Latest published: 26 Aug 2014, 3:151 https://doi.org/10.12688/f1000research.4492.2 Reviewer Status Invited Reviewers Abstract Modern genomic analysis often requires workflows incorporating 1 2 3 multiple best-of-breed tools. GenomeSpace is a web-based visual workbench that combines a selection of these tools with mechanisms version 2 that create data flows between them. One such tool is Cytoscape 3, a (revision) popular application that enables analysis and visualization of graph- 26 Aug 2014 oriented genomic networks. As Cytoscape runs on the desktop, and not in a web browser, integrating it into GenomeSpace required version 1 special care in creating a seamless user experience and enabling 01 Jul 2014 report report report appropriate data flows. In this paper, we present the design and operation of the Cytoscape GenomeSpace app, which accomplishes 1. Aris Floratos, Columbia University, New this integration, thereby providing critical analysis and visualization functionality for GenomeSpace users. It has been downloaded over York, NY, USA 850 times since the release of its first version in September, 2013.
    [Show full text]
  • Cytoscape: the Network Visualization Tool for Genomespace Workflows
    UC San Diego UC San Diego Previously Published Works Title Cytoscape: the network visualization tool for GenomeSpace workflows. Permalink https://escholarship.org/uc/item/1qr8t9n8 Authors Demchak, Barry Hull, Tim Reich, Michael et al. Publication Date 2014 DOI 10.12688/f1000research.4492.2 Peer reviewed eScholarship.org Powered by the California Digital Library University of California F1000Research 2014, 3:151 Last updated: 15 SEP 2014 SOFTWARE TOOL Cytoscape: the network visualization tool for GenomeSpace workflows [v2; ref status: indexed, http://f1000r.es/47f] Barry Demchak1, Tim Hull1, Michael Reich2, Ted Liefeld2, Michael Smoot3, 1 2 Trey Ideker , Jill P. Mesirov 1Department of Medicine, University of California, San Diego, La Jolla, 92093-0688, USA 2Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, 02142, USA 3Qualcomm, Inc, San Diego, 92121, USA v2 First published: 01 Jul 2014, 3:151 (doi: 10.12688/f1000research.4492.1) Open Peer Review Latest published: 26 Aug 2014, 3:151 (doi: 10.12688/f1000research.4492.2) Invited Referee Responses Abstract Modern genomic analysis often requires workflows incorporating multiple 1 2 3 best-of-breed tools. GenomeSpace is a web-based visual workbench that combines a selection of these tools with mechanisms that create data flows version 1 between them. One such tool is Cytoscape 3, a popular application that published report report report 01 Jul 2014 enables analysis and visualization of graph-oriented genomic networks. As 1 1 1 Cytoscape runs on the desktop, and not in a web browser, integrating it into GenomeSpace required special care in creating a seamless user experience and enabling appropriate data flows.
    [Show full text]
  • Data Visualization by Nils Gehlenborg
    Data Visualization Nils Gehlenborg ([email protected]) Center for Biomedical Informatics / Harvard Medical School Cancer Program / Broad Institute of MIT and Harvard ISMB/ECCB 2011 http://www.biovis.net Flyers at ISCB booth! Data Visualization / ISMB/ECCB 2011 / Nils Gehlenborg A good sketch is better than a long speech. Napoleon Bonaparte Data Visualization / ISMB/ECCB 2011 / Nils Gehlenborg Minard 1869 Napoleon’s March on Moscow Data Visualization / ISMB/ECCB 2011 / Nils Gehlenborg 4 I believe when I see it. Unknown Data Visualization / ISMB/ECCB 2011 / Nils Gehlenborg Anscombe 1973, The American Statistician Anscombe’s Quartet mean(X) = 9, var(X) = 11, mean(Y) = 7.5, var(Y) = 4.12, cor(X,Y) = 0.816, linear regression line Y = 3 + 0.5*X Data Visualization / ISMB/ECCB 2011 / Nils Gehlenborg 6 Anscombe 1973, The American Statistician Anscombe’s Quartet Data Visualization / ISMB/ECCB 2011 / Nils Gehlenborg 7 Exploration: Hypothesis Generation trends gaps outliers clusters - A large data set is given and the goal is to learn something about it. - Visualization is employed to perform pattern detection using the human visual system. - The goal is to generate hypotheses that can be tested with statistical methods or follow-up experiments. Data Visualization / ISMB/ECCB 2011 / Nils Gehlenborg 8 Visualization Use Cases Presentation Confirmation Exploration Data Visualization / ISMB/ECCB 2011 / Nils Gehlenborg 9 Definition The use of computer-supported, interactive, visual representations of data to amplify cognition. Stu Card, Jock Mackinlay & Ben Shneiderman Computer-based visualization systems provide visual representations of datasets intended to help people carry out some task more effectively.effectively.
    [Show full text]
  • Integrative Genomic Analysis by Interoperation of Bioinformatics Tools in Genomespace
    Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Qu, Kun et al. “Integrative Genomic Analysis by Interoperation of Bioinformatics Tools in GenomeSpace.” Nature Methods 13.3 (2016): 245–247. As Published http://dx.doi.org/10.1038/nmeth.3732 Publisher Nature Publishing Group Version Author's final manuscript Citable link http://hdl.handle.net/1721.1/105733 Terms of Use Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. HHS Public Access Author manuscript Author ManuscriptAuthor Manuscript Author Nat Methods Manuscript Author . Author manuscript; Manuscript Author available in PMC 2016 July 18. Published in final edited form as: Nat Methods. 2016 March ; 13(3): 245–247. doi:10.1038/nmeth.3732. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace Kun Qu#1, Sara Garamszegi#2, Felix Wu#2, Helga Thorvaldsdottir2, Ted Liefeld2,3, Marco Ocana2,3, Diego Borges-Rivera4, Nathalie Pochet2,5, James T. Robinson2,3, Barry Demchak3, Tim Hull3, Gil Ben-Artzi6,7, Daniel Blankenberg8, Galt P. Barber9, Brian T. Lee9, Robert M. Kuhn9, Anton Nekrutenko8, Eran Segal6, Trey Ideker3, Michael Reich2,3, Aviv Regev2,4,10, Howard Y. Chang1,11, and Jill P. Mesirov2,3 1Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA 2The
    [Show full text]
  • Cytoscape: the Network Visualization Tool for Genomespace Workflows[V2
    F1000Research 2014, 3:151 Last updated: 15 SEP 2014 SOFTWARE TOOL Cytoscape: the network visualization tool for GenomeSpace workflows [v2; ref status: indexed, http://f1000r.es/47f] Barry Demchak1, Tim Hull1, Michael Reich2, Ted Liefeld2, Michael Smoot3, 1 2 Trey Ideker , Jill P. Mesirov 1Department of Medicine, University of California, San Diego, La Jolla, 92093-0688, USA 2Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, 02142, USA 3Qualcomm, Inc, San Diego, 92121, USA v2 First published: 01 Jul 2014, 3:151 (doi: 10.12688/f1000research.4492.1) Open Peer Review Latest published: 26 Aug 2014, 3:151 (doi: 10.12688/f1000research.4492.2) Invited Referee Responses Abstract Modern genomic analysis often requires workflows incorporating multiple 1 2 3 best-of-breed tools. GenomeSpace is a web-based visual workbench that combines a selection of these tools with mechanisms that create data flows version 1 between them. One such tool is Cytoscape 3, a popular application that published report report report 01 Jul 2014 enables analysis and visualization of graph-oriented genomic networks. As 1 1 1 Cytoscape runs on the desktop, and not in a web browser, integrating it into GenomeSpace required special care in creating a seamless user experience and enabling appropriate data flows. In this paper, we present the design and version 2 operation of the Cytoscape GenomeSpace app, which accomplishes this published 26 Aug 2014 integration, thereby providing critical analysis and visualization functionality for GenomeSpace
    [Show full text]
  • Nathalie Pochet, Diego Borges-Rivera, Thorin Tabor, Helga Thorvaldsdóttir, Aviv Regev, Jill P
    : An Environment for Frictionless Bioinformatics Michael Reich, Ted Liefeld, Marco Ocana, Dongkeun Jang, Jon Bistline, James Robinson, Peter Carr, Barbara Hill www.genomespace.org Nathalie Pochet, Diego Borges-Rivera, Thorin Tabor, Helga Thorvaldsdóttir, Aviv Regev, Jill P. Mesirov Features Interface Recipes Using Galaxy with GenomeSpace GenomeSpace Enabled Tools Customize Organize and GenomeSpace makes it easy for researchers to use the Easily manage your display create groups of A collection of "recipes" provides quick guides to Galaxy users can send data easily between Galaxy and your files and One-click according to users (e.g. your Get help with Galaxy (Penn State University) tools they already know to perform analyses and to find directories and launching of the types of project team) the analysis accomplishing tasks using the GenomeSpace tools: their GenomeSpace cloud storage: Scientific workflow and data and analysis preview files analysis tools analyses you and add your tools and platform providing a large number of other tools that can help them extend their research into resident in the on your wish to local tools to GenomeSpace Importing Data from GenomeSpace to Galaxy sequence and genome analysis tools. cloud. datasets. perform. GenomeSpace. itself. new areas. Registration is free and includes 20GB of 1 2 3 4 5 cloud based storage. Preprocess and quality check RNA-Seq data (UCSD) 4 Cytoscape GenomeSpace features include: 6 Visualize and analyze molecular interaction Manage your Identify and annotate coding variants from Data is account igv networks and biological pathways whole exome sequencing (WES) data 2 Register or log in to your imported Seamless transfer of data between tools information.
    [Show full text]
  • Visualizing Genome and Systems Biology: Technologies, Tools, Implementation Techniques and Trends, Past, Present and Future Georgios A
    Pavlopoulos et al. GigaScience (2015) 4:38 DOI 10.1186/s13742-015-0077-2 REVIEW Open Access Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future Georgios A. Pavlopoulos1*, Dimitris Malliarakis2, Nikolas Papanikolaou1, Theodosis Theodosiou1, Anton J. Enright3 and Ioannis Iliopoulos1* Abstract “Α picture is worth a thousand words.” This widely used adage sums up in a few words the notion that a successful visual representation of a concept should enable easy and rapid absorption of large amounts of information. Although, in general, the notion of capturing complex ideas using images is very appealing, would 1000 words be enough to describe the unknown in a research field such as the life sciences? Life sciences is one of the biggest generators of enormous datasets, mainly as a result of recent and rapid technological advances; their complexity can make these datasets incomprehensible without effective visualization methods. Here we discuss the past, present and future of genomic and systems biology visualization. We briefly comment on many visualization and analysis tools and the purposes that they serve. We focus on the latest libraries and programming languages that enable more effective, efficient and faster approaches for visualizing biological concepts, and also comment on the future human-computer interaction trends that would enable for enhancing visualization further. Keywords: Biological data visualization, Network biology, Genomics, Systems biology, Multivariate analysis Review for de novo assembly of a new genome to acceptable Introduction coverage is still high, probably at least $40,000, we can In the current ‘big data’ era [1], the magnitude of data now resequence a human genome for $1000 and can explosion in life science research is undeniable.
    [Show full text]
  • Integrative Genomic Analysis by Interoperation Of
    BRIEF COMMUNICATIONS format converters, relieving scientists of the burden of identi- Integrative genomic fying and scripting the conversions. The GenomeSpace Recipe Resource is a growing set of high-utility use cases that demon- analysis by interoperation strate how to leverage multiple tools and serve as quick guides to analysis tasks. The website serves as a knowledge base, newsstand of bioinformatics tools and point of contact and help for the community of users and tool developers. in GenomeSpace Initially seeded by a consortium of biology research labs and development teams of six popular bioinformatics tools Kun Qu1,12, Sara Garamszegi2,12, Felix Wu2,12, (Cytoscape1,2, Galaxy3, GenePattern4, Genomica5, the Integrative Helga Thorvaldsdottir2, Ted Liefeld2,3, Marco Ocana2,3, Genomics Viewer (IGV)6 and the UCSC Table Browser7), Diego Borges-Rivera4, Nathalie Pochet2,5, GenomeSpace now connects 20 tools and data resources. Our James T Robinson2,3, Barry Demchak3, Tim Hull3, consortium labs provided biological projects and analytical needs 6,7 8 9 to drive GenomeSpace design and development. For example, we Gil Ben-Artzi , Daniel Blankenberg , Galt P Barber , 8,9 9 9 8 recapitulated the steps and results of published analyses within Brian T Lee , Robert M Kuhn , Anton Nekrutenko , GenomeSpace (Supplementary Figs. 1 and 2), dissecting and 6 3 2,3 2,4,10 Eran Segal , Trey Ideker , Michael Reich , Aviv Regev , visualizing the gene regulatory networks in human cancer stem Howard Y Chang1,11 & Jill P Mesirov2,3 cells (Supplementary Note 1, Supplementary Figs. 2–5). The study required diverse data types, analytical steps and methods Complex biomedical analyses require the use of multiple and multiple data transfers between tools.
    [Show full text]
  • Cytoscape3 6 0Manual/Pdf
    Cytoscape User Manual Release 3.6.0 The Cytoscape Consortium Nov 16, 2017 Contents 1 Introduction 3 2 Launching Cytoscape 5 2.1 System Requirements..........................................5 2.2 Getting Started..............................................5 3 Command Line Arguments 9 4 Quick Tour of Cytoscape 11 4.1 Starter Panel............................................... 11 4.2 Basic Features.............................................. 12 4.3 The Menus................................................ 14 4.4 Network Management.......................................... 21 4.5 The View Navigator........................................... 24 5 Creating Networks 25 5.1 Import Fixed-Format Network Files................................... 25 5.2 Import Networks from Unformatted Table Files............................. 26 5.3 Import Networks from Public Databases................................ 31 5.4 Create a New Network or Edit One Manually.............................. 37 6 Nested Networks 43 6.1 Creating Nested Networks........................................ 43 6.2 Visualization of Nested Networks.................................... 43 7 Supported Network File Formats 45 7.1 SIF Format................................................ 45 7.2 NNF................................................... 47 7.3 GML Format............................................... 51 7.4 XGMML Format............................................. 51 7.5 SBML (Systems Biology Markup Language) Format.......................... 52 7.6 BioPAX (Biological PAthways
    [Show full text]
  • Genomespace Workshop, FGED 15, Seattle
    Outline • Introduction to GenomeSpace • GenomeSpace Tools and Recipes • GenomeSpace User Interface • Integrative analysis exercise • Other GenomeSpace Tools • GenomeSpace development • Q and A The vision: Integrave Translaonal Genomics GenePattern Cytoscape IGV/UCSC Genomica CMAP iii 1 Differentially Expressed Genes Compendium Load compendium Arrests Show module G2/M 3 map Show i 2 GSEA test network enrichment Network 5 Show Extract 4 module Expand +1 Chromosome Idea (include ii atcgcgtttattcgataagg! neighbors) atcgcgttttttcgataagg! 6 Add ! Transcription Learn p53 Factor track site/score on Alterations Added to GenePattern from UCSC promoter Pathway iv activation v 8 7 Test for Looks close similarity of p53 and gene Expression Conclusion to p53 site location vi Online community to share diverse computational tools Seed Tools Driving Biological Projects Cytoscape Galaxy lincRNAs GenePaern Genomica Cancer stem cells IGV Paent Straficaon UCSC Browser Outreach to new DBPs Outreach to new tools www.genomespace.org GenomeSpace: a connection layer between integrative analysis tools • Support for all types of resource: Web- based, desktop, etc. • Automac conversion of data formats between tools • Easy access to data from any locaon • Ease of entry into the environment GenomeSpace Components GenePattern Galaxy Cytoscape Integrative Genomics Viewer GenomeSpace-Enabled Tools Authentication and Authorization Analysis and Data Manager Tool Manager Genome Space GenomeSpace Server Project Data GenomeSpace Server Register www.genomespace.org Register
    [Show full text]
  • 11 April 2016
    Bioinformatics discovery of new disease drivers from genomic, proteomics, and metabolomics data Andrew P. Hodges, Ph.D. [email protected] SBP 3 May 2016 Viper-KR* • Viper-KR applied to human tissue (obesity / adipose tissue) and cell culture transcriptome. • Data from the Translational Research Institute (TRI) for Metabolism and Diabetes, SBP-Florida Hospital (PI Steven Smith) • The results are promising: multiple computationally predicted gene targets were validated experimentally (7 out of 12) Viper - Virtual Inference of Protein-activity by Enriched Regulon analysis (Mariano J. Alvarez, Federico Giorgi, Andrea Califano). KR – K-nearest neighbor and Random Forest. 5/3/2016 BIOINFORMATIC CORE TOOLS 2 Overview • Pt1 - Introduction/Background • Pt2 - Resources – Public and proprietary data sources – Selected software and underlying algorithms • Pt3 – Example Service: Master regulator/driver predictions – Example models from in vitro and clinical studies – New gene/protein drivers for follow-up confirmation – Integration against knowledge repositories/other resources Pt1 - Introduction Bioinformatics • Major driver in current cancer research • Oftentimes an afterthought • *Should be the driver/guide for your studies! • Quest for drivers/controllers of disease – Not cancer-specific… conceptually any disease – Molecular signatures guiding disease processes and drug responses – Limited by type(s) of data & available samples NCI projects (2015) Major question today: What are the drivers of pancreatic cancer progression in gemcitabine-treated
    [Show full text]