: An Environment for Frictionless Bioinformatics Michael Reich, Ted Liefeld, Marco Ocana, Dongkeun Jang, Jon Bistline, James Robinson, Peter Carr, Barbara Hill www.genomespace.org Nathalie Pochet, Diego Borges-Rivera, Thorin Tabor, Helga Thorvaldsdóttir, Aviv Regev, Jill P. Mesirov

Features Interface Recipes Using Galaxy with GenomeSpace GenomeSpace Enabled Tools Customize Organize and GenomeSpace makes it easy for researchers to use the Easily manage your display create groups of A collection of "recipes" provides quick guides to Galaxy users can send data easily between Galaxy and your files and One-click according to users (e.g. your Get help with Galaxy (Penn State University) tools they already know to perform analyses and to find directories and launching of the types of project team) the analysis accomplishing tasks using the GenomeSpace tools: their GenomeSpace cloud storage: Scientific workflow and data and analysis preview files analysis tools analyses you and add your tools and platform providing a large number of other tools that can help them extend their research into resident in the on your wish to local tools to GenomeSpace Importing Data from GenomeSpace to Galaxy sequence and genome analysis tools. cloud. datasets. perform. GenomeSpace. itself. new areas. Registration is free and includes 20GB of 1 2 3 4 5 cloud based storage. Preprocess and quality check RNA-Seq data (UCSD) 4 GenomeSpace features include: 6 Visualize and analyze molecular interaction Manage your Identify and annotate coding variants from Data is account igv networks and biological pathways whole exome sequencing (WES) data 2 Register or log in to your imported Seamless transfer of data between tools information. Genome- GenomeSpace account to Galaxy GenomeSpace automatically converts file 6 Space 3 Table Identify biological functions for genes in copy Import tool formats, removing the need to write scripts Browser MSigDB number variation (CNV) regions Select the file(s) 1 2 3 4 5 1 8 you want to GenePattern (Broad Institute) and “glue” code. send to Galaxy Find differentially expressed subnetworks Analysis and workflow platform with hundreds of genomics tools for gene ARRAYEXPRESS Easy import of data from public repositories Users can transfer data directly from Exporting Data from Galaxy to GenomeSpace expression, sequence variation, proteomics, 7 Find differentially expressed genes in flow cytometry, etc. 7 Table igv Web-based resources to their genomics Browser RNA-Seq data tools without the need to download first. Genomica (Weizmann Institute) Context Identify and visualize expressed transcripts 1 3 8 One click igv Analysis and visualization tool for integration menus on Genome- Data is Connect your own cloud storage accounts launching of in RNA-Seq data of gene expression, DNA sequence, and Organize and files and Space sent to analysis tools Add your own Dropbox, Amazon or (coming manage your directories Share folders or Export tool Genome- gene and experiment annotation information. cloud-based just like files with on your Identify an up- or down-regulated pathway Space soon) Google Drive accounts easily. MSigDB project folders desktop individuals, the selected from expression data just like files on applications. public or specific datasets. GenomeSpace is an NIH-funded project. 2 Select the dataset to send and choose your desktop. groups of users. your GenomeSpace target directory IGV (Broad Institute) High-performance visualization tool for interactive exploration of large, integrated genomic datasets. Project Achilles The Multiple Myeloma The Cancer Cell Line How You Can Participate Genome- is a systematic effort aimed at Genomics Portal Encyclopedia We are seeking genomic Space identifying and cataloging genetic provides access to a reference set provides public access to UCSC Browser (UCSC) vulnerabilities across hundreds of of multiple myeloma data as well as genomic data, analysis and researchers, bioinformatics tool Convenient access to the underlying UCSC developers, and data repository Enabled genomically characterized cancer cell selected published multiple visualization for approximately browser database, including the reference providers who are interested in lines. myeloma datasets. 1000 cancer cell lines. sequence and working draft assemblies for Portals joining and expanding the a large collection of genomes. GenomeSpace community. See www.genomespace.org Cistrome (Dana-Farber) ChIP-chip- and seq-tools for peak calling and correlation, genome feature association, gene expression analyses, and motif At each step, GenomeSpace performs discovery. Finding transcription all data conversions and transfers Gitools (PRBB) in Action between tools. Framework for analysis and visualization factor regulators of of genomic data using interactive heatmaps that also allows data to be imported from human hematopoiesis various sources. This example GenomeSpace scenario reproduces InSilico DB (University of Brussels) 1. User saves the expression 3. User loads the lineage-specific Galaxy Powerful and intuitive interface to a large part of the Differentation Map analysis from repository of gene expression datasets, the Regev lab paper in Cell, Novershtern et al, data from the GO transcription factors generated 4 including the contents of Gene Expression 2010 transcription factors to in GenePattern to Genomica Compute overlaps Omnibus (GEO). GenomeSpace. through GenomeSpace. a. Upload annotation tracks for the ArrayExpress (EMBL-EBI) genomic locations of the regulators, a Database of functional genomics (microarray 2. User performs differential 4. User uploads bed set of previously published SNPs and and HTS) data expression using the annotation tracks to a set of linkage regions from a expression data loaded Galaxy and IGV through genome-wide association study. from GenomeSpace. b. Run an overlap analysis to determine the (Columbia University) Genomica GenomeSpace. intersection of putative regulators, SNPs, and Iintegrated suite of tools for the analysis and 1 visualization of gene expression, sequence, linkage regions protein structure, and systems biology. Extract transcription factors Genomica a. Load expression data containing GenePattern Synapse (Sage Bionetworks) 200 samples and 8000 genes 2 3 Platform designed to facilitate collaboration within and among scientific teams, access b. Load a gene set containing Gene Compute differentially Identify module networks IGV to large-scale genomics data sets, and Ontology (GO) transcription factors expressed transcription a. Compute module networks to integration with analysis tools and c. Save the expression data from only the GO determine coexpressed “modules” 5 programming environments. of genes within the original Visualize data transcription factors to the GenomeSpace Data factors ISAcreator (U. of Oxford) Manager. a. Perform differential expression expression dataset. a. Load annotation tracks for the Metadata tracking tools to manage life analysis to determine genes that b. Load the lineage-specific transcription factors 3 types of data in step 4 into IGV science, environmental, and biomedical experiments. significantly distinguish human generated by GenePattern igv b. View the concordance between the embryonic stem cells (hESCs) c. Use these two datasets to generate a list of locations of the analytically versus differentiated cells. potential regulators identified potential regulators and the previously MSigDB (The Broad Institute) published SNPs and linkage regions Online tools for a large curated collection of annotated gene sets. Hematologic disorders Reactome (OICR) - Available soon Reactome is a curated database of pathways and reactions (pathway steps) in human biology.

IGV multi-locus view