Informatics

Session 1 DATABASES, DATA MINING, VISUALIZATION AND CURATION WEDNESDAY 10/30/2013, 7:30 PM M. Fiume / C. Bult # lname Title Talk Length 1 Fiume MedSavant—Graphical search engine for genetic variants 15 2 Colak ELASPIC—Combining ensemble and structure-based modeling to accurately predict 15 effects on affinity and stability of mutations 3 Coraor Galaxy's long-term sustainability—Deployment to the XSEDE system 15 4 Nip jhive—Visual network comparison with differential hive plots 15 Bult No abstract 15 5 Nutter Progress towards a turnkey system for high-throughput variant discovery and interpretation 15 6 Ouellette Data availability and re-usability in the transition from microarray to next-generation 15 sequencing—Can we do better? 7 Smith The Genome Modeling System—An analysis engine for next generation genome sequencing 15

KEYNOTE SPEAKER: Michael Snyder THURSDAY 10/31/2013, 9:00 AM

Session 2 TRANSCRIPTOMICS, ALTERNATIVE SPLICING AND GENE PREDICTIONS THURSDAY 10/31/2013, 9:45 AM M. Stanke / C. Burge # lname Title Talk Length 8 Stanke Simultaneous gene finding in aligned 15 9 Ainsley Genome wide characterization of ribosome-bound mRNA from activated dendrites 15 10 Kannan Optimal and fundamental limits for de novo transcriptome assembly 15 11 Mortazavi A computational pipeline for the validation of canonical RNA editing using ICE-seq 15 Burge No abstract 15 12 Neretti Genome-wide transcriptional landscape of repetitive elements in humans 15 13 Patro Sailfish—RNA-seq expression estimates need not take longer than a cup of coffee 15 14 Sibbesen Probabilistic transcriptome assembly 15

Session 3 POSTER SESSION I THURSDAY 10/31/2013, 2:00 PM

# lname Title Talk Length 15 Abolude Human Microbiome Project data analysis and coordination center resources for user-friendly, automated metagenomic analysis 16 Adhikari Genomic analysis of atoxigenic isolates of Aspergillus flavus 17 Aganezov Varying-resolution synteny blocks construction for large-scale phylogenomics 18 Aitken Kinetic signatures—A novel approach to time series expression data analysis

Page 1 of 9 19 Akatsuka Informatics for analyzing distribution of oxidative DNA damages across the entire genome in mammalian cells 20 Alameer Modeling complex genetic and environmental influences on ALS and FTD 21 Argimon In silico genome subtraction to simplify the discovery of accessory sequences in microbial genomes 22 Ballouz Characterizing RNA-seq through the meta-analysis of co-expression networks 23 Barrell Ensembl Gene Annotation 2013 24 Beier Assembling barley chromosome 3H by multiplexed Illumina sequencing 25 Benoukraf Visualizing the DNA/DNA interactome using ChromoLens 26 Berthelot Benchmarking ChIP-seq pipelines for non-model species data 27 Blankenberg Wrangling Galaxy’s reference data 28 Bouvier Improving reproducibility using automated testing frameworks 29 Boyle The real effect of SNPs on transcription factor binding 30 Canzar Cutterhead—Recovering low-expressed transcripts from RNA-seq 31 Chan RNA-seq analysis workflow comparison on Ion Torrent data 32 Coffman DGIdb—Mining the druggable genome 33 Colak Novel machine learning approach identifies 30-40% of alternative splicing isoforms as novel functional 34 Criscione RepEnrich—A new method to estimate repetitive element enrichment reveals age-associated changes in retrotransposon expression 35 Daugherty The IGS analysis engine 36 Davila Analysis and characterization of immunoglobulin light chain SMRT™ sequencing data in normal and amyloid samples 37 De Pons Search, visualization, comparison, and annotation of next-gen rat strain sequence at the Rat Genome Database 38 DeBoever Transcriptome sequencing reveals aberrant 3’ splicing and alternative 3’ UTR usage in SF3B1-mutated cancers 39 Denas Deep modeling of gene expression regulation during erythropoiesis 40 Dobin Circular RNA detection and classification using RNA-seq data 41 Erickson ENCODE Project data access via REST API and JSON 42 Ferretti The new ICGC data portal and its underlying scalable software architecture 43 Fortini Optimization of PAR-CLIP and RNA-Seq analysis to give insights into the internal organization and function of a nuclear long noncoding RNA:Protein complex 44 Fourrage Identification and visualization of alternative splicing events in Uveal Melanoma RNA-seq samples using EASANA 45 Frankish Identifying functional ‘nonsense’ across the human genome

Page 2 of 9 46 Friedberg Critical assessment of function annotations 2—Lessons learned and the road ahead 47 Fu RNA-Seq based transcriptome assembly, profiling, and polymorphism identification of two alfalfa genotypes 48 Fu Conserved secondary structure prediction for RNA homologs with domain insertions—Dynalign II

49 Ghosh Evaluation of methods to analyze isoform expression from RNA-Seq data generated by the Ion Torrent *Proton platform 50 Ghosh Two-step alignment for optimal ION *Proton RNA-Seq analysis 51 Giardine Using workflows for consistent analysis of ChIP-seq and RNA-seq data 52 Goecks Understanding cancer genomes using Galaxy 53 Gonnella Harlekin—Effective and scalable homopolymer error correction for NGS data 54 Gonzalez Increasing the GENCODE mouse lncRNA gene repertoire using RNAseq data 55 Gout Transcriptome analysis reveals novel gene coding variants and fusion transcripts in infant acute lymphoblastic leukemia 56 Gupta GABox—A "white-box" genome annotation pipeline 57 Gurtowski An improved method for hybrid correction of long-read, low-identity sequencing data 58 Haberman Ziv Ileal transcriptome analysis in treatment naïve pediatric Crohn’s disease supports age-related immune maturation that is associated with induction and pathogenesis of disease in the ileum

59 hajirasouliha A combinatorial approach for constructing ancestral history of tumors, using inferred mutational frequencies in deep sequenced genomes 60 Halpern Use of known variants within the iSAAC variant calling pipeline, with evaluation using the Platinum Genomes resource 61 Hansen BARD—Detection of copy number alterations in next-generation sequence from tumors and matched normal samples 62 Hayes Transitioning phytozome genome visualization to JBrowse 63 Herrero Genome assembly assessment based on single copy genes 64 Hong Draft genome sequence of three Pseudomonas sp. strain H1, H5-1 and H5-2, analysis revealing genes for caprolactam degradation 65 Hou Detecting pico-inversions using multi-species alignment 66 Imai New genome assembly using only PacBio continuous long reads for genomes larger than bacterial genomes 67 Isomoto Population based discovery of tobacco-smoking-related differential DNA methylation 68 Jacobsen Systematic analysis of microRNA target interactions across diverse cancer types 69 Jain SBRI—A sampling based quantitative index to evaluate ChIP-Seq reproducibility 70 Janga Dissecting the expression landscape of RNA-binding proteins implicated in human cancers 71 Janin Versatile cloud-based genomics with compressed text indexes

Page 3 of 9 72 Janky Detecting master regulators and cis-regulatory interactions in human cancer related gene networks 73 Jex The draft Trichuris suis genome 74 Jiao Maize Pan-genome construction by short reads assembly 75 Jose Functional annotation of transcriptomes assembled de-novo from RNA-Seq data using random walk with return 76 JUNG Bacterial community structure and yield of the red pepper, Capsicum annum L., under different cropping systems via 454-pyrosequencing 77 Katayama Histogram Clustering approach for ChIP-seq data across multiple samples 78 Khalfan DNA subway—Genomics, DNA barcoding, and RNA-seq bringing cutting-edge biology into the classroom 79 Kim TopHat3—Faster and more sensitive spliced alignment 80 Kucukural Flexible pipeline generation platform for HPC systems 81 Kumari Emergence—Data-driven pipeline discovery interface integrating multiple platforms

82 Kyriazopoulou Integrating gene expression and sequence data with existing biological knowledge to model context-specific gene regulation 83 Lam Mapping of a non-canonical secondary structure, the G-quadruplex, in the mammalian genome

84 Layer Efficient and accurate DNA classification without sequence alignment 85 Layer LUMPY—A probabilistic framework for structural variant discovery 86 Lederman General purpose and customized random-permutations-based mappers 87 Lederman Using the long-range “independence” property of DNA for read mapping 88 Li Oshell—A comprehensive work environment for NGS analysis 89 Li A regression model for assessing factors that affect the reproducibility of high-throughput experiments 90 Liseron-Monfils The dynamic of regulatory network based on transcription factors and miRNAs during plant development and response to stresses 91 Liu Linear time de novo detection of transposable elements with sequence variation 92 Loraine Visualizing RNA-Seq data with Integrated Genome Browser 7.0 93 MARSHALL The future of the Samtools software package 94 Marshall Exome sequencing 1000 individuals with extreme bone density—Rare variant discovery and Validation 95 Martin Collared flycatcher genome annotation 96 Middleton NoFold—RNA structure characterization without folding or alignment 97 Minot Evaluating novel metagenomic classification algorithms for forensic microbial detection

Page 4 of 9 98 Mirmomeni Increasing genome assembly quality using high performance computing 99 Monaco Gramene—A resource for comparative plant genomics 100 Moore The antecedents of higher-order chromatin—Insights from integrative modelling 101 Morrissey Improving ChIP-seq peak identification using subsampling 102 Moxon A combined computational and genetic approach to identify novel canonical and non-canonical miRNAs in zebrafish 103 Mukamel Genomic sequence determinants of cell-type specific DNA methylation

Session 4 SEQUENCING PIPELINES AND ASSEMBLY THURSDAY 10/31/2013, 7:30 PM I. Korf / A. Quinlan # lname Title Talk Length 104 Quinlan Mining genetic variation with GEMINI 15 105 Chin String graph assembly for diploid genomes with long reads 15 106 Hara Identification and characterization of ultramicro inversions within local alignments between closely 15 related species 107 Mascher Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ) 15 108 Miga Complete sequence representation across human centromeric regions 15 109 Miller Scalable assembly of native single-molecule reads 15 110 Narzisi SCALPEL—Micro-assembly approach to detect INDELs within exome-capture data 15 111 Oesper Analysis of complex genomic rearrangements using high-throughput DNA sequencing data 15

Session 5 COMPARATIVE AND EVOLUTIONARY GENOMICS FRIDAY 11/1/2013, 9:00 AM J. Ma / H. Roest Crollius # lname Title Talk Length 112 Ma New algorithms to trace the evolution of non-coding genomic sequences 15 113 Alekseyev Reconstruction of ancestral genomes in presence of gene loss 15 114 Garber Evolutionary dynamics and tissue specificity of human long noncoding RNAs in six mammals 15

115 Herrero Ancestry of single base pair variants in the human genome 15 116 Roest Crollius Ancestral reconstructions provide insight in vertebrate molecular and structural genome evolution 15

117 Rozowsky Comparison of 3 metazoan transcriptomes 15 118 Wood Rapid phylogenetic sequence classification through repeated k-mer comparisons 15 119 Yan OrthoClust—An orthology-based network framework for expression clustering across multiple 15 species

Page 5 of 9 Session 6 EPIGENOMICS AND NON-CODING GENOME FRIDAY 11/1/2013, 1:30 PM R. Lister / A. Marques # lname Title Talk Length 120 Lister Global epigenomic reconfiguration during mammalian brain development 15 121 Brody Tools for analysis and comparison of enhancer structure 15 122 Carmel Paleo-epigenetics—Reconstructing the DNA methylation maps of archaic hominins 15 123 Day Responsive histone modifications at regulatory elements activated upon VEGFA stimulation of 15 human endothelial cells 124 Marques Chromatin status separates two equally populated yet distinct classes of intergenic long 15 noncoding RNAs 125 Dror Widespread evidence for DNA shape dependent transcription-factor binding preferences 15 126 Hon Determinants for the localization of short antisense transcripts at 3'end of protein-coding genes in 15 protozoan Entamoeba histolytica is encoded in the genome 127 Kasowski Extensive variation in chromatin states across humans 15

KEYNOTE SPEAKER: Lior Pachter FRIDAY 11/1/2013, 4:30 PM

Session 7 POSTER SESSION II FRIDAY 11/1/2013, 5:15 PM

# lname Title Talk Length 128 Naito GGRNA—A Google-like, ultrafast search engine for genes and transcripts 129 Nakaki CoLo—A novel algorithm to distinguish significant co-localizations through multiple ChIP-seq data comparison 130 Nakato Comparative ChIP-seq analysis of cohesin and CTCF for multiple cells 131 Narechania Finding clusters in the flock—A new method to explore incongruent gene partitions in phylogenomic datasets 132 Nekrutenko Finding missing analysis tools with Galaxy ToolShed—A case of mitochondrial RNA modification

133 Ning Cross_genome—Assembly Improvement using cross species synteny 134 Nissen Trio-based study of neuromuscular dysfunction using whole exome sequencing 135 Niu HOTSPOT—A novel computational tool for inferring functional importance of cancer mutations through 3D proximity analyses 136 Noutsos Atmosphere, iPlant Collaborative’s cloud computing for plant sciences 137 O'Connor SeqWare on the multicloud—Enabling distributed analysis in disparate environments 138 Olson Learning to tell the difference between genes and junk 139 Ono RefEx—Reference expression dataset for tissue transcriptome 140 Onuki Development of a miRNAs prediction pipeline for the wheat chromosome 6B genome sequence

Page 6 of 9 141 O'Rawe Integrating multiple sequencing and informatics pipelines in the study of one large pedigree 142 Ouellette FGED—The Functional Genomics Data Society 143 Ouellette Streamlining development and maintenance of sequence analysis pipelines 144 Park De novo assembly and insecticidal toxin gene mining of entomopathogenic bacteria, Photorhabdus temperata M1021 genome 145 Pendleton Detection and resolution of tandem repeats from single molecule sequencing data 146 Philip Bioinformatics tutorials using Galaxy 147 Piper Wellington—A novel method for the accurate identification of digital genomic footprints from DNase-seq data 148 Prasad Genome sequence and initial analysis of malaria model Plasmodium coatneyi 149 Preece Plant Reactome—Metabolic and regulatory networks for plants 150 Price Benchmarking RNAseq algorithms to detect differential expression of splice forms 151 Raney Assembly data hubs support viewing any sequence on the UCSC Genome Browser 152 Rao Functional data-mining for protein structure families without folding or homology modeling 153 Ratsch PALMapper—Fast, accurate and variation-aware RNA-Seq alignments 154 Ream Towards a universal model of gene block evolution in bacteria and archaea—The case for proteobacteria 155 Rebolledo-Jaramillo Controlling for contamination in re-sequencing studies with a reproducible phylogenetic approach

156 Regier Automatic interpretation of non-protein coding somatic variants 157 Rensch Modelling of complex tissue ChIP-sequencing 158 Rockowitz Comparison of REST cistromes across human cell types reveals common and context-specific functions 159 Saha Composition of the maize endophytic microbiome is correlated with maize genotype 160 Sanchez-Vega Differential methylation in the ZNF154, CASP8 and VHL promoters is recurrent across a wide variety of cancer types 161 Sankoff Genome aliquoting for ancient polyploids grape, tomato and sacred lotus 162 Sauria High-resolution HiC analysis method reveals sub-TAD, functionally diverse chromatin modules

163 Sedlazeck Improving de novo genome and transcriptome assemblies using a consensus assembly and genetic linkage 164 Sese Statistical significance of combinatorial regulations 165 Shah Developing a framework for for detection of low frequency somatic genetic alterations in targeted sequencing data 166 Shahid De novo annotation of Amborella small RNAs using ShortStack reveals unique prevalence of non-conserved 23-24-nt miRNAs

Page 7 of 9 167 Shimamura Hierarchical Bayes model-based analysis of chromatin interaction maps 168 Shu PERTRAN—Predict transcripts from paired-end or single-end RNA-seq reads 169 Silverstein Detecting small plant peptides using SPADA (Small Peptide Alignment Discovery Application)

170 Sompallae Clinical validation of targeted cancer gene mutations using next-generation sequencing 171 Song Comparative analysis of Saccharomyces cerevisiae—Integrating functional annotation with genetic variation 172 Song CLASS—A program for accurate reconstruction of genes and alternative splicing variations from RNA-seq data 173 Stadler Systematic exploration of coding sequence determinants of heterologous protein yields in a mammalian expression system 174 Standage mRNAmarkup—Quality control and annotation of de novo transcriptome assemblies 175 Syed Developing a data management system for NGS clinical diagnostics 176 Taghavi Distilled single-cell genome sequencing and de novo assembly for sparse microbial communities—Comparison of different search techniques 177 Tang Visualizing consequences of genetic variation in biological networks 178 Thiru Mining expression compendia to identify differences in metabolism between cancer and normal tissues 179 Trinh Genome-wide linkage analysis and rare variant association methods to identify LRRK2 p.G2019S age of onset modifiers 180 Verleyen Characterizing algorithmic and data dependencies in the computational analysis of gene function

181 Wang Novel stress-induced smRNAs from Brachypodium distachyon 182 Wang The core regulons orchestrating the response of Clostridium acetobutylicum to butanol and butyrate stress 183 Wei Exploring rice phylogenomics with GRAMENE 184 Westholm Identification of thousands of circular RNAs in three Drosophila species 185 Whitley Effects of sequencing coverage bias on NIPT aneuploidy detection 186 Wong A metadata standard to improve access to biological samples and diverse experimental data sets

187 Wong Allele-specific contributions to transcription factor binding in mouse 188 Wong Whole genome miRNA-mRNA-DNA integrative analysis in family trios 189 Wu Hybrid strategies for aligning high-throughput sequencing reads 190 Xuan Reveal significant association of genomic loci from protein-protein interactions 191 Yeung Whole transcriptome alternative splicing pipeline for analysis of subtype-specific splicing signature of prostate cancer 192 ZaÏd Coverage rate of ADME genes from commercial genotyping and sequencing assays

Page 8 of 9 193 Zerbino Updating the Ensembl regulatory features 194 Zhang Classify breast cancer patients based on somatic mutations and gene interacation networks

195 Zhang Competition between Pol II and positioned nucleosomes for access to core promoters during Drosophila development 196 Zhao TMM_RPKM for within- and between-sample RNA-Seq data normalization 197 Zhao A hybrid method of incorporating functional consequences into set-based association test to discover rare non-synonymous variants in complex disease 198 Zhu A framework to generate ortholog annotation for cross-species comparative analysis 199 Zhu Comprehensive depiction of genomic interactions in cancer based on TCGA data 200 Zmasek Analysis of the evolution of the repertoire of RNA recognition proteins hints at ancestrally complex RNA regulation

Session 8 POPULATION AND PERSONAL GENOMICS SATURDAY 11/2/2013, 9:00 AM D. MacArthur / Y. Erlich # lname Title Talk Length 201 MacArthur Integrated analysis of protein-coding variation in over 50,000 individuals 15 202 Gerstein Integrative annotation of variants from 1,092 humans—Application to cancer genomics 15 203 Gymrek Short tandem repeat polymorphisms create an abundant source of expression variability 15 204 Hide Community development of validated variant calling pipelines 15 205 Erlich Genetic media 15 206 Lau Discovering more variants in a human genome—New algorithms applied to longer and less 15 biased reads 207 Lehmann Causes and consequences of cancer transcriptome variability 15 208 Ronen Whole genome sequencing uncovers the genetics of chronic mountain sickness in Andean 15 highlanders

Page 9 of 9