The Scope of Pathway Curation Data Community
Total Page:16
File Type:pdf, Size:1020Kb
1/25/2019 Biocuration of genes and pathways Biocuration involves the creation of a user-friendly narrative of biological information—based on review, analysis and systematic organization of data—using manual and semi-automated methods. Involving the Research Community in Biocuration of Genes and Pathways Sushma Naithani Department of Botany and Plant Pathology Oregon State University [email protected] Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant Reactome in the grand scheme of An example of a curated rice reference pathway in genomics science the Plant Reactome Data generation Knowledgebase Plant’s response to biotic stimuli: Fungi and Bacteria • Sequence data • Mining • Proteomes • Synthesis • Genotyping • Visualization Handling • Phenotype • Cyberinfrastructure • Analysis • Storage • Annotation • Metadata Data impacts • Hypothesis • Translational research Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant Reactome: an open resource for the The scope of pathway curation data community Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 1 1/25/2019 Biocuration is one of the bottlenecks in making Involving the community in biocuration genomic data FAIR Finding a common framework to depict pathways PubMed search results for anthocyanin biosynthesis 600 500 Year 2018 - 499 items Dec • Expertise 400 300 • Time 200 Numberofpublications 100 • Training 0 2018 1980 1985 1990 1995 2000 2005 2010 2015 Year Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Design of a workshop FAIR is a Fairy tale! • Tool Centric: Teach how to use biocuration tools Usually workshops are OK But, hardly anyone returns with a curated pathway! • Knowledge Centric (process of curation) gathering data evaluating evidence synthesizing knowledge There is very little scientific information that meets FAIR standards (Forget about tools) Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Strategy and workflow of the pathway curation Every fairy tale needs: Task 1: Selection of the articles ‘Magic’ Secret recipes (SOPs for gene and pathway curation) + Alliances You can be a part of this ‘fairy tale’ Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 2 1/25/2019 Task 2: Critical review of the literature Biocuration in Reactome Curator Tool A core strength of Biologists Evaluation of the data and synthesis of knowledge – Yeast two-hybrid assay – Co-immunoprecipitation – Mutant and transgenic studies – Quantitative trait locus (QTL) mapping – Isotope-coded affinity tagging (ICAT) – Predicted protein-protein interactions A caricature by Kara (sponsored by DNA Link) – Expression clustering techniques – Literature-mining for specified interactions – Green florescent protein (GFP) tagging Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 View of a Plant Reactome pathway We do not excel using only Excel! Standard Gene IDs Subcellular location Reaction / Pathway But, it could be a stepping stone… UniProt IDs Membrane association (TMM) Summary with citation Task 3: Data collection • genes • functions • cellular location • associated reactions • associated pathway(s) Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Task 4: Connecting dots Community Pathway Curation Jamboree 2018 at Oregon State University • imagining reactions • building pathways Reactome Data Model Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 3 1/25/2019 Curation outcome from community DO NOT DRAW in PowerPoint! curators Os01g09528 OsWOX5 STI-like 00 (HLH-TF) Os12g023380 0 Os03g026630 Os11g055010 0 • We curated data from 7 research articles 0 AP2/ERF Os09g0438700 OsDREB2 OsVP1 (TF) D Heat Os03g0277300 OsCML18 Os11g0547000 Os12g0244100 • Extracted a list of 300 genes HVA22 OsEnS-2 Os01g0136100 OsEnS-18 MADS57 OsAPX Drought 1 OsWOX4 • Curated two pathways, gathered material for 3 Salinity OsMPK5 OsWOX12B OSERF3 Os08g052160 SUS2 pathways VAL3- 0 like/GD1-B3 OsWRKY7 1 ABA OsABF1 Ethylene OsHsfA7 GA Os05g0542500 • 1 opinion article (under review in DATABASE) Biotic KEY Submergence induced ERD4 Cold suppressed GIP13 +ively Naithani et al. Involving community in genes and pathway curation. Database (2019) Vol. Os01g0615100 RACK1A regulated by a TF hormone RAR1 OSMADS18 2019: article ID bay146; doi:10.1093/database/bay146 OsRac1 HSP Hormon s e XB24 OsEnS-22 LEA enzymes SAB4 SAB1 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Learning outcome for the students Other option for drawing pathways & gene networks Critical review of literature http://www.wikipathways.org/ The value of consistency in gene nomenclature Integration of information from various sources Data organization How to build data-driven hypotheses Why ontologies are useful Genomic resources are not perfect and are a work in progress. Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Online collaborators for gene and pathway curation Drawing gene-gene networks using PathVisio • Prof. Ashwani Pareek’s Group (JNU, Delhi, India) • Dr. Snehlata Pareek’s Group (ICGEB Delhi, India) • Dr. Bijiyalakshmi Mohanty, University of Singapore Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 4 1/25/2019 Conclusions If incorporated in the graduate school curriculum, biocuration training could be beneficial to students and simultaneously increase the community’s contribution to biocuration of public databases. The investment by various stakeholders (academia, industry, educators, scientific societies, and publishers) in engaging and training the broader research community in biocuration will provide a sustainable and quality solution for keeping pace with the Big Data explosion. Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Acknowledgements Oregon State University Cold Spring Harbor Laboratory .Pankaj Jaiswal (Co-PI) .Doreen Ware (PI) .Justin Preece (Software Dev) .Andrew Olson (Search integration) .Sushma Naithani (Curation & outreach) .Marcela K. Tello-Ruiz (Project Coordinator) .Parul Gupta (Curation) .Justin Elser (Software Dev) European Bioinformatics Institute . Priyanka Garg (Curation) . Antonio Fabregat Mundo (Reactome Dev) . Irene Papatheodorou (ATLAS) NYU Langone Medical Center . Alfonso Muñoz-Pomer Fuentes .Peter D’Eustachio (Curation mentor) . IntAct Ontario Institute for Cancer Source data providers & collaborators Research • Araport • BAR • TreeBase .Lincoln Stein (Reactome PI) • SoyBase • MaizeGDB .Robin Haw • PeanutBase • Phytozome .Joel Weiser • Legume information System • Planteome .Guanming Wu • WikiPathways Funding: Gramene - Exploring Function through Comparative Genomics and Network Analysis (NSF IOS 1127112) Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 Thank you! Plant and Animal Genome XXVII, San Diego, Jan 15, 2019 5.