Standards and Ontologies for Functional Genomics 2 University of Pennsylvania, Philadelphia, PA, USA, 23–26 October 2004
Total Page:16
File Type:pdf, Size:1020Kb
Comparative and Functional Genomics Comp Funct Genom 2004; 5: 618–622. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.448 Conference Editorial Standards and Ontologies for Functional Genomics 2 University of Pennsylvania, Philadelphia, PA, USA, 23–26 October 2004 Midori A. Harris* and Helen Parkinson European Bioinformatics Institute, EMBL Outstation, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK *Correspondence to: Midori A. Harris, EMBL-EBI- The European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. E-mail: [email protected] Received: 26 November 2004 Accepted: 26 November 2004 Like the first conference, held in 2002, on Stan- and exchanging biological data, experimental meta- dards and Ontologies for Functional Genomics data (data about data) and ontologies themselves. (www.sofg.org), the second SOFG meeting brou- SOFG2 examined the progress of the world ght together computer scientists, ontologists and of ontology and standards development in the biomedical scientists in Philadelphia to examine the two years since SOFG1; a number of significant state of the art in ontology technology, develop- changes soon became evident. ment and emerging standards for biomedicine. Notably, the ontology community has adopted Briefly, an ontology is a means of formaliz- ing knowledge about a subject; at a minimum, the I3C standard Web Ontology Language (OWL; + an ontology must include terms (concepts) rele- Horrocks et al., 2003), the successor to DAML vant to a domain, definitions for the terms, and OIL (Stevens et al., 2002), as its lingua franca. In defined relationships between the terms. Ontolo- parallel with the move to OWL, the past two years gies may be represented in any of a number of have brought increasingly sophisticated tools for formats and systems, with varying degrees of com- building and applying ontologies to the community. plexity, and may range in size from hundreds to Presentations and discussions at SOFG2, summa- hundreds of thousands of concepts. In biomedicine, rized briefly below, identified several more com- ontologies support the organization and manage- mon issues and themes. ment of large amounts of data, permitting sophisti- In her keynote presentation, Carole Goble used cated searches and allowing database annotations Shakespeare’s Romeo & Juliet as an analogy to be standardized. The standards aspect of the to illustrate the sociological (we currently know conference focused on content standards, i.e. what information should be captured about a biologi- of no romantic) interactions between ontology cal concept or an experiment. This issue of ‘what developers coming from differing perspectives. As should be said’ leads naturally to the considera- described in detail in the accompanying review tion of how things can be said, thus forging con- (Goble and Wroe, 2004), computer scientists and nections between standards and ontologies. Format philosophers (the ‘Montagues’) emphasize for- was touched upon in the context of representing mal structures, whereas domain experts such as Copyright 2005 John Wiley & Sons, Ltd. Conference editorial: SOFG2 619 biologists (the ‘Capulets’) have preferred less for- with GO, OBO has grown to encompass over 40 mal, more pragmatic approaches, despite the lim- ontologies, covering diverse biological domains. itations inherent in informal systems. The Mon- The availability of certain ontologies, such as those tague/Capulet analogy struck a chord with partici- for anatomical terms or chemical substances, facili- pants, as several speakers declared their allegiance; tates the creation and maintenance of combinatorial notably, many proudly claimed connections with concepts. Many types of biological knowledge can both formalist and pragmatist camps. Although only be adequately represented by such compound the relationship between the formal and pragmatic concepts; an example particularly relevant to SOFG camps has historically been marked by tension and is that of modelling phenotypes, which requires conflict, an important theme emerging from SOFG2 concepts from anatomy, experimental procedures is that of growing mutual interest and respect, as (assays) and results, among others. GO and OBO computer scientists become attracted to the com- have also given rise to the development of OBOL, plex use cases that biology provides, and biologists a language for formalizing combinatorial terms in come to appreciate the practical benefits of for- OBO ontologies. mal ontological approaches. The final outcome may The session on Ontologies for Biological Sys- thus be happier for bio-ontologists than for Shake- tems began with two perspectives on biological speare’s lovers, thanks to the efforts of our ‘Princes pathways. The Reactome database (www.reacto- of Genomics’ who straddle both communities, and me.org), presented by Peter D’Eustachio, mod- perhaps also to the auspicious location of SOFG2 els the entities and events that make up path- in the city of brotherly love. ways. Minoru Kanehisa described the graph-based Chris Wroe began the first session, on Onto- design of KEGG (Kanehisa et al., 2004). In light logical Systems: Theory and Development, with a of comments made in several talks on the need presentation that followed logically from Goble’s, for an ontology of chemicals, Marcus Ennis gave giving an overview of both theoretical and practi- a very timely presentation on the nascent database cal aspects of ontology development. For example, of Chemical Entities of Biological Interest (ChEBI) an ontology may be represented by a simple struc- (www.ebi.ac.uk/chebi). ChEBI covers chemical ture such as a hierarchy or directed acyclic graph nomenclature, formulae, and structures, and is built (DAG), or in a sophisticated formal structure such upon a chemical ontology that organizes chemical as a description logic (DL) system. The most suit- compounds by both chemical characteristics and able representation depends on the requirements, function, essentially providing a biologist’s view of i.e. on the intended use of the ontology. An ontol- the chemical world. The last talk tied the session to ogy may be converted from a simpler to a more the preceding one: Chris Catton used a description formal representation as needs evolve, as illustrated of the BioImage database (www.bioimage.org) by the example of the GONG project (Wroe et al., and its ontology-based architecture to provide con- 2003), which represented a portion of the Gene text for a discussion of challenges that ontology Ontology (GO) in OWL. developers face, especially in a field such as biol- Two presentations then described uses of OWL ogy, where knowledge changes rapidly and some- to adapt ontologies for particular purposes: Chin- times substantially. tan O. Patel discussed motivations for, and chal- The third session, Ontology Systems Develop- lenges presented by, representing existing biomed- ment: Electronic Demonstrations, underscored the ical domain ontologies in OWL; Karim Nashar considerable progress that has been made since presented a proposal to integrate several ontologies, SOFG1 in developing tools to construct ontolo- including the MGED core ontology (Stoeckert and gies and applying them in biological contexts. Parkinson, 2003) and existing and proposed exten- Mark Musen and Phil Lord presented different sions, to represent experiment metadata. To com- aspects of Proteg´ e´ (Noy et al., 2003): Musen gave plete the session and link with the following ses- an overview of the tool, noting recent develop- sion, Michael Ashburner summarized the origins ments such as support for multiple simultaneous and aims of the Open Biology Ontologies (OBO; users, and a growing array of plugins that lend formerly GOBO) initiative (obo.sourceforge.net). support for OWL reasoning and ontology manage- Intended to extend the model of community-based ment. Lord then focused on the OWL plugin, which development of publicly available ontologies begun allows Proteg´ e´ to read and write OWL ontologies, Copyright 2005 John Wiley & Sons, Ltd. Comp Funct Genom 2004; 5: 618–622. 620 M. A. Harris and H. Parkinson and provides an interface for constructing descrip- FMA includes the subcellular anatomical structures tion logic statements. Users who have worked also covered by the GO cellular component ontol- with DL statements in less user-friendly editors ogy; some inconsistencies between the GO cellular will welcome these advances. John Day-Richter component ontology and the FMA have come to reported on recent enhancements in DAG-Edit light, and should be resolved in the near future. (godatabase.org/dev/index.html), which was orig- The Anatomical Dictionary for the Adult Mou- inally created to edit GO and other DAG-structured se (www.informatics.jax.org/searches/anatdict ontologies, and now supports many features of form.shtml), presented by Terry Hayamizu, has more sophisticated representations. DAG-Edit now a somewhat simpler structure than the FMA, and offers a powerful mechanism for searching, filter- is used by the mouse community to annotate their ing and displaying ontology terms, and a wide array data. The anatomy talks continued a theme that of plugins to support managing categories, parents was among the highlights of SOFG1, where dis- and namespaces. Another plugin allows DAG-Edit cussions began that led to the creation of the SOFG to use OBOL; still others track change history Anatomy Entry List (Parkinson et al., 2004), a sim- and provide a graphical