Introduction to Ontologies for Environmental Biology
Barry Smith http://ontology.buffalo.edu/smith Finnegans Web concept type class instance model representation data process property Disciplines here involved
GIS Ecology Environmental biology Various -omics disciplines Bioinformatics Medical Informatics Database science Semantic webists ... Part 1: What is an Ontology?
4 what cellular component? what molecular function?
what biological process?
5 natural language labels designed for use in annotations to make the data cognitively accessible to human beings and algorithmically tractable to computers
6 compare: legends for maps
7 common legceonmdpsa arello: wle g(cernodsss -fboor rmaderp)s integration
8 ontologies are legends for data
9 compare: legends for diagrams
10 Ramirez et al. Linking of Digital Images to Phylogenetic Data Matrices Using a Morphological Ontology Syst. Biol. 56(2):283–294, 2007 computationally tractable legends
help integrate complex representations of reality help human beings find things in complex representations of reality help computers reason with complex representations of reality
12 ontologies are used to annotate data but there are two kinds of annotations names of types
16 names of instances
17 A basic distinction
type vs. instance
science text vs. diary human being vs. Michael Ashburner
18 Catalog vs. inventory
A 515287 DC3300 Dust Collector Fan B 521683 Gilmer Belt C 521682 Motor Drive Belt 19 Ontology types Instances 20 An ontology is a collection of standardized names for types We learn about types in reality from looking at the results of scientific experiments captured in the form of scientific theories Ontologies provide the terminological scaffolding of scientific theories experiments relate to what is particular science describes what is general 21 thing types organism
animal
cat siamese frog instances
22 types vs. their extensions
type
{a,b,c,...} class of instances = a collections of particulars
23 Extension =def
The extension of a type A is the class of instances of A
(the class of all entities to which the term ‘A’ applies)
24 types vs. classes
types
{c,d,e,...} classes
25 types vs. classes
types extensions ~ defined classes
26 Defined class =def member of Abba aged > 50 years pizza with > 4 different toppings red wine to serve with fish
27 Part 2: The OBO Foundry
28 what cellular component? what molecular function?
what biological process?
29 The Gene Ontology The Gene Ontology Five bTahnegs Gen fore y Ontouro GloOg ybuck
2. based in biological science 3. cross-species data comparability (human, mouse, yeast, fly ...) 4. cross-granularity data integration (molecule, cell, organ, organism) 5. cumulation of scientific knowledge in algorithmically tractable form 6. links people to software
7. part of Open Biomedical Ontologies (OBO) 32 Entry point for creation of web- accessible biomedical data
GO initially low-tech to encourage users Simple (web-service-based) tools created to support the work of biologists in creating annotations (data entry) OBO OWL DL converters now making OBO Foundry annotated data immediately accessible to Semantic Web data integration projects
33 The OBO Foundry
A suite of high quality interoperable reference ontologies to serve the annotation of biomedical data providing guidelines for those who need to create new ontology resources http://obofoundry.org RELATION CONTINUANT OCCURRENT TO TIME
INDEPENDENT DEPENDENT
GRANULARITY
Organism Anatomical Organ ORGAN AND (NCBI Entity Function ORGANISM Taxonomy) (FMA, CARO) (FMP, CPRO) Phenotypic Biological Process Quality (GO) (PaTO) CELL AND Cellular Cellular Cell CELLULAR Component Function (CL) COMPONENT (FMA, GO) (GO)
Molecule Molecular Function Molecular Process MOLECULE (ChEBI, SO, (GO) (GO) RnaO, PrO)
The OBO Foundry building out from the original GO
35 Simple guidelines
• use singular nouns • distinguish continuants from occurrents • distinguish things from their qualities • distinguish types from their instances • do not use the weasel word ‘concept’ CRITERIA . OPENNESS: The ontology is open and available to be used by all. . FORMAL LANGUAGE: The ontology is in, or can be instantiated in, a common formal language. . ORTHOGONALITY: The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap. . CONVERGENCE: The developers agree to work torwards a single ontology for each domain.
37 http://obofoundry.org/ CRITERIA . UPDATE: The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement. . IDENTIFIERS: The ontology possesses a unique identifier space within OBO. . VERSIONING: The ontology provider has procedures for identifying distinct successive versions. . DEFINITIONS: The ontology includes textual definitions for all terms.
38 http://obofoundry.org/ CRITERIA . CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content. . DOCUMENTATION: The ontology is well-documented. . USERS: The ontology has a plurality of independent users. . COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.
39 http://obofoundry.org/ Foundry ontologies all work in the same way all are built to represent the types existing in a pre- existing domain and the relations between these types in a way which can support reasoning – we have data – we need to make this data available for semantic search and algorithmic processing – we create a consensus-based ontology for annotating the data – and ensure that it can interoperate with Foundry ontologies for neighboring domains
40 Formal-Ontological Relations
is_a part_of located_at depends_on is_boundary_of adjacent_to
41 To support integration of ontologies
relational expressions such as is_a part_of ... should be used in the same way in all ontologies involved
42 to define these relations properly
we need to take account of both types and instances in reality
43 Kinds of relations
44 is_a
human is_a mammal all instances of the type human are as a matter of necessity instances of the type mammal
45 Ontology Scope URL Custodians
Cell Ontology cell types from prokaryotes obo.sourceforge.net/cgi- Jonathan Bard, Michael (CL) to mammals bin/detail.cgi?cell Ashburner, Oliver Hofman
Chemical Entities of Bio- Paula Dematos, molecular entities ebi.ac.uk/chebi logical Interest (ChEBI) Rafael Alcantara
Melissa Haendel, Terry Common Anatomy Refer- anatomical structures in (under development) Hayamizu, Cornelius Rosse, human and model organisms ence Ontology (CARO) David Sutherland,
Foundational Model of fma.biostr.washington. JLV Mejino Jr., structure of the human body Anatomy (FMA) edu Cornelius Rosse
Functional Genomics design, protocol, data Investigation Ontology fugo.sf.net FuGO Working Group instrumentation, and analysis (FuGO)
cellular components, Gene Ontology molecular functions, www.geneontology.org Gene Ontology Consortium (GO) biological processes
Phenotypic Quality obo.sourceforge.net/cgi Michael Ashburner, Suzanna Ontology qualities of biomedical entities -bin/ detail.cgi? Lewis, Georgios Gkoutos (PaTO) attribute_and_value
Protein Ontology protein types and (under development) Protein Ontology Consortium (PrO) modifications
Relation Ontology (RO) relations obo.sf.net/relationship Barry Smith, Chris Mungall
RNA Ontology three-dimensional RNA (under development) RNA Ontology Consortium (RnaO) structures
Sequence Ontology properties and features of song.sf.net Karen Eilbeck 46 (SO) nucleic sequences Ontology Scope URL Custodians
Cell Ontology cell types from prokaryotes obo.sourceforge.net/cgi- Jonathan Bard, Michael (CL) to mammals bin/detail.cgi?cell Ashburner, Oliver Hofman
Chemical Entities of Bio- Paula Dematos, molecular entities ebi.ac.uk/chebi logical Interest (ChEBI) Rafael Alcantara
Melissa Haendel, Terry Common Anatomy Refer- anatomical structures in (under development) Hayamizu, Cornelius Rosse, human and model organisms ence Ontology (CARO) David Sutherland,
Foundational Model of fma.biostr.washington. JLV Mejino Jr., structure of the human body Anatomy (FMA) edu Cornelius Rosse
Functional Genomics design, protocol, data Investigation Ontology fugo.sf.net FuGO Working Group instrumentation, and analysis (FuGO)
cellular components, Gene Ontology molecular functions, www.geneontology.org Gene Ontology Consortium (GO) biological processes
Phenotypic Quality obo.sourceforge.net/cgi Michael Ashburner, Suzanna Ontology qualities of biomedical entities -bin/ detail.cgi? Lewis, Georgios Gkoutos (PaTO) attribute_and_value
Protein Ontology protein types and (under development) Protein Ontology Consortium (PrO) modifications
Relation Ontology (RO) relations obo.sf.net/relationship Barry Smith, Chris Mungall
RNA Ontology three-dimensional RNA (under development) RNA Ontology Consortium (RnaO) structures
Sequence Ontology properties and features of song.sf.net Karen Eilbeck 47 (SO) nucleic sequences Anatomical Anatomical Space Structure
Organ Cavity Organ Organ Subdivision Cavity Organ Part
Serous Sac Serous Sac Organ Organ Tissue Cavity Cavity Serous Sac Component Subdivision Subdivision
Pleural Sac Pleura(Wall Pleural of Sac) Cavity Parietal Pleura Visceral Interlobar Pleura recess Mediastinal Pleura Mesothelium of Pleura
Foundational Model of Anatomy Anatomical Anatomical Space Structure
Organ Cavity Organ Organ Subdivision Cavity Organ Part
Serous Sac Serous Sac Organ Organ Tissue Cavity Cavity Serous Sac Component Subdivision Subdivision is_
Pleural Sac Pleura(Wall Pleural of Sac) a Cavity f Parietal o Pleura Visceral _ Interlobar Pleura t recess Mediastinal Pleura Mesothelium ar of Pleura p Mature OBO Foundry ontologies now undergoing reform Cell Ontology (CL) Chemical Entities of Biological Interest (ChEBI) Foundational Model of Anatomy (FMA) Gene Ontology (GO) Phenotypic Quality Ontology (PaTO) Relation Ontology (RO) Sequence Ontology (SO)
50 Ontologies being built to satisfy Foundry principles ab initio
Ontology for Clinical Investigations (OCI) Common Anatomy Reference Ontology (CARO) Ontology for Biomedical Investigations (OBI) Protein Ontology (PRO) RNA Ontology (RnaO) Subcellular Anatomy Ontology (SAO)
51 Ontologies in planning phase
Biobank/Biorepository Ontology (BrO, part of OBI) Environment Ontology (EnvO) Immunology Ontology (ImmunO) Infectious Disease Ontology (IDO) Mouse Adult Neurogenesis Ontology (MANGO)
52 OBO Foundry Success Story
Model organism research seeks results valuable for the understanding of human disease. This requires the ability to make reliable cross- species comparisons, and for this anatomy is crucial. But different MOD communities have developed their anatomy ontologies in uncoordinated fashion.
53 Ontologies facilitate grouping of annotations
brain 20 hindbrain 15 rhombomere 10
Query brain without ontology 20 Query brain with ontology 45
54 CARO – Common Anatomy Reference Ontology for the first time provides guidelines for model organism researchers who wish to achieve comparability of annotations for the first time provides guidelines for those new to ontology work
See Haendel et al., “CARO: The Common Anatomy Reference Ontology”, in5:5 Burger (ed.), Anatomy Ontologies for Bioinformatics: Springer, in press. CARO-conformant ontologies already in development:
Fish Multi-Species Anatomy Ontology (NSF funding received) Ixodidae and Argasidae (Tick) Anatomy Ontology Mosquito Anatomy Ontology (MAO) Spider Anatomy Ontology Xenopus Anatomy Ontology (XAO) undergoing reform: Drosophila and Zebrafish Anatomy Ontologies
56 Part 3 The Hole Story The Ontology of Environments Initial hypothesis: Environments are holes
environment place site niche habitat setting hole spatial region interior location Places are holes
RELATION CONTINUANT OCCURRENT TO TIME
INDEPENDENT DEPENDENT
GRANULARITY
Organism Anatomical Organ ORGAN AND (NCBI Entity Function ORGANISM Taxonomy) (FMA, CARO) (FMP, CPRO) Phenotypic Biological Process Quality (GO) (PaTO) CELL AND Cellular Cellular Cell CELLULAR Component Function (CL) COMPONENT (FMA, GO) (GO)
Molecule Molecular Function Molecular Process MOLECULE (ChEBI, SO, (GO) (GO) RnaO, PrO)
No place for environments
66 A Neglected Major Category in Ontologies thus far
Things (e.g. organisms) Qualities / Features Functions Processes
Environments = that into which organisms (etc.) fit RELATION CONTINUANT OCCURRENT TO TIME
INDEPENDENT DEPENDENT
GRANULARITY
Anatomical Organism Organ ORGAN AND Entity (NCBI Function ORGANISM (FMA, Taxonomy) (FMP, CPRO) CARO) Phenotypic Biological s
t Quality Process n
e (PaTO) (GO) m e n
CELL AND Cellular r Cellular Cell e CELLULAR Component o Function r h (CL) i
COMPONENT (FMA, GO) v (GO) e n r e a
Molecule Molecular Function Molecular Process MOLECULE (ChEBI, SO, (GO) (GO) RnaO, PrO)
Environments are holes in which
organisms, cells, molecules ... can 6l8ive RELATION CONTINUANT OCCURRENT TO TIME
INDEPENDENT DEPENDENT
GRANULARITY
POPULATION
Anatomical Organ Organism ORGAN AND Entity Function (NCBI ORGANISM (FMA, (FMP, Taxonomy) CARO) CPRO) Phenotypic Biological Quality Process (PaTO) (GO) CELL AND Cellular Cellular Cell CELLULAR Component Function (CL) COMPONENT (FMA, GO) (GO)
Molecule Molecular Function Molecular Process MOLECULE (ChEBI, SO, (GO) (GO) RnaO, PrO)
environments for populations 69
Environments are holes Double Hole Structure of the Occupied Niche
R e t a i n e r ( a b o u n d a r y o f s o m e s u r r o u n d i n g s t r u c t u r e )
M e d i u m ( f i l l i n g t h e e n v i r o n i n g h o l e )
T e n a n t ( o c c u p y i n g t h e c e n t r a l h o l e ) Tenant, medium and retainer
the medium of the bear’s niche is a circumscribed body of air medium might be body of water, cytosol, nasal mucosa, epithelium, endocardium, synovial tissue ... The Empty Niche
F i a t b o u n d a r y P h y s i c a l b o u n d a r y Two Types of Boundary
F i a t b o u n d a r y P h y s i c a l b o u n d a r y Positive and negative parts
negative part or hole (not made positive of matter) part (made of matter) Four Basic Niche Types (Niche as generalized hole)
1 2 3 4
1: a womb; an egg; a house (better: the interior thereof) 2: a snail’s shell; 3: the niche of a pasturing cow; 4: the niche around a circling buzzard (fiat boundary) Types of Niches a pond, a nest, a cave, a hut, an air- conditioned apartment building the history of evolution = history of the development of niches Types of relations for EnvO
in on (surface of) surrounds lives_in attaches to realizes occupies (spatial region) ... Lexical Semantics the fruit is in the bowl the bird is in the nest the lion is in the cage the pencil is in the cup the fish is in the river the river is in the valley the water is in the lake the car is in the garage the fetus is in the cavity in the uterine lining the colony of whooping crane is in its breeding grounds Double Hole Structure
R e t a i n e r ( a b o u n d a r y o f s o m e s u r r o u n d i n g s t r u c t u r e )
M e d i u m ( f i l l i n g t h e e n v i r o n i n g h o l e )
T e n a n t ( o c c u p y i n g t h e c e n t r a l h o l e ) when a tenant leaves its niche the gap left by the tenant is filled immediately by the surrounding medium A hole in the ground
Solid physical boundaries at the floor and walls
but with a fiat lid:
hole Part 4: Not every hole is an environment
An environment is a special kind of (generalized) hole but what kind? Elton – niche as role the ‘niche’ of an animal means its place in the biotic environment, its relations to food and enemies. [...] When an ecologist says ‘there goes a badger’ he should include in his thoughts some definite idea of the animal’s place in the community to which it belongs, just as if he had said ‘there goes the vicar’ (Elton 1927, pp. 63f.) G.E. Hutchinson: niche as volume in a functionally defined space the niche = an n-dimensional hyper- volume whose dimensions correspond to resource gradients over which species are distributed G.E. Hutchinson (1957, 1965)
Hypervolume niche = a location in an attribute space
defined by a specific constellation of environmental variables such as degree of slope, exposure to sunlight, soil fertility, foliage density, salinity... Niche Construction
Lewontin: niches normally arise in symbiosis with the activities of organisms or groups of organisms (“ecosystem engineering”); they are not already there, like vacant rooms in a gigantic evolutionary hotel, awaiting organisms who would evolve into them. (The Triple Helix, Gene Organism, Environment) Part Last: Bringing Together the Spatial and Functional Approaches to Environment Ontology
The environment is not a location in an attribute space, but it must have features have such location Every environment must have some spatial location
The functional niche presupposes the spatial-structural niche
Ontology of environment + ontology of associated environmental features J. J. Gibson’s Ecological Psychology
The terrestrial environment is [best] described in terms of a medium, substances, and the surfaces that separate them. (Gibson 1979, p. 16) Gibson’s theory of surface layout
‘a sort of applied geometry that is appropriate for the study of perception and behavior’ (1979, p. 33) ground, open environment, enclosure, detached object, attached object, hollow object, place, sheet, fissure, stick, fiber, dihedral, etc. Gibson’s theory of surface layout as an anatomy of environments
• systems of barriers, doors, pathways to which the behavior of organisms is specifically attuned, • temperature gradients, patterns of movement of air or water molecules • water holes, food sources (features) • apertures (mouths, sphincters ...) Two sets of issues
Environments, as spatial structures, and their parts
Environmental attributes (qualities, functions), determining multidimensional loci à la Hutchinson Aim
To define structural properties such as: open, closed, connected, compact, spatial coincidence, integrity, aggregate, boundary RCC (Region Connection Calculus) plus extensions Ecological Niche Concepts niche as particular place or subdivision of an environment that an organism or population occupies vs. niche as function of an organism or population within an ecological community Next steps
Our data needs are to link niche features with geo-locations Scale: From geographic to microbiological
From locations of organisms/samples, sources of museum artifacts ... to organism interactions, e.g. on bacterial infection – how the interior of one organism or organism part serves as environment for another organism Hosts for bacterial infection (interior of) lung blood (bacteremia) erythrocyte - plasmodium inhabits red blood cells hepatocyte – plasmodium infects liver cells macrophage gut and oral mucosa, nasal mucosa, vaginal mucosa kidney bladder portion of epithelial tissue
C: bacteria (arrows) adhering to and penetrating the epithelial cells (×3,000) D: abscess (Ab) formation in subepithelial region with a colony of bacteria (arrows) and a red blood cell (RBC) in it (×2,000)
RELATION CONTINUANT OCCURRENT TO TIME
INDEPENDENT DEPENDENT
GRANULARITY
Organism Anatomical Organ ORGAN AND (NCBI Entity Function ORGANISM Taxonomy) (FMA, CARO) (FMP, CPRO) Phenotypic Biological Process Quality (GO) (PaTO) CELL AND Cellular Cellular Cell CELLULAR Component Function (CL) COMPONENT (FMA, GO) (GO)
Molecule Molecular Function Molecular Process MOLECULE (ChEBI, SO, (GO) (GO) RnaO, PrO)
Environments, environment parts (features), environment qualities 106 Ontologies needed Environment -- Taxonomy place, habitat, city, farm, building (interior), oral cavity, uterine cavity, gut ... Environment part – Anatomy of environments (Surface, conduit, entry ...) city wall, uterine wall, water source, ... Environment function protection, supply of food,... Environment quality – (Phenotypes) ambient temperature, salinity, ...