Abstract ID: EGU2011-8578 EGU General Assembly Developing an Ontology for Ocean Biogeochemistry Data 3-8 April 2011 Biological and Chemical Oceanography Data Management Office

Cynthia L. Chandler1, Molly D. Allison1, Robert C. Groman1, Patrick West2, Stephan Zednik2, Andrew R. Maffei1 1 ~ Woods Hole Oceanographic Institution, Woods Hole, MA and 2 ~ Tetherless World Constellation, Rensselaer Polytechnic Institute, Troy, NY http://bcodmo.org

Abstract The Challenge Vocabulary Term Matching and Mapping to SeaDataNet A discipline as diverse as oceanography will benefit greatly when the Challenge: to develop semantically-enabled The BCO-DMO uses controlled vocabularies to record many The process of matching and mapping supports use of: 1) local research community meets the challenge of effectively incorporating search capability for the BCO-DMO data system vocabulary terms, familiar to the originating investigator; 2) intermediate, technologies into the existing cyberinfrastructure. of the important pieces of information that document the data sets in the BCO- Semantically-enabled data delivery systems offer great promise for enabling (e.g. faceted search) DMO database. To improve semantic interoperability, terms from the local BCO- consistent terms managed by repository custodians (e.g. BCO-DMO); new and better scientific research, but significant challenges must be met Strategy: follow the Semantic Web Methodology DMO controlled vocabularies are being mapped to controlled vocabulary terms and 3) closest match terms (e.g. from SeaDataNet), shared by the larger before the full potential can be realized. Evolving expectations for open and Technological Development Process adopted by other oceanographic data management organizations. access to research data combined with the complexity of global ecosystem community. Multi-level matching and mapping enables retention of described by Benedict et al. (2007) to develop an important information while improving interoperability of data systems. science research themes present a significant challenge, and one that is best For example, we have mapped the BCO-DMO Instrument vocabulary to the met through an informatics approach, wherein research scientists, ontology for the BCO-DMO data information managers and computer scientists collaborate in small teams. 'SeaDataNet Device Categories' vocabulary Instrument Mapping Examples (http://vocab.ndg.nerc.ac.uk/list/L05/current) served from the British Oceanographic The Biological and Chemical Oceanography Data Management Office (BCO- Semantic Web Methodology DMO) is funded by the US National Science Foundation Division of Ocean and Technology Development Process Data Centre (BODC) Natural Environment Research Council (NERC) Data Grid original data BCO-DMO SeaDataNet 1 Sciences to work with ocean biogeochemistry researchers in the US to Adopt vocabulary server (http://www.bodc.ac.uk/products/web_services/vocab/) . The Leverage Science/Expert Rapid Technology improve access to data resulting from their respective programs. In an effort Open World: Technology Approach Review & Iteration SeaDataNet device categories vocabulary and related sampling and sensor type Prototype Infrastructure to improve data access, BCO-DMO staff members are collaborating with Evolve, Iterate, 2 Sea-Bird SBE 911 CTD researchers from the Tetherless World Constellation (TWC) at Rensselaer Redesign, vocabularies met our criteria for a community standard vocabulary: SIO-CTD CTD Sea-Bird 911 Redeploy /term/L221/current/TOOL0035 Polytechnic Institute (RPI) to develop an ontology that formally describes the Use Tools concepts and relationships in the data managed by the BCO-DMO. The • availability (Web-accessible, dereferencable HTTP URLs) project required transforming a legacy system of human-readable, flat files of Analysis metadata to well-ordered controlled vocabularies, and finally to a fully Use Case • quality (completeness, clarity and precision, relevance) developed ontology. To improve semantic interoperability, terms from the plankton net Develop model / • community adoption bongo tow bongo net BCO-DMO controlled vocabularies are being mapped to controlled Small Team, ontology /term/L051/current/22 vocabulary terms adopted by SeaDataNet, the pan-European infrastructure mixed skills for ocean and marine data management. Additionally, as part of their efforts • effective governance structure to develop generic science ontologies, the team at TWC has facilitated the adoption of key concepts from the BCO-DMO ontology into ontologies Reference: References: James L. Benedict, Deborah L. McGuinness, Peter Fox. A Semantic Web-based developed for other science domains, and the adoption of concepts from Expendable Bathythermographs Methodology for Building Conceptual Models of Scientific Information, American 1. Lowry, R and G. Williams (in press) "Putting Meaning into SeaDataNet", XBT other domains into the BCO-DMO ontology. Geophysical Union, Fall Meeting (AGU 2007) (Eos Trans. AGU 88(52), Fall Meeting Bathythermograph /term/L054/current/132 Supplement, Abstract IN53A-0950), 2007. Mediterranean Marine Sciences. From the beginning of the project, development of the ontology has been guided by a use case based approach. The use cases were derived from 2. Graybeal, J. 2009. "Choosing and Implementing Established Controlled data access related requests received from members of the research Vocabularies." In The MMI Guides: Navigating the World of Marine Metadata. local global community served by the BCO-DMO. The resultant ontology represents the disambiguation of terms and concepts information stored in the metadata database and satisfies the requirements http://marinemetadata.org/guides/vocabs/cvchooseimplement of the use cases. The BCO-DMO metadata database currently contains Ontology Concept Map information that powers several different user and machine-to-machine The ontology concept map illustrates the semantic see mapping table interfaces to the BCO-DMO data repositories. One goal of the ontology above right development project is to enable subsequent development of semantically- relationships between classes represented in the BCO- enabled components (e.g. faceted search) to enhance the power of those DMO data system. The ontology is available from MMI interfaces and improve data access through enhanced data discovery. ORR at http://mmisw.org/orr/ and RPI TWC SVN site at http://escience.rpi.edu/ontology/BCO-DMO/bcodmo/2/0/. Acknowledgments This work has been funded by the National Science Foundation. We are especially grateful for the significant contributions made by Peter A. Fox ([email protected]) at the Tetherless World Constellation and Rensellear Polytechnic Institute and BCO-DMO programmer, Tobias Work.

photo by Molly D. Allison