Undefined 1 (2009) 1–5 1 IOS Press

Linked Data for Science and Education

Editor(s): Name Surname, University, Country Solicited review(s): Name Surname, University, Country Open review(s): Name Surname, University, Country

Carsten Keßler a, Mathieu d’Aquin b and Stefan Dietze c a Institute for Geoinformatics, University of Münster, Germany E-mail: [email protected] b The Open University, UK E-mail: [email protected] c L3S Research Center, Germany E-mail: [email protected]

Abstract. Sharing of resources and metadata is a central principle in scientific and educational contexts. With the emergence of the Linked Data approach as the most recent evolution of the Semantic Web, scientific and educational practitioners have started to adopt those principles. The communities working on Linked Data for science and education have since developed common schemas used for describing scientific or educational resources, substantial collections of structured data, bibliographic collec- tions, domain-specific vocabularies capturing vast amounts of scientific domain knowledge, as well as baseline technologies used to expose and integrate linked datasets. In this paper, we give an overview of the current landscape related to the use of Linked Data in the academic sector. We look at the common challenges, prominent datasets, tools and applications, and conclude on the major directions for research in this area.

Keywords: Survey, Linked Science, Linked Learning, Education

1. Introduction and taxonomies have been developed to describe and expose educational resources, and scientific workflows Sharing of resources, resource metadata, and data and data. Due to the prevailing heterogeneity of de- across the Web is a central principle in scientific and ployed approaches and technologies, interoperability educational contexts. Scientific collaboration has long remains an open challenge. been striving for wider reuse and sharing of knowl- At the same time, the Linked Data (LD) [10] ap- edge and data. Likewise, the Open Educational Re- proach has emerged as the most recent evolution of sources community has promoted the widespread ex- the Semantic Web [8] and has widely established it- ploitation of public and reusable educational resources self as the de-facto standard for sharing data on the throughout the last decade. Hence, technologies to en- Web. While the LD approach provides a set of well- able interoperability of shared resources and data have established principles and (W3C) standards, such as long been at the centre of scientific and educational the use of URIs as identifiers, RDF, SPARQL [91], information systems. However, due to the lack of a aiming at Web-scale data interoperability, it has pro- shared technology stack and joint principles, the land- duced an ever growing amount of data sets and scape of developed and utilised standards is very frag- schemas available on the Web. Given the proven ca- mented and covers an increasing variety of heteroge- pabilities of LD technologies towards realising Web neous technologies, such as repositories with propri- scale data sharing and reuse, scientific and educational etary interfaces and query mechanisms. Moreover, a practitioners have started to adopt those principles. Re- broad range of largely incompatible metadata schemas sults of such activities cover joint schemas used for

0000-0000/09/$00.00 c 2009 – IOS Press and the authors. All rights reserved 2 C. Keßler et al. / Linked Data for Science and Education describing scientific or educational resources, vast col- called attention metadata, capturing the perception of lections of structured data about, for instance, cultural learning resources by learners, has become increas- or historic artifacts, bibliographic collections, domain- ingly useful for tailoring learning experiences to par- specific vocabularies capturing extensive amounts of ticular user needs and requirements. To this end, al- scientific domain knowledge – where the life sciences though a vast amount of educational content and data are particularly well represented – as well as base- is shared on the Web in an open way, the integration line technologies used to expose and integrate linked process is still costly as different learning resources are datasets. isolated from each other and based on different imple- In this paper, we give an overview of the current mentation standards [24]. landscape related to the use of LD for science and ed- ucation, looking at the common challenges, prominent 2.2. Scientific data and metadata sharing datasets, tools and applications, to conclude on the ma- jor directions for research in this area. The scientific community has been developing new ways of sharing (meta-)data along with the move to digital research environments. However, similar to the 2. Knowledge and data sharing in Science & field of TEL, the developed approaches were charac- Education: challenges terized by a segregation of the research process into different aspects. Tools such as Taverna4 or MyEx- With technologies evolving, communities of prac- periment5 support researchers in managing workflows titioners in the science and education sectors have for their experiments. Electronic lab notebooks help produced a variety of standards and approaches, es- keeping track of progress. Data repositories and data pecially to facilitate information sharing. These stan- management systems, such as laboratory information dards have however shown limited success in their management systems, provide means to store and ac- concrete adoption, and generally seem not to be suit- cess research data through proprietary APIs. The way able for the Web-scale distribution of information of they handle and enable interaction with data is of- educational and scientific relevance. We discuss here ten domain-specific due to the varying requirements some of their shortcomings, identifying the challenges across disciplines. Finally, once a finding is published, where LD technologies and principles can prove of the publication and its metadata is provided through an valuable use. electronic library catalogue. Even though these tools and systems support and document parts of the re- 2.1. Educational data and metadata sharing search process, there is still a significant lack of in- tegration of different information sources. The docu- Throughout the last decade, research in the field of mentation of a research project is segregated, placing technology-enhanced learning (TEL) has focused fun- each aspect into a different silo, and the entry point to damentally on enabling interoperability and reuse of that documentation – the publication – is decontextu- learning resources and data. That has led to a frag- alized [6]. mented landscape of competing metadata schemas, i.e., general-purpose ones such as Dublin Core [31] 2.3. Challenges or schemas specific to the educational field, like IEEE Learning Object Metadata (LOM) [46] or ADL While there already is a large amount of educational SCORM1, but also interface mechanisms such as OAI- and scientific data available on the Web via proprietary PMH2 or SQI3. These technologies are exploited by and/or competing schemas and interface mechanisms, educational resource repository providers to support the main challenge is to (a) start adopting LD prin- interoperability. In addition, social data as well as so- ciples and vocabularies while (b) leveraging on exist- ing data available on the Web by non-LD compliant 1Advanced Distributed Learning (ADL) SCORM: http:// means [30]. In the following, we list the major research www.adlnet.org challenges which are currently approached by adopt- 2Open Archives Protocol for Metadata Harvest- ing LD-principles within science and education. ing http://www.openarchives.org/OAI/ openarchivesprotocol.html 3Simple Query Interface: http://www.cen-ltso.net/ 4http://www.taverna.org.uk main.aspx?put=859 5http://www.myexperiment.org C. Keßler et al. / Linked Data for Science and Education 3

Data interoperability—infrastructural: Vast amounts RDFS and SKOS facilitate the combination and align- of educational and scientific data have already been ment of different models making it no longer necessary available on the Web, however, due to heterogeneity to subscribe to just one domain model. of existing storage and interface approaches, interop- The wide range of available tools (see Section4) and erability has been limited [94]. LD offers a technology applications (see Section5) which exploit aforemen- stack composed of RDF as representation standard and tioned capabilities of LD demonstrate its applicability SPARQL as query mechanism and standardised infras- to scientific and educational processes. tructural HTTP endpoint which facilitates exposing, integrating and sharing of Web data at the infrastruc- tural level. In addition, a number of tools, for instance, 3. Datasets for storage and integration of data have been provided, often with domain-specific extensions (see Section4). This section gives an overview of existing datasets Data interoperability—semantic and syntactic: In for education (Section 3.1) and science (Section 3.2). addition to infrastructural boundaries, heterogeneity Section 3.3 lists vocabularies specifically developed with respect to data representation also hinders wide for science and education, and provides an analysis of interoperability of data. This includes, for instance, the vocabularies actually in use. The range of datasets the use of heterogeneous schemas, vocabularies and and vocabularies is steadily growing and has already representation languages. LD addresses these issues reached a size that does not permit complete listings in through a range of principles: RDF(S) provides a joint this paper. This section therefore contains a subjective data representation language, the consistent use of de- selection that is intended to cover the broad range of referencable URIs allows wide reuse of any data or thematic areas. schema item and inherent features of RDF and OWL, such as owl:sameAs statements, facilitate the in- 3.1. Educational datasets terlinking of heterogeneous datasets. These features are particularly well-suited to support scientific and Open Educational Resources (OER) are educational educational processes, fundamentally based on citing material freely available online. The wide availabil- and incrementally extending—i.e., linking—existing ity of educational resources is a common objective for work, that has led to the availability of a wide range universities, libraries, archives and other knowledge- of schemas and datasets in the area of education and intensive institutions raising a number of issues, par- science. In particular the notion of Linked Open Data ticularly with respect to Web-scale metadata interop- (LOD) emphasizes the public and open nature of data, erability or legal as well as licensing aspects. Sev- analogous to the public character of scientific knowl- eral competing standards and educational metadata edge, where raw data is made available along with pub- schemas have been proposed over time, including lications to ensure reproducibility (see Section 3.2). IEEE LTSC LOM (Learning Object Metadata), one of On the educational side, the open data movement the widest adopted, IMS6, ISO/IEC MLR - ISO 19788 found an equivalent with the Open Educational Re- Metadata for Learning Resources (MLR)7 and Dublin sources approach, but also fostered transparency in ar- Core. The adoption of a sole metadata schema is usu- eas such as enrollment statistics or energy consump- ally not sufficient to efficiently characterize learning tion in university facilities. resources. As a solution to this problem, a number Formalising domain knowledge. One particular chal- of taxonomies, vocabularies, policies, and guidelines lenge within both, scientific as well as educational con- (called application profiles) are defined [32]. Some texts, is the lack of formal and well-aligned domain popular examples are: UK LOM Core8, DC-Ed9 and models, which would allow a shared understanding ADL SCORM. of a discipline, such as the life sciences, and the de- scription of domain-specific data and knowledge re- 6http://www.imsglobal.org/metadata/ sources with consistent vocabularies. Here, the LD 7http://www.iso.org/iso/iso_catalogue/ movement was particularly successful in generating a catalogue_tc/catalogue_detail.htm?csnumber= wealth of domain-specific vocabularies, where some 50772 8http://zope.cetis.ac.uk/profiles/ disciplines, such as biomedicine, are particularly well uklomcore/ reflected, while others are still under-represented (see 9http://www.dublincore.org/documents/ Sections 3.1 and 3.2). Moreover, standards such as education-namespace/ 4 C. Keßler et al. / Linked Data for Science and Education

Due to the diversity of exploited standards, existing which includes around 5 million triples about 3,000 OER repositories offer very heterogeneous datasets, audio-video resources, 700 courses, 300 qualifications, differing with respect to schema, exploited vocabu- 100 Buildings, 13,000 people). Many other institutions laries, and interface mechanisms. For example, Open- have since then announced similar platforms, includ- Learn10 is the UK Open University’s contribution to ing in the UK the University of Southampton17 and the OER movement and it is a member of the MIT the University of Oxford.18 Outside the UK, several Open Courseware (OCW) Consortium11. Video mate- other universities and education institutions are join- rial from OpenLearn, distributed through iTunesU has ing the Web of Data, by publishing information of reached more than 40 million downloads in less than value to students, teachers and researchers with LD. 4 years. One of the largest and diverse collections of Noticeable efforts include the Linked Open Data at OER can be found in the GLOBE (Global Learning University of Münster, Germany, and the correspond- Objects Brokered Exchange)12 where jointly, nearly ing LODUM initiative,19 or the Norwegian University 1.2 million learning objects are shared. of Science and Technology exposing its library data Regarding the presence of educational information as Linked Open Data. These universities thus leverage in the LD landscape, two types of LDsets need to LD to facilitate both internal data exchange and make be considered: (1) datasets directly related to educa- the data easily and programmatically accessible for ex- tional material and institutions, including information ternal consumers. Serving a university’s data through from open educational repositories and data produced a SPARQL endpoint creates a one-stop shop for stan- by universities; (2) datasets that can be used in teach- dardized access, and, in combination, creates an in- ing and learning scenarios, while not being directly terlinked university graph. First applications with stu- published for this purpose. This second category in- dents as the main user group have been released [61, cludes for example datasets in the cultural heritage do- 96]. In addition, educational resources metadata has main, such as the ones made available by the Euro- been exposed by the mEducator project [95,94]. The 13 peana project, as well as by individual museums and problems connected to the heterogeneity of metadata libraries (such as the British Museum, which has made can be addressed by converting the data into a format 14 their collection available as LD, representing more that allows for implementing the LD principles. Most than 100 million triples, or the Bibliothèque Nationale often this means that the data which is provided as part 15 de France, which made available information about of RDBMS or in XML format – or, on occasion, in 30,000 books and 10,000 authors in RDF, representing other formats – are converted to RDF. The data model around 2 million triples). It also includes information of RDF is a natural choice as it allows for unique related to research in particular domains, and the re- identification, interlinking to related data, as well as 16 lated publications (see PubMed which covers more enrichment and contextualization. Therefore, general- than 21 million citations, in 800 million triples), as purpose tools (see Section4) are often used to convert well as general purpose information for example from proprietary datasets to RDF. It is common to use DB- Wikipedia (via DBPedia.org). pedia or other big datasets as “linking hubs”. One of Regarding category (1), initiatives have emerged re- the advantages of this approach is that such datasets cently using LD to expose, give access to and ex- are commonly used by other datasets, which auto- ploit public information for education. The Open Uni- matically leads to a plurality of indirect links. In the versity in the UK was the first education organiza- case of more specialized applications, it is beneficial tion to create an LD platform to expose informa- if domain-specific datasets or ontologies can be found tion from across its departments, and that would usu- and linked to. This has been successfully demonstrated ally sit in many different systems, behind many dif- by specialized projects such as Linked Life Data20 in http://data.open.ac.uk ferent interfaces (see the biomedical domain, Organic.Edunet21 in organic agriculture and agroecology, and mEducator in medi- 10http://openlearn.open.ac.uk/ cal education [94]. 11http://ocw.mit.edu/ 12http://globe-info.org/ 13http://www.europeana.eu/ 17See http://data.southampton.ac.uk. 14http://collection.britishmuseum.org/ 18See http://data.ox.ac.uk. 15http://data.bnf.fr/ 19See http://data.uni-muenster.de. 16http://www.ncbi.nlm.nih.gov/pubmed/ and 20http://www.linkedlifedata.com http://thedatahub.org/dataset/bio2rdf-pubmed 21http://www.organic-edunet.eu C. Keßler et al. / Linked Data for Science and Education 5

A more thorough overview of educational LD is the decades, leads to the libraries being at the forefront offered by the Linked Education platform,22 whose of the LD movement. This is reflected both in the high international team maintains a growing collection of number of libraries already publishing their catalogues datasets, models, tools, applications, as well as related as Linked Open Data (see Table2) and in a compa- events and calls. rably high degree of standardization with agreed-upon vocabularies [31,38,47,84] and the publication of key 3.2. Scientific datasets pieces of the library infrastructure, such as Digital Ob- ject Identifiers (DOIs) and thesauri already published Transparency and reproducibility are core principles as LD [9,45,64]. Beyond a mere conversion of MARC for scientific work, yet many disciplines have been 21 [65] catalogue entries, new approaches towards se- struggling to guarantee these beyond the traditional mantic publishing [1,85] are currently being discussed, publication of peer-reviewed articles. While most re- enabling detailed annotations of article contents, such search environments have been undergoing a funda- as hypotheses, and the different ways in which papers mental change moving into a digital environment, tra- 23 ditional paper publication is still prevalent. Different are being cited [16,83]. The World Wide Web Con- disciplines have tried to tackle this challenge by set- sortium addresses library data as a central pillar of the ting up repositories for their research data, with limited Web of Data with the Library Linked Data Incubator 24 success: Proprietary APIs and domain-specific data Group stressing the central role of library LD not models did not live up to the requirements of a mean- only for science and education, but for the Semantic ingful interlinking of a publication with the data, pro- Web as a whole [4]. BibBase has been introduced as cesses, and software that lead to the presented results. a service that supports authors in managing and pub- Over the past years, the potential of the LD approach lishing semantically annotated metadata for their pa- has been realised to achieve this integration. The ulti- pers [92]. mate goal is a Linked Science approach [53] that takes The actual data used and produced during research the traditional idea of Open Science [22] to a new projects can be divided into static datasets and dy- level, where all aspects of a study are available on- namic, semantically annotated data streams. Static line, referring to each other through shared vocabular- datasets are often hosted on repositories such as The ies. Ideally, this should enable the development of ex- Data Hub25 or DataCite26, with the latter offering ecutable papers [54] that automatically gather all data persistent DOIs for uploaded datasets to make them and software for a paper (e.g., packed as a Research citable [11]. These services facilitate data publishing Object [6,74,97]) and enable the reader to review the for institutions and individual researchers that lack whole process at a mouse click. Thinking beyond the the resources or expertise to publish the data them- use of vocabularies and ontologies to support scien- selves. Moreover, they also act as central hubs, there- tific workflows, the long-term vision is to use them as fore adopting a role similar to catalogue services that knowledge artifacts that support the researcher in test- users and developers are likely to visit when look- ing theories and finding new insights [15]. Having all resources relevant for a paper interlinked ing for datasets. Many self-hosted datasets can also be and available online has obvious advantages for re- found on repositories as dumps because of this exposi- producibility and secondary use of often expensive-to- tion. The actual datasets found here cover a variety of produce datasets. Nonetheless, data, software and pro- scientific disciplines, ranging from drug research [49] cesses are useless without the paper that documents over cultural heritage [44] to deforestation data from the work and serves as entry point and anchor for the Amazon rainforest [55,23]. Evidently, some fields any related information. In order to fulfill this func- have made more progress in the adoption of semantic tion in a LD driven research environment, it is vi- web technologies than others. The biomedical domain tal that bibliographic data is made available accord- is an example of a field that has a long tradition in us- ing to the LD principles [7]. This requirement, com- bined with the long-standing tradition and expertise in (meta-)data management the librarians have built over 23See the Semantic Publishing and Referencing (SPAR) ontolo- gies for details: http://purl.org/spar/. 24See http://www.w3.org/2005/Incubator/lld/ 22See http://linkededucation.org, co-maintained by 25See http://thedatahub.org. the authors of this paper, among others. 26See http://datacite.org. 6 C. Keßler et al. / Linked Data for Science and Education

Table 1 Selection of educational datasets Dataset Source & SPARQL Endpoint License mEducator Linked http://linkededucation.org/meducator c * Educational Resources http://meducator.open.ac.uk/resourcesrestapi/rest/meducator/sparql Open University http://data.open.ac.uk CC-BY Linked Data http://data.open.ac.uk/query Achiement Standards http://achievementstandards.org/ CC Network http://asn.jesandco.org:8890/sparql education.data.gov.uk http://education.data.gov.uk/ c * http://services.data.gov.uk/education/sparql Ariadne RDF http://www.ariadne-eu.org/ c * http://knowone.csc.kth.se/sparql/ariadne-big Nature Publishing http://data.nature.com/ CC-0 Group http://data.nature.com/sparql SEEK-AT WD http://www.gsic.uva.es/seek/ c * http://seek.rkbexplorer.com/sparql/ * No license info given, so we assume that these datasets are not under an open license.

Table 2 Selection of library datasets Dataset Source & SPARQL Endpoint License British National Bibliography http://bnb.data.bl.uk CC-0 http://bnb.data.bl.uk/sparql Calames http://calames.abes.fr c – Bibliothèque nationale http://data.bnf.fr CC-BY de France – NTNU special collections http://api.talis.com/stores/gunnbib-digitalmanuskripter PD http://api.talis.com/stores/gunnbib-digitalmanuskripter/sparql Leibniz Information Centre http://zbw.eu/beta/ c * for Economics http://zbw.eu/beta/sparql ECS Southampton EPrints http://eprints.ecs.soton.ac.uk CC-0 – DBLP Bibliography http://dblp.l3s.de/d2r/ c http://dblp.l3s.de/d2r/sparql Hungarian National Library http://nektar.oszk.hu/wiki/Semantic_web c http://setaria.oszk.hu/sparql HBZ NRW (college libraries http://lobid.org CC-0 Northrhine-Westphalia) http://lobid.org/sparql * No license info given, so we assume that these datasets are not under an open license. ing ontologies and has already published a significant analyses. In addition, the Semantic Sensor Web [17, number of datasets.27 82] is a source of semantically annotated data streams Table3 gives an overview of the range of such rela- that provide near-realtime sensor observations of phe- tively static scientific datasets. They manifest the out- nomena. Services that offer data on the current weather put of scientific studies and can act as input to new or water conditions, for example, enable reasoning on those data streams [26] and on-the-fly integration with 27See http://linkedlifedata.com/sources for an additional data sources [48]. In this case, Semantic overview. Web technologies are not only used to annotate and ex- C. Keßler et al. / Linked Data for Science and Education 7 change research results, but also part of the scientific SIOC [76,13]) can also be used to represent academic analysis toolchain. As the semantic annotation of such institutions and communities. services [66] has experienced significant adoption in Core vocabularies to represent elements related to the Sensor Web community, Linked Sensor Data [17] science and education have of course also emerged. have been introduced as an application of the LD prin- For example, TEACH [58] and Linked Science (LSC) [3] ciples [7] to observation data. Despite the challenges represent two core vocabularies, respectively for the that the often high frequency and large volume of ob- representation of courses and of scientific resources. servations pose when serving them as LD [59], the ap- The Ontology of units of Measure and related con- proach has recently gained traction, with first applica- cepts (OM) addresses the annotation of research data tions demonstrated [75,86,5]. with the correct units of measure, addressing a require- ment for semantic interoperability across many fields 3.3. Vocabularies and Schemas for Science and that work empirically [79]. Metadata for Learning Op- Education portunities (MLO) [33] and eXchanging Course Re- lated Information (XCRI) [80] are two attempts at pro- As discussed in previous sections, the eLearning viding standard formats for advertising and exchang- and Technology Enhanced Learning communities have ing information about the course offerings of aca- been very prolific in building metadata standards, demic institutions, which had both been adapted to the especially to describe and organise educational re- LD world (through an RDFs version, see http:// sources. Similarly, the scientific community has cre- linkeduniversities.org/lu/index.php/ ated formats and vocabularies to exchange information vocabularies/). about research. These standards have however rarely While a lot of effort was spent on metadata stan- been reused in or adapted to the LD world, for a vari- dards for learning resources, there have not been ety of reasons, and a new set of open vocabularies are significant initiatives to transfer this effort to LD, now emerging to represent different aspects of science besides the Learning Resource Metadata Initiative and education. (LRMI [88]), which attempts to provide a schema. As a shared interest between science and education org compliant vocabulary for educational resources is the representation of bibliographical references, to and the mEducator resource schema [34], which pro- support citation, reuse and discovery. Several ontolo- vides an adaptable RDF schema for describing edu- gies exist that can be used to represent references, in- cational resources. Many other vocabularies are used cluding the Bibliographic Ontology, BIBO [38], and however that are dedicated to the general representa- others like FABIO [84] based on the more sophisti- tion of (online) resources, and applied to the repre- cated FRBR model [47]. At the basis of a lot of these sentation of educational content (e.g., the W3C Ontol- efforts is the Dublin Core metadata schema [31] which ogy for Media Resources [63] for the representation of represents a common ground for the description of re- multimedia artefacts). It is important also to notice that sources and documents. Other ontologies have been significant efforts are required and being spent on mak- created that focus on more specific aspects of biblio- ing available common topic classifications to anno- graphical references, such as the CiTO ontology [83]. tate these resources, including for example the SKOS Another prominent aspect of representing informa- representations of the Mathematics Subject Classifica- tion for science and education is the representation of tion [2] or of the UK JACS codes [81]. In many cases, the communities and institutions in which scientific the LD-compliant URIs from the Library of Congress and educational activities are realised, especially aca- Subject Headings would also be used [64]. demic organisations. AIISO (Academic Institution In- ternal Structure Ontology) [87] for example is a com- As described above, vocabularies and ontologies for mon, simple ontology for the representation of the sub- LD in the area of science and education are currently organisations of an academic institution and of its of- gaining momentum and new initiatives are emerging ferings. The Bolowgna ontology [25] focuses more frequently. Specialised vocabularies that can represent specifically on the representation of notions related precisely the notions related to these areas are needed, to the Bologna process in Europe. General vocabu- that relate to more general vocabularies for online re- laries to represent organisations (such as the W3C sources, organisations, communities, etc. While these core organisation ontology [78]), as well as connec- vocabularies are being developed, additional effort is tions within communities (such as FOAF [14] and however needed in achieving a widespread adoption of 8 C. Keßler et al. / Linked Data for Science and Education

Table 3 Selection of scientific datasets Dataset Source & SPARQL Endpoint License Bio2RDF datasets http://bio2rdf.org various * http://*.bio2rdf.org/sparql Italian National Research Council http://data.cnr.it CC-BY-NC-ND http://data.cnr.it/sparql 2000 U.S. Census Data http://rdfabout.com/demo/census CC-NC http://www.rdfabout.com/sparql Chronicling America http://chroniclingamerica.loc.gov PD – Drugbank http://www4.wiwiss.fu-berlin.de/drugbank/ c http://www4.wiwiss.fu-berlin.de/drugbank/sparql Linked Life Data http://linkedlifedata.com c http://linkedlifedata.com/sparql European Nature http://semantic.eea.europa.eu CC-BY http://semantic.eea.europa.eu/sparql GeoSpecies Knowledge Base http://lod.geospecies.org/ CC-BY-SA http://lsd.taxonconcept.org/sparql Linked Sensor Data http://wiki.knoesis.org/index.php/SSW_Datasets CC-BY http://sonicbanana.cs.wright.edu:8890/sparql World Bank Linked Data http://worldbank.270a.info/ CC-0 http://worldbank.270a.info/sparql * BioRDF is a collection of datasets which have been published under separate licenses.

4. Tools and Technologies

For what concerns the publication of LD, in many cases, the same tools and technologies that are being used in other domains would apply similarly in sci- ence and education (see e.g., [18]). These include of course triple stores, SPARQL engines and URI deliv- ery mechanisms. In terms of data processing, the need for specialised tools might appear. Indeed, while generic tools such as Google Refine28 with the RDF extension29, Triplify30 or D2RQ31, can be applied on many different types of data, specialised data conversion and processing tools could be more adequate to transform specific for- mats used in science and education. These include in Fig. 1. Vocabularies used in six datasets related to education. Taken particular SIMILE RDFIzer’s converters32 for formats from the LUCERO Blog [20]. such as MARC records33 (for library catalogues), OAI-

28http://code.google.com/p/google-refine/ these common vocabularies. Indeed, as shown in Fig- 29http://refine.deri.ie/ ure1, generic vocabularies are still prominently used 30http://triplify.org/ in educational datasets, and only a few of the promi- 31http://d2rq.org/ 32 nent specialised ones (BIBO, AIISO) are really being http://simile.mit.edu/repository/RDFizers/ marcmods2rdf/ commonly employed (therefore creating entry points 33http://simile.mit.edu/repository/RDFizers/ for data exchange and interoperability). marcmods2rdf/ C. Keßler et al. / Linked Data for Science and Education 9

PMH34 (for open archive repositories) and OCW35 (for tended to handle LD, and the use of such extensions in MIT OpenCourseWare metadata). Other specialised applications for science and education are starting to RDF extractors include Bibtex2RDF36 to convert bib- emerge.40 liographical references in the Bibtex format, or the Youtube2RDF37 tool that converts Youtube playlists into RDF using media vocabularies (see [18]). Inter- 5. Applications active mapping tools such as Karma [62] support the As mentioned in [18], we can identify 3 main use user in selecting meaningful annotations for their data cases in learner centric applications. These scenarios during the conversion process to RDF. are especially relevant in open and distance learning, It is however in developing applications of LD in but can be extended to the education domain in gen- science and education that certain types of tools and eral, to research and to science. technologies become really crucial. As described in the next section, resource discovery and recommendation Resource delivery and navigation: This includes the are clearly amongst the most crucially needed features possibility to browse through resources in an or- both in teaching and research (where resources can be ganised manner and obtain structured data for learning material, articles, experts, etc.) Typical rec- these resources. In such applications, the struc- ommendation approaches can be employed to realise tures, classifications and links that are included such features, either by content or through collabora- in the underlying data are used to provide nav- tive filtering. However, on a number of examples (in- igational and browsing mechanisms improving cluding the ones given below), such approaches have the ease of access to relevant resources. Faceted been shown to improve (in performance and/or usabil- browsing could for example be used, exploiting ity) through relying on the structured representation, structured data to provide relevant filters to ap- links and semantics brought by LD (see e.g. [72]). ply on search results, as in Faceted DBLP [27]. More sophisticated techniques for resource discov- Relevant resources can also be included within ery, as well as for other scenarios in exploiting LD more general environments based on links be- in science and education are also becoming more and tween such environments and available LD. This more important in these areas. These include for ex- has been done for example in the course catalogue ample the use of Natural Language Processing (espe- application of the Open University (see descrip- cially Named Entity Recognition) to analyse and char- tions in [18,19]) and is at the basis of the myEx- periment41 collaborative platform (see [40]). The acterise textual resources in terms of LD entities. DB- 42 pedia Spotlight [68] has for example been employed in Glottolog/Langdoc project, providing a cata- several applications (including DiscOU mentioned be- logue of the languages of the world, is an im- low) in order to connect learning resources to DPpedia pressive showcase of applying SKOS to interlink entities, which in turn provide a hub to all sorts of other resources and make them navigable [73]. Sim- datasets on the LD cloud. Similarly, data mining and ilarly, data portals, such as the Linked Science portals for Brazilian Amazon Rainforest data, hu- data analytics techniques are attracting more and more manitarian response data from the 2010 earth- attention in science and education (especially through quake in Haiti, and geographic information sci- the emerging Learning Analytics domain [35]), where ence community data on researchers and publica- the traditional “data acquisition bottleneck” is being tions [60] provide exploratory user interfaces that replaced by an abundance of rich, structured and in- allow non-specialist users to explore the data.43 terconnected data from the LD cloud. To tackle this, Initial projects such as Open PHACTS [90] are common tools such as Gephi38 and R39 have been ex- now trying to go beyond resource delivery and put Semantic Web technologies to use, in this case for 34http://simile.mit.edu/repository/RDFizers/ drug discovery. oai2rdf/ 35http://simile.mit.edu/repository/RDFizers/ ocw2rdf/ 40see for example http:// 36www.l3s.de/~siberski/bibtex2rdf/ blog.ouseful.info/2011/01/30/ 37http://code.google.com/p/luceroproject/ open-university-undergraduate-module-map/ wiki/Youtube2RDF 41http://www.myexperiment.org/ 38https://gephi.org/ 42http://www.glottolog.org/ 39http://www.r-project.org/ 43http://linkedscience.org/data/ 10 C. Keßler et al. / Linked Data for Science and Education

Resource discovery and recommendation: Going a by researchers with different interests or back- step further with respect to resource delivery, one grounds. of the base scenarios in both education and sci- Personalisation and Social Recommendation: Similarly ence that can be better supported by the use of to resource discovery, the idea of social learn- LD is resource discovery. Indeed, finding rele- ing and online collaboration in science is to de- vant material for learning, teaching or research liver to users information/resources that are of in- in the flow of existing resources and publica- terest and create personalized learning environ- tions is becoming a major challenge for both stu- ments [50]. However, in this case, the idea is to dents and scientists. In particular, plenty of work rely on the profile of the user and on his/her in- has been done on the automatic generation of terests rather than on similarities with existing re- educational resources, for instance, assessment sources. This idea is becoming very popular in items [77,36]. Traditional information retrieval open and distance learning in particular, where and recommender system approaches can be em- the absence of “classroom-based interactions” ployed in education, but with the added impact of needs to be compensated, through technology, by the common/rich structure brought by LD. A typ- other ways of interacting with fellow learners, ical example of such an application is Talis As- teachers and available learning material. Early ex- pire Community Edition44 [89], which allows lec- amples of this type of applications include the turers from UK universities to create and man- Open University’s Course Profile Facebook app48 age online reading lists for the courses they teach. (see [18] for a short description), which allows With the help of a common, LD-based meta- students to share their “learning journey” and in- data representation, the tool offers the possibility teract with other students on the basis of com- to obtain similar, recommended resources to the mon interest, courses, etc. Another, particularly ones being selected. Similarly, the DiscOU ap- innovative approach [51] uses LD as a means plication developed at the Open University45 of- to directly interlink unstructured annotations of fers recommendations of (open) resources rele- eBooks in order to improve interoperability and vant to a TV programme, but in this case based correlation of related learner feedback. on semantic annotations of the resources using LD entities to measure their relevance and pro- In addition, there are many advantages in employing vide meaningful explanations of the proposed re- LD that are not necessarily directly for the purpose of sults [21]. Additionally, social educational en- building end-user (student, learner, lecturer, scientist) vironments such as Metamorphosis+ [52] have applications, but that address the general challenges adopted LD-mechanisms to integrate, cluster and mentioned in Section2. For example, the Bowlogna correlate educational resources and subjects. ontology [25] has been developed to support infor- mation exchange between academic institutions across In the scientific community, first attempts have Europe. Many applications of LD are also targeting been made to utilise LD to collect input data for support for management and organisation within edu- studies. LinkedDataLens46 is a tool that gener- cational institutions, including for example the mon- ates input data for network analysis algorithms itoring of research communities at the Open Univer- from LD, and makes the output available as LD sity [19] or at the University of Bristol.49 again [39,42]. The iExplore tool47 [71] imple- ments such a user interface with domain expert users as target group, leveraging the graph struc- 6. Conclusions ture to assist the researcher in developing new hy- potheses. Brenninkmeijer et al. [12] propose sci- The LD approach has gained significant traction entific lenses that make views on scientific linked in science and education. It facilitates the exchange datasets context aware, so that they can be used of data within and between scientific communities. Likewise, the OER community leverages LD to en- 44http://community.talisaspire.com/ hance annotation and retrievability of learning ma- 45http://discou.info 46http://linkeddatalens.isi.edu 47http://knoesis.wright.edu/iExplore/ 48http://apps.facebook.com/courseprofiles/ iExplore.html 49http://researchrevealed.ilrt.bris.ac.uk/ C. Keßler et al. / Linked Data for Science and Education 11 terials for use in classrooms as well as virtual and its website. With a growing number of datasets being distance learning environments. The number of aca- published, provenance, versioning, and archiving be- demic institutions publishing LD is constantly grow- comes more important. Approaches to handle these ex- ing, as the growing lists of partners at the LinkedUni- ist [37,43,67,70], but are hardly applied in practice so versties.org and LinkedEducation.org websites show. far. Likewise, the potential to leverage Semantic Web These hubs, together with blogs,50 mailing lists51 and technology to support the scientific process – besides regular events such as the Linked Science [56,57] and mere data management and sharing – is still largely un- Linked Learning [28,29] workshop series allow the tapped [15], for example, when it comes to formalizing communities to share and discuss best practices, next hypotheses [41] or breaking down results into nano- steps, and collaborations. Beyond the the publication publications [69]. Beyond the technology, science and of LD for science and education, standardisation ef- education will have to continue to become more open, forts such as VIVO,52 euroCRIS,53 or the Learning Re- with support for the open access movements on a po- source Metadata Initiative (LRMI) are engaging with litical level. practitioners in discussions towards a wider adoption As Bechhofer et al. [6] note, Linked Data alone is of the LD approach. not enough for scientists. It only provides the base- Concerning next steps for the LD for science and line technology, and there are still plenty of open chal- education community, further internationalisation and lenges in terms of tool support, coordination and ca- interconnection is indeed an important goal with its pacity building, as well as policy development in or- current centre in Europe, and especially the UK. As der to fully exploit the potential for the education and the overall landscape is still very fragmented, as high- science sectors. lighted by this special issue, there are significant chal- lenges to be tackled in order to enable LD to live up to its expectations in the academic sector. For exam- References ple, dereferenceable URIs, shared vocabularies, and a standardized technology stack are only the first build- [1] M.A. Angrosh, Stephen Cranefield, and Nigel Stanger. Con- ing blocks towards the vision of an executable paper textual Information Retrieval in Research Articles: Semantic Publishing Tools for the Research Community. Semantic Web that is linked to all relevant data, models, and soft- Journal, acceppted. ware. In order to achieve such ambitious aims, tools [2] Ioannis Antoniou, Charalampos Bratsas, Anastasia Dimou, are required that support the data publishing process, Patrick Ion, Christoph Lange, and Wolfram Sperber. Mathe- adapted to the specific requirements of the scientific matics Subject Classification Linked Wiki Version 0.1 Alpha http://sci-class.math.auth. and educational domains. These tools have to be inte- 1. Available from gr/MSCLW; last accessed 2012-08-26, 2012. grated with the practitioners’ environments and easy to [3] Alkyoni Baglatzi, Tomi Kauppinen, and Carsten Keßler. use, as outlined in [93], so that no specialist knowledge Linked Science Core Vocabulary Specification. Available about Semantic Web technologies is required. Com- from http://linkedscience.org/lsc/ns; last ac- parable to content management systems that allow lay cessed 2012-08-26, 2011. [4] Thomas Baker, Emmanuelle Bermès, Karen Coyle, Gordon users to administrate websites, data management sys- Dunsire, Antoine Isaac, Peter Murray, Michael Panzer, Jodi tems are required that support users in publishing their Schneider, Ross Singer, Ed Summers, William Waites, Jeff data online as well as aligning it with existing data Young, and Marcia Zeng. Library Linked Data Incubator sources. Currently, some institutions try to meet the Group Final Report. W3C Incubator Group Report, avail- need for such expertise by hiring data scientists; how- able from http://www.w3.org/2005/Incubator/ lld/XGR-lld-20111025/; last accessed 2012-08-26, ever, this model is unlikely to scale, if we think a few 2011. years ahead, when a data portal will probably be a [5] Payam Barnaghi, Mirko Presser, and Klaus Moessner. Pub- common service to offer for a university, in addition to lishing Linked Sensor Data. In Proceedings of the 3rd Inter- national Workshop on Semantic Sensor Networks, Shanghai, China, 2010. 50See http://linkedscience.org, http: [6] Sean Bechhofer, Iain Buchan, , Paolo Missier, //lucero-project.info/, and http://blogs.ecs. John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruick- soton.ac.uk/data/ for examples. shank, Mark Delderfield, Ian Dunlopa, Matthew Gamble, Da- 51https://groups.google.com/forum/ nius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, ?fromgroups#!forum/linked-universities and Carole Goble. Why Linked Data is Not Enough for Scien- 52http://vivoweb.org tists. In e-Science (e-Science), 2010 IEEE Sixth International 53http://eurocris.org/ Conference on, pages 300–307. IEEE, 2010. 12 C. Keßler et al. / Linked Data for Science and Education

[7] Tim Berners-Lee. Linked Data – Design Issues. Available Demo session of the International Semantic Web Conference, online from http://www.w3.org/DesignIssues/ 2012. LinkedData.html; last accessed 2012-04-04, 2009. [22] Paul A. David. Understanding the emergence of ‘open science’ [8] Tim Berners-Lee, James Hendler, and Ora Lassila. The Se- institutions: functionalist economics in historical context. In- mantic Web. Scientific American, 284(5):34–43, May 2001. dustrial and Corporate Change, 13(4):571–589, 2004. [9] Geoffrey Bilder. Content Negotiation for CrossRef [23] Giovana Mira de Espindola and Tomi Kauppinen. Semantic DOIs. Available from http://www.crossref.org/ Linking of Data for the Brazilian Amazon Rainforest Statistics. CrossTech/2011/04/content_negotiation_ In GeoChange 2010—GIScience for Environmental Change, for_crossr.html; last accessed 2012-08-26, April 2011. Campos do Jordão, São Paulo, Brazil, November 2010. [10] Christian Bizer, Tom Heath, and Tim Berners-Lee. Linked [24] Rafael de Santiago and André L.A. Raabe. Architecture Data – The Story So Far. International Journal on Semantic for Learning Objects Sharing among Learning Institutions- Web and Information Systems, 5(3):1–22, 2009. LOP2P. IEEE Transactions on Learning Technologies, April- [11] Jan Brase. DataCite—A global registration agency for research June:91–95, 2010. data. In Cooperation and Promotion of Information Resources [25] Gianluca Demartini, Iliya Enchev, Joël Gapany, and Philippe in Science and Technology, 2009. COINFO’09. Fourth Inter- Cudré-Mauroux. The Bowlogna Ontology: Fostering Open national Conference on, pages 257–261. IEEE, 2009. Curricula and Agile Knowledge Bases for Europe’s Higher Ed- [12] Christian Y. A. Brenninkmeijer, Chris Evelo, Carole Goble, ucation Landscape. Semantic Web Journal, this issue. Alasdair J.G. Gray, Paul Groth, , Robert Stevens, [26] Anusuriya Devaraju and Tomi Kauppinen. Geo-Processes Antony Williams, and Egon L. Willighagen. Scientific Lenses and Properties Observed by Sensors: Can We Relate Them? over Linked Data: An approach to support task specific views In GeoChange 2010—GIScience for Environmental Change, of the data. A vision. In Tomi Kauppinen, Line C. Pouchard, Campos do Jordão, São Paulo, Brazil, November 2010. and Carsten Keßler, editors, Proceedings of the 2nd Interna- [27] Jörg Diederich and Wolf-Tilo Balke. FacetedDBLP – Naviga- tional Workshop on Linked Science 2012 – Tackling Big Data tional Access for Digital Libraries. Bulletin of IEEE Technical (LISC2012), in conjunction with 11th International Semantic Committee on Digital Libraries, 4(1), 2008. Web Conference (ISWC2012), Boston, MA, 2012. [28] Stefan Dietze, Mathieu d’Aquin, and Dragan Gaševic,´ editors. [13] John Breslin, Andreas Harth, Uldis Bojars, and Stefan Decker. Proceedings of Linked Learning 2011: 1st International Work- Towards Semantically-Interlinked Online Communities. In shop on eLearning Approaches for the Linked Data Age, in Asunción Gómez-Pérez and Jérôme Euzenat, editors, The Se- conjunction with the 8th Extended Semantic Web Conference, mantic Web: Research and Applications, volume 3532 of ESWC2011, Heraklion, Crete, 2012. Lecture Notes in Computer Science, pages 71–83. Springer [29] Stefan Dietze, Mathieu d’Aquin, and Dragan Gaševic,´ editors. Berlin/Heidelberg, 2005. Proceedings of the 2nd International Workshop on Learning [14] Dan Brickley and Libby Miller. FOAF Vocabulary Specifi- and Education with the Web of Data, in conjunction with the cation 0.98. Available from http://xmlns.com/foaf/ World Wide Web Conference 2012 (WWW2012), Lyon, France, spec/20100809.html; last accessed 2012-08-26, 2010. 2012. [15] Boyan Brodaric and Marc Gahegan. Ontology Use for Seman- [30] Stefan Dietze, Salvador Sanchez-Alonso, Hannes Ebner, Hong tic e-Science. Semantic Web Journal, 1:149–153, April 2010. Qing, Daniela Giordano, Ivana Marenzi, and Bernardo Pereira [16] Paolo Ciccarese, David Shotton, Silvio Peroni, and Tim Clark. Nunes. Interlinking educational Resources and the Web of CiTO + SWAN: The Web Semantics of Bibliographic Records, Data – a Survey of Challenges and Approaches. Emerald Citations, Evidence and Discourse Relationships. Semantic Program: electronic Library and Information Systems, 47(1), Web Journal, acceppted. 2013. [17] Michael Compton, Payam Barnaghi, Oscar Corcho, Simon [31] Dublin Core Metadata Initiative. DCMI Metadata Terms. Cox, Manfred Hauswirth, Cory Henson, Krzysztof Janowicz, Avaliable online from http://dublincore.org/ Laurent Lefort, Amit Sheth, and Kerry Taylor. The SSN Ontol- documents/dcmi-terms/; last accessed 2012-08-02, ogy of the W3C Semantic Sensor Network Incubator Group. 2012. , forthcoming 2012. [32] Erik Duval, Wayne Hodgins, Stuart Sutton, and Stuart L. [18] Mathieu d’Aquin. Linked Data for Open and Distance Weibel. Metadata Principles and Practicalities. D-Lib Maga- Learning. Available from http://www.col.org/ zine, 8(4), 2013. resources/publications/Pages/detail.aspx? [33] European Committee for Standardization. Metadata for PID=420; last accessed 2012-08-26, 2012. Learning Opportunities (MLO) - Advertising. Available from [19] Mathieu d’Aquin. Putting Linked Data to Use in a Large ftp://ftp.cenorm.be/PUBLIC/CWAs/e-Europe/ Higher-Education Organisation. In Interacting with Linked WS-LT/CWA15903-00-2008-Dec.pdf; last accessed Data workshop (ILD) at Extended Semantic Web Conference, 2012-08-26, 2008. ESWC, 2012. [34] Mitsopoulou Evangelia, Davide Taibi, Daniela Giordano, Ste- [20] Mathieu d’Aquin. So, what’s in linked datasets for fan Dietze, Hong Qingand Yu, Panagiotis Bamidis, Charalam- education? Available from the LUCERO blog at pos Bratsas, and Luke Woodham. Connecting Medical Edu- http://lucero-project.info/lb/2012/04/ cational Resources to the Linked Data Cloud: the mEducator so-whats-in-linked-datasets-for-education/; RDF Schema, Store and API. In Proceedings of Linked Learn- last accessed 2012-08-28, 2012. ing 2011, Proceedings of the 1st International Workshop on [21] Mathieu d’Aquin, Carlo Allocca, and Trevor Collins. DiscOU: eLearning Approaches for the Linked Data Age, CEUR-WS, A Flexible Discovery Engine for Open Educational Resources Vol. 717, Heraklion, Greece, 2011. Using Semantic Indexing and Relationship Summaries. In C. Keßler et al. / Linked Data for Science and Education 13

[35] Rebecca Ferguson. The State of Learning Analytics in 2012: [49] A. Jentzsch, O. Hassanzadeh, C. Bizer, B. Andersson, and A Review and Future Challenges. Knowledge Media Institute, S. Stephens. Enabling Tailored Therapeutics with Linked Data. Technical Report KMI-2012-01, 2012. In Proceedings of the 2nd Workshop on Linked Data on the [36] Muriel Foulonneau and Valentin Grouès. Common vs. Expert Web (LDOW2009), 2009. knowledge: making the Semantic Web an educational model. [50] Zoran Jeremica,´ Jelena Jovanovic,´ and Dragan Gaševic.´ Per- In Proceedings of Linked Learning 2012 – The 2nd Interna- sonal Learning Environments on the Social Semantic Web. Se- tional Workshop on Learning and Education with the Web of mantic Web Journal, this issue. Data (LiLe-2012 at WWW-2012), Lyon, France, 2012. [51] Julien Robinson. Johann Stan, Myriam Ribiere. Using linked [37] Daniel Garijo and . Augmenting PROV with data to reduce learning latency for e-book readers. In Pro- Plans in P-PLAN: Scientific Processes as Linked Data. In ceedings of Linked Learning 2011 - the 1st International Work- Tomi Kauppinen, Line C. Pouchard, and Carsten Keßler, shop on eLearning Approaches for Linked Data Age, Herak- editors, Proceedings of the 2nd International Workshop on lion, Greece, 2011. Linked Science 2012 – Tackling Big Data (LISC2012), in con- [52] Eleni Kaldoudi, Nikolas Dovrolis, and Stefan Dietze. Informa- junction with 11th International Semantic Web Conference tion Organization on the Internet based on Heterogeneous So- (ISWC2012), Boston, MA, 2012. cial Networks. In Proceedings of 29th ACM International Con- [38] Frédérick Giasson and Bruce D’Arcus. Bibliographic On- ference on Design of Communication (ACM SIGDOCâA˘Z11)´ , tology Specification. Avaliable online from http:// Pisa, Italy, 2011. bibliontology.com/specification; last accessed [53] Tomi Kauppinen, Alkyoni Baglatzi, and Carsten Keßler. 2011-12-29, 2009. Linked Science: Interconnecting Scientific Assets. In Terence [39] Yolanda Gil and Paul Groth. Linkeddatalens: Linked data as Critchlow and Kerstin Kleese-Van Dam, editors, Data Inten- a network of networks. In Proceedings of the sixth interna- sive Science. CRC Press, USA, forthcoming 2012. tional conference on Knowledge capture, K-CAP ’11, pages [54] Tomi Kauppinen and Giovana Mira de Espindola. Linked 191–192, New York, NY, USA, 2011. ACM. Open Science—Communicating, Sharing and Evaluating Data, [40] Carole Anne Goble and David Charles De Roure. myEx- Methods and Results for Executable Papers. Proceedings of periment: social networking for workflow-using e-scientists. the International Conference on Computational Science (ICCS In Workshop on Workflows in support of large-scale science, 2011), Procedia Computer Science, 4(0):726–731, 2011. 2007. [55] Tomi Kauppinen and Giovana Mira de Espindola. Ontology- [41] Bernardo Gonçalves, Fabio Porto, and Ana Maria Moura. On Based Modeling of Land Change Trajectories in the Brazilian the Semantic Engineering of Scientific Hypotheses as Linked Amazon. In Geoinformatik 2011—GeoChange, Münster, Ger- Data. In Tomi Kauppinen, Line C. Pouchard, and Carsten many„ June 2011. Keßler, editors, Proceedings of the 2nd International Workshop [56] Tomi Kauppinen, Line C. Pouchard, and Carsten Keßler, edi- on Linked Science 2012 – Tackling Big Data (LISC2012), in tors. Proceedings of the 1st International Workshop on Linked conjunction with 11th International Semantic Web Conference Science 2011 (LISC2011), in conjunction with 10th Inter- (ISWC2012), Boston, MA, 2012. national Semantic Web Conference (ISWC2011), Bonn, Ger- [42] Paul Groth and Yolanda Gil. Linked Data for Network Science. many, 2011. In Tomi Kauppinen, Line C. Pouchard, and Carsten Keßler, [57] Tomi Kauppinen, Line C. Pouchard, and Carsten Keßler, ed- editors, Proceedings of the First International Workshop on itors. Proceedings of the 2nd International Workshop on Linked Science 2011, in conjunction with the International Se- Linked Science 2012 – Tackling Big Data (LISC2012), in con- mantic Web Conference (ISWC2011). Bonn, Germany, October junction with 11th International Semantic Web Conference 24, 2011. (ISWC2012), Boston, MA, 2012. [43] Olaf Hartig. Provenance Information in the Web of Data. In [58] Tomi Kauppinen and Johannes Trame. Teaching Core Proceedings of the 2nd Workshop on Linked Data on the Web Vocabulary Specification. Available from http:// (LDOW2009), 2009. linkedscience.org/teach/ns; last accessed 2012- [44] Bernhard Haslhofer, Elaheh Momeni, Manuel Gay, and Rainer 08-26, 2011. Simon. Augmenting Europeana Content with Linked Data Re- [59] Carsten Keßler and Krzysztof Janowicz. Linking Sensor sources. In Proceedings of the 6th International Conference on Data—Why, to What, and How? In Kerry Taylor, Arun Ayya- Semantic Systems. ACM, 2010. gari, and David De Roure, editors, Proceedings of the 3rd [45] Julia Hauser. Linked Data Service of the German Na- International workshop on Semantic Sensor Networks 2010 tional Library. Available from http://www.dnb.de/ (SSN10) in conjunction with the 9th International Seman- EN/Service/DigitaleDienste/LinkedData/ tic Web Conference (ISWC 2010), volume 668 of CEUR-WS, linkeddata_node.html, July 2012. Shanghai, China, 2010. [46] IEEE. IEEE Standard for Learning Object Metadata, IEEE Std [60] Carsten Keßler, Krzysztof Janowicz, and Tomi Kauppinen. 1484.12.1-2002. doi: 10.1109/IEEESTD.2002.94128, 2002. spatial@linkedscience—Exploring the Research Field of GI- [47] IFLA Study Group on the Functional Requirements of Bibli- Science with Linked Data. In Ningchuan Xiao, Mei-Po ographic Records. Functional Requirements of Bibliographic Kwan, Michael F. Goodchild, and Shashi Shekhar, editors, Records, 1998. Geographic Information Science. 7th International Confer- [48] Krzysztof Janowicz, Sven Schade, Arne Bröring, Carsten ence, GIScience 2012, Columbus, OH, USA, September 18–21, Keßler, Patrick Maué, and Christoph Stasch. Semantic En- 2012, Proceedings, volume 7478 of Lecture Notes in Computer ablement for Spatial Data Infrastructures. Transactions in GIS, Science, pages 102–117, 2012. 14(2):111–129, 2010. [61] Carsten Keßler and Tomi Kauppinen. Linked Open Data Uni- versity of Münster—Infrastructure and Applications. In Demos 14 C. Keßler et al. / Linked Data for Science and Education

of the Extended Semantic Web Conference 2012 (ESWC2012), port Content-based Recommender Systems. In International Heraklion, Crete, Greece, May 2012. Conference on Semantic Systems (I-Semantics), 2012. [62] Craig A. Knoblock, Pedro Szekely, Jose Luis Ambite, Shub- [73] Sebastian Nordhoff and Harald Hammarström. Glot- ham Gupta, Aman Goel, Maria Muslea, Kristina Lerman, and tolog/Langdoc: Defining Dialects, Languages, and Language Parag Mallick. Interactively Mapping Data Sources into the Families as Collections of Resources. In Tomi Kauppinen, Semantic Web. In Tomi Kauppinen, Line C. Pouchard, and Line C. Pouchard, and Carsten Keßler, editors, Proceedings of Carsten Keßler, editors, Proceedings of the First International the First International Workshop on Linked Science 2011, in Workshop on Linked Science 2011, in conjunction with the conjunction with the International Semantic Web Conference International Semantic Web Conference (ISWC2011). Bonn, (ISWC2011). Bonn, Germany, October 24, 2011. Germany, October 24, 2011. [74] Kevin Page, Raúl Palma, Piotr Ho, Graham Klyne, Stian [63] WonSuk Lee, Werner Bailer, Tobias Bürger, Pierre-Antoine Soiland-Reyes, Don Cruickshank, Rafael González Cabero, Champin, Jean-Pierre Evain, Véronique Malaisé, Thierry Esteban García Cuesta, David De Roure, Jun Zhao, and Michel, Felix Sasaki, Joakim Söderberg, Florian Stegmaier, José Manuel Gómez-Pérez. From workflows to Research and John Strassner. Ontology for media resources 1.0. Avail- Objects: an architecture for preserving the semantics of sci- able from http://www.w3.org/TR/mediaont-10/; ence. In Tomi Kauppinen, Line C. Pouchard, and Carsten last accessed 2012-08-26, 2012. Keßler, editors, Proceedings of the 2nd International Workshop [64] Library of Congress. Subject Headings. Avail- on Linked Science 2012 – Tackling Big Data (LISC2012), in able from http://id.loc.gov/authorities/ conjunction with 11th International Semantic Web Conference subjects.html; last accessed 2012-98-26. (ISWC2012), Boston, MA, 2012. [65] Library of Congress. MARC 21 Format for [75] Kevin R. Page, David C. De Roure, Kirk Martinez, Jason D. Bibliographic Data. Available online from url- Sadler, and Oles Y. Kit. Linked Sensor Data: RESTfully http://www.loc.gov/marc/bibliographic/ecbdhome.html; last serving RDF and GML. In Semantic Sensor Networks 2009 accessed 2012-08-13, 1999–2012. (SSN09), October 2009. [66] Patrick Maué, Henry Michels, and Marcell Roth. Injecting se- [76] Alexandre Passant, Uldis Bojàrs, John Breslin, and Stefan mantic annotations into (geospatial) Web service descriptions. Decker. The SIOC Project: Semantically-Interlinked On- Semantic Web Journal, acceppted. line Communities, from Humans to Machines. In Julian [67] James P. McCusker, Timothy Lebo, Li Ding, Cynthia Chang, Padget, Alexander Artikis, Wamberto Vasconcelos, Kostas Paulo Pinheiro da Silva, and Deborah L. McGuinness. Where Stathis, Viviane da Silva, Eric Matson, and Axel Polleres, ed- did you hear that? Information and the Sources They Come itors, Coordination, Organizations, Institutions and Norms in From. In Tomi Kauppinen, Line C. Pouchard, and Carsten Agent Systems V, volume 6069 of Lecture Notes in Computer Keßler, editors, Proceedings of the First International Work- Science, pages 179–194. Springer Berlin/Heidelberg, 2010. shop on Linked Science 2011, in conjunction with the Interna- 10.1007/978-3-642-14962-7_12. tional Semantic Web Conference (ISWC2011). Bonn, Germany, [77] Guillermo Alvaro Rey, Irene Celino, Mariana Damova, Dan- October 24, 2011. ica Damljanovic, Ning Li, Panos Alexopoulos, and Vladan [68] Pablo N. Mendes, Max Jakob, Andrés García-Silva, and Chris- Devedzic. Semi-Automatic Generation of Quizzes and Learn- tian Bizer. DBpedia Spotlight: Shedding Light on the Web of ing Artifacts from Linked Data. In Proceedings of Linked Documents. In International Conference on Semantic Systems Learning 2012 - The 2nd International Workshop on Learn- (I-Semantics), 2011. ing and Education with the Web of Data (LiLe-2012 at WWW- [69] , Herman van Haagen, Christine Chichester, 2012), Lyon, France, 2012. Peter-Bram ’t Hoen, Johan T den Dunnen, Gertjan van Om- [78] Dave Reynolds. An organization ontology. Available from men, Erik van Mulligen, Bharat Singh, Rob Hooft, Marco http://www.w3.org/TR/vocab-org/; last accessed Roos, Joel Hammond, Bruce Kiesel, Belinda Giardine, Jan 2012-08-26, 2012. Velterop, Paul Groth, and Schultes. The value of data. Nature [79] Hajo Rijgersberg, Mark van Assem, and Jan Top. Ontology of Genetics, 43:281–283, 2011. Units of Measure and Related Concepts. Semantic Web Jour- [70] Johnson Mwebaze, Danny Boxhoorn, and Edwin Valentijn. nal, this issue. Supporting Scientific Collaboration Through Class-Based Ob- [80] Ben Ryan, Mark Stubbs, and Scott Wilson. XCRI CAP 1.1. ject Versioning. In Tomi Kauppinen, Line C. Pouchard, and Available from http://www.xcri.org/wiki/index. Carsten Keßler, editors, Proceedings of the First International php/XCRI_CAP_1.1; last accessed 2012-08-26, 2009. Workshop on Linked Science 2011, in conjunction with the [81] Nadeem Shabir. JACS (Joint Academic Coding System) Data International Semantic Web Conference (ISWC2011). Bonn, Set. Available from http://jacs.dataincubator. Germany, October 24, 2011. org/; last accessed 2012-08-26, 2012. [71] Vinh Nguyen, Olivier Bodenreider, Todd Minning, and Amit [82] Amit Sheth, Cory Henson, and Satya S. Sahoo. Semantic Sen- Sheth. The Knowledge-driven Exploration of Integrated sor Web. Internet , IEEE, 12(4):78–83, 2008. Biomedical Knowledge Sources Facilitates the Generation of [83] David Shotton. CiTO, the Citation Typing Ontology. Journal New Hypotheses. In Tomi Kauppinen, Line C. Pouchard, and of Biomedical Semantics, 1(Suppl. 1):S6, 2010. Carsten Keßler, editors, Proceedings of the First International [84] David Shotton and Silvio Peroni. FaBiO, the FRBR-aligned Workshop on Linked Science 2011, in conjunction with the Bibliographic Ontology, 2012. International Semantic Web Conference (ISWC2011). Bonn, [85] David Shotton, Katie Portwin, Graham Klyne, and Alistair Germany, October 24, 2011. Miles. Adventures in Semantic Publishing: Exemplar Seman- [72] Tommaso Di Noia, Roberto Mirizzi, Vito Claudio Ostuni, Da- tic Enhancements of a Research Article. PLoS Comput Biol, vide Romito, and Markus Zanker. Linked Open Data to sup- 5(4), 2009. C. Keßler et al. / Linked Data for Science and Education 15

[86] Christoph Stasch, Sven Schade, Alejandro Llaves, Krzysztof In Tomi Kauppinen, Line C. Pouchard, and Carsten Keßler, Janowicz, and Arne Bröring. Aggregating Linked Sensor editors, Proceedings of the 2nd International Workshop on Data. In Kerry Taylor, Arun Ayyagari, David De Roure, editor, Linked Science 2012 – Tackling Big Data (LISC2012), in con- The 4th International Workshop on Semantic Sensor Networks junction with 11th International Semantic Web Conference 2011 (SSN11) in conjunction with the 10th International Se- (ISWC2012), Boston, MA, 2012. mantic Web Conference (ISWC 2011), Bonn, Germany, 2011. [94] Hong Qing Yu, Stefan Dietze, Ning Li, Carlos Pedrinaci, Da- [87] Rob Styles and Nadeem Shabir. Academic Institution Internal vide Taibi, Nikolas Dovrolls, Teodor Stefanut, Eleni Kaldoudi, Structure Ontology (AIISO). Available from http://purl. and John Domingue. A linked data-driven & service-oriented org/vocab/aiiso/schema#; last accessed 2012-08-26, architecture for sharing educational resources. In 1st Inter- 2008. national Workshop on eLearning Approaches for Linked Data [88] The Learning Resource Metadata Initiative (LRMI). LRMI Age (Linked Learning 2011), 8th Extended Semantic Web Con- Specification version 1.0. Available from http://www. ference (ESWC2011), 2011. Workshop proceedings published lrmi.net/the-specification; last accessed 2012- as CEUR Vol 717. 08-26, 2012. [95] Hong Qing Yu, Stefan Dietze, Ning Li, Carlos Pedrinaci, Da- [89] Nadeem Shabir Chris Clarke Justin Leavesley Tom Heath, vide Taibi, Nikolas Dovrolls, Teodor Stefanut, Eleni Kaldoudi, Ross Singer. Assembling and applying an education graph and John Domingue. A linked data-driven & service-oriented based on learning resources in universities. In Proceedings of architecture for sharing educational resources. In Proceed- Linked Learning 2012 – The 2nd International Workshop on ings of Linked Learning 2011 – the 1st International Work- Learning and Education with the Web of Data (LiLe-2012 at shop on eLearning Approaches for Linked Data Age, Herak- WWW-2012), Lyon, France, 2012. lion, Greece, 2011. [90] Antony J. Williams, Lee Harland, Paul Groth, Stephen Pettifer, [96] Fouad Zablith, Miriam Fernandez, and Matthew Rowe. The Christine Chichester, Egon L. Willighagen, Chris T. Evelo, OU Linked Open Data: Production and Consumption. In Pro- Niklas Blomberg, , Carole Goble, and Barend ceedings of Linked Learning Workshop, Extended Semantic Mons. Open PHACTS: semantic interoperability for drug dis- Web Conference 2011. Heraklion, Crete, 2011. covery. Drug Discovery Today, in press. [97] Jun Zhao, Graham Klyne, Piotr Holubowicz, Raul Palma, Stian [91] World Wide Web Consortium. SPARQL Query Language for Soiland-Reyes, Kristina Hettne, Jose Enrique Ruiz, Marco RDF. Avaliable online from http://www.w3.org/TR/ Roos, Kevin Page, Jose Manuel Gomez-Perez, David De rdf-sparql-query/; last accessed 2012-08-02, 2008. Roure, and Carole Goble. RO-Manager: A Tool for Creating [92] Reynold S. Xin, Oktie Hassanzadeh, Christian Fritz, Shirin and Manipulating Research Objects to Support Reproducibility Sohrabi, and Renée J. Miller. Publishing Bibliographic Data and Reuse in Sciences. In Tomi Kauppinen, Line C. Pouchard, on the Semantic Web using BibBase. Semantic Web Journal, and Carsten Keßler, editors, Proceedings of the 2nd Interna- this issue. tional Workshop on Linked Science 2012 – Tackling Big Data [93] Varun Ratnakar Yolanda Gil and Paul Hanson. Organic Data (LISC2012), in conjunction with 11th International Semantic Publishing: A Novel Approach to Scientific Data Sharing. Web Conference (ISWC2012), Boston, MA, 2012.