Project Review - CORE (Connecting Repositories)Project Reviewpage 1 of 2

Total Page:16

File Type:pdf, Size:1020Kb

Project Review - CORE (Connecting Repositories)Project Reviewpage 1 of 2 Discovery Project Review - CORE (COnnecting REpositories)project reviewPage 1 of 2 CORE was funded by JISC to improve access to collections that support research and education. This document is part of a series that describes the lessons from 8 JISC CORE projects funded under the Discovery programme in 2011 to explore open metadata for (COnnecting REpositories) libraries, museums and archives. More information about the projects can be found at: http://www.jisc.ac.uk/whatwedo/programmes/inf11/infrastructureforresourcediscovery.aspx http://core-project.kmi.open.ac.uk The other documents in the series can be found at: http://discovery.ac.uk Background Institution The Open University “The COnnecting REpositories Responsible Knowledge Media Institute (KMi) (CORE) project aims to facilitate group the access and navigation across relevant scientific papers stored in Open Access A research project in the Knowledge Media Institute, but repositories. The project will Capacity outputs embedded into the OU’s Open Research Online make a new open metadata service (ORO). repository available in the Linked Data format describing Metadata about Open Access scholarly articles, as stored in UK institutional repositories. Innovatively, CORE the semantic relatedness Data Scope between resources stored seeks to provide machine-generated metadata capturing across a selection of UK assertions about the similarity of one article to another. repositories, including the Open University Open Research More than 3 million RDF triples, describing over 50,000 Online (ORO).” Data Scale Open Access articles stored in 61 of the UK’s 168 institutional repositories. Mechanics Formats CORE relies upon use of the Open Archives Initiative Protocol for Metadata Harvesting1 (OAI-PMH) to harvest metadata from open access repositories around the UK. The team attempted to follow links in the metadata to PDF documents containing the full text of the paper, which proved difficult or impossible in the majority of cases. Although OAI-PMH is reasonably easy to work with, the project found that data quality varied quite significantly. For example, although 24% of metadata records contained a link to something described as the ‘full text’ of a paper, only half of these actually contained useful text (some were simply PDFs that, when opened, stated “No full text available”). Of those, only half (so 6% of the total set) contained text suitable for the project to analyse and compute relatedness to other documents. Enhancement CORE developed an RDF-based schema2 that is used to expose the statements of relatedness upon which the project depends. This RDF is freely available for reuse, and the schema reuses concepts from existing ontologies such as MuSIM3, BIBO4 and FOAF5 rather than inventing the whole structure from scratch. ______________ 1 http://www.openarchives.org/OAI/openarchivesprotocol.html 4 http://bibliontology.com 2 http://core-project.kmi.open.ac.uk/node/13 5 http://www.foaf-project.org 3 http://kakapo.dcs.qmul.ac.uk/ontology/musim/0.2/musim.html Improving access to collections and enabling new services for UK education and research Discovery Project Review - CORE (COnnecting REpositories) Page 2 of 2 Usability The CORE data is available via a basic search interface6 and a SPARQL endpoint7. It has also been embedded8 within the Open University’s institutional repository, Open Research Online, and made available via a pilot application for Android devices9. Impact Licensing CORE data is made available for reuse under a Creative Commons Attribution Licence. It is worth noting that the project team did not directly address any issues with respect to licenses for the data they harvested from third parties, “assuming” that “this [harvesting and processing] behaviour is in the spirit of open access.” Benefits – How will end users benefit? CORE was able to demonstrate that some meaningful relationships could be computed between items in institutional repositories, and that these relationships could be displayed to the end-user of a repository in ways that might facilitate further exploration on their part. Some limitations in the metadata provided by repositories, and the project’s focus upon solely open access material, serve to limit the broader utility of this as an end user-facing service, but there is clearly scope to explore opportunities for both data enrichment and a broadening of CORE’s scope. Outcome CORE is embedded within the Open University’s institutional repository, and the team have submitted a bid for further funding with which to extend their work. Lessons Learned • It is possible to compute a measure of similarity between documents in institutional repositories, and to report this similarity to end users. • Data quality within institutional repositories is a concern, if robust services are to be built that rely upon them. See Also • OpenDOAR - http://www.opendoar.org. A directory of open access repositories ______________ 6 http://core.kmi.open.ac.uk/search 8 http://core-project.kmi.open.ac.uk/node/22 7 http://core.kmi.open.ac.uk/squery 9 http://core-project.kmi.open.ac.uk/node/12 Improving access to collections and enabling new services for UK education and research.
Recommended publications
  • MAJ 04/2021, Cf. Feuilletage Ci-Dessous
    Recherche d’informations sur Internet (perfectionnement) méthodologie et outils disponibles A. Bouchard 04/2021 Pour commencer Principes Moteurs de recherche Sites internet Bases de données bibliographiques Autres bases de données textuelles Images et multimédia Web social Actualités et temps réel Quelques outils complémentaires Veille automatisée Exercices de synthèse Bibliographie Principes Internet ? web ? • internet • réseau de réseaux • fin des années 1960 • protocole TCP/IP • applications et services divers : courrier électronique (mail), messagerie instantanée (IM), forums de discussion, transfert de fichiers (FTP), pair à pair (P2P), web (www)… • données • utilisateurs : 5,1 MM. dans le monde (Internet World Stats, 2020) • langues : 61 % du contenu en anglais, 2,8 % en français (W3Techs, 2021) • voir également Internet live stats Internet ? web ? • Web • World Wide Web (www) • milieu des années 1990 • ensemble de pages HTML (textes, images, liens…) avec une URL et accessibles avec le protocole HTTP • web visible / web invisible • web invisible ou web profond (deep web) : partie du web non indexée et qui ne peut être trouvée par les moteurs de recherche (pages protégées par un mot de passe, pages générées dynamiquement à la suite d’une requête…), voire dark web (web illégal) : 95 % du total ? • taille • 1,2 MM. de sites (Netcraft) • web indexé : au moins 5,3 milliards de pages (Worldwidewebsize) • taille du web identifié (URL connues) ? web général ? Internet ? web ? Ascodocpsy ConceptArt multimédia, 2010 Les âges du web du web âges Les Méthodologie • DEBUSQUER l’information Différents outils Esprit critique Bookmark organisé URL significative Syntaxe de recherche Questions préalables Utilisation réfléchie Evaluation Règles à respecter d’après Eduscol. Rechercher sur internet Méthodologie 1° définir le sujet (contexte de la recherche et mots-clés) Questions Prendre du temps au départ pour en gagner par la suite..
    [Show full text]
  • Mapping the Future of Scholarly Publishing
    THE OPEN SCIENCE INITIATIVE WORKING GROUP Mapping the Future of Scholarly Publishing The Open Science Initiative (OSI) is a working group convened by the National Science Communi- cation Institute (nSCI) in October 2014 to discuss the issues regarding improving open access for the betterment of science and to recommend possible solutions. The following document summa- rizes the wide range of issues, perspectives and recommendations from this group’s online conver- sation during November and December 2014 and January 2015. The 112 participants who signed up to participate in this conversation were drawn mostly from the academic, research, and library communities. Most of these 112 were not active in this conversa- tion, but a healthy diversity of key perspectives was still represented. Individual participants may not agree with all of the viewpoints described herein, but participants agree that this document reflects the spirit and content of the conversation. This main body of this document was written by Glenn Hampson and edited by Joyce Ogburn and Laura Ada Emmett. Additional editorial input was provided by many members of the OSI working group. Kathleen Shearer is the author of Annex 5, with editing by Dominque Bambini and Richard Poynder. CC-BY 2015 National Science Communication Institute (nSCI) www.nationalscience.org [email protected] nSCI is a US-based 501(c)(3) nonprofit organization First edition, January 2015 Final version, April 2015 Recommended citation: Open Science Initiative Working Group, Mapping the Future of Scholarly
    [Show full text]
  • Do You Speak Open Science? Resources and Tips to Learn the Language
    Do You Speak Open Science? Resources and Tips to Learn the Language. Paola Masuzzo1, 2 - ORCID: 0000-0003-3699-1195, Lennart Martens1,2 - ORCID: 0000- 0003-4277-658X Author Affiliation 1 Medical Biotechnology Center, VIB, Ghent, Belgium 2 Department of Biochemistry, Ghent University, Ghent, Belgium Abstract The internet era, large-scale computing and storage resources, mobile devices, social media, and their high uptake among different groups of people, have all deeply changed the way knowledge is created, communicated, and further deployed. These advances have enabled a radical transformation of the practice of science, which is now more open, more global and collaborative, and closer to society than ever. Open science has therefore become an increasingly important topic. Moreover, as open science is actively pursued by several high-profile funders and institutions, it has fast become a crucial matter to all researchers. However, because this widespread interest in open science has emerged relatively recently, its definition and implementation are constantly shifting and evolving, sometimes leaving researchers in doubt about how to adopt open science, and which are the best practices to follow. This article therefore aims to be a field guide for scientists who want to perform science in the open, offering resources and tips to make open science happen in the four key areas of data, code, publications and peer-review. The Rationale for Open Science: Standing on the Shoulders of Giants One of the most widely used definitions of open science originates from Michael Nielsen [1]: “Open science is the idea that scientific knowledge of all kinds should be openly shared as early as is practical in the discovery process”.
    [Show full text]
  • Aggregating Research Papers from Publishers' Systems to Support Text
    Aggregating Research Papers from Publishers’ Systems to Support Text and Data Mining: Deliberate Lack of Interoperability or Not? Petr Knoth, Nancy Pontika The Open University Walton Drive, Milton Keynes, United Kingdom [email protected], [email protected] Abstract In the current technology dominated world, interoperability of systems managed by different organisations is an essential property enabling the provision of services at a global scale. In the Text and Data Mining field (TDM), interoperability of systems offering access to text corpora offers the opportunity of increasing the uptake and impact of TDM applications. The global corpus of all research papers, i.e. the collection of human knowledge so large no one can ever read in their lifetime, represents one of the most exciting opportunities for TDM. Although the Open Access movement, which has been advocating for free availability and reuse rights to TDM from research papers, has achieved some major successes on the legal front, the technical interoperability of systems offering free access to research papers continues to be a challenge. COnnecting REpositories (CORE) (Knoth and Zdrahal, 2012) aggregates the world’s open access full-text scientific manuscripts from repositories, journals and publisher systems. One of the main goals of CORE is to harmonise and pre-process these data to lower the barrier for TDM. In this paper, we report on the preliminary results of an interoperability survey of systems provided by journal publishers, both open access and toll access. This helps us to assess the current level of systems’ interoperability and suggest ways forward. Keywords: Interoperability, publishers, standardisation 1.
    [Show full text]
  • Exposing Dmps: Use Cases, Workflows and Guidelines
    Exposing DMPs: Use cases, workflows and guidelines Case statement for community review July 2017 Contributors David Carr - The Wellcome Trust John Chodacki, California Digital Library John Faundeen, Earth Resources Observation Center, USGS Bev Jones University of Lincoln Natalie Meyers, Centre for Open Science/ University of Notre Dame Paul Millar, DESY Fiona Murphy, MMC Ltd Kathryn Unsworth, Australian National Data Service Angus Whyte, Digital Curation Centre (Editor) Elena Zudilova-Seinstra, Elsevier Working Group Charter A variety of stakeholders are showing growing interest in exposing data management plans (*) to other actors (human/machine) in the research lifecycle, beyond their creator and the funder or institution that mandates their production. Interested stakeholders include researchers themselves, funders, institutions, and a variety of service providers and community organisations including repositories, institutions, journals, publishers, and providers of tools for writing and maintaining plans. Implementation and adoption is currently hampered by two problems: ● A lack of standards for expression and interchange of DMPs ● Insufficient understanding of the needs of users and the benefits and risks of different modes of action This proposed working group will address both of these issues; the issue of a standardised form of expression for DMPs is the concern of the proposed DMP Common Standards Working Group. The group’s output will include a reference model and alternative strategies for exposing plans, to best serve community interests in meeting FAIR principles,1 based on shared experience of ‘early adopters’ in test implementations. It will be supported by work to gauge user needs and motivations for exposing DMPs as well as perceived risks and disbenefits.
    [Show full text]
  • Book of Abstracts Ii Contents
    CERN Workshop on Innovations in Scholarly Communication (OAI8) Wednesday, 19 June 2013 - Friday, 21 June 2013 University of Geneva Book of Abstracts ii Contents Altmetrics in the Wild - Alternative Impact Measurement for Scientific Publications . 1 BG1 - Gold OA Infrastructure (Room 1140) .......................... 1 BG2 - Open Annotations (Room 1150) ............................. 1 BG3 - Altmetrics (Room R160) ................................. 2 BG4 - Open Access Policy Developments (Room R170) ................... 2 BG5 - How to make your university into a monograph publisher? (Room 1130) . 2 BG6 - Reusing Open Acces materials - a Wikimedia perspective (Room R150) . 3 BG6 - Using Wikipedia’s popularity to share research .................... 3 CERIF for Datasets (C4D) .................................... 3 COnnecting REpositories (CORE) - the current state of aggregating Open Access content 4 Citation Finder. A tool for enhancing bibliographic research by extracting references from unstructured scholarly works ............................... 4 DML-CZ, Czech Digital Mathematics Library ......................... 4 Empowering Development: Why Open is Right for Development ............. 5 EuDML, the European Digital Mathematics Library ..................... 5 Guidelines towards implementing Open Access policies ................... 6 Humanities Session: OA Research Monographs in HSS: Opportunities & Challenges . 6 Humanities Session: The Humanities in and for the Digital Age .............. 7 Hydra: open, flexible workflows and community for
    [Show full text]
  • Handbook of E-Resources from Yenepoya Central Library
    Yenepoya Central Library The Handbook of E-Resources Prepared by: Verified by: Dr. K. S. Ali Dr. Mamatha P.K Deputy Librarian Chief Librarian E-mail: [email protected] E-mail: [email protected] Phone no. +91-9686618671 Phone no. +91-9845783853 Extn. no.2067 Extn. no. 5085 Updated on 12-05-2020 Dr. K. S. Ali E-Resources Handbook List of E-Resources Subscribed by Yenepoya (Deemed to be University) Access Details Name Descriptions Discipline User ID credential, Remote Remote Access If any Access URL Links Web of Science is a platform consisting of several Web of Science literature search databases designed to support No Click Here scientific and scholarly research. Multidiscipline Yes The evidence-based clinical decision support UpToDate resource from Wolters Kluwer, is trusted at the Healthcare No Click Here point of care by clinicians worldwide. Professionals Yes Clinical Key supports healthcare professionals and students with the latest evidence across specialities Healthcare Click Here in a variety of formats, including full-text reference Yes Professionals Clinical Key books and journals, point-of-care monographs, drug No information, videos, practice guidelines, customised patient education handouts and more. The BMJ share that global endeavour with millions The BMJ (online) of readers working in clinical practice, research, education, government, and with patients and the Healthcare No Click Here public too. Professionals Yes BMJ Case Reports is an important educational resource offering a high volume of cases in all disciplines so that healthcare professionals, Healthcare Click Here researchers and others can easily find clinically Professionals BMJ Case Report important information on common and rare No conditions.
    [Show full text]
  • Relationships Between Metadata Application and Downloads in an Institutional Repository of an American Law School
    volume 28, issue 1, pages 13-24 (2018) Relationships Between Metadata Application and Downloads in an Institutional Repository of an American Law School Hollie C. White Libraries, Archives, Records, and Information Science (LARIS), School of Media, Creative Arts, and Social Inquiry Curtin University, Australia [email protected] Sean Chen Duke Law School, Duke University, United States [email protected] Guangya Liu Duke Law School, Duke University, United States [email protected] ABSTRACT Background. The Duke Law Scholarship Repository is a successful digital repository of an American law school, with over 1 million downloads per year. A series of studies were conducted to understand the relationship between metadata work and downloads. Objective. The paper reports an analysis of the relationships between certain metadata elements and repository downloads. Methods. Quantitative statistical methods, specifically correlation, t-test and multiple regression analysis, were used. Results. Statistically significant relationships were found between download frequency and factors relating to abstract, co-authors, page count and discipline. Negative statistically significant relationships were found between download frequency and free text keywords, as well as controlled vocabulary subject terms. Contributions. This study is an example of how in-use repository system administrators can demonstrate the impact of metadata work for institutional scholarly outreach. Also, this study adds another dimension to the keyword and searching/download
    [Show full text]
  • CAUL Review of Australian Repository Infrastructure
    CAUL Fair, affordable and open access to knowledge program CAUL Review of Australian Repository Infrastructure Review of Australian Repository Infrastructure A project within the CAUL Fair, affordable and open access to knowledge program To: Director, FAIR Access to Research Program, CAUL From: Project Leader, Review of Australian Repository Infrastructure Subject: Report from the Review of Australian Repository Infrastructure Date: Version 15 March 2019 Dear Catherine, On behalf of the Project Team I am pleased to provide you with the report from the CAUL Review of Australian Repository Infrastructure, a project of the CAUL Fair, affordable and open access to knowledge program. The project was resourced using a large number of volunteers from across the CAUL member libraries. This model has proved to be successful. This final version of the report includes reports from work packages #1 - #7. All feedback provided by the 2/2018 CAUL Council meeting a community consultation process conducted in February/March 2019 were considered and incorporated into the document where possible. Suggested changes to project scope could not be incorporated into the project and the report at this stage. Best regards, Martin Borchert Project Lead Page 2 of 171 CAUL (Council of Australian University Librarians) +61 2 6125 2990 | [email protected] | www.caul.edu.au Table of Contents Project Overview……………………………………………………………………………………………………………………………. 4 #1 Review and Report on the current Australasian institutional research repository infrastructure…………………………………………………………………………………………………..................................
    [Show full text]
  • Educonector: Open Access to Mexican Academic and Scientific Production Silvia I. Adame and Luis Llorens, Engineering Institute
    Educonector: Open access to Mexican Academic and Scientific Production Silvia I. Adame and Luis Llorens, Engineering Institute University Autonomus of Baja California, México [email protected], [email protected] Vladimir Burgos, Ctro. de Innovación Tecnológico de Monterrey, México [email protected] Abstract This paper presents a description of the metadata harvester software development, called educonector.info. This system provides access to reliable and quality educational resources, shared by Mexican Universities through their repositories, to anyone with Internet Access. We present the conceptual and contextual framework, followed by the technical specifications, the conclusion and future work. This paper is based on the experience gained from working with the technical committee of the project sponsored by CUDI-CONACYT titled: Metasearch of Educational repositories to Promote the Use of Learning Objects and Open Educational Resources: Best Practices. Keywords Mexico, metadata harvester, digital repository, open educational resources, metadata, search engine Introduction The search for information on the Web is an everyday activity, find free, reliable and quality information is a challenge. Many of our search results are widely varying quality and it is difficult to find reliable educational content online. According to estimates by Van de Sompel (2011), the Web is growing at an amazing rate; every minute over 70 new domains are being registered and more than 500,000 documents are being added to websites. In Mexico actually there are more than 4.1 million people that usually access to internet. (AMIPCI, 2013). The rapid Web expansion and the increasing Internet access brings great opportunities and challenges for Universities, among them, the opportunity to develop a culture of sharing and reusing scientific, academic and cultural information to benefit all people with internet access.
    [Show full text]
  • Open Research Online Oro.Open.Ac.Uk
    Open Research Online The Open University’s repository of research publications and other research outputs Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories Journal Item How to cite: Pontika, Nancy; Knoth, Petr; Cancellieri, Matteo and Pearce, Samuel (2016). Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories. LIBER Quarterly, 25(4) pp. 172–188. For guidance on citations see FAQs. c 2016 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/ Version: Version of Record Link(s) to article on publisher’s website: http://dx.doi.org/doi:10.18352/lq.10138 Copyright and Moral Rights for the articles on this site are retained by the individual authors and/or other copyright owners. For more information on Open Research Online’s data policy on reuse of materials please consult the policies page. oro.open.ac.uk Vol. 25, no. 4 (2016) 172–188 | ISSN: 1435-5205 | e-ISSN: 2213-056X Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories Nancy Pontika The Open University, UK [email protected] Petr Knoth The Open University, UK [email protected] Matteo Cacellieri The Open University, UK [email protected] Samuel Pearce The Open University, UK [email protected] Abstract The amount of open access content stored in repositories has increased dra- matically, which has created new technical and organisational challenges for bringing this content together. The COnnecting REpositories (CORE) project has been dealing with these challenges by aggregating and enrich- ing content from hundreds of open access repositories, increasing the This work is licensed under a Creative Commons Attribution 4.0 International License Uopen Journals | http://liberquarterly.eu/ | DOI: 10.18352/lq.10138 Liber Quarterly Volume 25 Issue 4 2016 172 Nancy Pontika et al.
    [Show full text]
  • The 'State-Of-The-Art' in Providing Open Access to Scholarly Literature
    This is a repository copy of Making open access work: The ‘state-of-the-art’ in providing open access to scholarly literature. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/87407/ Version: Accepted Version Article: Pinfield, S. (2015) Making open access work: The ‘state-of-the-art’ in providing open access to scholarly literature. Online Information Review, 39 (5). pp. 604-636. ISSN 1468-4535 https://doi.org/10.1108/OIR-05-2015-0167 Reuse Unless indicated otherwise, fulltext items are protected by copyright with all rights reserved. The copyright exception in section 29 of the Copyright, Designs and Patents Act 1988 allows the making of a single copy solely for the purpose of non-commercial research or private study within the limits of fair dealing. The publisher or other rights-holder may allow further reproduction and re-use of this version - refer to the White Rose Research Online record for this item. Where records identify the publisher as the copyright holder, users can verify any specific terms of use on the publisher’s website. Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request. [email protected] https://eprints.whiterose.ac.uk/ Making open access work: The ‘state-of-the-art’ in providing open access to scholarly literature Abstract Purpose: This paper is designed to provide an overview of one of the most important and controversial areas of scholarly communication: open-access publishing and dissemination of research outputs.
    [Show full text]