Introduction to Wikidata for Librarians Structuring Wikipedia and Beyond

Total Page:16

File Type:pdf, Size:1020Kb

Load more

Please complete a short survey: http://bit.ly/oclc18wikidatasurvey Introduction to Slides: http://bit.ly/oclc18wikidata One page guide: Wikidata for http://bit.ly/wikidata-onepage Librarians Wikimedia District of Columbia Structuring Wikipedia @wikimediadc and Beyond [email protected] @fuzheado [email protected] OCLC @wikigamaliel June 12, 2018 | Wikimedia DC #Wikidata Licensed via CC-BY-SA 4.0 Andrew Lih Author of The Wikipedia Revolution; digital sharing strategist; journalism professor About Robert Fernandez Assistant Professor, Resources Development/ eLearning Librarian, Prince George's Community College; Wikimedia DC board member GLAM Galleries, Libraries, Archives and Museums - Cultural Partners Wikimedia DC Library of Congress NARA Smithsonian Local chapter for Wikipedia Edit-a-thons Full-time Wikipedian in Edit-a-thons / Wikimedia community Residence Article improvement drives Wikidata and modeling Associations for the Centers Wikipedia Space exhibit Linked Open Data for Study of Congress Wikiconference hosting Wikidata: The evolution of Wikipedia into the ultimate, free linked open database Wikidata In one page http://bit.ly/wikidata-onepage 2017 was a turning point for Wikidata ● Google Knowledge Graph ● Digital assistants: Siri, Alexa, etc. ● Infoboxes on Wikipedia Why? ● Structured data on Commons ● Wikicite, WikidataCon A hub for the future of Wikimedia content The mission The mission CC BY 3.0, Cavefrog Wikidata items have identifiers - VIAF links back Dan Scott, Laurentian University "rather than focusing on directly enhancing our own local data repository silos (for example, library catalogues, digital exhibits), libraries and archives should invest their limited resources in enriching Wikidata, a centralized data repository, to maximize the visibility of those entities and the reusability of that data in the world at large… and then pull that data back into our local repositories to enrich our displays and integration with the broader world of data." https://coffeecode.net/wikidata-canada-150-and-music-festival-data .html Why Wikidata? Design of Wikidata Features, RDF, triples Overview Queries and tools Case studies Calls to action More than 5 million English articles Wikipedia Top 10 most visited site today Reputation and cultural partnerships Knowledge scattered among 30 million articles in 200+ languages Wikipedia Inconsistency, gaps and challenges replication How to consolidate knowable facts? 2001: Images scattered across Wikipedia editions Lesson: Images 2004: Wikimedia Commons centralized and and multimedia consolidated multimedia commons.wikimedia.org Convert encyclopedic lexical content into "structured" statements Wikidata Turn human readable into machine understandable as the future Link to stable external data of LAM institutions "Semantic web" realized Facts and figures from articles, infoboxes are only in human-readable prose Navigation boxes at bottom of Wikipedia articles done by hand Launched 2012 Power of searching, Wikidata sorting and querying capabilities Sum Interconnected mesh of all human knowledge Factual claims are stored as statements in Wikidata subject - predicate - object Fundamentals: or Statements item - property - value or thing - relationship - thing Wikidata item for United States Congress (Q11268) Wikidata item for United States Congress (Q11268) "Triple" Wikidata Basics item - property - value Q numbers - item P numbers - property ● Anyone can make a Q item ● Controlled vocabulary for consistency ● Corresponds to Wikipedia article / concept ● Proposal, discussion and approval ● Examples ○ Q1 - the Universe process ○ Q2 - Earth ● Examples ○ Q5 - human ○ P31 - instance of ○ Q146 - cat ○ P279 - subclass of ○ Q729 - animal ○ P214 - VIAF ID ○ Q571 - book ○ P569 - date of birth ○ Q7075 - library ○ P625 - coordinate location ○ Q190593 - OCLC ● See: Wikidata:List_of_properties ○ Q877140 - cardigan Wikidata item page Claims capture factual, provable information Any number of statements can be associated with an item Wikidata statement Item Property Value "George Washington" "instance of" "human" triples Q23 P31 Q5 Underneath the surface... Using symbols makes them language "place of "George Washington" "Mount Vernon" independent (identifiers vs names) burial" Q23 P119 Q731635 "George Washington" "LCAuth ID" "n86140996" Q23 P244 Wikidata link on Wikipedia articles Wikidata stores statements as explicit triples - item + property + value Item United States Congress Q11268 Property Value "instance of" "bicameral legislature" P31 Q189445 Wikidata statement triples Claims capture factual, provable information Using symbols makes them language independent (identifiers vs names) Item Property Value United States Congress "instance of" "bicameral legislature" Q11268 P31 Q189445 Wikidata statement triples Relationships are "first class" = very fast to search and sort Seconds vs minutes to search Ad hoc data model highly adaptive Well-suited to the wiki way Item Property Value United States Congress "instance of" "bicameral legislature" Q11268 P31 Q189445 Traditional databases Artist Date of birth Country Medium Henri Matisse December 31, 1869 France Painting Schemas well-defined and controlled Claude Monet November 14, 1840 France Painting Relational databases and SQL: Columns need lots of planning and Edward Hopper July 22, 1882 United States Painting forethought Changes can be complex, with many Work Creator Date Location cascading effects Les Bêtes De La Mer Henri Matisse 1950 NGA Searches involving relationships can be slow or expensive (join operations) Cape Cod Morning Edward Hopper 1950 SAAM Nighthawks Edward Hopper 1942 Art Inst Chicago United States Wikidata and RDF citizen of databases Edward July 22, 1882 Hopper Relationships are explicit and precise date of birth Database can take any shape and grow according to need creator Also known as "graph databases" Nighthawks Cape Cod Morning Instance of painting creative work subclass of Summary UPSIDES DOWNSIDES RDF triples make for a very flexible and fast Schema-on-the-fly system can make modeling system inconsistent and difficult Suitable for the BEBOLD wiki culture Hard for newcomers to understand Multiple parallel ontologies can co-exist Multiple parallel ontologies can co-exist Muammar Gaddafi Gadhafi Mu‘ammar al-Qaḏḏafi Muammar Muhammad Abu Qaḏḏafi Muamar al Gadafi Minyar al-Gaddafi Qaḏḏāfī Moammar Gadafi Colonel Gaddafi Muammar Muhammad Abu Muammar al-Gaddafi Kadhafi Minyar al-Gaddafi Muhammad Ghadaffi Wikidata items Mu‘ammar al Qaḏḏāfi Khadafi Muammar el Gaddafi Moammar Al Qadhafi Mu‘ammar al-Qaḏḏāfi Muamar al Gaddafhi Qaḏḏāfi Gaddafi Mu‘ammar al-Qaḏḏāfī Gadafi Muammar el Gadafi Mu‘ammar al Qaḏḏafī Using identifiers removes language Kadaffi Muamar al-Gaddafi Khaddafi Al-Khadafy Muamar al Gaddafi Muammar al Gaddafi dependence and ambiguity in: Gadaffi Mu‘ammar al Qaḏḏafi Qaḏḏafī Kaddafi Kadafi El Kazzafi Muammar al–Gaddafi Omar Gadafi Muhamad Gadafi Writing systems (Chinese, Serbian, Jaddafi Kaddaffi Muamar al-Gaddafhi Kazakh, et al) Qaddafy Moammar Jaddafi Muammar Gaddafi Muamar Gadafi Muhamar Gadaffi Muamar el-Gadafi 53 Latinized Phonetization variations Mu‘ammar al-Qaḏḏafī Mu‘ammar al Qaḏḏāfī Al-Qadhdhaafi Al-Qathafi variations! (May 2017) Spelling variations Maiden vs. married names Canonical identifiers help link to Item external databases Muammar Gaddafi Q19878 Wikidata has more than 48 million items Speed, Simple searches take less consistency, than a second Complex queries automation supported by open standards like SPARQL Search example - Find all bicameral legislatures http://query.wikidata.org Item Property Value ? "instance of" "bicameral legislature" ? P31 Q189445 Wikidata Search - Result from Query 26 million items in 1/3 of a second Wikidata items have identifiers - VIAF links back Wikidata items have identifiers - links to external databases Barack Obama (Q76) has 83 identifiers! Some prominent identifiers - links to external databases University Library catalog NDLAuth ID (National Diet Library of CiNii (Scholarly and Academic Japan) Information Navigator) Japan SELIBR (National Library of Sweden NNDB people ID - Notable Names Libris) Database NLA (Australia) ID Politifact NKCR Czech National Authority Encyclopedia Britannica ID WorldCat Database (National Library of Czech CONOR ID (Slovenia) VIAF Republic) NYT topic ID LC Name Authority File RSL ID (person) Russian State Library Guardian topic ID ISNI IMDB Parlement & Politiek ID (Dutch politics GND (Integrated Authority File) Dutch National Thesaurus for Author site) SUDOC (French universities) names Social Networks and Archival Context BNF (Bibliotheque France) Declarator.org - Russian ID (SNAC) MusicBrainz non-governmental database with NARA Bio Directory of Congress information on the income of California Digital Library Quora topic ID government officials University of Virginia C-SPAN person ID NUKAT - Center of Warsaw Freebase University of California, Berkeley OCLC related properties Consistency and automation Constraint reports/violations provide warnings on logic and bounds Common Wikidata editing tasks - Language links Wikidata provides central hub Where all inter-wiki links are found Properties: Identifiers Indexes into other databases Authority control Accession numbers Catalog identifiers Stable URLs to other sites Instead of individual item Alternative pages... contribution Task lists, games and other interfaces contribute methods to Wikidata Wikidata Game allows for "one-click" contributions based on task lists Notable
Recommended publications
  • Amber Billey Senylrc 4/1/2016

    Amber Billey Senylrc 4/1/2016

    AMBER BILLEY SENYLRC 4/1/2016 Photo by twm1340 - Creative Commons Attribution-ShareAlike License https://www.flickr.com/photos/89093669@N00 Created with Haiku Deck Today’s slides: http://bit.ly/SENYLRC_BF About me... ● Metadata Librarian ● 10 years of cataloging and digitization experience for cultural heritage institutions ● MLIS from Pratt in 2009 ● Zepheira BIBFRAME alumna ● Curious by nature [email protected] @justbilley Photo by GBokas - Creative Commons Attribution-NonCommercial-ShareAlike License https://www.flickr.com/photos/36724091@N07 Created with Haiku Deck http://digitalgallery.nypl.org/nypldigital/id?1153322 02743cam 22004094a 45000010008000000050017000080080041000250100017000660190014000830200031000970200028001280200 03800156020003500194024001600229035002400245035003900269035001700308040005300325050002200378 08200120040010000300041224500790044225000120052126000510053330000350058449000480061950400640 06675050675007315200735014066500030021416500014021717000023021858300049022089000014022579600 03302271948002902304700839020090428114549.0080822s2009 ctua b 001 0 eng a 2008037446 a236328594 a9781591585862 (alk. paper) a1591585864 (alk. paper) a9781591587002 (pbk. : alk. paper) a159158700X (pbk. : alk. paper) a99932583184 a(OCoLC) ocn236328585 a(OCoLC)236328585z(OCoLC)236328594 a(NNC)7008390 aDLCcDLCdBTCTAdBAKERdYDXCPdUKMdC#PdOrLoB-B00aZ666.5b.T39 200900a0252221 aTaylor, Arlene G., d1941-14aThe organization of information /cArlene G. Taylor and Daniel N. Joudrey. a3rd ed. aWestport, Conn. :bLibraries Unlimited,c2009. axxvi,
  • What Do Wikidata and Wikipedia Have in Common? an Analysis of Their Use of External References

    What Do Wikidata and Wikipedia Have in Common? an Analysis of Their Use of External References

    What do Wikidata and Wikipedia Have in Common? An Analysis of their Use of External References Alessandro Piscopo Pavlos Vougiouklis Lucie-Aimée Kaffee University of Southampton University of Southampton University of Southampton United Kingdom United Kingdom United Kingdom [email protected] [email protected] [email protected] Christopher Phethean Jonathon Hare Elena Simperl University of Southampton University of Southampton University of Southampton United Kingdom United Kingdom United Kingdom [email protected] [email protected] [email protected] ABSTRACT one hundred thousand. These users have gathered facts about Wikidata is a community-driven knowledge graph, strongly around 24 million entities and are able, at least theoretically, linked to Wikipedia. However, the connection between the to further expand the coverage of the knowledge graph and two projects has been sporadically explored. We investigated continuously keep it updated and correct. This is an advantage the relationship between the two projects in terms of the in- compared to a project like DBpedia, where data is periodically formation they contain by looking at their external references. extracted from Wikipedia and must first be modified on the Our findings show that while only a small number of sources is online encyclopedia in order to be corrected. directly reused across Wikidata and Wikipedia, references of- Another strength is that all the data in Wikidata can be openly ten point to the same domain. Furthermore, Wikidata appears reused and shared without requiring any attribution, as it is to use less Anglo-American-centred sources. These results released under a CC0 licence1.
  • Wikipedia Knowledge Graph with Deepdive

    Wikipedia Knowledge Graph with Deepdive

    The Workshops of the Tenth International AAAI Conference on Web and Social Media Wiki: Technical Report WS-16-17 Wikipedia Knowledge Graph with DeepDive Thomas Palomares Youssef Ahres [email protected] [email protected] Juhana Kangaspunta Christopher Re´ [email protected] [email protected] Abstract This paper is organized as follows: first, we review the related work and give a general overview of DeepDive. Sec- Despite the tremendous amount of information on Wikipedia, ond, starting from the data preprocessing, we detail the gen- only a very small amount is structured. Most of the informa- eral methodology used. Then, we detail two applications tion is embedded in unstructured text and extracting it is a non trivial challenge. In this paper, we propose a full pipeline that follow this pipeline along with their specific challenges built on top of DeepDive to successfully extract meaningful and solutions. Finally, we report the results of these applica- relations from the Wikipedia text corpus. We evaluated the tions and discuss the next steps to continue populating Wiki- system by extracting company-founders and family relations data and improve the current system to extract more relations from the text. As a result, we extracted more than 140,000 with a high precision. distinct relations with an average precision above 90%. Background & Related Work Introduction Until recently, populating the large knowledge bases relied on direct contributions from human volunteers as well With the perpetual growth of web usage, the amount as integration of existing repositories such as Wikipedia of unstructured data grows exponentially. Extract- info boxes. These methods are limited by the available ing facts and assertions to store them in a struc- structured data and by human power.
  • Worldcat Data Licensing

    Worldcat Data Licensing

    WorldCat Data Licensing 1. Why is OCLC recommending an open data license for its members? Many libraries are now examining ways that they can make their bibliographic records available for free on the Internet so that they can be reused and more fully integrated into the broader Web environment. Libraries may want to release catalog data as linked data, as MARC21 or as MARCXML. For an OCLC® member institution, these records may often contain data derived from WorldCat®. Coupled with a reference to the community norms articulated in WorldCat Rights and Responsibilities for the OCLC Cooperative, the Open Data Commons Attribution (ODC-BY) license provides a good way to share records that is consistent with the cooperative nature of OCLC cataloging. Best practices in the Web environment include making data available along with a license that clearly sets out the terms under which the data is being made available. Without such a license, users can never be sure of their rights to use the data, which can impede innovation. The VIAF project and the addition of Schema.org linked data to WorldCat.org records were both made available under the ODC-BY license. After much research and discussion, it was clear that ODC-BY was the best choice of license for many OCLC data services. The recommendation for members to also adopt this clear and consistent approach to the open licensing of shared data, derived from WorldCat, flowed from this experience. An OCLC staff group, aided by an external open-data licensing expert, conducted a structured investigation of available licensing alternatives to provide OCLC member institutions with guidance.
  • Knowledge Graphs on the Web – an Overview Arxiv:2003.00719V3 [Cs

    Knowledge Graphs on the Web – an Overview Arxiv:2003.00719V3 [Cs

    January 2020 Knowledge Graphs on the Web – an Overview Nicolas HEIST, Sven HERTLING, Daniel RINGLER, and Heiko PAULHEIM Data and Web Science Group, University of Mannheim, Germany Abstract. Knowledge Graphs are an emerging form of knowledge representation. While Google coined the term Knowledge Graph first and promoted it as a means to improve their search results, they are used in many applications today. In a knowl- edge graph, entities in the real world and/or a business domain (e.g., people, places, or events) are represented as nodes, which are connected by edges representing the relations between those entities. While companies such as Google, Microsoft, and Facebook have their own, non-public knowledge graphs, there is also a larger body of publicly available knowledge graphs, such as DBpedia or Wikidata. In this chap- ter, we provide an overview and comparison of those publicly available knowledge graphs, and give insights into their contents, size, coverage, and overlap. Keywords. Knowledge Graph, Linked Data, Semantic Web, Profiling 1. Introduction Knowledge Graphs are increasingly used as means to represent knowledge. Due to their versatile means of representation, they can be used to integrate different heterogeneous data sources, both within as well as across organizations. [8,9] Besides such domain-specific knowledge graphs which are typically developed for specific domains and/or use cases, there are also public, cross-domain knowledge graphs encoding common knowledge, such as DBpedia, Wikidata, or YAGO. [33] Such knowl- edge graphs may be used, e.g., for automatically enriching data with background knowl- arXiv:2003.00719v3 [cs.AI] 12 Mar 2020 edge to be used in knowledge-intensive downstream applications.
  • Mate Choice from Avicenna's Perspective

    Mate Choice from Avicenna's Perspective

    Journal of Law, Policy and Globalization www.iiste.org ISSN 2224-3240 (Paper) ISSN 2224-3259 (Online) Vol.26, 2014 Mate choice from Avicenna’s perspective 1 2 3 Masoud Raei , Maryam sadat fatemi Hassanabadi * , Hossein Mansoori 1. Assistant Professor Faculty of law and Elahiyat and Maaref Eslami, Najafabad Branch, Islamic Azad University, Najafabad, Isfahan, Iran. 2. MA Student. Fundamentals jurisprudence Islamic Law. , Islamic Azad Khorasgan University, Isfahan, Iran. 3. MA in History and Philosophy of Education, University of Isfahan, Isfahan, Iran. * E-mail of the corresponding author: hosintaghi@ gmail.com Abstract The aim of the present research was examining mate choice from Avicenna’s perspective. Being done through application of qualitative approach and a descriptive-analytic method as well, this study attempted to analyze and examine Avicenna’s perspective on effects of mate choice and on criteria for selecting an appropriate spouse and its necessities as well as the hurts discussed in this regard. The research results showed that Avicenna has encouraged all people in marriage, since it brings to them economic and social outcomes, peace and sexual satisfaction as well. Avicenna stated some criteria for appropriate mate choice, and in addition to its necessities, he advised us to follow such principles as the obvious marriage occurrence, its stability, wife’s not being common and a suitable age range for marriage. Moreover, he has examined issues and hurts related to mate choice referring to such cases as nature incongruity, marital infidelity, the state of not having any babies and ethical conflict which may happen in mate choice and marriage, suggesting some solutions to such problems.
  • Download; (2) the Appropriate Log-In and Password to Access the Server; and (3) Where on the Server (I.E., in What Folder) the File Was Kept

    Download; (2) the Appropriate Log-In and Password to Access the Server; and (3) Where on the Server (I.E., in What Folder) the File Was Kept

    AN ALCTS MONOGRAPH LINKED DATA FOR THE PERPLEXED LIBRARIAN SCOTT CARLSON CORY LAMPERT DARNELLE MELVIN AND ANNE WASHINGTON chicago | 2020 alastore.ala.org © 2020 by the American Library Association Extensive effort has gone into ensuring the reliability of the information in this book; however, the publisher makes no warranty, express or implied, with respect to the material contained herein. ISBNs 978-0-8389-4746-3 (paper) 978-0-8389-4712-8 (PDF) 978-0-8389-4710-4 (ePub) 978-0-8389-4711-1 (Kindle) Library of Congress Control Number: 2019053975 Cover design by Alejandra Diaz. Text composition by Dianne M. Rooney in the Adobe Caslon Pro and Archer typefaces. This paper meets the requirements of ANSI/NISO Z39.48–1992 (Permanence of Paper). Printed in the United States of America 23 24 22 21 20 5 4 3 2 1 alastore.ala.org CONTENTS Acknowledgments vii Introduction ix One Enquire Within upon Everything 1 The Origins of Linked Data Two Unfunky and Obsolete 17 From MARC to RDF Three Mothership Connections 39 URIs and Serializations Four What Is a Thing? 61 Ontologies and Linked Data Five Once upon a Time Called Now 77 Real-World Examples of Linked Data Six Tear the Roof off the Sucker 105 Linked Library Data Seven Freaky and Habit-Forming 121 Linked Data Projects That Even Librarians Can Mess Around With EPILOGUE The Unprovable Pudding: Where Is Linked Data in Everyday Library Life? 139 Bibliography 143 Glossary 149 Figure Credits 153 About the Authors 155 Index 157 alastore.ala.orgv INTRODUCTION ince the mid-2000s, the greater GLAM (galleries, libraries, archives, and museums) community has proved itself to be a natural facilitator S of the idea of linked data—that is, a large collection of datasets on the Internet that is structured so that both humans and computers can understand it.
  • Preparing for a Linked Data Approach to Name Authority Control in an Institutional Repository Context

    Preparing for a Linked Data Approach to Name Authority Control in an Institutional Repository Context

    Title Assessing author identifiers: preparing for a linked data approach to name authority control in an institutional repository context Author details Corresponding author: Moira Downey, Digital Repository Content Analyst, Duke University ([email protected] ; 919 660 2409; https://orcid.org/0000-0002-6238-4690) 1 Abstract Linked data solutions for name authority control in digital libraries are an area of growing interest, particularly among institutional repositories (IRs). This article first considers the shift from traditional authority files to author identifiers, highlighting some of the challenges and possibilities. An analysis of author name strings in Duke University's open access repository, DukeSpace, is conducted in order to identify a suitable source of author URIs for Duke's newly launched repository for research data. Does one of three prominent international authority sources—LCNAF, VIAF, and ORCID—demonstrate the most comprehensive uptake? Finally, recommendations surrounding a technical approach to leveraging author URIs at Duke are briefly considered. Keywords Name authority control, Authority files, Author identifiers, Linked data, Institutional repositories 2 Introduction Linked data has increasingly been looked upon as an encouraging model for storing metadata about digital objects in the libraries, archives and museums that constitute the cultural heritage sector. Writing in 2010, Coyle draws a connection between the affordances of linked data and the evolution of what she refers to as "bibliographic control," that is, "the organization of library materials to facilitate discovery, management, identification, and access" (2010, p. 7). Coyle notes the encouragement of the Working Group on the Future of Bibliographic Control to think beyond the library catalog when considering avenues by which users seek and encounter information, as well as the group's observation that the future of bibliographic control will be "collaborative, decentralized, international in scope, and Web- based" (2010, p.
  • OCLC and the Ethics of Librarianship: Using a Critical Lens to Recast a Key Resource

    OCLC and the Ethics of Librarianship: Using a Critical Lens to Recast a Key Resource

    OCLC and the Ethics of Librarianship: Using a Critical Lens to Recast a Key Resource Maurine McCourry* Introduction Since its founding in 1967, OCLC has been a catalyst for dramatic change in the way that libraries organize and share resources. The organization’s structure has been self-described as a cooperative throughout its his- tory, but its governance has never strictly fit that model. There have always been librarians involved, and it has remained a non-profit organization, but its business model is strongly influenced by the corporate world, and many in the upper echelon of its leadership have come from that realm. As a result, many librarians have questioned OCLC’s role as a partner with libraries in providing access to information. There has been an undercurrent of belief that the organization does not really share the values of the profession, and does not necessarily ascribe to its ethics. This paper will explore the role the ethics of the profession of librarianship have, or should have, in both the governance of OCLC as an organization, and the use of its services by the library community. Using the frame- work of critical librarianship, it will attempt to answer the following questions: 1) What obligation does OCLC have to observe the ethics of librarianship?; 2) Do OCLC’s current practices fit within the guidelines established by the accepted ethics of the profession?; and 3) What are the responsibilities of the library community in ob- serving the ethics of the profession in relation to the services provided by OCLC? OCLC was founded with a mission of service to academic libraries, seeking to save library staff time and money while facilitating cooperative borrowing.
  • Scripts, Languages, and Authority Control Joan M

    Scripts, Languages, and Authority Control Joan M

    49(4) LRTS 243 Scripts, Languages, and Authority Control Joan M. Aliprand Library vendors’ use of Unicode is leading to library systems with multiscript capability, which offers the prospect of multiscript authority records. Although librarians tend to focus on Unicode in relation to non-Roman scripts, language is a more important feature of authority records than script. The concept of a catalog “locale” (of which language is one aspect) is introduced. Restrictions on the structure and content of a MARC 21 authority record are outlined, and the alternative structures for authority records containing languages written in non- Roman scripts are described. he Unicode Standard is the universal encoding standard for all the charac- Tters used in writing the world’s languages.1 The availability of library systems based on Unicode offers the prospect of library records not only in all languages but also in all the scripts that a particular system supports. While such a system will be used primarily to create and provide access to bibliographic records in their actual scripts, it can also be used to create authority records for the library, perhaps for contribution to communal authority files. A number of general design issues apply to authority records in multiple languages and scripts, design issues that affect not just the key hubs of communal authority files, but any institution or organization involved with authority control. Multiple scripts in library systems became available in the 1980s in the Research Libraries Information Network (RLIN) with the addition of Chinese, Japanese, and Korean (CJK) capability, and in ALEPH (Israel’s research library network), which initially provided Latin and Hebrew scripts and later Arabic, Cyrillic, and Greek.2 The Library of Congress continued to produce catalog cards for material in the JACKPHY (Japanese, Arabic, Chinese, Korean, Persian, Hebrew, and Yiddish) languages until all of the scripts used to write these languages were supported by an automated system.
  • There Are No Limits to Learning! Academic and High School

    There Are No Limits to Learning! Academic and High School

    Brick and Click Libraries An Academic Library Symposium Northwest Missouri State University Friday, November 5, 2010 Managing Editors: Frank Baudino Connie Jo Ury Sarah G. Park Co-Editor: Carolyn Johnson Vicki Wainscott Pat Wyatt Technical Editor: Kathy Ferguson Cover Design: Sean Callahan Northwest Missouri State University Maryville, Missouri Brick & Click Libraries Team Director of Libraries: Leslie Galbreath Co-Coordinators: Carolyn Johnson and Kathy Ferguson Executive Secretary & Check-in Assistant: Beverly Ruckman Proposal Reviewers: Frank Baudino, Sara Duff, Kathy Ferguson, Hong Gyu Han, Lisa Jennings, Carolyn Johnson, Sarah G. Park, Connie Jo Ury, Vicki Wainscott and Pat Wyatt Technology Coordinators: Sarah G. Park and Hong Gyu Han Union & Food Coordinator: Pat Wyatt Web Page Editors: Lori Mardis, Sarah G. Park and Vicki Wainscott Graphic Designer: Sean Callahan Table of Contents Quick & Dirty Library Promotions That Really Work! 1 Eric Jennings, Reference & Instruction Librarian Kathryn Tvaruzka, Education Reference Librarian University of Wisconsin Leveraging Technology, Improving Service: Streamlining Student Billing Procedures 2 Colleen S. Harris, Head of Access Services University of Tennessee – Chattanooga Powerful Partnerships & Great Opportunities: Promoting Archival Resources and Optimizing Outreach to Public and K12 Community 8 Lea Worcester, Public Services Librarian Evelyn Barker, Instruction & Information Literacy Librarian University of Texas at Arlington Mobile Patrons: Better Services on the Go 12 Vincci Kwong,
  • Falcon 2.0: an Entity and Relation Linking Tool Over Wikidata

    Falcon 2.0: an Entity and Relation Linking Tool Over Wikidata

    Falcon 2.0: An Entity and Relation Linking Tool over Wikidata Ahmad Sakor Kuldeep Singh [email protected] [email protected] L3S Research Center and TIB, University of Hannover Cerence GmbH and Zerotha Research Hannover, Germany Aachen, Germany Anery Patel Maria-Esther Vidal [email protected] [email protected] TIB, University of Hannover L3S Research Center and TIB, University of Hannover Hannover, Germany Hannover, Germany ABSTRACT Management (CIKM ’20), October 19–23, 2020, Virtual Event, Ireland. ACM, The Natural Language Processing (NLP) community has signifi- New York, NY, USA, 8 pages. https://doi.org/10.1145/3340531.3412777 cantly contributed to the solutions for entity and relation recog- nition from a natural language text, and possibly linking them to 1 INTRODUCTION proper matches in Knowledge Graphs (KGs). Considering Wikidata Entity Linking (EL)- also known as Named Entity Disambiguation as the background KG, there are still limited tools to link knowl- (NED)- is a well-studied research domain for aligning unstructured edge within the text to Wikidata. In this paper, we present Falcon text to its structured mentions in various knowledge repositories 2.0, the first joint entity and relation linking tool over Wikidata. It (e.g., Wikipedia, DBpedia [1], Freebase [4] or Wikidata [28]). Entity receives a short natural language text in the English language and linking comprises two sub-tasks. The first task is Named Entity outputs a ranked list of entities and relations annotated with the Recognition (NER), in which an approach aims to identify entity proper candidates in Wikidata. The candidates are represented by labels (or surface forms) in an input sentence.