ARL White Paper on Wikidata: Opportunities and Recommendations 2
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
International Benchmark of Good Practices on New Business Models and Initiatives from and for Cultural Institutions
International Benchmark of Good Practices on new Business Models and initiatives from and for cultural institutions June 2020 The European Commission support for the production of this document does not constitute an endorsement of the contents which reflects the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained therein Index Introduction ....................................................................................................................................................................................................................................... 5 European Digital Treasures project ............................................................................................................................................................................................... 5 Objectives of the document .......................................................................................................................................................................................................... 6 Methodology ................................................................................................................................................................................................................................. 7 Category 1: Management ............................................................................................................................................................................................................. -
Data Quality: Letting Data Speak for Itself Within the Enterprise Data Strategy
Data Quality: Letting Data Speak for Itself within the Enterprise Data Strategy Collaboration & Transformation (C&T) Shared Interest Group (SIG) Financial Management Committee DATA Act – Transparency in Federal Financials Project Date Released: September 2015 SYNOPSIS Data quality and information management across the enterprise is challenging. Today’s information systems consist of multiple, interdependent platforms that are interfaced across the enterprise. Each of these systems and the business processes they support often use and define the same piece of data differently. How the information is used across the enterprise becomes a critical part of the equation when managing data quality. Letting the data speak for itself is the process of analyzing how the data and information is used across the enterprise, providing details on how the data is defined, what systems and processes are responsible for authoring and maintaining the data, and how the information is used to support the agency, and what needs to be done to support data quality and governance. This information also plays a critical role in the agency’s capacity to meet the requirements of the DATA Act. American Council for Technology-Industry Advisory Council (ACT-IAC) 3040 Williams Drive, Suite 500, Fairfax, VA 22031 www.actiac.org ● (p) (703) 208.4800 (f) ● (703) 208.4805 Advancing Government through Collaboration, Education and Action Data Quality: Letting the Data Speak for Itself within the Enterprise Data Strategy American Council for Technology-Industry Advisory Council (ACT-IAC) The American Council for Technology (ACT) is a non-profit educational organization established in 1979 to improve government through the efficient and innovative application of information technology. -
What Do Wikidata and Wikipedia Have in Common? an Analysis of Their Use of External References
What do Wikidata and Wikipedia Have in Common? An Analysis of their Use of External References Alessandro Piscopo Pavlos Vougiouklis Lucie-Aimée Kaffee University of Southampton University of Southampton University of Southampton United Kingdom United Kingdom United Kingdom [email protected] [email protected] [email protected] Christopher Phethean Jonathon Hare Elena Simperl University of Southampton University of Southampton University of Southampton United Kingdom United Kingdom United Kingdom [email protected] [email protected] [email protected] ABSTRACT one hundred thousand. These users have gathered facts about Wikidata is a community-driven knowledge graph, strongly around 24 million entities and are able, at least theoretically, linked to Wikipedia. However, the connection between the to further expand the coverage of the knowledge graph and two projects has been sporadically explored. We investigated continuously keep it updated and correct. This is an advantage the relationship between the two projects in terms of the in- compared to a project like DBpedia, where data is periodically formation they contain by looking at their external references. extracted from Wikipedia and must first be modified on the Our findings show that while only a small number of sources is online encyclopedia in order to be corrected. directly reused across Wikidata and Wikipedia, references of- Another strength is that all the data in Wikidata can be openly ten point to the same domain. Furthermore, Wikidata appears reused and shared without requiring any attribution, as it is to use less Anglo-American-centred sources. These results released under a CC0 licence1. -
Jose's Geweldige Presentatie
2012-06-11 Kulturrad Fagtag, Oslo Stel je een wereld voor waarin elke persoon vrije toegang heeft tot alle kennis. Dat is waar wij aan werken. Collaboration: why • Mission • Impact • Reach • Community • Non-pro • 1 + 1 = Community • Not a crowd • Structures • Rules of the game • Opportunity Image from: Tropenmuseum, CCBYSA ; CCNL/Hay Kranen,, CCBY ; Susana Morais, PD ; Mrjohncummings, CCBYSA ; Donaldytong, CCBYSA ;Mike Peel, CCBYSA; Content donations • Collection available through Wikimedia Commons • Reach • Context • Multiple languages • Digital restauration ... Image from: Tropenmuseum, CCBYSA Effective donations Higher reach. Nationaal Archief: 2000+ views vs 4 views per photo Copyright... Metadata Community Focus? Wiki Loves Art • Photo contest • 2009 Netherlands • 45 museums • 5400+ images • Van Gogh, Boijmans van Beuningen Image from: CCNL/Hay Kranen,, CCBY Wiki Loves … 1 month Community Easy upload Good overview Wiki Loves Monuments 2010: Netherlands 2011: Europe (18 countries) 2012: World (25+ countries) 1 month Lists of monuments Collaborations Local 2011 2012 Photos: Goodness Shamrock, Steven van der Wal Wiki Loves Monuments results Freely licensed photos: 165,000+ Familiarity: 4,000+ new uploaders, 7,500,000+ hits on project websites 18 countries 14 local events Wikipedia is editable: 85% expects to edit more Collaboration: 2000+ emails... Leiden Photo contests: lessons Easy Fun Nearby Wikipedia Results Edit-a-thon Themed meeting of Wikipedians to edit articles Something extra British Library, Teylers Museum Lock them in a -
DM2E: Digitised Manuscripts to Europeana Project ID Card
European Commission - Information Society projects DM2E: Digitised Manuscripts to Europeana Project ID card The projects aims to technically enable as many content providers as possible to integrate Funded under: The their content into Europeana. Since different providers make their data available in different Information and formats, a tool has to be developed that converts metadata from a diverse range of source Communication formats into the EDM (Europeana Data Model). During the project, this will exemplarily be Technologies Policy Support done for the autograph database Kalliope (Staatsbibliothek zu Berlin), the German text archive Programme Deutsches Textarchiv (Berlin-Brandenburgische Akademie der Wissenschaften), as well as Area: CIP-ICT-PSP.2011.2.1 content of the Austrian national library (Österreichische Nationalbibliothek) plus a number of - Aggregating content in other sources. Metadata providers are strongly encouraged to make their data available under Europeana CC0, as all other licenses hamper further use by Europeana and scholars. Finally, Europeana is going to provide the aggregated data in a form which allows Digital Humanities scholars to Total cost: €2.62m make best use of them. In addition, basic services / functionalities (modeled on the humanities' EU contribution: €2.10m “scholarly primitives”) will be provided by Europeana. In this perspective, the proposed project is in line with theme 2 of the call (Digital Content) and targets its objective 2.1 (Aggregating Project reference: 297274 content in Europeana). -
SUGI 28: the Value of ETL and Data Quality
SUGI 28 Data Warehousing and Enterprise Solutions Paper 161-28 The Value of ETL and Data Quality Tho Nguyen, SAS Institute Inc. Cary, North Carolina ABSTRACT Information collection is increasing more than tenfold each year, with the Internet a major driver in this trend. As more and more The adage “garbage in, garbage out” becomes an unfortunate data is collected, the reality of a multi-channel world that includes reality when data quality is not addressed. This is the information e-business, direct sales, call centers, and existing systems sets age and we base our decisions on insights gained from data. If in. Bad data (i.e. inconsistent, incomplete, duplicate, or inaccurate data is entered without subsequent data quality redundant data) is affecting companies at an alarming rate and checks, only inaccurate information will prevail. Bad data can the dilemma is to ensure that how to optimize the use of affect businesses in varying degrees, ranging from simple corporate data within every application, system and database misrepresentation of information to multimillion-dollar mistakes. throughout the enterprise. In fact, numerous research and studies have concluded that data quality is the culprit cause of many failed data warehousing or Customer Relationship Management (CRM) projects. With the Take into consideration the director of data warehousing at a large price tags on these high-profile initiatives and the major electronic component manufacturer who realized there was importance of accurate information to business intelligence, a problem linking information between an inventory database and improving data quality has become a top management priority. a customer order database. -
Wikipedia Knowledge Graph with Deepdive
The Workshops of the Tenth International AAAI Conference on Web and Social Media Wiki: Technical Report WS-16-17 Wikipedia Knowledge Graph with DeepDive Thomas Palomares Youssef Ahres [email protected] [email protected] Juhana Kangaspunta Christopher Re´ [email protected] [email protected] Abstract This paper is organized as follows: first, we review the related work and give a general overview of DeepDive. Sec- Despite the tremendous amount of information on Wikipedia, ond, starting from the data preprocessing, we detail the gen- only a very small amount is structured. Most of the informa- eral methodology used. Then, we detail two applications tion is embedded in unstructured text and extracting it is a non trivial challenge. In this paper, we propose a full pipeline that follow this pipeline along with their specific challenges built on top of DeepDive to successfully extract meaningful and solutions. Finally, we report the results of these applica- relations from the Wikipedia text corpus. We evaluated the tions and discuss the next steps to continue populating Wiki- system by extracting company-founders and family relations data and improve the current system to extract more relations from the text. As a result, we extracted more than 140,000 with a high precision. distinct relations with an average precision above 90%. Background & Related Work Introduction Until recently, populating the large knowledge bases relied on direct contributions from human volunteers as well With the perpetual growth of web usage, the amount as integration of existing repositories such as Wikipedia of unstructured data grows exponentially. Extract- info boxes. These methods are limited by the available ing facts and assertions to store them in a struc- structured data and by human power. -
Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
information Article Modeling Popularity and Reliability of Sources in Multilingual Wikipedia Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland; [email protected] (K.W.); [email protected] (W.A.) * Correspondence: [email protected] Received: 31 March 2020; Accepted: 7 May 2020; Published: 13 May 2020 Abstract: One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia. -
Omnipedia: Bridging the Wikipedia Language
Omnipedia: Bridging the Wikipedia Language Gap Patti Bao*†, Brent Hecht†, Samuel Carton†, Mahmood Quaderi†, Michael Horn†§, Darren Gergle*† *Communication Studies, †Electrical Engineering & Computer Science, §Learning Sciences Northwestern University {patti,brent,sam.carton,quaderi}@u.northwestern.edu, {michael-horn,dgergle}@northwestern.edu ABSTRACT language edition contains its own cultural viewpoints on a We present Omnipedia, a system that allows Wikipedia large number of topics [7, 14, 15, 27]. On the other hand, readers to gain insight from up to 25 language editions of the language barrier serves to silo knowledge [2, 4, 33], Wikipedia simultaneously. Omnipedia highlights the slowing the transfer of less culturally imbued information similarities and differences that exist among Wikipedia between language editions and preventing Wikipedia’s 422 language editions, and makes salient information that is million monthly visitors [12] from accessing most of the unique to each language as well as that which is shared information on the site. more widely. We detail solutions to numerous front-end and algorithmic challenges inherent to providing users with In this paper, we present Omnipedia, a system that attempts a multilingual Wikipedia experience. These include to remedy this situation at a large scale. It reduces the silo visualizing content in a language-neutral way and aligning effect by providing users with structured access in their data in the face of diverse information organization native language to over 7.5 million concepts from up to 25 strategies. We present a study of Omnipedia that language editions of Wikipedia. At the same time, it characterizes how people interact with information using a highlights similarities and differences between each of the multilingual lens. -
10Annual Report
ANNUAL REPORT 1A de0cade of democratising culture 2018 COLOPHON TABLE OF CONTENTS The Europeana Foundation is an Europeana Foundation A decade of democratising culture independent foundation, registered Prins Willem-Alexanderhof 5 under Dutch Law and has its offices in the 2595 BE Den Haag Table of contents 3 Koninklijke Bibliotheek in The Hague, The Netherlands Netherlands. Foreword 4 www.europeana.eu 1. Participants share 600 personal stories with Europeana Migration 6 Our mission: ‘We transform the world pro.europeana.eu 2. Public transcribes 2,800 WWI handwritten documents 7 with culture’. We build on Europe’s rich cultural heritage and make it easier 3. New aggregation workflow transforms data ingestion process 8 for people to use for work, learning or 4. Europeana Network Association launches six specialist communities 9 pleasure. Our work contributes to an open, 5. EuropeanaTech inspires innovators to forge the future of digital 10 knowledgeable and creative society. 6. Europeana Collections updates make content more discoverable 11 The Europeana Foundation is the leader 7. Transformative cultural heritage shapes Impact Playbook 12 of a consortium that is the operator of the 8. Scholars harness cultural heritage in WWI research projects 13 Europeana Digital Service Infrastructure 9. 2,000 teachers upskill with ‘Europeana in your classroom’ MOOC 14 (DSI), funded by the Connecting Europe Facility of the European Union. 10. Creatives get playful with digital culture 15 Key Performance Indicators (KPIs) 16 In addition, the Europeana -
How to Contribute Climate Change Information to Wikipedia : a Guide
HOW TO CONTRIBUTE CLIMATE CHANGE INFORMATION TO WIKIPEDIA Emma Baker, Lisa McNamara, Beth Mackay, Katharine Vincent; ; © 2021, CDKN This work is licensed under the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/legalcode), which permits unrestricted use, distribution, and reproduction, provided the original work is properly credited. Cette œuvre est mise à disposition selon les termes de la licence Creative Commons Attribution (https://creativecommons.org/licenses/by/4.0/legalcode), qui permet l’utilisation, la distribution et la reproduction sans restriction, pourvu que le mérite de la création originale soit adéquatement reconnu. IDRC Grant/ Subvention du CRDI: 108754-001-CDKN knowledge accelerator for climate compatible development How to contribute climate change information to Wikipedia A guide for researchers, practitioners and communicators Contents About this guide .................................................................................................................................................... 5 1 Why Wikipedia is an important tool to communicate climate change information .................................................................................................................................. 7 1.1 Enhancing the quality of online climate change information ............................................. 8 1.2 Sharing your work more widely ......................................................................................................8 1.3 Why researchers should -
The Future of Wikimedia and Why New Zealand Museums Should Pay Attention
The Future of Wikimedia and Why New Zealand Museums Should Pay Attention Susan Tolich On the 21st of May it was announced that Mike Dickison will be assuming the position as Aotearoa’s frst Wikipedia-at-large. Tis new role will entail several placements at GLAM institutions around the country where Dickison will act as a ‘Wikipedian in Residence.’ Tis position does not involve editing Wikipedia on behalf of the organisations but focuses on training staf in how to contribute and engage with all parts of Wikimedia and its editing community. Wikipedia is just one of the projects run by the non-proft Wikimedia Foundation; others include Wikimedia Commons and Wikidata. Troughout his career Dickison has had years of experience advocating for Wikipedia to be used in the GLAM sector and has hosted various events to improve the representation of New Zealand endemic species and female scientists on the site.1 While Dickson is the frst Wikipedian-at-large in New Zealand he is part of a much larger global movement which works towards creating a freely accessed ‘sum of all knowledge.’ Wikimedians have partnered with GLAM institutions around the world since 2010 with the mission of ‘connecting audiences to open knowledge, ideas and creativity on a global scale.’2 Other Wikipedian-in-residence projects have ranged from creating documentary photography of Carpathian folk lore, to upskilling librarians in the Ivory Coast to be able to promote their heritage using Wikimedia platforms. It was eforts such as these that also delivered Te Metropolitan Museum 1 Mike Dickison, “New Zealand Wikimedian at large,” Giant Flightless Bird (Blog), 21 May 2018, http://www.giantfightlessbirds.com/2018/05/new-zealand-wikipedian-at-large/ 2 Katherine Maher and Loic Tallon, “Wikimedia and the Met: a shared digital vision,” Medium, 20 April 2018, https://medium.com/freely-sharing-the-sum-of-all-knowledge/wikimedia-and -the-met-a-shared-digital-vision-f91b59eab2e9.