Digital Inclusion the Vital Role of Local Content

Total Page:16

File Type:pdf, Size:1020Kb

Digital Inclusion the Vital Role of Local Content Special Issue A quarterly journal published by MIT Press innovations TECHNOLOGY | GOVERNANCE | GLOBALIZATION Digital Inclusion The Vital Role of Local Content Lead Essays Christopher Burns and Jonathan Dolan Building a Foundation for Digital Inclusion Mark Graham Inequitable Distributions in Internet Geographies Matthew Guilford To the Next Billion Case Narratives Sara Chamberlain A Mobile Guide Toward Better Health Kerry Harwin and Rikin Gandhi A Social Network for Farmer Training Analysis and Perspectives on Policy Mark Surman, Corina Gardner & David Ascher Local Content & Smartphones Emrys Schoemaker The Mobile Web Lesley-Anne Long, Sara Chamberlain & Kirsten Gagnaire The 80-20 Debate Marco Veremis Hyper-Local Content Is Key—Especially Social Media Abigail Steinberg, Peres Were & Amolo Ng’weno Democratizing Legal Information Across Africa Ravi Chhatpar and Robert Fabricant Digital Design for Emerging Markets Iris Orriss The Internet’s Language Barrier Kul Wadhwa & Howie Fung Converting Western Internet to Indigenous Internet ENTREPRENEURIAL SOLUTIONS TO GLOBAL CHALLENGES Editors Advisory Board Philip Auerswald Susan Davis Iqbal Quadir Bill Drayton David Kellogg Managing Editor Eric Lemelson Michael Youngblood Granger Morgan Guest Editors Jacqueline Novogratz Audrey Hyland James Turner Nicholas Sullivan Xue Lan Senior Associate Editor Editorial Board Robin Miller David Audretsch Matthew Bunn Associate Editors Maryann Feldman Dody Riggs Richard Florida Helen Snively Peter Mandaville Strategic Advisor Julia Novy-Hildesley Erin Krampetz Francisco Veloso Yang Xuedong Innovations: Technology | Governance | Globalization is co-hosted by the School of Public Policy, George Mason University (Fairfax, VA, USA); the Belfer Center for Science and International Affairs, Kennedy School of Government, Harvard University (Cambridge, MA, USA); and the Legatum Center for Development and Entrepreneurship, Massachusetts Institute of Technology (Cambridge, MA, USA). Support for the journal is provided in part by the Lemelson Foundation and the Ewing Marion Kauffman Foundation. © 2014 Tagore LLC innovations TECHNOLOGY | GOVERNANCE | GLOBALIZATION Foreword 3 Toward a More Inclusive Digital Economy Dr. Rajiv Shah and Priya Jaisinghani Lead Essays 7 Building a Foundation for Digital Inclusion: A Coordinated Local Content Ecosystem Christopher Burns and Jonathan Dolan 17 Inequitable Distributions in Internet Geographies: The Global South Is Gaining Access, but Lags in Local Content Mark Graham 35 To the Next Billion: Mobile Network Operators and the Content Distribution Value Chain Matthew Guilford Case Narratives Mobile Kunji 47 A Mobile Guide Toward Better Health: How Mobile Kunji is Improving Birth Outcomes in Bihar, India Sara Chamberlain Digital Green 57 A Rural Video-Based Social Network for Farmer Training Kerry Harwin and Rikin Gandhi Analysis 67 Local Content, Smartphones, and Digital Inclusion: Will the Next Billion Consumers Also Be Contributors to the Mobile Web? Mark Surman, Corina Gardner, and David Ascher 79 The Mobile Web: Amplifying, But Not Creating, Changemakers Emrys Schoemaker Special Issue 91 The 80-20 Debate: Framework or Fiction? How Much Development Work is Standardized Across Geographies, and How Much is Customized for Local Conditions? Lesley-Anne Long, Sara Chamberlain, and Kirsten Gagnaire 101 Hyper-Local Content Is Key—Especially Social Media: A Cross-Country Comparison of Mobile Content in Brazil, China, India and Nigeria Marco Veremis Perspectives on Policy 107 Democratizing Legal Information Across Africa: An Inside Look at Digitizing Local Content in Africa Abigail Steinberg, Peres Were, and Amolo Ng’weno 117 Digital Design for Emerging Markets: Beyond Textual and Technical Literacy to Cultural Fluency Ravi Chhatpar and Robert Fabricant 127 The Internet’s Language Barrier Iris Orriss 131 Converting Western Internet to Indigenous Internet: Lessons from Wikipedia Kul Wadhwa and Howie Fung About Innovations Innovations is about entrepreneurial solutions to global challenges. The journal features cases authored by exceptional innovators; commentary and research from leading academics; and essays from globally recognized executives and political leaders. The journal is jointly hosted at George Mason University's School of Public Policy, Harvard's Kennedy School of Government, and MIT's Legatum Center for Development and Entrepreneurship. Dr. Rajiv Shah and Priya Jaisinghani Toward a More Inclusive Digital Economy Foreword to Innovations special issue on Digital Inclusion The emerging digital economy has unprecedented potential to improve the lives of the very poor around the world. Powered by mobile and broadband technolo- gies and energized by new business models and an abundance of data, the digital economy is transforming everything from teacher evaluation to child survival to election monitoring. By harnessing its full reach in development, we can help answer President Obama’s call to end extreme poverty in two generations. The macroeconomic forecasts are stunningly positive. Strategy& (formerly Booz & Company) has predicted that the “digitization” of the economy could yield as much as $4.1 trillion in GDP for the world’s poorest people. This would create 64 million new jobs and help lift 580 million people who currently live on less than US $4 per day.1 We have seen the power of this transformation among poor fam- ilies across the world. We have visited marginalized and vulnerable communities across Africa, Latin America, and Asia that are now able to connect, learn, vote, and even save money safely through one simple device—the mobile phone. The opportunity is clear, but it is not a forgone conclusion that mobile and broadband technologies will benefit those who arguably stand to gain the most. Building inclusive digital economies requires the collective action of govern- ments, industry, financiers, and civil society. We need to build the infrastructure, align the policies, and create the tools that will enable the very poor to join the dig- ital revolution. We have seen the impact of a collective approach in the global partnerships that USAID has built. Through the Better Than Cash Alliance, Visa, MasterCard, USAID, the UN, and the governments of Malawi, The Philippines, Kenya, Afghanistan, and others have come together to promote a global move- ment toward digital payments. At the same time, USAID has launched the Alliance for Affordable Internet alongside the Department for International Development (DFID), Google, the Omidyar Network, and 60 other partners to help governments worldwide make policy reforms that are bringing down broad- band prices for billions of current and future online users. More recently, USAID has teamed up with donors such as UNICEF, the Bill & Melinda Gates © 2014 Dr. Rajiv Shah and Priya Jaisinghani innovations / Special Issue 3 Dr. Rajiv Shah and Priya Jaisinghani Foundation, the World Food Program, and others to examine shared experiences in digital development through a common set of principles and best practices. However, persistent barriers to accessing digital technologies remain. Women are, on average, 21 percent less likely to own mobile phones or go online than their male counterparts.2 In some African countries, access to broadband can cost upwards of 1,000 times that of most people’s monthly income.3 As a result, fewer than 20 percent of Africans can access the Internet.4 These are but a We have long known the value of few stark examples. Until digital technology. It accelerates such gaps are closed, dig- ital technology will never financial inclusion, improves reach its potential as a transparency, and unlocks new driver of inclusive growth; in fact, it could markets. But the digital revolution amplify socioeconomic divisions. delivers more than just economic We know the digital efficiencies. It empowers voices, economy will reshape institutions, systems, and advances dignity, and, perhaps requisite skills, but there most important, it builds the is no guarantee that this transformation will drive capacity of individuals to lift broad-based benefit, as themselves—and future expanding access to information and commu- generations—out of poverty. nication technologies alone will not ensure that the digital economy develops in an inclusive way. In the U.S., for example, where 98 percent of homes have some access to high-speed Internet, the impact of the innovation-driven economy has led to a surprising economic paradox.5 Never before have productivity, generation of wealth, and profits been higher, yet the median income of the American worker has stagnated and unemployment has risen.6 The American Dream remains out of reach for far too many. The explanations for this paradox are, of course, complex. At a minimum, however, it points to the fact that policies, institutions, and indi- vidual capabilities have not kept pace with the changes driven by technological advances. Today, the world’s online knowledge is primarily in English and is largely text based, yet information shared through mobile and broadband technologies must be accessible for people of all language and literacy skills. There are notable efforts to create information services in different languages and across delivery channels, and the world of image and video-rich smartphone environments certainly will 4 innovations / Digital Inclusion Toward a More Inclusive Digital Economy change this dynamic, but much more needs to be done. As such, this issue of Innovations on Digital Inclusion is an important exercise
Recommended publications
  • State of Wikimedia Communities of India
    State of Wikimedia Communities of India Assamese http://as.wikipedia.org State of Assamese Wikipedia RISE OF ASSAMESE WIKIPEDIA Number of edits and internal links EDITS PER MONTH INTERNAL LINKS GROWTH OF ASSAMESE WIKIPEDIA Number of good Date Articles January 2010 263 December 2012 301 (around 3 articles per month) November 2011 742 (around 40 articles per month) Future Plans Awareness Sessions and Wiki Academy Workshops in Universities of Assam. Conduct Assamese Editing Workshops to groom writers to write in Assamese. Future Plans Awareness Sessions and Wiki Academy Workshops in Universities of Assam. Conduct Assamese Editing Workshops to groom writers to write in Assamese. THANK YOU Bengali বাংলা উইকিপিডিয়া Bengali Wikipedia http://bn.wikipedia.org/ By Bengali Wikipedia community Bengali Language • 6th most spoken language • 230 million speakers Bengali Language • National language of Bangladesh • Official language of India • Official language in Sierra Leone Bengali Wikipedia • Started in 2004 • 22,000 articles • 2,500 page views per month • 150 active editors Bengali Wikipedia • Monthly meet ups • W10 anniversary • Women’s Wikipedia workshop Wikimedia Bangladesh local chapter approved in 2011 by Wikimedia Foundation English State of WikiProject India on ENGLISH WIKIPEDIA ● One of the largest Indian Wikipedias. ● WikiProject started on 11 July 2006 by GaneshK, an NRI. ● Number of article:89,874 articles. (Excludes those that are not tagged with the WikiProject banner) ● Editors – 465 (active) ● Featured content : FAs - 55, FLs - 20, A class – 2, GAs – 163. BASIC STATISTICS ● B class – 1188 ● C class – 801 ● Start – 10,931 ● Stub – 43,666 ● Unassessed for quality – 20,875 ● Unknown importance – 61,061 ● Cleanup tags – 43,080 articles & 71,415 tags BASIC STATISTICS ● Diversity of opinion ● Lack of reliable sources ● Indic sources „lost in translation“ ● Editor skills need to be upgraded ● Lack of leadership ● Lack of coordinated activities ● ….
    [Show full text]
  • Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
    information Article Modeling Popularity and Reliability of Sources in Multilingual Wikipedia Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland; [email protected] (K.W.); [email protected] (W.A.) * Correspondence: [email protected] Received: 31 March 2020; Accepted: 7 May 2020; Published: 13 May 2020 Abstract: One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia.
    [Show full text]
  • Wikipedia As a Lens for Studying the Real-Time Formation of Collective Memories of Revolutions
    International Journal of Communication 5 (2011), Feature 1313–1332 1932–8036/2011FEA1313 WikiRevolutions: Wikipedia as a Lens for Studying the Real-time Formation of Collective Memories of Revolutions MICHELA FERRON University of Trento PAOLO MASSA Fondazione Bruno Kessler In this article, we propose to interpret the online encyclopedia Wikipedia as an online setting in which collective memories about controversial and traumatic events are built in a collaborative way. We present the richness of data available on the phenomenon, providing examples of users’ participation in the creation of articles related to the 2011 Egyptian revolution. Finally, we propose possible research directions for the empirical study of collective memory formation of traumatic and controversial events in large populations as they unfold over time. Introduction On December 17, 2010, Mohammed Bouazizi, a 26-year-old fruit vendor in the central town of Sidi Bouzid doused himself and set himself on fire in front of a local municipal office. On January 25, 2011, a series of protests began in downtown Cairo and across the country against the government of Egyptian President Hosni Mubarak, in what has been called the “Day of Revolt.” In the following days, protests spread across Tunisia and Egypt, leading to the flight of the Tunisian president Zine El Abidine Ben Ali from his country on January 14, 2011, and to the resignation of Hosni Mubarak on February 11, 2011. Besides the great deal of media attention received by these events, the Tunisian and Egyptian revolutions also triggered an intense flurry of editing activity and heated discussions on the online encyclopedia Wikipedia.
    [Show full text]
  • Chapter – I Introduction
    CHAPTER – I INTRODUCTION Wikipedia matters. It is widely used and immensely influential in contemporary discourse. It is the ultimate paradigm of collective action on the Web, producing a large, successful resource of great value. According to Tonkin (2005), “Wikis allow all members to edit web pages so they are often used to promote collaborative content creation and editing”. Ketih explains ─ “Wikis (Internet provided private online spaces) are believed to be useful in supporting collaborative activity and improving student interaction. A Wiki provides an online space that allows members to collaboratively create and edit Web pages where content is emphasized over authorship. It could be used as a place for brainstorming or a place to archive shared content and link to other Web sites.” Moreover, Desilets et al. (2006) assert, “Wiki is a collective website where a large number of participants are allowed to modify or create pages using their Web browser (p.19)”. This introductory unit „Attitude of Information Professionals towards the Use of Wikipedia‟ consists of the background, statement of the problem, rationale of the study, objective of the study, research questions, significance of the study, delimitation of the study and operational definition of the key terms. 1.1 Background of the Study Over the last decade, the Web has become an essential tool for researchers. Information can be found using search tools such as Google or Yahoo quickly and/or easily. The problem is often not a lack of content, but rather the large volumes of stale and questionable information. Determining the accuracy of search, a result is a challenge for any Internet user.
    [Show full text]
  • Cultural Identities in Wikipedias
    Cultural Identities in Wikipedias Marc Miquel-Ribé David Laniado Universitat Pompeu Fabra Eurecat Roc Boronat, 138, 08018, Barcelona, Av. Diagonal, 177, 080018, Barcelona, Catalonia, Spain Catalonia, Spain [email protected] [email protected] ABSTRACT 1. INTRODUCTION Wikipedia is self-defined as "a free-access, free-content Internet In this paper we study identity-based motivation in Wikipedia as encyclopedia”1. When Jimmy Wales and Larry Sanger started a drive for editors to act congruently with their cultural identity Wikipedia in 2001, they were already developing a free values by contributing with content related to them. To assess its encyclopedia called Nupedia with this same purpose. It was the influence, we developed a computational method to identify implementation of the wiki technology that completely changed articles related to the cultural identities associated to a language their approach by allowing collaborative modifications directly and applied it to 40 Wikipedia language editions. The results from the browser. This grew into the current site we know. The show that about a quarter of each Wikipedia language edition is result is a dual object: a social network that also serves the dedicated to represent the corresponding cultural identities. The purpose of creating a knowledge repository. However, , topical coverage of these articles reflects that geography, Wikipedia does not encourage editors to build their identities biographies, and culture are the most common themes, although based on personal traits, biography and social affinities2, which each language shows its idiosyncrasy and other topics are also is different from other online communities. Instead, Wikipedians present. The majority of these articles remain exclusive to each are valued according to their activity, their writing skills, the language, which is consistent with the idea that a Cultural languages they speak or acknowledgements they have received Identity is defined in relation to others; as entangled and from other peers, such as barnstars3 and praising comments.
    [Show full text]
  • An End-To-End Learning Solution for Assessing the Quality of Wikipedia Articles Quang-Vinh Dang, Claudia-Lavinia Ignat
    An end-to-end learning solution for assessing the quality of Wikipedia articles Quang-Vinh Dang, Claudia-Lavinia Ignat To cite this version: Quang-Vinh Dang, Claudia-Lavinia Ignat. An end-to-end learning solution for assessing the quality of Wikipedia articles. OpenSym 2017 - International Symposium on Open Collaboration, Aug 2017, Galway, Ireland. 10.1145/3125433.3125448. hal-01559693v3 HAL Id: hal-01559693 https://hal.inria.fr/hal-01559693v3 Submitted on 28 Jul 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. An end-to-end learning solution for assessing the quality of Wikipedia articles Quang-Vinh Dang Claudia-Lavinia Ignat Université de Lorraine, LORIA, F-54506 Inria, F-54600 Inria, F-54600 Université de Lorraine, LORIA, F-54506 CNRS, LORIA, F-54506 CNRS, LORIA, F-54506 [email protected] [email protected] ABSTRACT contains about 42 million articles in all languages with 5:4 Wikipedia is considered as the largest knowledge repository million articles belonging to English Wikipedia, as the result in the history of humanity and plays a crucial role in modern of the contribution from around 29 million users1.
    [Show full text]
  • Using Topical Networks to Detect Editor Communities in Wikipedias
    Using Topical Networks to Detect Editor Communities in Wikipedias Michael Kretschmer Bernhard Goschlberger¨ Ralf Klamma Advanced Community Research Studio Data Science Advanced Community Information Systems (ACIS), Research Studios Austria FG Information Systems (ACIS), Chair of Computer Science 5 Leopoldskronstr. 30, 5020 Salzburg, Austria Chair of Computer Science 5 (Information Systems & Databases), [email protected] (Information Systems & Databases), RWTH Aachen University RWTH Aachen University Ahornstr. 55, 52074 Aachen, Germany Ahornstr. 55, 52074 Aachen, Germany [email protected] [email protected] Abstract—The collaboration of Wikipedia editors is well re- a helpful tool in supporting editors to find articles to improve searched, covered by scientific works of many different fields. [6]. In recent years different designs of recommender systems There is a growing interest to implement recommender systems have been proposed [7], [8] to personalize these suggestions. that guide inexperienced editors to projects which fit their interests in certain topical domains. Although there have been Morgan and Halfaker identified the sense of community a new numerous studies focusing on editing behavior in Wikipedia the Wikipedia editors experiences as an important factor related to role of topical domains in this regard is still unclear. In particular, the retention rate in a recent report [9]. These subcommunities topical aspects of co-authorship are generally neglected. In this within Wikipedia are the driving force behind article creation paper, we want to determine by which criteria editors usually and elaboration [10]. We are therefore interested in analyzing choose articles they want to contribute to. We analyzed three different language editions of Wikipedia (Vietnamese, Hebrew, these editor communities and investigate how topical domains and Serbo-Croatian) by building social networks and running relate to communities of Wikipedia contributors across differ- community detection algorithms on them, i.e.
    [Show full text]
  • Linguistic Neighbourhoods: Explaining Cultural Borders on Wikipedia Through Multilingual Co-Editing Activity
    Samoilenko et al. RESEARCH Linguistic neighbourhoods: Explaining cultural borders on Wikipedia through multilingual co-editing activity Anna Samoilenko1,3*, Fariba Karimi1, Daniel Edler2, J´er^omeKunegis3 and Markus Strohmaier1,3 *Correspondence: [email protected] Abstract 1GESIS { Leibniz-Institute for the Social Sciences, 6-8 Unter In this paper, we study the network of global interconnections between language Sachsenhausen, 50667 Cologne, communities, based on shared co-editing interests of Wikipedia editors, and show Germany that although English is discussed as a potential lingua franca of the digital Full list of author information is available at the end of the article space, its domination disappears in the network of co-editing similarities, and instead local connections come to the forefront. Out of the hypotheses we explored, bilingualism, linguistic similarity of languages, and shared religion provide the best explanations for the similarity of interests between cultural communities. Population attraction and geographical proximity are also significant, but much weaker factors bringing communities together. In addition, we present an approach that allows for extracting significant cultural borders from editing activity of Wikipedia users, and comparing a set of hypotheses about the social mechanisms generating these borders. Our study sheds light on how culture is reflected in the collective process of archiving knowledge on Wikipedia, and demonstrates that cross-lingual interconnections on Wikipedia are not dominated by one powerful language. Our findings also raise some important policy questions for the Wikimedia Foundation. Keywords: Wikipedia; Multilingual; Cultural similarity; Network; Digital language divide; Socio-linguistics; Digital Humanities; Hypothesis testing 1 Introduction Measuring the extent to which cultural communities overlap via the knowledge they preserve can paint a picture of how culturally proximate or diverse they are.
    [Show full text]
  • Common Issues Faced by Indic Wikipedia
    Indic Wikipedia Policies & Guidelines Handbook Table of content Preface Introduction to policies Types of policies Features of a policy page Necessity of policies and guidelines Creating policies Proposing Village pump Article or project talk page Policy page and its talk page Initial proposal Highlighting important discussion Discussing Consensus Implementing Modifying or updating an existing policy Enforcements Common issues faced by Indic Wikipedia communities Missing or incomplete policy pages Incomplete or untranslated policy pages Lack of active translators/editors Addressing the issues Dedicated team or task force Using MediaWiki translation tool Policy mapping Credits Images Text Screenshots Planning suggestions Proofreading: Preface Currently CIS-A2K is working with five Indian-language Wikimedia communities (Kannada, Konkani, Marathi, Odia and Telugu). While working with the mentioned Indic Wikimedia communities, we observed a number of issues affecting them and we also noticed that there are many similarities between the issues and difficulties faced by these communities. So, we decided to create this “Indic Wikipedia Policies and Guidelines Handbook”. At first, we created a short handbook discussing a number of topics, such as how to create new policies, or modify the existing ones, using village pump, enforcing policies etc. Then we talked to Indic Wikipedians to know more about the policy and guideline related issues and problems they are facing. We also asked for their feedback on the first draft of this handbook. When we contacted them and requested them to join our survey, we received overwhelming responses from them. We must thank everyone who has taken part in our surveys and we will continue communicating with Indic Wikimedians.
    [Show full text]
  • Wikimedia Research Newsletter Volume 4 (2014) Contents
    Wikimedia Research Newsletter Volume 4 (2014) Contents 1 About 1 1.1 Facts and figures ............................................ 1 1.2 How to subscribe ........................................... 1 1.3 How to contribute ........................................... 2 1.4 Open access vs. closed access publications .............................. 2 1.5 Archives ................................................ 3 1.5.1 Volume 6 (2016) ....................................... 3 1.5.2 Volume 5 (2015) ....................................... 3 1.5.3 Volume 4 (2014) ....................................... 3 1.5.4 Volume 3 (2013) ....................................... 3 1.5.5 Volume 2 (2012) ....................................... 3 1.5.6 Volume 1 (2011) ....................................... 4 1.5.7 Search the WRN archives ................................... 4 1.6 Contact ................................................ 4 2 Issue 4(1): January 2014 5 2.0.1 Translation students embrace Wikipedia assignments, but find user interface frustrating ... 5 2.1 Briefly ................................................. 6 2.1.1 References .......................................... 7 3 Issue 4(2): February 2014 8 3.0.2 CSCW '14 retrospective ................................... 8 3.0.3 Clustering Wikipedia editors by their biases ......................... 9 3.0.4 Monthly research showcase launched ............................. 9 3.0.5 Study of AfD debates: Did the SOPA protests mellow deletionists? ............. 9 3.0.6 Word frequency analysis identifies “four
    [Show full text]
  • Lexbank: a Multilingual Lexical Resource for Low-Resource
    LEXBANK: A MULTILINGUAL LEXICAL RESOURCE FOR LOW-RESOURCE LANGUAGES by Feras Ali Al Tarouti M.S., King Fahd University of Petroleum & Minerals, 2008 B.S., University of Dammam, 2001 A dissertation submitted to the Graduate Faculty of the University of Colorado Colorado Springs in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science 2016 ii © Copyright by Feras Ali Al Tarouti 2016 All Rights Reserved iii This dissertation for Doctor of Philosophy degree by Feras Ali Al Tarouti has been approved for the Department of Computer Science by Jugal Kalita, Chair Tim Chamillard Rory Lewis Khang Nhut Lam Sudhanshu Semwal Date iv Al Tarouti, Feras A. (Ph.D., Computer Science) LexBank: A Multilingual Lexical Resource for Low-Resource Languages Dissertation directed by Professor Jugal Kalita In this dissertation, we present new methods to create essential lexical resources for low-resource languages. Specifically, we develop methods for enhancing automatically cre- ated wordnets. As a baseline, we start by producing core wordnets, for several languages, using methods that need limited freely available resources for creating lexical resources (Lam et al., 2014a,b, 2015b). Then, we establish the semantic relations between synsets in wordnets we create. Next, we introduce a new method to automatically add glosses to the synsets in our wordnets. Our techniques use limited resources as input to ensure that they can be felicitously used with languages that currently lack many original resources. Most existing research works with languages that have significant lexical resources available, which are costly to construct. To make our created lexical resources publicly available, we developed LexBank which is a web-based system that provides language services for several low-resource languages.
    [Show full text]
  • Monitoring the Gender Gap with Wikidata Human Gender Indicators
    Monitoring the Gender Gap with Wikidata Human Gender Indicators Maximilian Klein Harsh Gupta Vivek Rai GroupLens Research Indian Institute of Technology Indian Institute of Technology Dept. of Computer Science Kharagpur Kharagpur University of Minnesota [email protected] [email protected] [email protected] Piotr Konieczny Haiyi Zhu Hanyang University GroupLens Research [email protected] Dept. of Computer Science University of Minnesota [email protected] ABSTRACT The gender gap in Wikipedia’s content, specifically in the repre- sentation of women in biographies, is well-known but has been difficult to measure. Furthermore the impacts of efforts to ad- dress this gender gap have received little attention. To investigate we utilise Wikidata, the database that feeds Wikipedia, and intro- duce the “Wikidata Human Gender Indicators” (WHGI), a free and open source, longitudinal, biographical dataset monitoring gender disparities across time, space, culture, occupation and language. Through these lenses we show how the representation of women is changing along 11 dimensions. Validations of WHGI are pre- sented against three exogenous datasets: the world’s historical pop- ulation, “traditional” gender-disparity indices (GDI, GEI, GGGI and SIGI), and occupational gender according to the US Bureau of Labor Statistics. Furthermore, to demonstrate its general use in research, we revisit previously published findings on Wikipedia’s gender bias that can be strengthened by WHGI. Figure 1: Example of monitoring the changes in biographies of women for Wikipedia languages in the period December CCS Concepts 27th 2015 - January 3rd 2016. We can highlight that English •Human-centered computing ! Empirical studies in collab- Wikipedia increased 1,788 biographies, 65% about women, orative and social computing; Computer supported cooperative while Nepali Wikipedia increased by 120 biographies, 119 were work; Wikis; about women.
    [Show full text]