Design Challenges in Low-Resource Cross-Lingual Entity Linking

Total Page:16

File Type:pdf, Size:1020Kb

Design Challenges in Low-Resource Cross-Lingual Entity Linking Design Challenges in Low-resource Cross-lingual Entity Linking Xingyu Fu1,∗ Weijia Shi2∗, Xiaodong Yu1, Zian Zhao3, Dan Roth1 1University of Pennsylvania 2University of California Los Angeles 3Columbia University fxingyuf2, xdyu, [email protected] [email protected], [email protected] Abstract Odia: [ଚିଲିକାେର] ଅରଖକୁଦ ନିକଟେର ଏକ ମୁହାଣ େଖାଲିଥିଲା । Cross-lingual Entity Linking (XEL), the prob- English: Gloss: Flood appeared in Arakhkud near [Chilika Lake]. lem of grounding mentions of entities in a for- XEL: eign language text into an English knowledge [ଚିଲିକାେର] Chilika Lake base such as Wikipedia, has seen a lot of re- https://en.wikipedia.org/wiki/Chilika_Lake search in recent years, with a range of promis- ing techniques. However, current techniques Figure 1: The EXL task: in the given sentence we link do not rise to the challenges introduced by text “Chilika Lake” to its corresponding English Wikipedia in low-resource languages (LRL) and, surpris- ingly, fail to generalize to text not taken from task typically involves two main steps: (1) candi- Wikipedia, on which they are usually trained. This paper provides a thorough analysis of date generation, retrieving a list of candidate KB low-resource XEL techniques, focusing on entries for the mention, and (2) candidate ranking, the key step of identifying candidate English selecting the most likely entry from the candidates. Wikipedia titles that correspond to a given for- While XEL techniques have been studied heavily eign language mention. Our analysis indi- in recent years, many challenges remain in the LRL cates that current methods are limited by their setting. Specifically, existing candidate generation reliance on Wikipedia’s interlanguage links methods perform well on Wikipedia-based dataset and thus suffer when the foreign language’s but fail to generalize beyond Wikipedia, to news Wikipedia is small. We conclude that the LRL setting requires the use of outside-Wikipedia and social media text. Error analysis on existing cross-lingual resources and present a simple LRL XEL systems shows that the key obstacle is yet effective zero-shot XEL system, QuEL, candidate generation. For example, 79.3%-89.1% that utilizes search engines query logs. With of XEL errors in Odia can be attributed to the limi- experiments on 25 languages, QuEL shows an tations of candidate generation. average increase of 25% in gold candidate re- In this paper, we present a thorough analysis of call and of 13% in end-to-end linking accu- the limitations of several leading candidate genera- racy over state-of-the-art baselines.1 tion methods. Although these methods adopt differ- 1 Introduction ent techniques, we find that all of them heavily rely on Wikipedia interlanguage links2 as their cross- Cross-lingual Entity Linking (XEL) aims at ground- lingual resources. However, small SL Wikipedia ing mentions written in a foreign (source) language size limits their performance in the LRL setting. (SL) into entries in a (target) language Knowledge As shown in Figure2, while the core challenge of Base (KB), which we consider here as the English LRL XEL is to link LRL entities (A) to candidates Wikipedia following Pan et al.(2017); Upadhyay in the English Wikipedia (C), interlanguage links et al.(2018a); Zhou et al.(2020). In Figure1, only map a small subset of the LRL entities that for instance, an Odia (an Indo-Aryan language in appear in both LRL Wikipedia (B) and English India) mention (“Chilika Lake”) is linked to the Wikipedia. Therefore, methods that only leverage corresponding English Wikipedia entry. The XEL interlanguage links (B \ C) as the main source of ∗ Both authors contributed equally to this work. supervision cannot cover a wide range of entities. 1Code is available at: http://cogcomp.org/page/ publication_view/911. The LORELEI data will be 2https://en.wikipedia.org/wiki/Help: available through LDC. Interlanguage_links 6418 Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 6418–6432, November 16–20, 2020. c 2020 Association for Computational Linguistics For example, the Amharic Wikipedia has 14,854 2 Related Work entries, but only 8,176 of them have interlanguage links to English. 2.1 Background: Wikipedia resources Furthermore, as we show, existing candidate gen- Wikipedia title mappings: The mapping comes eration methods perform well on Wikipedia-based from Wikipedia interlanguage links4 between the datasets but fail to generalize to outside-Wikipedia SL and English, and uses Wikipedia articles titles text such as news or social media. as mappings. It directly links an SL entity to an English Wikipedia entry without ambiguity. Wikipedia anchor text mappings: A clickable A B C text mention in Wikipedia articles is annotated with LRL LRL Wiki English Wiki Entities Entities anchor text linking it to a Wikipedia entry. The Entities following retrieving order: SL anchor text ! SL Wikipedia entry ! English Wikipedia entry (where Figure 2: An illustration of Cross-lingual resources the last step is done via the Wikipedia interlanguage serving current XEL systems. links), allows one to build a bilingual title mapping from a SL mentions to Wikipedia English entries, resulting in a probabilistic mapping with scores Our observations lead to the conclusion that calculated using total counts (Tsai and Roth, 2016). the LRL setting necessitates the use of outside- Wikipedia cross-lingual resources. Specifically, we 2.2 XEL systems for Low-resource languages propose various ways to utilize the abundant query log from online search engines to compensate for We briefly survey key approaches to XEL below. the lack of supervision. Similar to Wikipedia, a Direct Mapping Based Systems, including free online encyclopedia created by Internet users, xlwikifier (Tsai and Roth, 2016) and Query logs (QL) provide a free resource, collabo- xelms (Upadhyay et al., 2018a), focus on building ratively generated by a large number of users, and a SL to English mapping to generate candidates. mildly curated. However, it is orders of magni- For candidate ranking, both xlwikifier and tude larger than Wikipedia.3 In particular, it in- xelms combine supervision from multiple lan- cludes all of Wikipedia cross-lingual resources as guages to learn a ranking model. a subset since a search of an SL mention leads to Word Translation Based Systems including Pan the English Wikipedia entity if the corresponding et al.(2017) and ELISA (Zhang et al., 2018), ex- Wikipedia entries are interlanguage-linked. tract SL – English name translation pairs and apply In this paper, the main part, Sec.3 presents an unsupervised collective inference approach to a thorough method-wise evaluation and analy- link the translated mention. sis of leading candidate generation methods, and Transliteration Based Systems include Tsai and quantifies their limitations as a function of SL Roth(2018) and translit (Upadhyay et al., Wikipedia size and the size of the interlanguage 2018b). translit uses a sequence-to-sequence cross-lingual resources. Based on the limitations, model and bootstrapping to deal with limited data. we analyze QuEL CG, an improved candidate gen- It is useful when the English and SL word pairs eration method utilizing QL in Sec.4, showing have similar pronunciation. that it exceeds the Wikipedia resource limit on Pivoting Based Systems including (Rijh- LRL. To exhibit a system-wise XEL compari- wani et al., 2019; Zhou et al., 2019) and son, in Sec.5, we suggest a simple yet efficient PBEL PLUS (Zhou et al., 2020), remove the zero-shot XEL framework QuEL, that incorpo- reliance on SL resources and use a pivot language rates QuEL CG. QuEL achieves an average of 25% for candidate generation. Specifically, they train increase in gold candidate recall, and 13% in- the XEL model on a selected high-resource crease in end-to-end linking accuracy on outside- language, and apply it to SL mentions through wikipedia text. language conversion. 3https://www.internetlivestats.com/ 4https://en.wikipedia.org/wiki/Help: google-search-statistics/ Interlanguage_links 6419 2.3 Query Logs flow: SL mention ! SL Wikipedia entity ! En- Query logs have long been used in many tasks glish Wikipedia entity.23 name trans such as across-domain generalization (Rud¨ et al., (Name Translation) as introduced in 2011) for NER, and ontological knowledge acquisi- (Pan et al., 2017; Zhang et al., 2018) performs word tion(Alfonseca et al., 2010; Pantel et al., 2012). In alignment on Wikipedia title mappings, to induce the English entity linking task, Shen et al.(2015) a fixed word-to-word translation table between SL pointed out that google query logs can be an ef- and English. For instance, to link the Suomi name ficient way to identify candidates. Dredze et al. “Pekingin tekninen instituutti” (Beijing Institute of (2010); Monahan et al.(2011) use search result as Technology), it translates each word in the mention: one of their methods for candidate generation on (“Pekingin” – Beijing, “tekninen” – Technology, high resource language entity linking task. While “instituutti” – Institute). At test time, after map- earlier work indicates that query logs provide abun- ping each word in the given mention to English, it dant information that can be used in NLP tasks, as links the translation to English Wikipedia using an far as we know, it has never been used in cross- unsupervised collective inference approach. translit lingual tasks as we do here. And, only search in- (Upadhyay et al., 2018b) trains a formation has been used, while we suggest using seq2seq model on Wikipedia title mappings, to Maps information too. generate the English entity directly. pivoting to a related high-resource language 3 Candidate Generation Analysis (HRL) (Zhou et al., 2020) attempts to general- ize to unseen LRL mentions through grapheme or In this section, we analyze four leading candidate phoneme similarity between SL and a related HRL. generation methods: p(e|m), xlwikifier, Specifically, it first finds a related HRL for the SL, name trans, pivoting, and translit(see and learns a XEL model on the related HRL, using Table1 for the systems used by each method) and Wikipedia title mappings and anchor text mappings.
Recommended publications
  • The Effectiveness of Application Permissions
    The Effectiveness of Application Permissions Adrienne Porter Felt,∗ Kate Greenwood, David Wagner University of California, Berkeley apf, kate eli, [email protected] Abstract cations’ permission requirements up-front so that users can grant them during installation. Traditional user-based permission systems assign the Traditional user-based permission systems assign the user’s full privileges to all applications. Modern plat- user’s full privileges to all of the user’s applications. In forms are transitioning to a new model, in which each the application permission model, however, each appli- application has a different set of permissions based on cation can have a customized set of permissions based its requirements. Application permissions offer several on its individual privilege requirements. If most applica- advantages over traditional user-based permissions, but tions can be satisfied with less than the user’s full priv- these benefits rely on the assumption that applications ileges, then three advantages of application permissions generally require less than full privileges. We explore over the traditional user-based model are possible: whether that assumption is realistic, which provides in- sight into the value of application permissions. • User Consent: Security-conscious users may be We perform case studies on two platforms with appli- hesitant to grant access to dangerous permissions cation permissions, the Google Chrome extension sys- without justification. For install-time systems, this tem and the Android OS. We collect the permission re- might alert some users to malware at installation; quirements of a large set of Google Chrome extensions for time-of-use systems, this can prevent an in- and Android applications.
    [Show full text]
  • Appendix 1 - 5 Appendix 1 - Questionnaire for Colleges
    Appendix 1 - 5 Appendix 1 - Questionnaire for Colleges Declaration: I, Rajeev Ghode, persuing my Ph.D. in Department of Communication Studies, Pune University. Title of Ph.D. Research is "To study potential and challenges in the use and adoption of ICT in Higher Education"For this research purpose, I want to collect quantitative data from all the professors of colleges which are affiliated to Pune University. I ensure that all the data collected will be used only for the Ph.D. research and secrecy of the data will be maintained. I appreciate you for spending your valuable time to fill this questionnaire. Thanking You <•;. Prof. Rajeev Ghode Questionnaire Name of the College Address/City 1 Arts Science and Commerce College 1 Arts Science and Commerce College with Computer Science Type of College 1 Commerce and BBA Q B.Ed. College 1 1 Law Institutional ICT Infrastructure Sr. ICT Infrastructure Yes No Provision in near No. future 1. Multimedia /Conference Hall 2. Computer Lab 3. Internet Connectivity in Campus 4. Digital Library 5. Website 6. Organization e-mail Server 7. Blog Appendix - I Sr. ICT Infrastructure Yes No Provision in near No. future 8. Presence on SNS 9. Online Admission System 10. Online/Offline Examination Application 11. Biometric Attendance 12. Student Management System 13. Professional membership for online journals Department-wise ICT Infrastructure Sr. ICT Infrastructure Yes No Provision in near No. future 1. Do you have separate Desktops for faculties in every department? 2. Do you have separate Laptops for every department? 3. Does every department have LCD Projector? 4.
    [Show full text]
  • Apigee X Migration Offering
    Apigee X Migration Offering Overview Today, enterprises on their digital transformation journeys are striving for “Digital Excellence” to meet new digital demands. To achieve this, they are looking to accelerate their journeys to the cloud and revamp their API strategies. Businesses are looking to build APIs that can operate anywhere to provide new and seamless cus- tomer experiences quickly and securely. In February 2021, Google announced the launch of the new version of the cloud API management platform Apigee called Apigee X. It will provide enterprises with a high performing, reliable, and global digital transformation platform that drives success with digital excellence. Apigee X inte- grates deeply with Google Cloud Platform offerings to provide improved performance, scalability, controls and AI powered automation & security that clients need to provide un-parallel customer experiences. Partnerships Fresh Gravity is an official partner of Google Cloud and has deep experience in implementing GCP products like Apigee/Hybrid, Anthos, GKE, Cloud Run, Cloud CDN, Appsheet, BigQuery, Cloud Armor and others. Apigee X Value Proposition Apigee X provides several benefits to clients for them to consider migrating from their existing Apigee Edge platform, whether on-premise or on the cloud, to better manage their APIs. Enhanced customer experience through global reach, better performance, scalability and predictability • Global reach for multi-region setup, distributed caching, scaling, and peak traffic support • Managed autoscaling for runtime instance ingress as well as environments independently based on API traffic • AI-powered automation and ML capabilities help to autonomously identify anomalies, predict traffic for peak seasons, and ensure APIs adhere to compliance requirements.
    [Show full text]
  • EN-Google Hacks.Pdf
    Table of Contents Credits Foreword Preface Chapter 1. Searching Google 1. Setting Preferences 2. Language Tools 3. Anatomy of a Search Result 4. Specialized Vocabularies: Slang and Terminology 5. Getting Around the 10 Word Limit 6. Word Order Matters 7. Repetition Matters 8. Mixing Syntaxes 9. Hacking Google URLs 10. Hacking Google Search Forms 11. Date-Range Searching 12. Understanding and Using Julian Dates 13. Using Full-Word Wildcards 14. inurl: Versus site: 15. Checking Spelling 16. Consulting the Dictionary 17. Consulting the Phonebook 18. Tracking Stocks 19. Google Interface for Translators 20. Searching Article Archives 21. Finding Directories of Information 22. Finding Technical Definitions 23. Finding Weblog Commentary 24. The Google Toolbar 25. The Mozilla Google Toolbar 26. The Quick Search Toolbar 27. GAPIS 28. Googling with Bookmarklets Chapter 2. Google Special Services and Collections 29. Google Directory 30. Google Groups 31. Google Images 32. Google News 33. Google Catalogs 34. Froogle 35. Google Labs Chapter 3. Third-Party Google Services 36. XooMLe: The Google API in Plain Old XML 37. Google by Email 38. Simplifying Google Groups URLs 39. What Does Google Think Of... 40. GooglePeople Chapter 4. Non-API Google Applications 41. Don't Try This at Home 42. Building a Custom Date-Range Search Form 43. Building Google Directory URLs 44. Scraping Google Results 45. Scraping Google AdWords 46. Scraping Google Groups 47. Scraping Google News 48. Scraping Google Catalogs 49. Scraping the Google Phonebook Chapter 5. Introducing the Google Web API 50. Programming the Google Web API with Perl 51. Looping Around the 10-Result Limit 52.
    [Show full text]
  • 2Nd USENIX Conference on Web Application Development (Webapps ’11)
    conference proceedings Proceedings of the 2nd USENIX Conference Application on Web Development 2nd USENIX Conference on Web Application Development (WebApps ’11) Portland, OR, USA Portland, OR, USA June 15–16, 2011 Sponsored by June 15–16, 2011 © 2011 by The USENIX Association All Rights Reserved This volume is published as a collective work. Rights to individual papers remain with the author or the author’s employer. Permission is granted for the noncommercial reproduction of the complete work for educational or research purposes. Permission is granted to print, primarily for one person’s exclusive use, a single copy of these Proceedings. USENIX acknowledges all trademarks herein. ISBN 978-931971-86-7 USENIX Association Proceedings of the 2nd USENIX Conference on Web Application Development June 15–16, 2011 Portland, OR, USA Conference Organizers Program Chair Armando Fox, University of California, Berkeley Program Committee Adam Barth, Google Inc. Abdur Chowdhury, Twitter Jon Howell, Microsoft Research Collin Jackson, Carnegie Mellon University Bobby Johnson, Facebook Emre Kıcıman, Microsoft Research Michael E. Maximilien, IBM Research Owen O’Malley, Yahoo! Research John Ousterhout, Stanford University Swami Sivasubramanian, Amazon Web Services Geoffrey M. Voelker, University of California, San Diego Nickolai Zeldovich, Massachusetts Institute of Technology The USENIX Association Staff WebApps ’11: 2nd USENIX Conference on Web Application Development June 15–16, 2011 Portland, OR, USA Message from the Program Chair . v Wednesday, June 15 10:30–Noon GuardRails: A Data-Centric Web Application Security Framework . 1 Jonathan Burket, Patrick Mutchler, Michael Weaver, Muzzammil Zaveri, and David Evans, University of Virginia PHP Aspis: Using Partial Taint Tracking to Protect Against Injection Attacks .
    [Show full text]
  • Mobile Applications and Browser Plugins for Wordnets
    Sophisticated Lexical Databases - Simplified Usage: Mobile Applications and Browser Plugins For Wordnets Diptesh Kanojia Raj Dabre Pushpak Bhattacharyya CFILT, CSE Department, School of Informatics, CFILT, CSE Department, IIT Bombay, Kyoto University, IIT Bombay, Mumbai, India Kyoto, Japan Mumbai, India [email protected] [email protected] [email protected] Abstract identifiable mother-tongues and 122 major languages1 . Of these, 29 languages have India is a country with 22 officially rec- more than a million native speakers, 60 have ognized languages and 17 of these have more than 100,000 and 122 have more than WordNets, a crucial resource. Web 10,000 native speakers. With this in mind, browser based interfaces are available the construction of the Indian WordNets, the for these WordNets, but are not suited IndoWordNet (Bhattacharyya, 2010) project for mobile devices which deters peo- was initiated which was an effort undertaken ple from effectively using this resource. by over 12 educational and research insti- We present our initial work on devel- tutes headed by IIT Bombay. Indian Word- oping mobile applications and browser Nets were inspired by the pioneering work of extensions to access WordNets for In- Princeton WordNet(Fellbaum, 1998) and cur- dian Languages. rently, there exist WordNets for 17 Indian lan- Our contribution is two fold: (1) We guages with the smallest one having around develop mobile applications for the An- 14,900 synsets and the largest one being Hindi droid, iOS and Windows Phone OS with 39,034 synsets and 100,705 unique words. platforms for Hindi, Marathi and San- Each WordNet is accessible by web interfaces skrit WordNets which allow users to amongst which Hindi WordNet(Dipak et al., search for words and obtain more in- 2002), Marathi WordNet and Sanskrit Word- formation along with their translations Net(Kulkarni et al., 2010) were developed at in English and other Indian languages.
    [Show full text]
  • Mergers in the Digital Economy
    2020/01 DP Axel Gautier and Joe Lamesch Mergers in the digital economy CORE Voie du Roman Pays 34, L1.03.01 B-1348 Louvain-la-Neuve Tel (32 10) 47 43 04 Email: [email protected] https://uclouvain.be/en/research-institutes/ lidam/core/discussion-papers.html Mergers in the Digital Economy∗ Axel Gautier y& Joe Lamesch z January 13, 2020 Abstract Over the period 2015-2017, the five giant technologically leading firms, Google, Amazon, Facebook, Amazon and Microsoft (GAFAM) acquired 175 companies, from small start-ups to billion dollar deals. By investigating this intense M&A, this paper ambitions a better understanding of the Big Five's strategies. To do so, we identify 6 different user groups gravitating around these multi-sided companies along with each company's most important market segments. We then track their mergers and acquisitions and match them with the segments. This exercise shows that these five firms use M&A activity mostly to strengthen their core market segments but rarely to expand their activities into new ones. Furthermore, most of the acquired products are shut down post acquisition, which suggests that GAFAM mainly acquire firm’s assets (functionality, technology, talent or IP) to integrate them in their ecosystem rather than the products and users themselves. For these tech giants, therefore, acquisition appears to be a substitute for in-house R&D. Finally, from our check for possible "killer acquisitions", it appears that just a single one in our sample could potentially be qualified as such. Keywords: Mergers, GAFAM, platform, digital markets, competition policy, killer acquisition JEL Codes: D43, K21, L40, L86, G34 ∗The authors would like to thank M.
    [Show full text]
  • Devonagentpro
    DEVONagent Pro VERSION 3.11.1 DOCUMENTATION © 2001-2019 DEVONtechnologies TABLE OF CONTENTS Read Me 4 Menus 29 The DEVONagent Pro Advantage 5 The DEVONagent Pro menu 29 System Requirements 6 The File menu 30 Installing, Updating, Removing 6 The Edit menu 31 Trial Restrictions 6 The Data menu 32 Version history 7 The Sort menu 33 License Agreement 11 The View menu 33 Credits 11 The Web menu 34 The History menu 35 Getting Started 12 The Go menu 36 Why use DEVONagent Pro? 12 The Window menu 36 When to use DEVONagent Pro? 12 The Services menu 37 DEVONagent Pro at a Glance 13 The Scripts menu 37 First steps with DEVONagent Pro 13 The Help menu 37 The Dock menu 38 Common Tasks 14 Windows and panels 39 How to find on the Internet 14 How to search beyond Google 14 Search window 40 How to customize DEVONagent Pro 15 Web browser 46 How to set up a search set to crawl feeds 15 Archive window 52 How to run a search automatically 16 Search sets 54 How to archive search results 16 Plugins and scanners 54 Downloads 56 Queries 18 Preferences 57 Assistant 57 Operators 18 Designing a search query 19 Preferences 59 Search sets 21 General 59 Search 60 What are search sets? 21 Menu extra 61 Choosing a search set 21 Web 62 Creating and managing sets 22 Tabs 63 Sharing sets 23 Bookmarks 64 General tab 23 Email 64 Advanced tab 25 Update 65 Sites tab 25 Plugins tab 26 Menu extra 66 Actions tab 27 Scripts 67 Schedule tab 28 Introduction 67 DEVONagent Pro's Scripts menu 68 Automator 69 DEVONagent Pro 3.11.1 Documentation, page 2 Plugin Development 70 Creating Your
    [Show full text]
  • Cross Lingual Information Access and Retrieval CHAPTER 1
    Cross Lingual Information Access and Retrieval CHAPTER 1 INTRODUCTION 1.1 Overview With an increasingly globalized economy, the ability to find information in other languages is becoming a necessity. Though initially the World Wide Web was dominated by English, now less than half of existing web pages are in English. Thus the problem is that much Internet content is effectively inaccessible as it is stored in languages that are not searchable without specific technical know­how. To overcome the above problem and to make communication global, the access to cross lingual information is more and more widespread within this society. The demand for fully multilingual multimedia retrieval systems first formulated in 1997, is increasingly relevant today as the global networks become vital sources of information for both professional and leisure activities. Information retrieval (IR) is a process that determines the relationship between document and query. It also defines the strength between them. Cross language information retrieval (CLIR) is the subfield of information retrieval. The basic idea behind the cross language information retrieval (CLIR) system is to retrieve documents in a language different from a query language made in the user’s own language. This may be desirable even when the user is not a speaker of the language used in the retrieved documents. Once it is known that the information exists and is relevant, the retrieved documents can be translated by a translator. Research in the area of cross­language information retrieval (CLIR) has focused mainly on methods for translating queries. Full document translation for large collections is impractical, thus query translation is a viable alternative.
    [Show full text]
  • Cesifo Working Paper No. 8056
    8056 2020 January 2020 Mergers in the Digital Economy Axel Gautier, Joe Lamesch Impressum: CESifo Working Papers ISSN 2364-1428 (electronic version) Publisher and distributor: Munich Society for the Promotion of Economic Research - CESifo GmbH The international platform of Ludwigs-Maximilians University’s Center for Economic Studies and the ifo Institute Poschingerstr. 5, 81679 Munich, Germany Telephone +49 (0)89 2180-2740, Telefax +49 (0)89 2180-17845, email [email protected] Editor: Clemens Fuest www.cesifo-group.org/wp An electronic version of the paper may be downloaded · from the SSRN website: www.SSRN.com · from the RePEc website: www.RePEc.org · from the CESifo website: www.CESifo-group.org/wp CESifo Working Paper No. 8056 Mergers in the Digital Economy Abstract Over the period 2015-2017, the five giant technologically leading firms, Google, Amazon, Facebook, Amazon and Microsoft (GAFAM) acquired 175 companies, from small start-ups to billion dollar deals. By investigating this intense M&A, this paper ambitions a better understanding of the Big Five’s strategies. To do so, we identify 6 different user groups gravitating around these multi-sided companies along with each company’s most important market segments. We then track their mergers and acquisitions and match them with the segments. This exercise shows that these five firms use M&A activity mostly to strengthen their core market segments but rarely to expand their activities into new ones. Furthermore, most of the acquired products are shut down post acquisition, which suggests that GAFAM mainly acquire firm’s assets (functionality, technology, talent or IP) to integrate them in their ecosystem rather than the products and users themselves.
    [Show full text]
  • Multi-Language Online Word Processor for Learners and the Visually Impaired
    Multi-language Online Word Processor for Learners and the Visually Impaired Shiblee Imtiaz Hasan Department of Computer Science and Engineering, BRAC University, 66, Mohakhali C/A, Dhaka 1212, Bangladesh [email protected] Abstract. Most of the modern operating systems come with accessibility and screen reading features. They also have with basic word processing applications. However, many of the languages are still not supported in the screen reading application and neither do they come with features such as translation and transliteration (phonetic). Many plugins are available to overcome these barriers, but the visually impaired users and/or users without administrative privilege will not be able to install those in the local computer. This paper discusses about implementation of a rich internet application that enables users to have access to a free online word processor which can translate, transliterate and speak out words (in different languages) that have been typed, which can be very helpful for learners of foreign languages and visually impaired users. Keywords: languages, word processor, speech synthesis, transliteration, accessibility. 1 Introduction Desktop-based applications, especially word processors are usually known to be more powerful than the online ones. They also come with more features compared to the Internet applications. However, the online applications can be very portable and can run without being installed in a computer. This paper would discuss about an online word processor which can be portable solution for many. The application can be a very important document processing utility for the visually impaired too. Some of the features would also assist learners of a new language.
    [Show full text]
  • Merger Policy in Digital Markets: an Ex Post Assessment 3 Study Is to Undertake a Less Common Form of Ex Post Assessment
    Journal of Competition Law & Economics, 00(00), 1–46 doi: 10.1093/joclec/nhaa020 MERGER POLICY IN DIGITAL MARKETS: AN EX Downloaded from https://academic.oup.com/jcle/advance-article/doi/10.1093/joclec/nhaa020/5874037 by guest on 18 December 2020 POST ASSESSMENT† Elena Argentesi,∗Paolo Buccirossi,†Emilio Calvano,‡ Tomaso Duso,§,∗ & Alessia Marrazzo,¶ & Salvatore Nava† ABSTRACT This paper presents a broad retrospective evaluation of mergers and merger decisions in markets dominated by multisided digital platforms. First, we doc- ument almost 300 acquisitions carried out by three major tech companies— Amazon, Facebook, and Google—between 2008 and 2018. We cluster target companies on their area of economic activity providing suggestive evidence on the strategies behind these mergers. Second, we discuss the features of digital markets that create new challenges for competition policy. By using relevant case studies as illustrative examples, we discuss theories of harm that have been used or, alternatively, could have been formulated by authorities in these cases. Finally, we retrospectively examine two important merger cases, Facebook/Instagram and Google/Waze, providing a systematic assessment of the theories of harm considered by the UK competition authorities as well as evidence on the evolution of the market after the transactions were approved. We discuss whether the competition authority performed complete and careful analyses to foresee the competitive consequences of the investigated mergers and whether a more effective merger control regime can be achieved within the current legal framework. JEL codes: L4; K21 ∗ Department of Economics, University of Bologna † Lear, Rome ‡ Department of Economics, University of Bologna, Toulouse School of Economics and CEPR, London § Deutsches Institut fuer Wirtschaftsforschung (DIW Berlin), Department of Economics, Tech- nical University (TU) Berlin, CEPR, London and CESifo, Munich ¶ Lear, Rome and Department of Economics, University of Bologna ∗ Corresponding author.
    [Show full text]