Welsh Language Technology Action Plan Progress Report 2020 Welsh Language Technology Action Plan: Progress Report 2020
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
'Translating' Emotions: Nationalism in Contemporary Greek Cinema
‘Reading’ and ‘Translating’ Emotions: Nationalism in Contemporary Greek Cinema by Sophia Sakellis A thesis submitted in fulfilment of the requirements for the degree of Master of Arts (Research) Department of Modern Greek and Byzantine Studies The Faculty of Arts and Social Sciences University of Sydney October 2016 ABSTRACT This study explores emotions related to nationalism, and their manifestations in contemporary Greek cinema. It also investigates the reasons and mechanisms giving rise to nationalism, and how it is perceived, expressed and ‘translated’ into other cultures. A core focus within the nationalist paradigm is the theme of national identity, with social exclusion ideologies such as racism operating in the background. Two contemporary Greek films have been chosen, which deal with themes of identity, nationalism, xenophobia, anger and fear in different contexts. The study is carried out by drawing on the theories of emotion, language, translation and cinema, to analyse the visual and audio components of the two films and ascertain their translatability to an Australian audience. Both films depict a similar milieu to each other, which is plagued by the lingering nature of all the unresolved political and national issues faced by the Greek nation, in addition to the economic crisis, a severe refugee crisis, and externally imposed policy issues, as well as numerous other social problems stemming from bureaucracy, red tape and widespread state-led corruption, which have resulted in massive rates of unemployment and financial hardship that have befallen a major part of the population. In spite of their topicality, the themes are universal and prevalent in a number of countries to varying degrees, as cultural borders become increasingly integrated, both socially and economically. -
Study on the Size of the Language Industry in the EU
Studies on translation and multilingualism o The size of the language industry in the EU European Commission Directorate-General for Translation 1/2009 Manuscript completed on 17th August 2009 ISBN 978-92-79-14181-2 © European Commission, 2009 Reproduction is authorised provided the source is acknowledged. %R7`V]Q` Q .V 1`VH Q`: VVJV`:C`Q``:JC: 1QJ Q` .V%`Q]V:J QII11QJ !1J:C0V`1QJ R$R% %R7QJ .V1<VQ` .VC:J$%:$V1JR% `71J .V .%$% .V:J$%:$VVH.JQCQ$7VJ `V R R 1J$ QJ1CC 1J$ QJ%]QJ.:IV %``V7 J1 VR1J$RQI 1118C:J$ VH.8HQ8%@ % .Q`7 `8R`1:JV 1JH.V.::.#1JQI]% : 1QJ:C1J$%1 1H``QI%QJJJ10V`1 75(V`I:J78 .V `Q%JRVR .V :J$%:$V VH.JQCQ$7 VJ `V ^_ 1J 5 : C1I1 VR HQI]:J7 G:VR 1J QJRQJ :JR 1JHQ`]Q`: VR 1J :.1J$ QJ #8 .J /]`1C 5 GVH:IV ]:` Q` : $`Q%] Q` HQI]:J1V%JRV` .V%IG`VCC:Q`/12#.3( R11 .#`811JH.V:I:=Q`1 7.:`V.QCRV`8 JRV`#`811JH.V;CV:RV`.1]5HQJ 1J%V QQ]V`: V::I%C 1C1J$%:CHQJ%C :JH75V`01HV :JRQ` 1:`VR1 `1G% 1QJHQI]:J71.V`V:Q` 1:`VRV1$J5RV0VCQ]IVJ :JR%]]Q` 1: `:J`V``VR Q/$1CVVGQC% 1QJ R811 .Q``1HV1JQJRQJ:JR%QJJ5(V`I:J78 #`8 11JH.V HQRQ`R1J: V 1J V`J:C :JR 7 `%JRVR `VV:`H. :JR RV0VCQ]IVJ ]`Q=VH 5 I:`@V %R1V:JR `1:C8.V1::]]Q1J VRV0:C%: Q``Q`V0V`:C:CC`Q``Q]Q:CQ` .V 7%`Q]V:JQII11QJ5:JR`V01V1V``Q`V0V`:C7]`Q=VH V0:C%: 1QJ8 :R1:1Q` V`:R:JQ 1;]`Q`1CV1JHC%RV:%H1J.71H:JR/R0:JHVRVH.JQCQ$1V]%`%VR : .VJ10V`1 1V Q` 8`V1G%`$ ^(V`I:J7_ :JR 1VJ: ^. -
Automatic Correction of Real-Word Errors in Spanish Clinical Texts
sensors Article Automatic Correction of Real-Word Errors in Spanish Clinical Texts Daniel Bravo-Candel 1,Jésica López-Hernández 1, José Antonio García-Díaz 1 , Fernando Molina-Molina 2 and Francisco García-Sánchez 1,* 1 Department of Informatics and Systems, Faculty of Computer Science, Campus de Espinardo, University of Murcia, 30100 Murcia, Spain; [email protected] (D.B.-C.); [email protected] (J.L.-H.); [email protected] (J.A.G.-D.) 2 VÓCALI Sistemas Inteligentes S.L., 30100 Murcia, Spain; [email protected] * Correspondence: [email protected]; Tel.: +34-86888-8107 Abstract: Real-word errors are characterized by being actual terms in the dictionary. By providing context, real-word errors are detected. Traditional methods to detect and correct such errors are mostly based on counting the frequency of short word sequences in a corpus. Then, the probability of a word being a real-word error is computed. On the other hand, state-of-the-art approaches make use of deep learning models to learn context by extracting semantic features from text. In this work, a deep learning model were implemented for correcting real-word errors in clinical text. Specifically, a Seq2seq Neural Machine Translation Model mapped erroneous sentences to correct them. For that, different types of error were generated in correct sentences by using rules. Different Seq2seq models were trained and evaluated on two corpora: the Wikicorpus and a collection of three clinical datasets. The medicine corpus was much smaller than the Wikicorpus due to privacy issues when dealing Citation: Bravo-Candel, D.; López-Hernández, J.; García-Díaz, with patient information. -
Digital Workaholics:A Click Away
GSJ: Volume 9, Issue 5, May 2021 ISSN 2320-9186 1479 GSJ: Volume 9, Issue 5, May 2021, Online: ISSN 2320-9186 www.globalscientificjournal.com DIGITAL WORKAHOLICS:A CLICK AWAY Tripti Srivastava Muskan Chauhan Rohit Pandey Galgotias University Galgotias University Galgotias University [email protected] muskanchauhansmile@g [email protected] m mail.com om ABSTRACT:-In today’s world, Artificial Intelligence (AI) has become an 1. INTRODUCTION integral part of human life. There are many applications of AI such as Chatbot, AI defines as those device that understands network security, complex problem there surroundings and took actions which solving, assistants, and many such. increase there chance to accomplish its Artificial Intelligence is designed to have outcomes. Artifical Intelligence used as “a cognitive intelligence which learns from system’s ability to precisely interpret its experience to take future decisions. A external data, to learn previous such data, virtual assistant is also an example of and to use these learnings to accomplish cognitive intelligence. Virtual assistant distinct outcomes and tasks through supple implies the AI operated program which adaptation.” Artificial Intelligence is the can assist you to reply to your query or developing branch of computer science. virtually do something for you. Currently, Having much more power and ability to the virtual assistant is used for personal develop the various application. AI implies and professional use. Most of the virtual the use of a different algorithm to solve assistant is device-dependent and is bind to various problems. The major application a single user or device. It recognizes only being Optical character recognition, one user. -
Deep Almond: a Deep Learning-Based Virtual Assistant [Language-To-Code Synthesis of Trigger-Action Programs Using Seq2seq Neural Networks]
Deep Almond: A Deep Learning-based Virtual Assistant [Language-to-code synthesis of Trigger-Action programs using Seq2Seq Neural Networks] Giovanni Campagna Rakesh Ramesh Computer Science Department Stanford University Stanford, CA 94305 {gcampagn, rakeshr1}@stanford.edu Abstract Virtual assistants are the cutting edge of end user interaction, thanks to endless set of capabilities across multiple services. The natural language techniques thus need to be evolved to match the level of power and sophistication that users ex- pect from virtual assistants. In this report we investigate an existing deep learning model for semantic parsing, and we apply it to the problem of converting nat- ural language to trigger-action programs for the Almond virtual assistant. We implement a one layer seq2seq model with attention layer, and experiment with grammar constraints and different RNN cells. We take advantage of its existing dataset and we experiment with different ways to extend the training set. Our parser shows mixed results on the different Almond test sets, performing better than the state of the art on synthetic benchmarks by about 10% but poorer on real- istic user data by about 15%. Furthermore, our parser is shown to be extensible to generalization, as well as or better than the current system employed by Almond. 1 Introduction Today, we can ask virtual assistants like Amazon Alexa, Apple’s Siri, Google Now to perform simple tasks like, “What’s the weather”, “Remind me to take pills in the morning”, etc. in natural language. The next evolution of natural language interaction with virtual assistants is in the form of task automation such as “turn on the air conditioner whenever the temperature rises above 30 degrees Celsius”, or “if there is motion on the security camera after 10pm, call Bob”. -
Inside Chatbots – Answer Bot Vs. Personal Assistant Vs. Virtual Support Agent
WHITE PAPER Inside Chatbots – Answer Bot vs. Personal Assistant vs. Virtual Support Agent Artificial intelligence (AI) has made huge strides in recent years and chatbots have become all the rage as they transform customer service. Terminology, such as answer bots, intelligent personal assistants, virtual assistants for business, and virtual support agents are used interchangeably, but are they the same thing? The concepts are similar, but each serves a different purpose, has specific capabilities, varying extensibility, and provides a range of business value. 2018 © Copyright ServiceAide, Inc. 1-650-206-8988 | www.serviceaide.com | [email protected] INSIDE CHATBOTS – ANSWER BOT VS. PERSONAL ASSISTANT VS. VIRTUAL SUPPORT AGENT WHITE PAPER Before we dive into each solution, a short technical primer is in order. A chatbot is an AI-based solution that uses natural language understanding to “understand” a user’s statement or request and map that to a specific intent. The ‘intent’ is equivalent to the intention or ‘the want’ of the user, such as ordering a pizza or booking a flight. Once the chatbot understands the intent of the user, it can carry out the corresponding task(s). To create a chatbot, someone (the developer or vendor) must determine the services that the ‘bot’ will provide and then collect the information to support requests for the services. The designer must train the chatbot on numerous speech patterns (called utterances) which cover various ways a user or customer might express intent. In this development stage, the developer defines the information required for a particular service (e.g. for a pizza order the chatbot will require the size of the pizza, crust type, and toppings). -
82 Post-Editing of Machine Translation Output
Matej Kovačević, Post-editing of MT output Hieronymus 1 (2014), 82-104 POST-EDITING OF MACHINE TRANSLATION OUTPUT WITH AND WITHOUT SOURCE TEXT Matej Kovačević Abstract Post-editing of machine translation output is a practice which aims to speed up translation production. There is still no agreement on whether post-editors should have access to the source text of the translations they are post-editing. The aim of this paper is to see how access to source text influences post-editors’ quality of work and their speed. An experiment was conducted among 22 graduate students of English, who post-edited two translations produced by Google Translate. The subjects were divided into two groups, with access to the source text for only one of the translations. Task duration and the number of corrected errors in the MT output were measured. Types of errors were further analysed. Contrary to expectations, access to source text was found not to have a considerable impact on speed. As expected, it did have an impact on the quality of the final translation. 1. Introduction Machine translation (MT) systems have been available to translators for 60 years, but they still cannot produce perfect translations. This is the reason why people are still apprehensive when it comes to using such systems. The development of MT systems has introduced a practice called post-editing of MT output whereby a machine translates the text and the translator (post-editor) revises the translation. This paper will explore that practice, more specifically the issue of whether the post-editor should have access to the source text (ST) when post-editing a translation. -
Influence of Liverpool Welsh on Lenition in Liverpool English
Influence of Liverpool Welsh on Lenition in Liverpool English Hannah Paton The purpose of this research was to provide evidence for the Welsh language having an influence on the Liverpool accent with a specific focus on the lenition and aspiration of voiceless plosives. Lenition and aspiration of the speech of participants from Liverpool and North Wales were determined using an acoustic software. The data suggested lenition did occur in the speech of the Welsh participants. However, lenition seemed to be trending amongst people in Liverpool as plosives were lenited a stage further than previous research suggests. Conclusions may be drawn to highlight an influence of the Welsh language on lenition in Liverpool. Keywords: lenition, Liverpool, sociophonetics, Wales, Welsh language 1 Introduction The use of aspirated voiceless plosives is a common feature in British accents of English (Ashby & Maidment 2005). However, the aspiration of voiceless plosives in a word-final position is normally not audible and if it is medial, it takes on the characteristics of other syllables in the word (Roach, 2000). Some accents of English exaggerate the aspiration of the voiceless plosives /p, t, k/ although this is not generally found in most Northern English accents of English (Wells 1982). There are only two English accents of English that have been found to do this through previous research. These are London and Liverpool English (Trudgill, Hughes & Watt 2005). Due to the influence of London English on Estuary English, this feature can be heard in the South West of England. Most accents of English that do have this feature have some interference from another language that heavily aspirates voiceless plosives such as Irish, Welsh, Singaporean or African-American English (Wells, 1982). -
Learning to Read by Spelling Towards Unsupervised Text Recognition
Learning to Read by Spelling Towards Unsupervised Text Recognition Ankush Gupta Andrea Vedaldi Andrew Zisserman Visual Geometry Group Visual Geometry Group Visual Geometry Group University of Oxford University of Oxford University of Oxford [email protected] [email protected] [email protected] tttttttttttttttttttttttttttttttttttttttttttrssssss ttttttt ny nytt nr nrttttttt ny ntt nrttttttt zzzz iterations iterations bcorote to whol th ticunthss tidio tiostolonzzzz trougfht to ferr oy disectins it has dicomered Training brought to view by dissection it was discovered Figure 1: Text recognition from unaligned data. We present a method for recognising text in images without using any labelled data. This is achieved by learning to align the statistics of the predicted text strings, against the statistics of valid text strings sampled from a corpus. The figure above visualises the transcriptions as various characters are learnt through the training iterations. The model firstlearns the concept of {space}, and hence, learns to segment the string into words; followed by common words like {to, it}, and only later learns to correctly map the less frequent characters like {v, w}. The last transcription also corresponds to the ground-truth (punctuations are not modelled). The colour bar on the right indicates the accuracy (darker means higher accuracy). ABSTRACT 1 INTRODUCTION This work presents a method for visual text recognition without read (ri:d) verb • Look at and comprehend the meaning of using any paired supervisory data. We formulate the text recogni- (written or printed matter) by interpreting the characters or tion task as one of aligning the conditional distribution of strings symbols of which it is composed. -
Kupni Smlouva.Pdf
KUPNÍ SMLOUVA Č. UKFFS/0267/2020 uzavřená dle ustanovení § 2430 a násl. zákona č. 89/2012 Sb., občanský zákoník, ve znění pozdějších předpisů, (dále jen „OZ“) mezi smluvními stranami Univerzita Karlova, Filozofická fakulta sídlo: náměstí Jana Palacha 1/2, 116 38 Praha 1 zastoupena: doc. PhDr. Michalem Pullmannem, Ph.D., děkanem IČO: 00216208 DIČ: CZ00216208 bankovní spojení: Komerční banka, a.s., Praha 1 č. ú.: kontaktní osoba: +420 221 619 248, e‐mail: interní číslo zakázky: na straně jedné (dále jen „kupující“) a Kuba Libri s.r.o. sídlo/místo podnikání: Ruská 972/94, 100 00 Praha 10 ‐ Vršovice zápis v obchodním rejstříku: vedeném Městským soudem v Praze, odd. C, vložka 204178 zastoupena: MUDr. Janem Švecem, jednatelem společnosti IČO: 29149177 DIČ: CZ29149177 bankovní spojení: Komerční banka, a.s. č. ú.: 107‐407821021 0 kontaktní osoba: MUDr. Jan Švec, 6 959 563, e‐mail: jan.svec@kubal na straně druhé (dále jen „prodávající“) Článek I Úvodní ustanovení 1.1. Tato smlouva se uzavírá na základě výsledků zadávacího řízení na dodávky, vyhlášeného kupujícím, jako zadavatelem, pod názvem „Výzva č. 15: FF UK ‐ nákup knih pro projekt KREAS“ (dále jen „veřejná zakázka“ nebo „zakázka“) v rámci zavedeného Dynamického nákupního systému pro nákup knih pro projekt KREAS v období 2018 ‐ 2022 a v souladu se zadávací dokumentací a nabídkou prodávajícího podanou v rámci shora uvedeného zadávacího řízení. 1.2. Touto smlouvou se prodávající zavazuje na svůj náklad, na svou odpovědnost a v dohodnuté době dodat kupujícímu knihy (dále jen „zboží“), které jsou blíže specifikované v Příloze č. 1 Specifikace dodávky (dále jen „Příloha č. -
Detecting Personal Life Events from Social Media
Open Research Online The Open University’s repository of research publications and other research outputs Detecting Personal Life Events from Social Media Thesis How to cite: Dickinson, Thomas Kier (2019). Detecting Personal Life Events from Social Media. PhD thesis The Open University. For guidance on citations see FAQs. c 2018 The Author https://creativecommons.org/licenses/by-nc-nd/4.0/ Version: Version of Record Link(s) to article on publisher’s website: http://dx.doi.org/doi:10.21954/ou.ro.00010aa9 Copyright and Moral Rights for the articles on this site are retained by the individual authors and/or other copyright owners. For more information on Open Research Online’s data policy on reuse of materials please consult the policies page. oro.open.ac.uk Detecting Personal Life Events from Social Media a thesis presented by Thomas K. Dickinson to The Department of Science, Technology, Engineering and Mathematics in partial fulfilment of the requirements for the degree of Doctor of Philosophy in the subject of Computer Science The Open University Milton Keynes, England May 2019 Thesis advisor: Professor Harith Alani & Dr Paul Mulholland Thomas K. Dickinson Detecting Personal Life Events from Social Media Abstract Social media has become a dominating force over the past 15 years, with the rise of sites such as Facebook, Instagram, and Twitter. Some of us have been with these sites since the start, posting all about our personal lives and building up a digital identify of ourselves. But within this myriad of posts, what actually matters to us, and what do our digital identities tell people about ourselves? One way that we can start to filter through this data, is to build classifiers that can identify posts about our personal life events, allowing us to start to self reflect on what we share online. -
Spelling Correction: from Two-Level Morphology to Open Source
Spelling Correction: from two-level morphology to open source Iñaki Alegria, Klara Ceberio, Nerea Ezeiza, Aitor Soroa, Gregorio Hernandez Ixa group. University of the Basque Country / Eleka S.L. 649 P.K. 20080 Donostia. Basque Country. [email protected] Abstract Basque is a highly inflected and agglutinative language (Alegria et al., 1996). Two-level morphology has been applied successfully to this kind of languages and there are two-level based descriptions for very different languages. After doing the morphological description for a language, it is easy to develop a spelling checker/corrector for this language. However, what happens if we want to use the speller in the "free world" (OpenOffice, Mozilla, emacs, LaTeX, ...)? Ispell and similar tools (aspell, hunspell, myspell) are the usual mechanisms for these purposes, but they do not fit the two-level model. In the absence of two-level morphology based mechanisms, an automatic conversion from two-level description to hunspell is described in this paper. previous work and reuse the morphological information. 1. Introduction Two-level morphology (Koskenniemi, 1983; Beesley & Karttunen, 2003) has been applied successfully to the 2. Free software for spelling correction morphological description of highly inflected languages. Unfortunately there are not open source tools for spelling There are two-level based descriptions for very different correction with these features: languages (English, German, Swedish, French, Spanish, • It is standardized in the most of the applications Danish, Norwegian, Finnish, Basque, Russian, Turkish, (OpenOffice, Mozilla, emacs, LaTeX, ...). Arab, Aymara, Swahili, etc.). • It is based on the two-level morphology. After doing the morphological description, it is easy to The spell family of spell checkers (ispell, aspell, myspell) develop a spelling checker/corrector for the language fits the first condition but not the second.