When Information Retrieval Meets Machine Translation
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Modeling Communities in Information Systems: Informal Learning Communities in Social Media"
"Modeling Communities in Information Systems: Informal Learning Communities in Social Media" Von der Fakultät für Mathematik, Informatik und Naturwissenschaften der RWTH Aachen University zur Erlangung des akademischen Grades einer Doktorin der Naturwissenschaften genehmigte Dissertation vorgelegt von Zinayida Kensche, geb. Petrushyna, MSc. aus Odessa, Ukraine Berichter: Universitätsprofessor Professor Dr. Matthias Jarke Universitätsprofessor Professor Dr. Marcus Specht Privatdozent Dr. Ralf Klamma Tag der mündlichen Prüfung: 17.11.2015 Diese Dissertation ist auf den Internetseiten der Universitätsbibliothek online verfügbar. i c Copyright by 2016 All Rights Reserved ii Abstract Information modeling is required for creating a successful information system while modeling of communities is pivotal for maintaining community information systems (CIS). Online social media, a special case of CIS, have been intensively used but not usually adopted for learning community needs. Thus community stakeholders meet problems by supporting learning communities in social media. Under the prism of Community of Practice theory, such communities have three dimensions that are responsible for community sustainability: mutual engagement, joint enterprises and shared repertoire. Existing modeling solutions use either perspectives of learning theories, or anal- ysis of learner or community data captured in social media but rarely combine both approaches. Therefore, current solutions produce community models that supply only a part of community stakeholders with information that can hardly describe community success and failure. We also claim that community models must be created based on community data analysis integrated with our learning community dimensions. More- over, the models need to be adapted according to environmental changes. This work provides a solution to continuous modeling of informal learning com- munities in social media. -
Ranlpstud 2019
RANLPStud 2019 Proceedings of the Student Research Workshop associated with The 12th International Conference on Recent Advances in Natural Language Processing (RANLP 2019) 2–4 September, 2019 Varna, Bulgaria STUDENT RESEARCH WORKSHOP ASSOCIATED WITH THE INTERNATIONAL CONFERENCE RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING’2019 PROCEEDINGS Varna, Bulgaria 2–4 September 2019 ISSN 1314-9156 Designed and Printed by INCOMA Ltd. Shoumen, BULGARIA ii Preface The RANLP 2019 Student Research Workshop (RANLPStud 2019) is a special track of the established international conference Recent Advances in Natural Language Processing (RANLP 2019), now in its twelfth edition. The RANLP Student Research Workshop is being organised for the sixth time and this year is running in parallel with the other tracks of the main RANLP 2019 conference. The target of RANLPStud 2019 is to be a discussion forum and provide an outstanding opportunity for students at all levels (Bachelor, Masters, and Ph.D.) to present their work in progress or completed projects to an international research audience and receive feedback from senior researchers. The RANLP 2019 Student Research Workshop received a large number of submissions (23), a fact which was reflecting the record number of events, sponsors, submissions, and participants at the main RANLP 2019 conference. We have accepted 2 excellent student papers as oral presentations and 12 submissions will be presented as posters. The final acceptance rate of the workshop was 60%. We made our best to make the reviewing process in the best interest of our authors, by asking our reviewers to give as most exhaustive comments and suggestions as possible as well as to maintain an encouraging attitude. -
The Case of 13 Wikipedia Instances
Interaction Design and Architecture(s) Journal - IxD&A, N.22, 2014, pp. 34-47 The Impact of Culture On Smart Community Technology: The Case of 13 Wikipedia Instances Zinayida Petrushyna1, Ralf Klamma1, Matthias Jarke1,2 1 Advanced Community Information Systems Group, Information Systems and Databases Chair, RWTH Aachen University, Ahornstrasse 55, 52056 Aachen, Germany 2 Fraunhofer Institute for Applied Information Technology FIT, 53754 St. Augustin, Germany {petrushyna, klamma}@dbis.rwth-aachen.de [email protected] Abstract Smart communities provide technologies for monitoring social behaviors inside communities. The technologies that support knowledge building should consider the cultural background of community members. The studies of the influence of the culture on knowledge building is limited. Just a few works consider digital traces of individuals that they explain using cultural values and beliefs. In this work, we analyze 13 Wikipedia instances where users with different cultural background build knowledge in different ways. We compare edits of users. Using social network analysis we build and analyze co- authorship networks and watch the networks evolution. We explain the differences we have found using Hofstede dimensions and Schwartz cultural values and discuss implications for the design of smart community technologies. Our findings provide insights in requirements for technologies used for smart communities in different cultures. Keywords: Social network analysis, Wikipedia communities, Hofstede dimensions, Schwartz cultural values 1 Introduction People prefer to leave in smart cities where their needs are satisfied [1]. The development of smart cities depends on the collaboration of individuals. The investigation of the flow [1] of knowledge created by the individuals allows the monitoring of city smartness. -
Mining Cross-Cultural Relations from Wikipedia - a Study of 31 European Food Cultures
Mining cross-cultural relations from Wikipedia - A study of 31 European food cultures Paul Laufer Claudia Wagner Graz University of Technology GESIS & U. of Koblenz Graz, Austria Cologne, Germany [email protected] [email protected] Fabian Flöck Markus Strohmaier GESIS GESIS & U. of Koblenz Cologne, Germany Cologne, Germany fabian.fl[email protected] [email protected] ABSTRACT the editor community of the Romanian-language Wikipedia For many people, Wikipedia represents one of the primary could either have a deviant mental picture of the French sources of knowledge about foreign cultures. Yet, differ- cuisine { or it might estimate the priorities of Romanian- ent Wikipedia language editions offer different descriptions speaking readers to rather be on meat-based French deli- of cultural practices. Unveiling diverging representations of catessen than on wine and baking goods. Further, the gen- cultures provides an important insight, since they may foster eral interest of the Romanian-speaking readers in the French the formation of cross-cultural stereotypes, misunderstand- cuisine (for example measured by the number of views of the ings and potentially even conflict. In this work, we explore article about French cuisine in the Romanian language edi- to what extent the descriptions of cultural practices in var- tion) might serve to potentially displease any Francophile, ious European language editions of Wikipedia differ on the since the Romanian speaking community might show no- example of culinary practices and propose an approach to tably less interest in the French kitchen than in the Russian mine cultural relations between different language commu- or Hungarian one. This hypothetical scenario serves as an nities trough their description of and interest in their own example for numerous similar real-world cases (which can- and other communities' food culture. -
Digital Spaces in “Popular” Culture Ontologized
Digital Spaces in “Popular” Culture Ontologized Mícheál Mac an Airchinnigh School of Computer Science and Statistics, University of Dublin, Trinity College Dublin, Ireland [email protected] Abstract. Electronic publication seems to have the biased connotation of a one-way delivery system from the source to the masses. It originates within the “Push culture.” Specifically, it triggers the notion of (a small number of) producers and the (masses of) consumers. The World Wide Web (but not the Internet) tends to reinforce this intrinsic paradigm, one that is not very different from schooling. Wikipedia, in all its linguistic varieties is a classical publication of the collective. Much enthusiasm is expressed for the English version. The rest of the other world languages do not fare so well. It is of interest to consider the nature and status of Wikipedia with respect to two specific fields, Mathematics (which is not so popular) and Soap Operas (popular for the masses, largely sponsored by the Corporations). Keywords: Digital Space, Intelligent Artificial Life, Ontology, Soap Opera 1. The Talking Drum “Before long, there were people for whom the path of communications technology had leapt directly from the talking drum to the mobile phone, skipping over the intermediate stages” [1][2] It comes as a shock for most, to know that the first successful transmission of message data (i.e., significant amount of meaningful information, in the technical scientific sense) at a distance, was done on African drums? From the book cited above one may, with clear conscience, jump to Wikipedia to see if there is information on the “Talking drum” [1, 3]. -
Gourmet Project
GoURMET H2020–825299 D1.1 Survey of relevant low-resource languages Global Under-Resourced MEdia Translation (GoURMET) H2020 Research and Innovation Action Number: 825299 D1.1 – Survey of relevant low-resource languages Nature Report Work Package WP1 Due Date 30/04/2019 Submission Date 30/04/2019 Main authors Mikel L. Forcada Co-authors Miquel Espla-Gomis,` Juan Antonio Perez-Ortiz,´ V´ıctor M. Sanchez-´ Cartagena, Felipe Sanchez-Mart´ ´ınez Reviewers Alexandra Birch Keywords survey, languages, resources, machine translation Version Control v0.8 Status 1st Draft 21/04/2019 v1.0 Status Final Version 30/04/2019 page 1 of 86 GoURMET H2020–825299 D1.1 Survey of relevant low-resource languages Contents 1 Introduction6 1.1 Language pairs of interest for GoURMET......................7 2 Languages8 2.1 Afaan Oromoo (om, orm)...............................8 2.1.1 Factsheet...................................8 2.1.2 Contrasts with English............................8 2.1.3 Corpora.................................... 11 2.1.4 Resources................................... 12 2.1.5 Challenges for corpus-based MT from English............... 12 2.2 Bosnian (bs, bos)................................... 14 2.2.1 Factsheet................................... 14 2.2.2 Contrasts with English............................ 14 2.2.3 Corpora.................................... 14 2.2.4 Resources................................... 15 2.2.5 Challenges for corpus-based MT from English............... 15 2.3 Bulgarian (bg, bul).................................. 16 2.3.1 Factsheet................................... 16 2.3.2 Contrasts with English............................ 16 2.3.3 Corpora................................... 17 2.3.4 Resources................................... 20 2.3.5 Challenges for corpus-based MT from English............... 21 2.4 Croatian (hr, hrv).................................. 22 2.4.1 Factsheet................................... 22 2.4.2 Contrasts with English........................... -
Proceedings of the Student Research Workshop (Ranlpstud 2019)
RANLPStud 2019 Proceedings of the Student Research Workshop associated with The 12th International Conference on Recent Advances in Natural Language Processing (RANLP 2019) 2–4 September, 2019 Varna, Bulgaria STUDENT RESEARCH WORKSHOP ASSOCIATED WITH THE INTERNATIONAL CONFERENCE RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING’2019 PROCEEDINGS Varna, Bulgaria 2–4 September 2019 Series Online ISSN 2603-2821 Designed and Printed by INCOMA Ltd. Shoumen, BULGARIA ii Preface The RANLP 2019 Student Research Workshop (RANLPStud 2019) is a special track of the established international conference Recent Advances in Natural Language Processing (RANLP 2019), now in its twelfth edition. The RANLP Student Research Workshop is being organised for the sixth time and this year is running in parallel with the other tracks of the main RANLP 2019 conference. The target of RANLPStud 2019 is to be a discussion forum and provide an outstanding opportunity for students at all levels (Bachelor, Masters, and Ph.D.) to present their work in progress or completed projects to an international research audience and receive feedback from senior researchers. The RANLP 2019 Student Research Workshop received a large number of submissions (23), a fact which was reflecting the record number of events, sponsors, submissions, and participants at the main RANLP 2019 conference. We have accepted 2 excellent student papers as oral presentations and 12 submissions will be presented as posters. The final acceptance rate of the workshop was 60%. We made our best to make the reviewing process in the best interest of our authors, by asking our reviewers to give as most exhaustive comments and suggestions as possible as well as to maintain an encouraging attitude.