<<

Number 22 ● July 2014 Kernerman kdictionaries.com/kdn DICTIONARY News The European Network of e-Lexicography (ENeL) Tanneke Schoonheim

On October 11th 2013, the kick-off meeting of the European production and reception of dictionaries. The internet offers Network of e-Lexicography (ENeL) project took place in entirely new possibilities for developing and presenting Brussels. This meeting was the outcome of an idea ventilated dictionary information, such as with the integration of sound, a year and a half earlier, in March 2012 in Berlin, at the maps or video, and various novel ways of interacting with European Workshop on Future Standards in Lexicography. dictionary users. For editors of scholarly dictionaries the new The workshop participants then confirmed the imperative to medium is not only a source of inspiration, it also generates coordinate and harmonise research in the field of (electronic) new and serious challenges that demand cooperation and lexicography across Europe, namely to share expertise relating standardization on various levels: to standards, discuss new methodologies in lexicography that a. Through the internet scholarly dictionaries can potentially fully exploit the possibilities of the digital medium, reflect on reach large audiences. However, at present scholarly the pan-European nature of the languages of Europe and attain dictionaries providing reliable information are often not easy a wider audience. to find and are hard to decode for a non-academic audience; A proposal was written by a team of researchers from ‘traditional’ dictionary users tend to use easily accessible the Instituut voor Nederlandse Lexicologie, the Fryske non-academic and user-generated dictionaries. In order to Akademy and the Berlin-Brandenburgische Akademie für let a larger audience benefit from higher quality dictionary Wissenschaften, partners from all over Europe were found content, we should bridge the gap between the general public and in May 2013 the action was approved of by COST and scholarly dictionaries by improving access to these (European Cooperation in Science and Technology), which dictionaries and making them more widely known. is an EU framework supporting cooperation among scientists b. Most scholarly dictionary projects make their products and researchers across Europe. This means that up to the end available on the internet or have plans for going online. All of 2017 public funds are available for meetings, workshops, of them find themselves confronted with similar problems training schools and conferences with regard to the European relating to technologies for producing lexicographical Network of e-Lexicography. content, presentation, interaction with users, etc. Most This presentation of ENeL explains the background, dictionaries take different approaches to and find different aim, structure and impact of the network, and its meetings, solutions for these problems. So far European and workshops, training schools and scientific missions. international cooperation in these fields has been restricted to bilateral collaborations. There is a clear need for a broader General background and more systematic exchange of expertise and for the The background for setting up the network was that computers establishment of common standards and solutions. and the internet have seriously changed the conditions for the c. In the past years, innovative forms of electronic dictionaries

1 The European Network of e-Lexicography (ENeL) | Tanneke Schoonheim 2 Joint Symposia of JACET Society of English Lexicography and Kansai English Lexicography Circle | Shigeru Yamada 5 Multilingual Linked Open Data for Enterprises (MLODE 2014) | Sebastian Hellmann, Bettina Klimek 5 KDNews 2014 6 Claudia Xatara, Claudia Zavaglia, Rosa Maria da Silva (dirs.). Dicionário Multilíngue de Regência Verbal – Verbos preposicionados | Antonio Pamies Bertrán 8 Reverso Context: Redefining categoriesfor dictionaries and language tools | © 2014 All rights reserved. Théo Hoffenberg 10 An introduction to iFinger and Clarify Language Service | Knut Haga K DICTIONARIES LTD 11 ASIALEX 2015, Hong Kong | Li Lan Nahum 8 Tel Aviv 63503 Israel 12 KD website Tel: 972-3-5468102 Fax: 972-3-5468103 kdl@@kdictionaries.com Editor | Ilan Kernerman http://kdictionaries.com ISSN 1565-4745 2

have appeared that no longer resemble Working groups traditional paper dictionaries but try to Four working groups are the main vehicles fully exploit the new possibilities of the of delivering the scientific program. digital medium. Though serious attempts Each consists of lexicographers and have already been made at embedding computational linguists from different electronic lexicography into a theoretical countries and includes both experienced framework, an up-to-date research researchers and young ones. The working paradigm and common standards for groups are led by a chair and vice-chair, electronic lexicography are still lacking, who also provide annual reports on its as are common standards and cooperation activities. for interlinking content of digitized All researchers affiliated to an institution dictionaries and innovative e-dictionaries. in a COST country can join the network. d. The digital medium offers the potential for There is also room for researchers from a new type of lexicography that no longer countries neighbouring a COST country Joint Symposia of views languages as isolated entities. and from Argentina, New Zealand and JACET Society of English Language migration has always been South Africa. Lexicography and part of human history, yet this is often not properly reflected in dictionaries. Working group 1. Kansai English Many dictionaries have their origins Integrated interface for European Lexicography Circle in the 19th century, where a national dictionary content 30 May 2014 perspective on language prevailed. Working group 1 investigates how Aichi University, Nagoya Consequently, the information in the authoritative dictionary information on dictionaries is mostly presented from the the languages of Europe can be made The joint symposia of the point of view of a single language, not accessible to both the general and academic JACET Society of English sufficiently taking into account similar or public. Chair of the working group is Anne Lexicography and the Kensai related developments in other European Dykstra (Netherlands) and vice-chair is Bob English Lexicography Society languages. This view of languages often Boelhouwer (Netherlands). (KELC) was organized by dominates also modern dictionaries. The working group will: Professors Michihisa Tsukamoto There is a clear need for a common • Set up a European dictionary portal, and Yoshihito Kamakura from approach to e-lexicography that forms the which will give information on scholarly basis for a new type of lexicography that dictionaries of the languages of Europe Aichi University in Nagoya. fully embraces the pan-European nature and provide access to these dictionaries. If Japan is counted among of much of the vocabularies of languages Different parameters will be considered, lexicographic powerhouses, spoken in Europe. e.g. dictionary type, language covered, these organizations are two of digitally-born versus retro-digitized. the driving forces behind it, Aim of the network • Investigate the possibilities of interlinking contributing to the improvement The aim of the European Network of the contents of European dictionaries. and promotion of theoretical and e-Lexicography is to increase, co-ordinate • Investigate user requirements with respect practical lexicography. and harmonise research in the field of to the presentation of dictionary content. JACET Society of English e-lexicography and to make authoritative • Investigate the possible involvement of Lexicography is a group under information on the languages of Europe users in the creation of dictionary content. the auspices of the Japan easily accessible. The network will: Working group 2. Association of College English • Make lexical knowledge of small and Retro-digitized dictionaries Teachers. It was co-founded large languages available in a European In working group 2 the focus is on the in 1995 by Professors Minoru dictionary portal. This portal will serve digitization of paper dictionaries. It Murata and Kosei Minamide, as the central reference point for all intends to set up guidelines and standards and chaired by Professor Murata dictionary users who look for reliable, for turning paper dictionaries into digital from 1995 to 2006 and Professor authoritative dictionary information format. Chair of the working group is Vera Kaoru Akasu from 2007 to on the languages of Europe and their Hildenbrandt (Germany) and vice-chair is 2013. The society holds an histories on the internet. Vladimír Benko (Slovakia). annual workshop (including a • Enable cooperation and the exchange of Work will be carried out on: joint one with its sister society resources, technologies and experience in • The development of standards for 2014 JACET Vocabulary Acquisition e-lexicography and provide support for encoding of information and the Research Group once in three dictionaries that are not yet online. description of relevant categories for print • Discuss and aim at establishing standards dictionaries. years), and has played a key role for innovative e-dictionaries that fully • Presenting an overview of software for the in organizing the two Asialex exploit the possibilities of the digital conversion of physical layout information conferences held in Japan medium. to logical information. (Tokyo 2003 and Kyoto 2011). • Establish new ways of representing the • The investigation of relevant information KELC was founded in 2003 common heritage of European languages categories to be added to the dictionary with a view to read literature by developing shared editorial practices in order to make its content more readily and by interconnecting already existing accessible and interoperable.

Kernerman Dictionary News, July information. • The development of a work plan for 3

digitisation, including parameters • Find new applications for the very large necessary for estimating costs. amount of interconnected dictionary • The investigation of possible use of information from the European dictionary dictionary content for computational portal in the field of digital humanities. linguistic applications. • The organisation of a training school • The organisation of a training school in lexicography and lexicology from a on standard tools and methods for pan-European perspective in 2017. retro-digitising dictionaries in 2015. Training schools and short term Working group 3. scientific missions Innovative e-dictionaries One of the responsibilities of the network The scope of working group 3 is the is to organise training schools and to development of digitally-born dictionaries, provide so-called short-time scientific focusing in particular on the latest missions (STSMs). The three training developments in e-lexicography and schools mentioned above will be organised the interface between lexicography and by members of the relevant working computational . Chair of the groups, in close cooperation with the about lexicography, lexical working group is Simon Krek (Slovenia) and training school manager of the network, semantics and , vice-chair is Carole Tiberius (Netherlands). Rute Costa (Portugal). Participants of the and deepen the understanding Work will be carried out on: training schools will get their travel and of recent research findings • The description of the workflow for accommodation expenses reimbursed by in these fields of study. corpus-based lexicography. the network. Coordinated by Professor • Providing an overview of existing The first training school will be held in software needed to set up this workflow. 2015 on the subject of retro-digitization. Kosei Minamide, the circle • The use of dictionary writing systems. More information on this training school meets bi-monthly to discuss • The analysis of the possible impact of will be available at the time of the network the issues and insights raised in automatic acquisition of lexical data. meeting in July 2014 and will also be selected books, and its members • The analysis of the interface between published on the website: http://www. have translated and brought dictionary and computational lexica elexicography.eu/events/training-schools/. three of these to publication in (cf. ) and syntactically and As part of an STSM researchers visit Japan: Phraseology (Cowie, semantically annotated corpora (cf. established dictionary projects and centres ed. 2001), and Phrases FrameNet, SemCor, Senseval). of excellence in a country other than their (Michael Stubbs, 2001), and • The investigation of the possible use own. The visit period may take from five Lexicography: An Introduction of dictionary content for language days to three months and can range from (Howard Jackson, 2002). technology applications. discussions and demonstrations to deeper The regular meeting of the • The organisation of a training school on involvement in the activities of the centres innovative approaches in e-lexicography visited, in accordance with the needs of the JACET group is in Tokyo and in 2016. researchers. Travel and accommodation that of KELC is in Kyoto. expenses of approved missions are Located between the two cities, Working group 4. reimbursed by the network. There is room Nagoya was a fine venue for Lexicography and lexicology from a for at least four STSMs in each year, the joint symposia. The meeting pan-European perspective which brings the total of STSMs to take began with a special session Working group 4 investigates how the place within the duration of the project (till on ‘Challenges Facing English pan-European nature of the vocabularies of October 2017) to sixteen or more. Dictionary Editors’, with the the languages of Europe can be represented STSMs are especially meant to allow participation of the chief editors in single-language dictionaries and within young researchers to build up their own of four major English-Japanese the European dictionary portal. This is networks, and they will facilitate and learners' dictionaries, and particularly relevant to studying the multiple increase the capacity for research in the field dimensions of borrowing, i.e. the migration of e-lexicography. Tanneke Schoonheim continued with sessions on and re-migration of words and meanings (Netherlands) is appointed as the manager ‘EFL Dictionaries: Description across the languages of Europe. Chair of of these missions. For more information on and Use’ and ‘Collocation and the working group is Eveline Wandl-Vogt the missions and how to apply, see http:// English-Japanese Dictionaries’. (Austria) and vice-chair is Phil Withington www.elexicography.eu/events/workshops/. Altogether there were a

(UK). dozen talks, attended by 80 2014 This working group will: Impact of the network participants. The symposium • Develop ways in which already existing The network will allow the exchange of included a book display and was information from single language knowledge and expertise in the field of followed by a buffet-style party dictionaries can be displayed and e-lexicography. It is open to all scholarly at the university cafeteria. interlinked to represent more adequately dictionaries in Europe irrespective of their their common European heritage. previous experience with e-lexicography. Shigeru Yamada • Develop editorial guidelines for the The network will facilitate the coordination Chair, JACET Society of integration of European information into and progressive expansion and more traditional dictionaries as well as standardization of research activity through English Lexicography

innovative ones. the work of the four working groups and Kernerman Dictionary News, July 4

through close cooperation with the Euralex and vice-chair and up to two representatives and e-Lex conferences. from each COST country. The chair is Martin The sustainability of the network will be Everaert (Netherlands) and vice-chair is PASSWORD APPS promoted through the European dictionary Iztok Kosem (Slovenia). The management Series of semi-bilingual English portal and through the publication of reports committee meets every six months and is and roadmaps concerning future standards responsible for co-ordinating the activities learners’ dictionary applications, for e-lexicography on the network’s website. of the working groups, budget planning and developed by Paragon Software This website is intended to become the first the allocation of funds, organising training and launched in January 2014 source of consultation by lexicographers schools and conferences. It will monitor for speakers of: planning to create new e-dictionaries. progress in relation to the scientific focus Afrikaans In the European dictionary portal reliable and work plan in relation to the achievement Arabic information on the languages of Europe of milestones. Bulgarian will be made easily available to everyone. The steering group is responsible for Catalan The portal will therefore be a trustworthy preparing annual reports on the work of Chinese Simplified alternative to the many user-generated the network, overseeing the development Chinese Traditional dictionaries on the internet. In addition, and maintenance of the network website, Croatian whereas traditional printed dictionaries are communicating with the COST office and Czech often inaccessible to people with a visual monitoring its procedures. It comprises the impairment (set in very small type, heavily chair and vice-chair of the management Danish abbreviated for space-saving reasons) the committee, the chair and vice-chair of each Dutch portal will give visually impaired people working group, and the managers for the Estonian easy access to these dictionaries with screen training schools, STSMs and young/female Farsi readers. researchers. The steering group meets every Finnish The development of common standards three months, having started in January French will save time and money. New e-dictionaries 2014 in Leiden, followed in April in Vienna, German or dictionaries that are due to go online in July in Bolzano and in October in a place Greek will no longer have to develop their own yet unknown. Hebrew standards but will be able to refer to the At present the network comprises Hungarian publications of the network. In particular representatives from 29 COST countries: Icelandic dictionaries of small languages will benefit Austria, Belgium, Bulgaria, Croatia, Czech from such common and easily-available Republic, Denmark, Estonia, Finland, Indonesian standards, as they usually do not have the France, Germany, Greece, Hungary, Iceland, Italian means to develop their own approaches. Ireland, Israel, Italy, Latvia, Netherlands, Japanese The network can therefore substantially Norway, Poland, Portugal, Romania, Serbia, Korean contribute to regional or minority languages. Slovak Republic, Slovenia, Spain, Sweden, Latvian The overall quality of the dictionaries will Switzerland and the United Kingdom. Lithuanian be improved. The new editorial methods Members of the network meet every six Malay and practises to be developed will reflect months, preferably in combination with Norwegian more realistically the language situation other conferences regarding lexicography or Polish in Europe and the historical development computational linguistics, such as Euralex, Portuguese Brazil and interaction of European languages, e-Lex and LREC. On these occasions there Portuguese Portugal in particular the migration of words and will always be meetings of the working Romanian meanings across Europe. Lexicographers groups, the management committee and and lexicological researchers, especially the steering group, followed by a plenary Russian young ones, will increase their knowledge session for all members on the progress Serbian and skills by participating in working made. The next network meeting is on July Slovak groups, training schools and conferences. 19-20 in Bolzano (Italy), directly following Slovene Yvonne Luther (Germany) is appointed the Euralex conference. The first meeting of Spanish as manager for young researchers and/or 2015 is due in Vienna (Austria) in January Swedish female researchers to make sure that they or February and the second meeting may Thai get the right chances in the network. be connected to e-Lex 2015 in July in Turkish The large amount of data connected in Herstmonceux Castle (UK). Ukrainian the European dictionary portal will enable For more information on ENeL, please 2014 Urdu new lines of research in the field of digital visit our website: www.elexicography.eu. Vietnamese humanities that could have not been carried out on the basis of isolated language Available for Android, iOS and resources, e.g. the spread of technological Mac OS. innovation by studying the appearance of relevant words in the vocabularies of the languages in Europe. European Cooperation in Science and Structure of the network Technology COST ACTION IS1305 The network is driven by a management Start of Action: 11 October 2013

Kernerman Dictionary News, July committee, comprising an elected chair End of Action: 10 October 2017 5

Multilingual Linked Open Data for Enterprises MLODE 2014

The second workshop on Multilingual project is eager to contribute to the Linked Open Data for Enterprises creation, hosting and maintenance of (MLODE 2014) will be held as part of the final resources, guaranteeing persistence, SEMANTiCS conference in Leipzig on impact and visibility of the outcome. MLODE organization 2 September 2014. It will bring together • Building LLOD-aware NLP services LIDER (http://lider-project.eu) developers, data producers, academia and The development of resources and Linked Data for Language enterprises from various fields of linguistics, services, including comprehensive Technology W3C Community natural language processing (NLP) and structured metadata, with the final goal Group (http://www.w3.org/ information technology to present and of granting effortless programmatic community/ld4lt/) discuss principles, case studies and best access to multilingual natural language Working Group for Open Data practices for representing, publishing and processing services based on open data. in Linguistics linking linguistic data collections, including • Generating Linked Data for Language (http://linguistics.okfn.org) corpora, dictionaries, lexical networks, Resources K Dictionaries memories, thesauri, etc. The continuation of the initiative began at (http://kdictionaries.com) Ontotext (http://ontotext.com) As Semantic Web research progresses, MLODE 2012 to convert, aggregate and Clarin (http://clarin.eu) interest by practitioners, industry and publish Language Resources and extend Language Science Press infrastructure providers operating across the LLOD-cloud, which has grown (http://langsci-press.org/) language barriers increases. The Linked significantly, contributing linguistic data DBpedia Association (http:// Open Data (LOD) community is enthusiastic to the NLP service infrastructure (http:// .org/Association) about the possibilities offered by new, vast linguistics.okfn.org/resources/llod/). InfAI (http://infai.org) multilingual resources. While it is clear MLODE 2014 also aims to present cases for, NLP2RDF (http://nlp2rdf.org) that the Semantic Web is not a panacea, it and barriers against, industry participation has matured into a technology capable of in Linked Data for NLP and content inter- addressing specific real world problems of nationalization and localization, discuss globalization faced by the industry and the best practices on how to channel feedback governmental sector. from industry to open-source and academic The first MLODE workshop (2012) communities, and produce a roadmap for was successful in establishing a Linked Data & Language Technology in highly-productive interdisciplinary Europe. network that enables researchers to share http://mlode2014.blogs.aksw.org/ experiences on how to sustainably manage and interlink huge amounts of language Sebastian Hellmann, Bettina Klimek data through Semantic Web technologies, Universität Leipzig providing scientific resources that enhance the development of more precise models and applications for research on Linked KDNews 2014 Data. A major output of MLODE 2012 was the improvement of the technical viability INTERNS @ KD of the Linguistic Linked Open Data (LLOD) Zulema Badanes Canet | Eva Prats Balaguer | Miguel Angel Bordas – cloud (http://sabre2012.infai.org/mlode/). Universitat Jaume I, Spain The goal of MLODE 2014 is to compare Bettina Klimek – Universität Leipzig, Germany technologies and datasets developed before Dılara Özlem Güneren – Université de Lille 3, France and in parallel to Linked Data, RDF and Kseniya Egorova | Anna Bespyatova – Saint-Petersburg State Economic OWL, and to offer industrial participants University, Russia an overview of the technologies ready for Alex Milton de Carvalho | Bruna Rafael Neira Munoz – UNESP, Brazil exploitation. The workshop will focus on Louis Albrecht – Université de Lorraine (ATILF CNRS), France the following topics: • Unifying the Dictionary KD in CHINA

The continuation and centralization of The Commercial Press International Company (Beijing) will collaborate 2014 various efforts to compile a multilingual with K Dictionaries on dictionary publication in China. The cooperation dictionary integrating heterogeneous will be launched with a new edition of Random House Kernerman sources. Open questions on focusing Webster’s College Dictionary and include a Chinese bilingual pocket the research and crowd-sourcing a dictionary series, starting with French and Japanese. multilingual core lexicon will be tackled with actual corporate use cases, combining CONFERENCE SPONSOR the know-how of the assembled research LREC – Reykjavik, Iceland communities and industry professionals EURALEX – Bolzano, Italy associated with the SEMANTiCS MLODE – Leipzig, Germany

conference. In this context, the DBpedia Kernerman Dictionary News, July 6

Claudia Xatara, Claudia Zavaglia, Rosa Maria da Silva (dirs.). Dicionário Multilíngue de Regência Verbal – Verbos preposicionados

It is a well-known fact that collocations Claudia Xatara, Claudia Zavaglia and and prepositional regency are the hardest Rosa Maria da Silva made this Dicionário part of learning a foreign language, since Multilíngue de Regência Verbal – Verbos the arbitrariness of the linguistic sign also preposicionadas facing the difficult task affects the combinability of the signs; of making an inventory of these events, besides, theoretical phraseology expanded ordering and describing them, comparing a long time ago its object of study beyond each construction in seven languages set phrases and proverbs, extending its (Portuguese, English, French, German, interest to light verb constructions, routine Italian, Japanese, Spanish), ordering them formulae, interjections, insults, curses, etc. alphabetically from the Portuguese version. The compilation of dictionaries has not Starting from 6,000 verbs taken from always duly accompanied this progress, Borba’s grammatical dictionary (Francisco and, in many languages, good dictionaries Da Silva Borba, Dicionário grammatical of collocations are still missing, or, de verbos do português contemporâneo. when available, appear much later, in São Paulo: Editora UNESP, 1991), and Dicionário Multilíngue de comparison to other types of dictionaries. taking information from large general Regência Verbal The fact that the boundaries between what dictionaries, like Aurélio (Aurélio Buarque Verbos preposicionados belongs to phraseology and what belongs de Holanda Ferreira, Dicionário Aurélio Claudia Xatara, Claudia to syntax are not unanimously accepted is da Língua Portuguesa. Curitiba: Editora perhaps one of the reasons why, in many Positivo, 1975, 1999, 2010) or Houaiss Zavaglia, Rosa Maria da Silva languages, not enough attention has been (Grande Dicionário Houaiss da Língua (dirs.) granted until recently to the production of Portuguesa. Rio de Janeiro: Instituto Portuguese - Claudia Xatara | dictionaries of propositional schemes or Antônio Houaiss, 2001, 2003), data were Claudia Zavaglia | Rosa Maria verbal valences. checked and compared with the Corpus da Silva Although verbal valency is a syntactic Textual Electrônico do Laboratório de English - Peter James Harris phenomenon, it depends largely on the Lexicografia of UNESP (São Paulo State French - Claudia Xatara meaning of each verb, since it cannot be University), gathering 1,200 Brazilian German - João Moraes Pinto predicted only from formal rules. Such rules Portuguese words, clarifying and describing Junior may explain the mechanics of “arguments” their construction regency. The authors’ Italian - Claudia Zavaglia | like subject and object for transitive verbs, starting assumption is that prepositional Fábio Bertonha | Vivian Orsi but when prepositional complements appear complements are also mandatory arguments the casuistry explodes. A verb may have of the verb in certain constructions (e.g. Japanese - Eliza Atsuko Tashito figurative meanings that alter completely simpatizar com *sympathize with, rezar por Spanish - Rosa Maria da Silva the valency, e.g. Portuguese dar (to give), *pray for), which are called in Portuguese São Paulo: Disal Editora which is trivalent par excellence, loses verbos preposicionados (*prepositional 2013. 351 pages its arguments in some sequences that verbs). An explanatory paraphrase and ISBN: 978-85-7844-150-0 are neither ditransitive nor idiomatic set a Portuguese definition of each entry are phrases. In Spanish, la terraza da al mar provided, with examples of real usage, (*the terrace gives to the sea the terrace as well as equivalents in the six other overlooks the sea) has no direct objet; or languages (with the good idea of adding dio con la cabeza contra el muro (*he gave the transliteration into Latin alphabet for with the head against the wall he beat his Japanese hiragana and katakana characters). head against the wall) has neither indirect One of the most representative entries is nor direct object. These examples are not dar (give), which has 35 sub-entries, where necessarily set-phrases,​​ although they may the regular valences of the literal meanings undergo a metaphor that integrates them of the verb inevitably meet together with into an idiom, like in Spanish dar en el the not so literal ones, and with several

2014 clavo (*to give in the nail to hit the nail). more or less idiomatic combinations, such Therefore, the figurative meanings are as dar com alguem (*give with someone to linked to a prepositional regency as variable meet someone), equivalent to Spanish and and as whimsical as English phrasal verbs, German reflexive constructions requiring with a certain degree of phraseological another verb (encontrarse con alguien; mit fixation that affects not only the seemingly sich jemandem treffen *to find oneself with arbitrary union between a given verb and someone), and to a completely different a given preposition, but also the verbal metaphor in Italian (imbattersi in qualcuno valences, which are no longer the same as *bump into someone). in the “literal” verb. Since a foreign language learner cannot

Kernerman Dictionary News, July The team led by Brazilian phraseologists know beforehand the boundaries between 7 categories which are still controversial often thought. even for specialists, it is clear that a The entries are presented as follows: prototypical user of a bilingual dictionary needs a reference work where all this kind DAR (31) POR algo/alguém (perceber a of information is available simultaneously, ausência de) without knowing in advance whether the → quando derem por mim será tarde demais searched sequences are syntactic, lexical or A: jemandem etwas vermissen 0 phraseological. Since most of the valences E: echar en falta algo/alguien involve prepositional complements, and F: remarquer l’absence de we cannot expect the learner to distinguish In: note the absence of a priori between verbal valency and It.: accorgersi di prepositional regency, the dictionary must J: ga iru (inai) no ni ki ga tsuku PASSWORD JR enable the user to access both phenomena English Dictionary for as they occur naturally in speech: “mixed DAR (34) algo POR algo (desfazer-se; Speakers of Portuguese at together”. vender) Beginner Level The Spanish verb consentir (*to allow) → só darei o meu Picasso por uma fortuna Martins Editora Livraria is trivalent (A consiente B a C) (*someone equivalente á sua beleza São Paulo, Brazil consents something to someone), but it is A: etwas für etwas verkaufen June 2014 also possible to have (A consiente en B), E: vender por Translator: Lina Maria with no addressee, a construction that is F: donner contre; donner pour Alvarenga possible also in French but without changing In: to give up for Editor: Érika Nogueira de the preposition (consentir à qqch.) while, in It.: dare per; vendere per Andrade Stupiello Italian, the ellipsis of the addressee requires J: wo to korihiki suru 396 pages, 130x190x20 mm to change the verb and the preposition at ISBN: 978-85-8063-129-9 the same time (acconsentire a: permettere The applications to natural language http://www.martinsfontes- a qualcuno di). processing and/or are selomartins.com.br/catalogo_ In this sense, verbal valences also belong also important. For example, the Google det.php?id=1081 to the field of lexical combinations, and Translate tool (http://translate.google. Published in collaboration need a detailed lexicographical treatment, com/, accessed May 2014), though based with K DICTIONARIES including the lexical and prepositional on and statistics, environment of each verb, not only for their shows a dramatically wrong result for dar literal meaning, but also for their figurative por alguém (*by giving someone) or for and idiomatic values. The cross-linguistic Spanish echar en falta a (*hacks to take) dimension means the predictable valency instead of to notice the absence of. Results in one language may have, in another are no better with the Reference tool language, an equivalent whose argument (http://wordreference.com/, accessed May is a prepositional complement instead of a 2014). The same can be said about the great direct object. Besides, each preposition can majority of such constructions, except govern another morpho-syntactic case, thus when, incidentally, English coincides the possibilities are multiplied. literally with the Portuguese (or Spanish, For example, let’s have a look at dream etc) form. + name of action: in Italian the dreamed This dictionary is, thus, an excellent tool, action is represented by a direct object not only for foreign learners (and teachers) Webster’s English (sogna viaggiare *he dreams travelling), of the Portuguese language, but also for Language Dictionary whereas in Spanish and Portuguese it is research purposes in contrastive linguistics Philippine Edition a prepositional complement of company in a field that has been a kind of ‘no man’s Merriam & Webster Bookstore (soñar con viajar / sonhar com viajar *to land’for too many years: the borderline Manila, Philippines dream with travelling) and in French and between syntax, idioms and lexicology. June 2014 256 pages, 130x195x18 mm German there is a genitive construction ISBN: 978-971-30-1343-9 (rêver de... Traumen von *to dream of). Antonio Pamies Bertrán Another clear example of this apparent Dept. Lingüística General y Teoría de la Published in collaboration falling in love , arbitrariness is “ ”, which in Literatura Universidad de Granada with K DICTIONARIES Spanish requires de (genitive construction: [email protected] enamorarse de), while English requires 2014 with (committative construction: to fall in love with), German requires in (locative construction: sich verlieben in), and Portuguese requires por (ablative PASSWORD semi-bilingual English learners’ dictionaries on CDO: construction: se apaixonar por). In this French - http://dictionary.cambridge.org/dictionary/english-french/ sense, Spanish andar detrás de N (*to German - http://dictionary.cambridge.org/dictionary/english-german/ walk after N to be looking for (something)) Spanish - http://dictionary.cambridge.org/dictionary/english-spanish/ allows us to suspect that the difference between collocations, idioms and phrasal Published in collaboration with K DICTIONARIES verbs is not as clear and objective as is Kernerman Dictionary News, July 8

Reverso Context: Redefining dictionaries and language tools Théo Hoffenberg

The last issue of this publication had a fully appreciate the innovative features of brief article by Ilan Kernerman entitled this linguistic tool. Dictionary n. Obsolete? (KDN21, 2013). Although the title is certainly provocative, Changes it’s quite clear that today’s dictionaries are Dictionaries of previous generations not exactly what they used to be. In the had various limitations. Firstly, the total same issue Colin McIntosh wrote about number of characters must fit an acceptable the new definition for ‘book’ in Cambridge volume (say below 2,000 pages for the large Advanced Learner’s Dictionary, which ones). Thus, reducing the content size and focused more on the content than the eliminating redundancy and inconsistencies physical form we used to associate it with soon became a huge task that required (ibid.). enormous work from authors and editors Combining the two approaches, we looking to fit comprehensive data within Théo Hoffenberg is CEO could say the dictionary as we knew it restricted space. Secondly, the print was of Softissimo Inc. He is an (i.e. a book consisting of a list of entries in black and white, occasionally including engineer by training (graduate indexed in alphabetical order, containing only one additional color. of École Polytechnique in definitions or , examples of The use of such dictionaries demanded Paris), an entrepreneur by usage, compounds, etc) is probably already the reader’s active participation to interpret profession (founder of Reverso obsolete. Nevertheless, we are likely to signs or symbols (e.g. ~ to replace the still use the word ‘dictionary’ to refer to a headword, -> for cross-reference), as among others), a designer and very different concept, just as we still use well as to cross-reference the indicators architect of NLP tools and ‘telephone’ for something that no longer suggested (subject, object type, preposition content by passion, and an resembles the large contraption with a use, etc) with the actual context in order amateur linguist. rotary dial it represented 30 or 40 years to choose the most appropriate meaning. [email protected] ago. Cross-checking in the opposite direction Soon the word ‘dictionary’ will likely was also in the hands of the user. Moreover, refer to a tool that helps us find the most looking up phrases such as not at all, je appropriate choice in a certain context, m’en vais or pas tout de suite often proved offering users easy access to meanings and to be a difficult task. translations of words and phrases, along At that time, there were no search engines, with relevant examples of usage, etc. and intensive users of foreign languages were Within this framework, Reverso presents fewer (though perhaps more motivated). a new approach to translation aids that Nowadays, many people are used to search in a few years might be understood as a engines and machine translation, hence dictionary, but for the time being can some laziness or higher expectations on have different names: a new type of their part. When searching for appropriate example-based dictionary; a bilingual vocabulary, modern users expect answers to concordancer; a search engine for large be instant, precise and varied. bilingual texts (bitexts in NLP jargon) Users are also increasingly used to ask aligned at word and phrase level; a bilingual questions in whichever way they come to aligner providing translation for relatively mind, without rephrasing or adapting to frequent sequences of words; a provider query syntax, and still be able to obtain of frequent wording suggestions and relevant answers. their translations; an analyzing tool that In 2000, based on comprehensive applies linguistic principles to big data; a dictionaries from Collins, we made a big terminology checker based on balanced step by putting computer power into use to

2014 corpora. enhance the user experience for dictionary These descriptions may seem intimidating look-up: no more ~ to replace the headword at first, and may even bring to mind a Rube or other such abbreviations; use of color Goldberg machine, or a white elephant. to identify components of an entry (blue However, in practice, novices and experts for source language, black for target alike find this approach efficient and language, green for domain indicators, easy to use. Therefore we call it simply red for grammar, etc); direct access to Reverso Context (RC). While the text compounds; and full-text search to find below offers some insight on how our idea examples in both directions (for example, of a ‘dictionary’ works, readers are invited faire miroiter appears as the translation for

Kernerman Dictionary News, July to also experience it firsthand in order to dangle even though dangle is not given 9 as a translation for faire miroiter). This speakers’ intuition allows them to know feature was initially implemented in our that je m’en vais can have very different dedicated software environment called meanings according to context, new learners Lexibase, which included Collins bilingual or even proficient non-native speakers and monolingual dictionaries. The software may find it difficult to grasp the different has been updated and is still in use today, nuances of this phrase. In fact, the tone of available for Internet, intranet and PC, and this expression can range from neutral to the same environment has been applied to aggressive and threatening, and its meaning many more dictionaries since then. varies when it precedes a verb, in which Reverso is a leading Despite its extensiveness, the content case it expresses a will to take action. multiple-language portal itself was originally designed with the Looking up me voy in the large Collins offering online translation, intention of producing a book. This means English-Spanish dictionary, for example, definition, spelling, grammar, that the variety of the examples and the one might automatically switch to full-text conjugation and more to over 10 coverage of derivations, among other search to display relevant examples million regular users worldwide. components, were limited, and focus was containing this text, but still not find the Reverso–Softissimo provides put more on avoiding redundancy rather direct translation of the phrase itself. In customized solutions including than on expanding coverage. addition, RC offers more than 8,000 short products and services for Dictionaries of this type were not only texts containing the item, of which over corporate clients. limited, but also extremely costly to 1,000 are aligned to I’m going and 400 to develop, because of the 100% human I’m leaving and I’m off. http://reverso.net/ factor and the strict editorial rules and RC is also particularly useful for finding http://context.reverso.net/ compactness. As a result, most dictionaries examples of usage and translation of phrases http://reverso.softissimo.com/ for non-major language pairs, such as that cannot be translated independently. French-Arabic or French-Japanese, never Take for example the phrase shy of + number reached the comprehensiveness of those for or quantity, which can be translated as un English-Spanish, for example. peu moins de, meaning that a certain amount But even the largest dictionary content is less than expected. The examples enable originally designed for a print edition users to find the most suitable expression for cannot provide the full coverage expected each particular situation. The same applies today in terms of examples, derivatives or for other words, such as sorted, get sorted, context, let alone up-to-date vocabulary and get things sorted, get myself sorted, etc. technical terms. For the linguist, RC offers more interesting features, allowing to identify Examples trends or validate theories and lexicons. Let’s take some concrete examples of the It responds to questions such as: What are benefits of this advanced language tool that the most frequent translations for this word allows users to communicate in languages or phrase? Which frequently used phrases that are not their mother tongue. contain this word? What does this word Suppose a French-speaking person wants translate into when not in this phrase? to translate je m’en vais into English. Taking an example, a quick look-up of Wouldn’t it be nice to type this text in an upside in RC will show that most examples entry box, and get translation suggestions of usage are related to upside down. A more including examples that use both the advanced search provides translations searched item and the suggested results excluding the phrase upside down. Then, in context? What if the same possibilities if a certain phrase (e.g. upside risks) is too existed for a Spanish speaker looking to widely represented, it can be excluded from translate me voy? This is precisely what the search. Alternatively, simply looking up RC is about. an upside will provide translations of upside Users these days are accustomed to as a noun. getting relevant results in a blink of an When translating siège from French to eye and effortlessly. In this sense, RC English, words such as seat, siege and caters to “pampered” users that no longer headquarters may come to mind. Although wish to lemmatize such phrases. After one may think that seat is the most generic all, knowing that je m’en vais stems from translation and that headquarters is used 2014 s’en aller isn’t obvious, and searching mainly as part of the siège through the sub-entries of aller for the social, a search with RC would show that verb’s pronominal or reflexive forms headquarters is by far the most commonly can be tiresome. Additionally, if you used equivalent. Moreover, the “-” option happen to be a linguist, you know that can be applied in the RC (siège -{siège important information is often lost through social}) to check if siège is translated into lemmatization, as not all verb forms relate headquarters even when it is not part of to the original meaning of the root. siège social. This same example can be observed Non-natives who are proficient in a from another viewpoint. Although native foreign language often need assistance Kernerman Dictionary News, July 10

to validate their choice of words. This could check sentences containing the first process of producing a coherent translation one in the source language, and the second is what linguists describe as encoding, for one in the target language, finding more production purposes. than ten relevant examples. Reverso is adding the One way to do it is to use online following KD titles to its dictionaries, starting with bilingual Conclusion services in 2014: dictionaries from the source language into Bilingual concordancers have already ● monolingual dictionaries for the target language and then searching proven to answer certain requirements the definition or synonyms in the target that dictionaries could not meet. This English, French | German | language to find the most appropriate one explains why their use has spread in recent Spanish in context. Another option would be to look years and why RC could profit from ● bilingual, bi-directional up the definition or synonym of the assumed being easy-to-handle by average users French dictionaries for translation and see whether it is appropriate, though providing more advanced features. Arabic | Dutch | Hebrew | also using the reverse translation (translating RC will continue to innovate and push Portuguese Brazil and back into the source language). towards the dictionary of the future. First, Portugal | Russian For example, to translate acharné a by improving and diversifying content French-English dictionary would provide resources, adding new and varied corpora ● phrasal French/Hebrew equivalents such as fierce, bitter, relentless that encompass diverse fields and language dictionaries and unremitting as first proposals, with levels, and expanding coverage to both only a few examples. To search further, written and spoken language. Second, by Published in collaboration one could look for synonyms for relentless introducing new ways to customize the with K DICTIONARIES and find ruthless, unrelenting and user’s search experience. For example, uncompromising. However, in order to large organizations that have voluminous find the best translation, one should have a corpora are already able to prioritize their near-native level of English, or at least read content with more pertinent features, and the definitions or the “back translations”. subject domain, regional variants and the Moreover, if one were to use this adjective language level will also be possible to filter as part of travail acharné, results could be in the future. Last but not least, RC will surprising. The RC search shows that it strive to maintain high quality when dealing is widely used, and that its translation is with large data volumes thanks to automatic hard work, although hard is not among cleaning scripts as well as processing user the proposed translations or synonyms for feedback. With this, we hope to be to the acharné. dictionary what a smartphone is to the old If the relentless translation is chosen, one telephone today.

An introduction to iFinger and Clarify Language Service Knut Haga

iFinger is a provider of digital dictionaries integrated in Microsoft Windows environment. The first version of the iFinger software was released in 2000, with the main goal of offering convenient look-up solutions in high-quality dictionaries. The portfolio varies from glossaries to unabridged monolingual and bilingual dictionaries from HarperCollins, Merriam-Webster, K Dictionaries, Pons and Cappelan Damm, as well as iFinger’s own terminology for the medical, technical and legal domains. Since its inception, over 3.5 million users have accessed this dictionary software from CNET’s Download.com. It enables tailored application for multiple users in the corporate, educational and governmental sectors. At present iFinger has more than 200,000 users in the educational sector in Norway and more than 50,000 users in corporate and governmental sector. Overall, the demand for language services is growing constantly. iFinger aims to meet this demand by developing new cloud-based language tools that will be available as native solutions for all common operating systems. This service is branded as Clarify and is offered for free to the general public, with premium content available through annual 2014 subscription to corporate and government users. The initial launch in June 2014 offers a free dictionary service for 30 languages, including 670 dictionaries covering the languages of 3.4 billon people. There are mobile apps for iOS, to be followed soon by apps for Android and Windows Mobile, as well as by new services for Machine Translation and Text To Speech. This entire activity is being transferred from iFinger to the new Clarify service, which is designed for unlimited global growth.

http://clarifylanguage.com http://ifinger.com Kernerman Dictionary News, July 11

ASIALEX 2015, Hong Kong

The 9th International Conference of proceedings, and selected papers will be ASIALEX will be hosted from 25 to 27 recommended to the association’s refereed June 2015 by the Hong Kong Polytechnic journal Lexicography: Journal of Asialex. University. Important dates: ASIALEX, the Asian Association of • Abstract submission date: Lexicography, is approaching its eighteenth 15 February 2015 anniversary. Founded in the Dictionaries in • Notification of paper acceptance: Asia conference at Hong Kong University 30 March 2015 of Science and Technology in 1997, the • Deadline for early registration: LEXICOGRAPHY association has made great efforts to bring 30 April 2015 Journal of ASIALEX linguists, lexicographers and dictionary • Deadline for paper submission: Volume 1, Number 1 users together for the development of 10 May 2015 June 2014 lexicography in Asia. It has acted as a • Conference dates: 25-27 June 2015 Editor-in-Chief: Yukio Tono catalyst for scholarly cooperation and Submission guidelines: information-sharing in a wide range of • The medium of the conference is English Opening Statement areas: dictionary compilation, dictionary • The abstract should not exceed 300 words • Lexicography in Asia: Its critique, bilingual lexicography, learners’ (excluding references) future and challenges | Yukio dictionaries, user study, Asian language • Abstract submission is online at Tono study and computational lexicography. The http://asialex2015.engl.polyu.edu.hk/ Papers biennial conference is a major event in the It is our great honour and pleasure to bring development of ASIALEX. The table below ASIALEX back to its birthplace, Hong • Sketch Engine: Ten years records the landmarks. Kong, in 2015. Great changes have taken on | Adam Kilgarriff Following the vein, the theme of place in lexicography, in the city, in the • Lexical Markup Framework ASIALEX 2015 is Words, Corpora and region and around the world in the last 18 | Gil Francopoulo and Dictionaries: Innovations in reference years. We welcome lexicography experts, Chu-Ren Huang science, with the following topics: linguistic educationists, young scholars • Korean and English • dictionary and ELT and dictionary enthusiasts to join us in this ‘dictionary’ questions: What • dictionary compilation in a digital era regional and international event to witness does the public want to • the role of corpora in reference science the growth and changes in lexicography in know? | Susanne Bae and • dictionary, corpus and Asian languages Asia. Hilary Nesi • multimedia and multifunction of the See you in Hong Kong in 2015! • “Bottom-up” approach dictionary http://asialex2015.engl.polyu.edu.hk/ in making verb entries in • dictionary and culture • terminology, phraseology and Dr Lan Li, Convener a monolingual Indonesian neologisms Department of English learner’s dictionary | Dora The papers presented at ASIALEX 2015 Hong Kong Polytechnic University Amalia will be published in the conference [email protected] • Towards improved coverage of Southeast Asian Conference Year Venue Theme President Englishes in the Oxford English Dictionary | Danica inauguration 1997 Hong Kong Dictionaries in Asia: Research and pedagogical implications Salazar

st 1 1999 Guangzhou, National experience in lexicography or Huang Jianhua Research Report China dictionary compilation and bilingual (China) lexicography • An analysis of the 2nd 2001 Seoul, Korea Asian bilingualism and the dictionary Sangsup Lee smartphone dictionary app (Korea) market | Christoper Winstock 3rd 2003 Tokyo, Dictionaries and language learning: How Minoru Murata and Young-kuk Jeong Japan can dictionaries help human and machine (Japan) learning? Book Reviews 4th 2005 Singapore Words in Asian cultural contexts Anne Pakir • The Chinese-English (Singapore) Encarta Dictionary (2011) | 2014 5th 2007 Chennai, Asian lexicography: Retrospect and V. Jayadevan Lan Li India prospect (India) • The Bloomsbury 6th 2009 Bangkok, Dictionary in education Jirapa Vitayapirak Companion to Lexicography Thailand (Thailand) (2013) | Vincent Ooi 7th 2011 Kyoto, Lexicography: Theoretical and practical Zafar Iqbal Japan perspectives (Pakistan) Published by Springer Journal no. 40607 8th 2013 Bali, Lexicography and dictionaries in the Yukio Tono Indonesia information age (Japan) ISSN: 2197-4292 (print) ISSN: 2197-4306 (electronic) 9th 2015 Hong Kong Words, corpora and dictionaries: Chu-Ren Huang Innovations in reference science (Hong Kong) http://springer.com/40607 Kernerman Dictionary News, July Welcome to our new website soft launch throughout 2014

http://kdictionaries.com

K DICTIONARIES LTD Nahum 8 Tel Aviv 63503 Israel ı Tel 972-3-5468102 ı Fax 972-3-5468103 ı [email protected] ı http://kdictionaries.com