From Vlibras to Opensigns: Towards an Open Platform for Machine Translation of Spoken Languages Into Sign Languages
Total Page:16
File Type:pdf, Size:1020Kb
From VLibras to OpenSigns: Towards an Open Platform for Machine Translation of Spoken Languages into Sign Languages Tiago Ara´ujo · Rostand Costa · Manuella Lima · Guido Lemos June, 2018 1 Introduction In order to allow adequate access of information for deaf people, one solution is to translate/interpret The World Health Organization estimates that approx- spoken contents into the associated SL. However, con- imately 466 million people worldwide have some level sidering the volume and dynamism of information in of hearing loss[31]. In Brazil, according to the 2010 cen- some environments and platforms, such as the Web, sus of the Brazilian Institute of Geography and Statis- performing this task using human interpreters is a very tics (IBGE), there are approximately 9.7 million Brazil- difficult task, considering the high volume of content ians with some type of hearing loss, representing around that is published daily on the Internet. In the context 5.1% of its population[13]. of Digital TV, the support for sign languages is gener- This relevant part of the population faces several ally limited to a window with a human sign language challenges for accessing information, since it is gener- interpreter, which is displayed overlaying the video pro- ally available in written or spoken language. The main gram. This solution has high operational costs for gen- problem is that most deaf people are not proficiency eration and production of the contents (cameras, stu- in reading and writing the spoken language of their dio, staff, among others), needs full-time human inter- country. One of the possible explanations is the fact preters, which ends up restricting its use to a small por- that these languages are based on sounds[25]. A study tion of the TV programming. To address this question carried out in 2005 with 7 to 20 years old Dutch deaf pragmatically, one of the most promising approaches is persons found that only 25% of them had a reading ca- the use of machine translation (MT) tools from spoken pacity equal or greater than a 9-year-old child without languages into SLs. disability [30]. Proportionately to the number of SL, there are also One of the reasons for this difficulty is that the deaf several parallel initiatives to develop machine trans- communicate naturally through sign languages (SL), lation tools for SLs, usually focused on a single lan- 1 and spoken languages are just a \second language". guage/country[11,17,24]. Most of these text-to-sign ma- Each SL is a natural language, with its own lexicon and chine translation tools, although they were developed grammar, developed by each deaf community over time, independently in their respective countries, have sim- as well as each non-deaf community develop its spoken ilarities in approach, scope, and architecture. In gen- languages. Thus, there is no unique SL. Although there eral, the basic functionalities are present in some form are some similarities between all these languages, each in most of them. Some examples are the extraction of country usually has its own, some even more than one the text to be translated from audio and subtitles, the - by 2013, there were already over 137 sign languages generation of a sign language video, incorporation of cataloged around the world[4]. the sign language videos into the original videos (e.g, on Digital TV), dactilology and rendering of signs by plugins and mobile applications, etc. There are also sim- Digital Video Applications Lab (LAVID) Informatics Center (CI) - Federal University of Paraiba 1 In this paper, we use the acronym text-to-sign to repre- (UFPB) sent the translation of texts from spoken languages into sign E-mail: fmaritan,rostand,manuella,[email protected] languages. 2 Tiago Ara´ujoet al. ilarities in the structure and behavior of components, further enhance digital inclusion and accessibility, in such as APIs and backends of communication, transla- technologies such as Digital TV, Web and Cinema, es- tion and control. pecially in the poorest countries. The main points of variation are usually the spe- cific mechanism of machine translation and the sign lan- guage dictionary (visual representation of signs). Con- 2 Machine Translation Platforms for Sign sidering the use of avatars for representing the content Languages in the sign language, the process of creating the visual represention of signs is usually similar (e.g., a set of 2.1 Sign Languages animations) and generally depends on the allocation of The communication of people with hearing impairment financial and human resources, regardless of the tech- occurs through formal gestural languages, called Sign nology used. Languages (SL). SL are languages that use gestures, To reduce this problem, the objective of this pa- facial and body expressions, instead of sounds in com- per is to propose an open, comprehensive and extensi- munication. They have a proper linguistic system that ble platform for text-to-sign translation in various us- is independent of spoken languages and effectively ful- age scenarios and countries, including Digital TV. In fills the communication needs of the human being, be- the proposed platform, the common components share cause they have the same complexity and expressiveness generic functionalities, including the creation and ma- of the spoken languages [21]. nipulation of the SL dictionaries. Only the translation SLs have properties common to other human lan- mechanism and the dictionary itself are interchange- guages[21], such as: able, being specific for each SL. To accelerate the de- velopment, we used the Su´ıteVLibras2 tools and com- { Flexibility and Versatility: SL present several possi- ponents as a basis[2]. bilities of use in different contexts; Our proposal is the concentration of efforts and re- { Arbitrarities: the word is arbitrary because it is al- sources around an unique solution can provide some ways a convention recognized by the speakers - the cutting edge gains, such as the definition of patterns for sign languages also have words where there is no the industry standard and greater functional flexibility relation between form and meaning; for the common components, and also allow advances { Discontinuity: minimal differences between words in the state-of-the-art, such as sharing techniques and and their meanings are discontinued through the heuristics among translation mechanisms. distribution they present at different linguistic lev- A single standardized platform with centralized pro- els; cessing of multiple sign languages can also serve as a { Creativity/Productivity: there are infinite ways of catalyst for more advanced translation services, such expressing the same idea with different rules; as incorporating text-to-textIn this paper, we use the { Dual articulation: the human languages present units acronym text-to-text to represent the translation of texts of smaller articulations, without meanings, that com- between spoken or written languages. conversion. Thus, bined with others form units of meaning; we can integrate available translation mechanisms be- { Standard: there is a set of rules shared by a group tween spoken languages to allow Deaf in Brazil or Spain of people; to understand a text in English, for example. { Structural Dependency: elements of the language can Another contribution is to leverage the emergence not be combined at random, there is a structural de- of a common core machine translator that can be ex- pendence between them. tended/adapted to other languages and regionalisms. Generally, each country has its own sign language. Reducing the effort to make a new SL available may The Brazilian Sign Language (Libras), Portuguese Ges- 2 The Suite VLIBRAS is the result of a partnership be- tural Language (LGP), Angolan Sign Language and tween the Brazilian Ministry of Planning, Development and Mozambican Sign Language (LMS) are the SL of Brazil, Management (MP), through the Information Technology Sec- Portugal, Angola and Mozambique, respectively, just to retariat (STI) and the Federal University of Para´ıba(UFPB), name a few countries with the same oral linguistic base and consists of a set of tools (text, audio and video) for the Brazilian Sign Language (Libras), making computers, (ie, Portuguese). As in spoken languages, there are also mobile devices and Web platforms accessible to deaf. Cur- variations within the sign language itself, caused by re- rently, VLibras is used in several governmental and private gionalisms and/or other cultural differences. sites, among them the main sites of the Brazilian government It is relatively common to assume that sign lan- (brasil.gov.br), Chamber of Deputies (camara.leg.br) and the Federal Senate (senado.leg.br) ). Further information can be guages are flagged versions of their respective spoken obtained from http://www.vlibras.gov.br. languages. However, although there are similarities, SLs VLibras to OpenSigns 3 are autonomous languages, possessing singularities that chine translation services for the other components 3 distinguish them from spoken languages and each other and also hosts the repository of 3D models of the Libras SL [21]. signs that are used by the avatar to render the acces- The relevant cultural differences that impact the sible content after the translation. Currently, the Signs modes of environmental representation are reflected in Dictionary of the Su´ıte VLibras has around 13,500 considerable differences between sign languages. modeled signs, one of the largest bases of the kind in the world. Finally, there is the WikiLibras, a Web tool for the collaborative modeling of signs in Libras, which allows 2.2 Machine Translation Platforms volunteers to participate in the process of building and expanding the signs dictionary, through the specifica- Machine translation systems for sign languages are gen- tion of the movements of each signal. erally divided into three main classes: Rule-Based Ma- chine Translation (RBMT), Statistical Machine Trans- lation (SMT) and Example-Based Machine Translation 3 Open Signs: A Proposal of a Multilingual (EBMT) [26].