Interactive Editing in French Sign Language Dedicated to Virtual
Total Page:16
File Type:pdf, Size:1020Kb
Interactive Editing in French Sign Language Dedicated to Virtual Signers: Requirements and Challenges Sylvie Gibet, François Lefebvre-Albaret, Ludovic Hamon, Rémi Brun, Ahmed Turki To cite this version: Sylvie Gibet, François Lefebvre-Albaret, Ludovic Hamon, Rémi Brun, Ahmed Turki. Interactive Edit- ing in French Sign Language Dedicated to Virtual Signers: Requirements and Challenges. Universal Access in the Information Society, Springer Verlag, 2016, 15 (4), pp.525-539. 10.1007/s10209-015- 0411-6. hal-01367723 HAL Id: hal-01367723 https://hal.archives-ouvertes.fr/hal-01367723 Submitted on 9 Dec 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Noname manuscript No. (will be inserted by the editor) Interactive Editing in French Sign Language Dedicated to Virtual Signers: Requirements and Challenges Sylvie Gibet · Fran¸coisLefebvre-Albaret · Ludovic Hamon · R´emiBrun · Ahmed Turki Received: date / Accepted: date Abstract Signing avatars are increasingly used as an motion database, (v) the virtual avatar animation by interface for communication to the deaf community. editing and composing motion segments, and (vi) the In recent years, an emerging approach uses captured conception of a dedicated user interface according to data to edit and generate Sign Language (SL) gestures. user knowledge and abilities. Each step is illustrated by Thanks to motion editing operations (e.g. concatena- the authors' recent work and results from the project tion, mixing, etc.), this method offers the possibility Sign3D, i.e. an editing system of French Sign Language to compose new utterances, thus facilitating the en- (LSF) content (http://sign3d.websourd.org/sltat). richment of the original corpus, enhancing the natu- ral look of the animation, and promoting the avatar's Keywords Signing Avatar · Interactive Editing · acceptability. However, designing such an editing sys- Data-Driven Synthesis tem raises many questions. In particular, manipulating existing movements does not guarantee the semantic consistency of the reconstructed actions. A solution is 1 Introduction to insert the human operator in a loop for construct- ing new utterances, and to incorporate within the ut- New communication technologies, such as mobile phones terance's structure constraints that are derived from with integrated video, MMS messaging, videoconferenc- linguistic patterns. This article discusses the main re- ing devices, internet and social networks, have been de- quirements for the whole pipeline design of interactive veloped in recent years to improve Deaf accessibility to virtual signers, including: (i) the creation of corpora, various media. These new assistive technologies usually (ii) the needed ressources for motion recording, (iii) contain pre-recorded videos adapted to a visual-gesture the annotation process as the heart of the SL editing communication channel, in which a human signer per- process, (iv) the building, indexing and querying of a forms LSF messages. However, the video does not fulfill all the functions of writing languages: it does not allow Sylvie Gibet and Ludovic Hamon anonymity and sometimes imposes stringent recording IRISA, Universit´ede Bretagne Sud requirements. Furthermore, video lacks flexibility, and Campus de Tohannic, BP 573 56017, Vannes, France operations such as copy / paste fail to deal with tran- E-mail: [email protected], [email protected] sitions in the context of editing new utterances. Fran¸coisLefebvre-Albaret One of the current assistive technologies is centered Websourd on the generation of signed languages using virtual hu- 99, route d'Espagne, Bat. A mans, or avatars. The use of avatars allows more ad- 31100 Toulouse, France E-mail: [email protected] vanced manipulations of SL statements, mostly because the possibilities of content creation with avatars through R´emiBrun and Ahmed Turki Mocaplab editing operations are far more advanced and flexible. 5, cit´eRiverin, 75010 Paris, France Importantly, 3D animation presents an opportunity to E-mail: [email protected], provide on-the-fly access to otherwise inaccessible con- [email protected] tent, thus facilitating the comprehension of the signs. 2 Sylvie Gibet et al. Avatars also allow for the anonymity of the interlocu- tion 6 discusses the editing and composing operations tor, and can be personalized along with the user's will. of the animation system, while section 7 is centered on Most signing avatar projects adopt synthetic an- the requirements for a dedicated user's interface. Fi- imation techniques for their virtual agents. Some of nally a discussion and some perspectives conclude this them use keyframe techniques to produce high quality article. animations [1], or pure synthesis methods to generate virtual avatar animation thanks to inverse kinematic 2 Related Works techniques [34,31]. However, these methods have been proven to be tedious for building new signs or utter- Among recent methods and technologies for editing sign ances. In addition, the resulting movements may fail in language and animating signing avatars, two major classes convincing audiences with their credibility and expres- of approaches can be distinguished: specifying sign lan- sivity, thus leading to a poor acceptability of the avatar guage associated to pure synthesis animation techniques, by the deaf community. The authors argue that using or editing movement for data-driven animation tech- captured motion on real signers might help to under- niques. In order to produce understandable and accu- stand the main features of the sign generation mech- rate animations, both approaches require a fine com- anisms, and reinforce the virtual avatar acceptability prehension of the linguistic mechanisms underlying the for the deaf community. Furthermore, such data-driven production of signed utterances. approaches go beyond the production of realistic anima- tions: they avoid the tedious task of computing all the joint values of the skeleton for an animation sequence, 2.1 From Specifying Sign Language Utterances to and make possible the use of pre-recorded motion, thus Procedural Synthesis capturing the relevant key-postures and preserving the dynamics of the original motion. Furthermore, compos- Different linguistic stages for synthesis mediated by an ing new motions from editing captured motion can pro- avatar can be considered, including syntactic, semantic, vide new ways of manipulating, transporting, or visu- morphological, and phonetic issues. Until recently, sign- alizing sign language utterances that are not contained ing avatars have focused on the generation of signing ut- in the original database. terances given a phonetic or phonological description of This paper focuses on a fully data-driven virtual the sign sequence, with methods ranging from notation signer. Some important issues have already been stud- systems to formalized scripts or dedicated gestural lan- ied and implemented in previous works (see for ex- guages. Initial studies on sign languages formed early ample the whole pipeline production system described description / transcription systems, as for example the and evaluated in [20]). The goal here is to consider HamNoSys notation [41]. One of the first virtual sign- the knowledge and skills acquired in previous works, ers was based on a quantified description of the signing both theoretically and practically, and extract from this space, associated to a phonological description of hand- knowledge useful information for the development of arm movements [17]. Starting from HamNoSys as initial future applications; more specifically, previous works input description, a similar XML-based notation lan- are generalized and the main requirements and chal- guage called SigML has recently been developed [31, lenges for building an interactive editing system from 13]. This language has been recently extended by in- pre-recorded motion captured data are discussed. Each troducing the P DT S segmental model of Johnson and step is illustrated with current results related to the Liddell to provide greater flexibility and precision of Sign3D project [4], which aims to build a range of in- control [21]. A sign language animation tool also uses in- novative tools for storing and processing high-definition verse kinematics to synthesize an animation from a list LSF content. Section 2 describes two major approaches of hand location targets [27]. Other researches fall into to generate new sign language (SL) utterances: the textual- the category of linguistic modeling for sign language based SL specifying approach dedicated to pure syn- generation. Among these studies, higher-level language thesis techniques and the editing operations applied to processing techniques are used for the formation of signs data-driven animations. The creation of an appropriate (e.g [38,40,25,16,14]). corpus and the needed ressources to record SL gestures Several major challenges are still to be faced as re- are exposed in section 3. At the heart of the SL editing gards the synthesis of