Motivations, Challenges, and Perspectives for the Development of a Deep Learning based Automatic Speech Recognition System for the Under-resourced Ngiemboon Language Patrice A. Yemmene Laurent Besacier School of Engineering Laboratoire Informatique de Grenoble University of Saint Thomas, MN, USA University of Grenoble, France
[email protected] [email protected] Abstract systems for automatic speech recognition were mostly guided by the theory of acoustic-phonetics, which Nowadays, a broad range of speech recognition describes the phonetic elements of speech (the basic technologies (such as Apple Siri and Amazon sounds of the language) and tries to explain how they Alexa) are developed as the user interface has are acoustically realized in a spoken utterance” (Juang become ever convenient and prevalent. Machine learning algorithms are yielding better training and Rabiner, 2005). These efforts date back to the early results to support these developments in 50s. Since then, ASR has yielded incredible Automatic Speech Recognition (ASR). development in a broad range of commercial However, most of these developments have been technologies where Speech Recognition as the user in languages with worldwide, political, economic interface has become ever useful and pervasive. and/or scientific influence such as English, However, most of these developments have been in Japanese, German, French, and Spanish, just to languages with strong scientific, political, and/or name a few. On the other hand, there has been economic influences such as English, German, French, little or no development of ASR systems (or and to some extent Japanese and Spanish, just to name a language technologies) in most minority and few.