Real-Time Translator with ASL Interpretation (RTAI)

Total Page:16

File Type:pdf, Size:1020Kb

Real-Time Translator with ASL Interpretation (RTAI) Real-Time Translator with ASL Interpretation (RTAI) University of Central Florida Department of Electrical and Computer Engineering Final Report Senior Design I Dr. Samuel Richie Group 14 Members Gustavo Camero Computer Engineering [email protected] Luis Hurtado Electrical Engineering [email protected] Michael Loyd Electrical Engineering [email protected] Jared Spinks Computer Engineering [email protected] Client/Advisor Dr. Chung Yong Chan [email protected] Page i Group 14 Final Report Table of Contents 1. Executive Summary ····················································································································· 1 2. Project Description ······················································································································ 2 2.1 Project Background ····················································································· 2 2.2 Motivation ··································································································· 2 2.2.2 GPU based RTAI ············································································································ 3 2.2.3 GPU-FPGA based RTAI ·································································································· 3 2.3 Goals and Objectives ···················································································· 3 2.4 Requirement Specification ·········································································· 4 2.5 Marketing Requirements. ············································································ 4 3. Research Related to the Project ··································································································· 6 3.1. Existing Projects and Products ·································································· 6 3.1.1. Google Translate ··········································································································· 6 3.1.2. Amazon Translate ········································································································· 7 3.1.3. Stratus Video ················································································································ 7 3.2 Relevant Technologies ················································································· 7 3.2.1. American Sign Language Typography ············································································ 7 3.2.1.1. SignWriting ············································································································ 8 3.2.1.2 Si5s ························································································································· 9 3.2.1.3 SignFont ·················································································································· 9 3.2.1.4 Stokoe Notation ······································································································ 10 3.2.1.5 Gloss Notation ········································································································ 10 3.2.1.6 Extended Linear Stokoe (ELS) Notation ··································································· 12 3.2.2 Machine Learning ········································································································· 18 3.2.2.1. Speech to Text ······································································································· 18 3.2.2.2 Text to Text Translation ·························································································· 23 3.2.2.3 Text-to-Speech ······································································································· 32 3.2.3 Processor Technologies ································································································· 35 3.2.3.1. CPU ······················································································································· 35 3.2.3.2. MCU ······················································································································ 36 3.2.3.3. FPGA ····················································································································· 37 3.2.3.4. GPU ······················································································································· 37 3.2.3.5 GPU versus FPGA ···································································································· 39 3.2.4 3D Graphics API Technologies ······················································································· 40 3.2.4.1 OpenGL ·················································································································· 40 3.2.4.2 Direct3D 12 ············································································································ 41 3.2.4.3 Metal ····················································································································· 42 3.2.4.4 Vulkan ··················································································································· 42 3.2.5 Internet Connectivity ···································································································· 42 3.2.6 Voltage Regulation ········································································································ 43 3.2.6.1 Linear Voltage Regulator ························································································ 44 3.2.6.2 Switching Voltage Regulator ··················································································· 44 3.2.6.3 Considerations for Voltage Regulators ····································································· 45 3.2.7. 3D Rendering Software Platforms ················································································· 47 3.2.7.1. Unreal Engine 4 ····································································································· 47 3.2.7.2. Unity ····················································································································· 48 3.2.7.3. MikuMikuDance ····································································································· 48 3.2.7.3. Blender ················································································································· 49 Page ii Group 14 Final Report 3.2.7.4 Godot ····················································································································· 49 3.3. Components and Part Selections ····························································· 51 3.3.1 Sound System Selection ································································································· 51 3.3.1.1 Amplifier ················································································································ 51 3.3.1.2 Speaker ·················································································································· 52 3.3.1.3 Microphone ············································································································ 53 3.3.2. FPGA Selection ············································································································· 55 3.3.2.1 Altera Cyclone IV – EP4CE22E22xxx ········································································ 56 3.3.2.2 Spartan 3E - XC3S500E-xPGx208C ·········································································· 57 3.3.2.3 Artix 7 – XC7A35T-1CPG236C ················································································· 57 3.3.3 Computer-on-Module (COM) Selection ··········································································· 58 3.3.3.1 Nvidia Jetson Nano Developer Board ······································································· 58 3.3.3.2 Nvidia Jetson TX2 Developer Board ········································································· 59 3.3.3.3 ASUS Tinker Board ································································································· 59 3.3.3.4 Raspberry Pi 4 Model B ·························································································· 60 3.3.4 Display Selection ··········································································································· 61 3.3.4.1 10.1" Display & Audio IPS Panel ·············································································· 61 3.3.4.2 Sceptre E205W-16003S LED Monitor ······································································ 62 3.3.4.3. UPERFECT 12.3” Touch Monitor ············································································· 62 3.3.5 Controller Selection ······································································································ 62 3.3.5.1 IR Remote ·············································································································· 62 3.3.5.2 RF Remote ············································································································· 63 3.3.5.3 Push Button ··········································································································· 63 3.3.6 Wi-Fi Module Selection ··································································································
Recommended publications
  • Machine Translation in Post-Contemporary Era
    Title : Machine Translation in Post-Contemporary Era Author name(s) : Grace Hui Chin Lin Adjunct Professor, National Changhua University of Education Publication date: (or date of document completion) 2010, Dec. 10 Conference information for conference papers (name, date and location of the conference) : Name : 2010 International Conference on TESOL and Translation, Dept. of English Language, Da-Yeh University Date: Dec. 10, 2010, Location: Changhua, Taiwan Proceedings of the International Conference on TESOL and Translation 2010 Machine Translation in Post-Contemporary Era Hui-chin Lin National Changhua University of Education Abstract This article focusing on translating techniques via personal computer or laptop reports updated artificial intelligence progresses before 2010. Based on interpretations and information for field of MT by Yorick Wilks’ book, Machine Translation, Its scope and limits, this paper displays understandable theoretical frameworks from views of a translating field worker and linguistics. Moreover, from author’s practical application experiences working as a translator, this research in addition introduces two current and popular soft wares and translating systems created by SYSTRAN and Google. The basic functions and characteristics created by computering assistant translation are thoroughly examined and introduced. More importantly, the advantages and weaknesses in computational assisting procedures are investigated and revealed. The purpose of this study mainly focuses on providing an overview for what functions and what doesn’t perform well in computational linguistics, comparing to human translation. After reading this commentary-oriented article, the reader will obtain basic concepts for definitions and explanations for diverse terminologies in MT field. Key words: Machine Translation, Artificial Intelligence, Computational Linguistics 47 Introduction Machine Translation has been a significant issue in our post-modernized world where almost everyone is able to use a computer and surf on the internet.
    [Show full text]
  • Study on the Size of the Language Industry in the EU
    Studies on translation and multilingualism o The size of the language industry in the EU European Commission Directorate-General for Translation 1/2009 Manuscript completed on 17th August 2009 ISBN 978-92-79-14181-2 © European Commission, 2009 Reproduction is authorised provided the source is acknowledged. %R7`V]Q` Q .V 1`VH Q`: VVJV`:C`Q``:JC: 1QJ Q` .V%`Q]V:J QII11QJ !1J:C0V`1QJ R$R% %R7QJ .V1<VQ` .VC:J$%:$V1JR% `71J .V .%$% .V:J$%:$VVH.JQCQ$7VJ `V R R 1J$ QJ1CC 1J$ QJ%]QJ.:IV %``V7 J1 VR1J$RQI 1118C:J$ VH.8HQ8%@ % .Q`7 `8R`1:JV 1JH.V.::.#1JQI]% : 1QJ:C1J$%1 1H``QI%QJJJ10V`1 75(V`I:J78 .V `Q%JRVR .V :J$%:$V VH.JQCQ$7 VJ `V ^_ 1J 5 : C1I1 VR HQI]:J7 G:VR 1J QJRQJ :JR 1JHQ`]Q`: VR 1J :.1J$ QJ #8 .J /]`1C 5 GVH:IV ]:` Q` : $`Q%] Q` HQI]:J1V%JRV` .V%IG`VCC:Q`/12#.3( R11 .#`811JH.V:I:=Q`1 7.:`V.QCRV`8 JRV`#`811JH.V;CV:RV`.1]5HQJ 1J%V QQ]V`: V::I%C 1C1J$%:CHQJ%C :JH75V`01HV :JRQ` 1:`VR1 `1G% 1QJHQI]:J71.V`V:Q` 1:`VRV1$J5RV0VCQ]IVJ :JR%]]Q` 1: `:J`V``VR Q/$1CVVGQC% 1QJ R811 .Q``1HV1JQJRQJ:JR%QJJ5(V`I:J78 #`8 11JH.V HQRQ`R1J: V 1J V`J:C :JR 7 `%JRVR `VV:`H. :JR RV0VCQ]IVJ ]`Q=VH 5 I:`@V %R1V:JR `1:C8.V1::]]Q1J VRV0:C%: Q``Q`V0V`:C:CC`Q``Q]Q:CQ` .V 7%`Q]V:JQII11QJ5:JR`V01V1V``Q`V0V`:C7]`Q=VH V0:C%: 1QJ8 :R1:1Q` V`:R:JQ 1;]`Q`1CV1JHC%RV:%H1J.71H:JR/R0:JHVRVH.JQCQ$1V]%`%VR : .VJ10V`1 1V Q` 8`V1G%`$ ^(V`I:J7_ :JR 1VJ: ^.
    [Show full text]
  • An Animated Avatar to Interpret Signwriting Transcription
    An Animated Avatar to Interpret SignWriting Transcription Yosra Bouzid Mohamed Jemni Research Laboratory of Technologies of Information and Research Laboratory of Technologies of Information and Communication & Electrical Engineering (LaTICE) Communication & Electrical Engineering (LaTICE) ESSTT, University of Tunis ESSTT, University of Tunis [email protected] [email protected] Abstract—People with hearing disability often face multiple To address this issue, different methods and technologies barriers when attempting to interact with hearing society. geared towards deaf communication have been suggested over The lower proficiency in reading, writing and understanding the the last decades. For instance, the use of sign language spoken language may be one of the most important reasons. interpreters is one of the most common ways to ensure a Certainly, if there were a commonly accepted notation system for successful communication between deaf and hearing persons in sign languages, signed information could be provided in written direct interaction. But according to a WFD survey in 2009, 13 form, but such a system does not exist yet. SignWriting seems at countries out of 93 do not have any sign language interpreters, present the best solution to the problem of SL representation as it and in many of those countries where there are interpreters, was intended as a practical writing system for everyday there are serious problems in finding qualified candidates. communication, but requires mastery of a set of conventions The digital technologies, particularly video and avatar-based different from those of the other transcriptions systems to become a proficient reader or writer. To provide additional systems, contribute, in turn, in bridging the communication support for deaf signers to learn and use such notation, we gap.
    [Show full text]
  • Current Perspectives on Linux Accessibility Tools for Visually Impaired Users
    ЕЛЕКТРОННО СПИСАНИЕ „ИКОНОМИКА И КОМПЮТЪРНИ НАУКИ“, БРОЙ 2, 2019, ISSN 2367-7791, ВАРНА, БЪЛГАРИЯ ELECTRONIC JOURNAL “ECONOMICS AND COMPUTER SCIENCE”, ISSUE 2, 2019, ISSN 2367-7791, VARNA, BULGARIA Current Perspectives on Linux Accessibility Tools for Visually Impaired Users Radka NACHEVA1 1 University of Economics, Varna, Bulgaria [email protected] Abstract. The development of user-oriented technologies is related not only to compliance with standards, rules and good practices for their usability but also to their accessibility. For people with special needs, assistive technologies have been developed to ensure the use of modern information and communication technologies. The choice of a particular tool depends mostly on the user's operating system. The aim of this research paper is to study the current state of the accessibility software tools designed for an operating system Linux and especially used by visually impaired people. The specific context of the considering of the study’s objective is the possibility of using such technologies by Bulgarian users. The applied approach of the research is content analysis of scientific publications, official documentation of Linux accessibility tools, and legal provisions and classifiers of international organizations. The results of the study are useful to other researchers who work in the area of accessibility of software technologies, including software companies that develop solutions for visually impaired people. For the purpose of the article several tests are performed with the studied tools, on the basis of which the conclusions of the study are made. On the base of the comparative study of assistive software tools the main conclusion of the paper is made: Bulgarian visually impaired users are limited to work with Linux operating system because of the lack of the Bulgarian language support.
    [Show full text]
  • Performance Analysis and Optimization Opportunities for NVIDIA Automotive Gpus
    Performance Analysis and Optimization Opportunities for NVIDIA Automotive GPUs Hamid Tabani∗, Fabio Mazzocchetti∗y, Pedro Benedicte∗y, Jaume Abella∗ and Francisco J. Cazorla∗ ∗ Barcelona Supercomputing Center y Universitat Politecnica` de Catalunya Abstract—Advanced Driver Assistance Systems (ADAS) and products such as Renesas R-Car H3 [1], NVIDIA Jetson Autonomous Driving (AD) bring unprecedented performance TX2 [20] and NVIDIA Jetson Xavier [35], [27] have already requirements for automotive systems. Graphic Processing Unit reached the market building upon GPU technology inherited (GPU) based platforms have been deployed with the aim of meeting these requirements, being NVIDIA Jetson TX2 and from the high-performance domain. Automotive GPUs have its high-performance successor, NVIDIA AGX Xavier, relevant inherited designs devised for the high-performance domain representatives. However, to what extent high-performance GPU with the aim of reducing costs in the design, verification and configurations are appropriate for ADAS and AD workloads validation process for chip manufacturers. remains as an open question. Unfortunately, reusability of high-performance hardware This paper analyzes this concern and provides valuable does not consider GPUs efficiency in the automotive domain insights on this question by modeling two recent automotive NVIDIA GPU-based platforms, namely TX2 and AGX Xavier. In and, to the best of our knowledge, the design space for GPUs, particular, our work assesses their microarchitectural parameters where resources are sized with the aim of optimizing specific against relevant benchmarks, identifying GPU setups delivering goals such as performance, has not been yet thoroughly increased performance within a similar cost envelope, or decreas- performed for the automotive domain. ing hardware costs while preserving original performance levels.
    [Show full text]
  • Audio Description and Audio Subtitling in a Dubbing Country: Case Studies
    Audio description and audio subtitling in a dubbing country: Case studies Bernd Benecke Bavarian Broadcasting, Munich, Germany Abstract In many European countries foreign films are not dubbed but subtitled. An audio describer has to include all the written subtitles in his script and try to make the description fit in between. Dubbing countries like Spain, Italy and Germany are also used to combining audio description and audio subtitling – for different reasons. This presentation shows how audio subtitling affects the work of describers in a dubbing country like Germany. It will present examples from daily work to show how many different ways are used to deal with the subtitles. Introduction A new focus in the research on audio description is the interaction with the field of audio subtitling. In many European countries, foreign films (mainly with English dialogues) are not dubbed, but rather subtitled. In such a case, the work of the audio describer becomes more complicated, for he has to include all the written subtitles in his script and try to make the description fit in between. In the production process, sometimes more than one narrator is needed to make a distinction between what is subtitle and what is description. From time to time, audio description and audio subtitling in a dubbing country 99 the describer has to introduce the name of a character being subtitled, in order to make clear who is speaking. However, audio subtitling is not a common practice only in subtitling countries: dubbing countries like Spain, Italy and Germany are also used to combining audio description and audio subtitling.
    [Show full text]
  • The Impact of Crowdsourcing Post-Editing with the Collaborative Translation Framework
    The Impact of Crowdsourcing Post-editing with the Collaborative Translation Framework Takako Aikawa1, Kentaro Yamamoto2, and Hitoshi Isahara2 1 Microsoft Research, Machine Translation Team [email protected] 2 Toyohashi University of Technology [email protected], [email protected] Abstract. This paper presents a preliminary report on the impact of crowdsourcing post-editing through the so-called “Collaborative Translation Framework” (CTF) developed by the Machine Translation team at Microsoft Research. We first provide a high-level overview of CTF and explain the basic functionalities available from CTF. Next, we provide the motivation and design of our crowdsourcing post-editing project using CTF. Last, we present the re- sults from the project and our observations. Crowdsourcing translation is an in- creasingly popular-trend in the MT community, and we hope that our paper can shed new light on the research into crowdsourcing translation. Keywords: Crowdsourcing post-editing, Collaborative Translation Framework. 1 Introduction The output of machine translation (MT) can be used either as-is (i.e., raw-MT) or for post-editing (i.e., MT for post-editing). Although the advancement of MT technology is making raw-MT use more pervasive, reservations about raw-MT still persist; espe- cially among users who need to worry about the accuracy of the translated contents (e.g., government organizations, education institutes, NPO/NGO, enterprises, etc.). Professional human translation from scratch, however, is just too expensive. To re- duce the cost of translation while achieving high translation quality, many places use MT for post-editing; that is, use MT output as an initial draft of translation and let human translators post-edit it.
    [Show full text]
  • How to Use Google Translate
    HOW TO USE GOOGLE TRANSLATE For some ASVAB CEP participants (or their parents), English is a second language. Google Translate is an easy way to instantly translate any webpage using these steps. Google Chrome Internet Explorer 1. Open Google Chrome. Google Translate is available on Internet Explorer version 6 and 2. Go to asvabprogram.com. later. To activate it: 3. Right click anywhere on the webpage. 1. Open Internet Explorer. 4. Select Translate from the menu. 2. Go to Google Toolbar’s website (toolbar.google.com), 5. Select Options. and click the “Download Google Toolbar” button. 6. On the Translate Language dropdown, 3. Click on “Accept and Install” and the toolbar will be select the desired language. automatically installed on your Internet Explorer. 4. Click Run or Open in the window that appears. 5. Enable the toolbar. 6. Go to asvabprogram.com. 7. Select More >> 8. Select Translate. 9. Then, the translate button will appear at the top of your webpage. 10. Right click to select the language option. 7. You will see the Google Translate icon in the browser bar, which you can use to manage your translation settings. iphone Android Microsoft Translator is a universal app for 1. On your Android phone or iPhone and iPad, and can be downloaded tablet, open the Chrome app. from the App Store for free. Once you’ve 2. Go to a webpage. got it downloaded, you can set up the action extension for translation web pages. 3. To change the language, tap 4. Tap Translate… To activate the Microsoft Translator extension in Safari: 5.
    [Show full text]
  • Embed Text to Speech in Website
    Embed Text To Speech In Website KingsleyUndefended kick-off or pinchbeck, very andantino. Avraham Plagued never and introspects tentier Morlee any undercurrent! always outburn Multistorey smugly andOswell brines except his herenthymemes. earthquake so accelerando that You in speech recognition Uses premium Acapela TTS voices with license for battle use my the. 21 Best proud to Speech Software 2021 Free & Paid Online TTS. Add the method to deity the complete API endpoint for your matter The most example URL represents a crop to Speech instance group is. Annyang is a JavaScript SpeechRecognition library that makes adding voice. The speech recognition portion of the WebSpeech API allows websites to enable. A high-quality unlimited TTS voice app that runs in your Chrome browser Tool for creating voice from despair or Google Drive file. Speech synthesis is so artificial production of human speech A computer system used for this claim is called a speech computer or speech synthesizer and telling be implemented in software building hardware products A text-to-speech TTS system converts normal language text into speech. Which reads completely consume it? How we Add run to Speech in WordPress WPBeginner. The divine tool accepts both typed and handwritten input and supports. RingCentral Embeddable Voice into Text Widget. It is to in your people. Most of the embed the files, voice from gallo romance, embed text to speech in website where the former is built in the pronunciation for only in the spelling with. The type an external program, and continue to a few steps in your blog publishers can be spoken version is speech text? Usted tiene teclear cualquier texto, website to in text speech recognition is the home with a female voice? The Chrome extension lets you highlight the text tag any webpage to hear even read aloud.
    [Show full text]
  • A Simplified Overview of Text-To-Speech Synthesis
    Proceedings of the World Congress on Engineering 2014 Vol I, WCE 2014, July 2 - 4, 2014, London, U.K. A Simplified Overview of Text-To-Speech Synthesis J. O. Onaolapo, F. E. Idachaba, J. Badejo, T. Odu, and O. I. Adu open-source software while SpeakVolumes is commercial. It must however be noted that Text-To-Speech systems do Abstract — Computer-based Text-To-Speech systems render not sound perfectly natural due to audible glitches [4]. This is text into an audible form, with the aim of sounding as natural as possible. This paper seeks to explain Text-To-Speech synthesis because speech synthesis science is yet to capture all the in a simplified manner. Emphasis is placed on the Natural complexities and intricacies of human speaking capabilities. Language Processing (NLP) and Digital Signal Processing (DSP) components of Text-To-Speech Systems. Applications and II. MACHINE SPEECH limitations of speech synthesis are also explored. Text-To-Speech system processes are significantly different Index Terms — Speech synthesis, Natural Language from live human speech production (and language analysis). Processing, Auditory, Text-To-Speech Live human speech production depends of complex fluid mechanics dependent on changes in lung pressure and vocal I. INTRODUCTION tract constrictions. Designing systems to mimic those human Text-To-Speech System (TTS) is a computer-based constructs would result in avoidable complexity. system that automatically converts text into artificial A In general terms, a Text-To-Speech synthesizer comprises human speech [1]. Text-To-Speech synthesizers do not of two parts; namely the Natural Language Processing (NLP) playback recorded speech; rather, they generate sentences unit and the Digital Signal Processing (DSP) unit.
    [Show full text]
  • A Low-Cost Deep Neural Network-Based Autonomous Car
    DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael G. Bechtely, Elise McEllhineyy, Minje Kim?, Heechul Yuny y University of Kansas, USA. fmbechtel, elisemmc, [email protected] ? Indiana University, USA. [email protected] Abstract—We present DeepPicar, a low-cost deep neural net- task may be directly linked to the safety of the vehicle. This work based autonomous car platform. DeepPicar is a small scale requires a high computing capacity as well as the means to replication of a real self-driving car called DAVE-2 by NVIDIA. guaranteeing the timings. On the other hand, the computing DAVE-2 uses a deep convolutional neural network (CNN), which takes images from a front-facing camera as input and produces hardware platform must also satisfy cost, size, weight, and car steering angles as output. DeepPicar uses the same net- power constraints, which require a highly efficient computing work architecture—9 layers, 27 million connections and 250K platform. These two conflicting requirements complicate the parameters—and can drive itself in real-time using a web camera platform selection process as observed in [25]. and a Raspberry Pi 3 quad-core platform. Using DeepPicar, we To understand what kind of computing hardware is needed analyze the Pi 3’s computing capabilities to support end-to-end deep learning based real-time control of autonomous vehicles. for AI workloads, we need a testbed and realistic workloads. We also systematically compare other contemporary embedded While using a real car-based testbed would be most ideal, it computing platforms using the DeepPicar’s CNN-based real-time is not only highly expensive, but also poses serious safety control workload.
    [Show full text]
  • A Human-Editable Sign Language Representation for Software Editing—And a Writing System?
    A human-editable Sign Language representation for software editing—and a writing system? Michael Filhol [email protected] LIMSI, CNRS, Université Paris Saclay Orsay, France Abstract To equip SL with software properly, we need an input system to rep- resent and manipulate signed contents in the same way that every day software allows to process written text. Refuting the claim that video is good enough a medium to serve the purpose, we propose to build a repres- entation that is: editable, queryable, synthesisable and user-friendly—we define those terms upfront. The issue being functionally and conceptually linked to that of writing, we study existing writing systems, namely those in use for vocal languages, those designed and proposed for SLs, and more spontaneous ways in which SL users put their language in writing. Ob- serving each paradigm in turn, we move on to propose a new approach to satisfy our goals of integration in software. We finally open the prospect of our proposition being used outside of this restricted scope, as a writing system in itself, and compare its properties to the other writing systems presented. 1 Motivation and goals The main motivation here is to equip Sign Language (SL) with software and foster implementation as available tools for SL are paradoxically limited in such digital times. For example, translation assisting software would help respond arXiv:1811.01786v1 [cs.CL] 5 Nov 2018 to the high demand for accessible content and information. But equivalent text-to-text software relies on source and target written forms to work, whereas similar SL support seems impossible without major revision of the typical user interface.
    [Show full text]