Assistive Technologies

Revista Informatica Economică nr. 2(46)/2008 135

Ion SMEUREANU, Narcisa ISĂILĂ Academy of Economic Studies, Bucharest [email protected], [email protected]

A special place into assistive technologies is taken up by the speech recognition and speech synthesizer, which can be used by many different users, persons with visual, language or mobility disabilities. For many years soft developers have been concerned by speech recognition and text-to-speech because we assist to great changes in informatics area and accessibility is the main condition in the creation of assistive software applications. Keywords : speech recognition, text-to-speech, assistive technologies, accessibility.

ntroduction settings in Control Panel using Accessibility I Over a decade, the computer asserts one- Options application regarding contrast, filter self as a needful instrument for persons with options, colors or navigation, all to increase disabilities offering them a new perspective, accessibility to each element of application a new way to live. used by the user. For these users, the products which have ex- Office package includes accessibility charac- isted on the market offer additional accessi- teristics as: zoom, contrast between graphic bility to computer and are created for each elements, the possibility to page setup for type of disability. For example, assistive better seeing documents’ content (using technology for visual disability persons in- Reading Layout), development accessibility cludes the following products: screen enlarg- Web pages (Microsoft FrontPage) by adding ers, screen readers, screen review utility, text to images, formatting styles, making im- speech recognition systems, speech synthe- age maps. sizer, refreshable Braille displays, Braille embossers, text readers, word prediction pro- 1. Speech Recognition Systems grams. Assistive products for persons with Named also voice recognition programs, may mobility disabilities include speech recogni- let users to introduce data, using voice in- tion systems, editing programs on screen us- stead keyboard or mouse. ing alternative products (sip and puff, sticks, The main characteristics of the speech recog- joysticks, trackballs), alternative keyboards, nition system are: keyboard filters for editing or touch screens. - vocabulary’s size, the number of words The persons with learning impairments can recognized use word prediction programs, reading com- - the separated or continuous speech prehension programs, speech synthesizers, - the conditions of noise speech recognition programs. - the number of speakers Many accessibility characteristics are offered - the percentage of recognize by Windows operating system or Office - processing time, on-line, delay, off-line package, which render easy the access to dif- - the area of applicability ferent elements using keyboard or mouse. In- The speech recognition is a complex process to operating system it’s remarkable the ac- composed by difficult constructive parts, in cessibility offered by Magnifier utility (for which a part of the system, the physics part, increasing the zoom of some screen parts), converts the sound into electric signal and Screen Review (which reads information adjusts it for entrance into the next part. The from screen using the sound), Narrator (for second part, the logic part, is represented by helping the user who works with programs the computer with sound board and necessary from Control Panel, NotePad, WordPad or program for all required processes. Internet Explorer browser). There are a lot of 136 Revista Informatica Economică nr. 2(46)/2008

Acoustic signal Database Database Unit recognized AMP Extract the FTJ Pre-processing parameters Coding Recogni ze CAN

Digital Electro Analysis Symbolic Analyse tact Fig.1. General structure of a speech recognition system

For understanding human speaking, each a) identifying the whole word, a method speech recognition system uses four main which consists in finding into the database components with strong connection: for the word fit to the audio signal. That 1) Dividing text in words, process used by means less search ways but it is necessary to speech recognition engine, which assures exist into the system templates for every greater or less accuracy in speech recogni- word, which overload utilization. tion. Thus: b) finding phonemes into the dictionary for a) digitization of the speech, meaning to in- the speaker language. The advantage is a re- sert short pauses after each pronounced word, duced space for keeping information but the which assures finding by the system the be- disadvantage is increased by the power cal- ginning and the end of the word. The advan- culation. tage is a little power calculation but the sys- 4) Dependence of the speaker is the main tem becomes unused if we don’t respect element to design and implement speech rec- pauses between words. ognition system. The system can be: b) identifying words into vocabulary me- a) independent by speaker and in this case thod, which allows the user to speak natural- there is a great resources consumption to ly without pauses between words, but the convert all into dialect human speech. system can read wrong if some used words b) dependent by speaker, using minimum aren’t into the vocabulary. resources but they require to educate systems c) continuous speaking method, offers the for a few hours and so accuracy is ninety best accuracy in speech recognition percentage. The users with mobility disabili- processing every pronounced word. Because ties prefer these systems because they are the system doesn’t use elements to separate easily to use. words is necessary a longer period of time to c) adjusted to speaker, meaning to educate find the beginning and the end for every system for the same speaker. word. The speech recognition system must achieve 2) The vocabulary or words list, which an equilibrium among four elements and to speech recognition system can find to a cer- assure independence by characteristics of tain moment. Using a rich vocabulary, that voice users. means to improve speech recognition but to increase the size vocabulary isn’t a guaranty 2. Text-to- speech (TTS) for a better accuracy if many words have the Starting from a written text it suppose differ- same meaning. ent constrains regarding vocabulary (theoret- 3) Finding words means to search in vocal ically unlimited) but pronounced sentences database and to achieve connections to each must be as naturally respecting an usual into- other and write the audio signal. nation. Because sentences can’t be memo- There are two methods: rized, it’s necessary to choose a limited en- Revista Informatica Economică nr. 2(46)/2008 137 semble of linguistic units which by the text to audio signal; intonation, emotions and process of concatenate allows the vocal syn- some problems in pronunciation overload thesis from written text. producing vocal signal. To eliminate these A text-to-speech system contains two units: impediments it can be used tags control or 1. a unit regarding editing text, which makes associate transcription text version to tran- all the operations meaning content analyze to scription phonetics word. convert text in audio codes. c) text-to-speech through concatenate di- 2. a synthesizer, which contents hardware phones ; support for synthesis proceeding. d) using phonemes pairs to produce each To create and use the rules for convert text sound. into vocal signal are analyzed the following It’s evaluated every word and then the sys- elements: tem joins phonemes to pronounce the word. a) the phonemes (sound’s parts which com- The system which concatenates diphones is pose the words) used to produce compute- individual language because phonemes are rized voice ; differentiated by speaker’s language. b) the quality of vocal signal which depends on the rules of finding and converting the

Text in ASCII format