Selection of Tutorials and Related Materials for Spoken Language Engineering

CHAPTER 3 SELECTION OF TUTORIALS AND RELATED MATERIALS FOR SPOKEN LANGUAGE ENGINEERING Klaus Fellbaum Brandenburg Technical University of Cottbus, Germany Marian Boldea Universitea Politehnica Timisoara, Romania Andrzej Drygajlo Ecole Polytechnic Federale de Lausanne, Switzerland Mircea Giurgiu Technical University of Cluj-Napoca, Romania Phil Green University of Sheffield, United Kingdom Ruediger Hoffmann Technische Universität, Dresden, Germany Michael McTear University of Ulster, Northern Ireland Bojan Petek University of Ljubljana, Slovenia Victoria Sanchez University of Granada, Spain Kyriakos Sgarbas University of Patras, Greece Spoken Language Engineering 1 Introduction This chapter summarises work by the Spoken Language Engineering (SLE) Working Group of the Socrates Thematic Network in Speech Communication Sciences. The SLE Working Group, now in its third funding year, has surveyed SLE course provision in Europe (Green et al., 1997) and has made proposals for SLE curriculum development at both undergraduate and postgraduate levels (Espain et al., 1998). The thematic network has shown that computer-based teaching aids (on- line tutorials, demonstration packages and so on) are vital to the future development of SLE education. This follows from the multidisciplinary and technical nature of SLE, which requires novel ways of presenting unfamiliar material. In recent years, such software has begun to appear, partly as a result of initiatives taken within the network, within associated projects and independently. The increasing interest in SLE courseware was demonstrated at the recent MATISSE workshop, on which much of the following review material is based (Hazan and Holland, 1999). The chapter analyses software resources available in relation to curricular requirements and educational criteria and makes recommendations for modules in an SLE curriculum. In addition, we identify areas for which high-quality courseware is, to our knowledge, unavailable and identify actions to fill these remaining gaps. According to the structuring of the second book (Bloothooft et al., 1998), we used the sections: • Introduction to Speech Communication and Speech Technology • Speech Analysis • Natural Language Processing • Speech Production and Perception • Speech Coding • Speech Synthesis • Speech Recognition • Spoken Dialogue Modelling • Language Resources Concerning the subchapters on Applications and Current Research in SLE, we did not find tutorials or other relevant teaching material. This is not very surprising since applications in speech processing are usually a commercial matter and a company (presenting applications) normally does not have a strong interest in a detailed and tutorial-like presentation. For current research in general, the time is too short to transform the results into a didactically oriented form and thus the usual presentation is in scientific articles or in proceedings. Generally speaking, we found a very heterogeneous coverage of the speech communication area, heterogeneous in both, the subjects (see above) and the media (Web, CDROM, books etc.). We could identify an accumulation of introductory material, mainly in speech production and perception, signal processing and linguistics. But other areas (for example speech coding and synthesis) are not covered satisfactorily. As to the media, very often the Web presentations were only test versions with a more advertising character offering CDROMS, books or the download of the complete material after payment. Finally, the quality of the material and the didactic quality varies strongly. 22 Spoken Language Engineering As a general remark it must be stated that SPEECH INPUT is still a difficult problem in the landscape of the Web. SPEECH OUTPUT in contrast is quite easy! There are only very few tutorials using speech input. For the moment, there are only two possibilities to perform speech input: • The shareware SoundBite (SCRAWL company) which works only for Windows 95/98 or NT and the NETSCAPE Browser 4.04 (or higher). For more details and downloading visit http://www.scrawl.com/store. • The Tcl/Tk plugin. This plugin can be used if speech input (and output) is applied to an existing tutorial. If someone wants to produce a new tutorial, the Tcl/Tk libraries are necessary. For more details see http://www.scriptics.com/plugin/ . A third possibility, based on Java2 tools, is in preparation and a release is announced for the end of 1999. For now, only pre-versions are available. One module is Java Sound API (http://java.sun.com/products/java-media/sound) which has record and playback but no storage features. Another Java product is Java Media Framework which has storage capabilities and, in addition, network transmission (RTP) features. However, it is also a pre-version. For more details see http://java.sun.com/products/java-media/jmf . The next sections will present a selection of tutorials in detail. 23 Spoken Language Engineering 2 Introduction to Speech Communication and Speech Technology This section deals with introductory material. As a matter of fact, the subjects cover a wide area between speech signal presentations in the time and frequency domain, signal processing techniques (windowing, FFT, parameter extraction etc.) and basic principles of acoustics and physiology. Although most of the tutorials are far from complete speech courses, they are very useful as appetizers and they can motivate beginners to dive into the speech area. Speech Visualisation Tutorial http://isl.ira.uka.de/~maier/speech/vistut/ University of Karlruhe, Interactive Systems Laboratories Availability: free. Requirements: WWW-browser with sound replay capabilities. Description: The tutorial covers visualisation of speech waveforms and spectrograms. It presents the waveform and a spectrogram of the utterance "speech lab". Labels have been added to the views of the speech marking the beginning of each phoneme (or speech sound) in the utterance. Impression: May be used as a short introductory text on spectrograms. comp.speech Frequently Asked Questions WWW site http://svr-www.eng.cam.ac.uk/comp.speech/ University of Cambridge, Department of Engineering Availability: free. Requirements: WWW-browser with sound replay capabilities. Description: The site provides a range of information on speech technology, including speech synthesis, speech recognition, speech coding, and related material. The information is regularly posted to the comp.speech newsgroup as the "comp.speech FAQ" posting. This site is mirrored at several other WWW sites around the world (Australia, UK, Japan and USA) and the information is also available in a plain text format. There are 250 comp.speech WWW pages and they include over 500 hyperlinks to speech technology web sites, ftp servers, mailing lists, and newsgroups. Impression: These web sites are a very useful tool to get oriented in the world of speech technology. They are not suited as teaching material but they present a collection of interesting speech themes and very many links to speech products. Speech Analysis Tutorial http://www.ling.lu.se/research/speechtutorial/tutorial.html University of Cambridge, Department of Engineering 24 Spoken Language Engineering Author: Tony Robinson Availability: free. Requirements: WWW-browser with sound replay capabilities. Description: Very brief, very introductory tutorial on speech analysis, introducing fundamental speech signal representations (waveform, Fo contour, spectrum, spectrogram, waterfall spectrogram, phonetic transcription), suitable for a first exposure to these topics. Impression: The tutorial covers a lot of details but gives only short explanations. Thus, it is suited to support lectures in speech signal analysis. Spectrogram Reading Tutorial http://cslu.cse.ogi.edu/tutordemos/SpectrogramReading/spectrogram_readin g.html Availability: free. Requirements: WWW-browser with sound replay capabilities. Description: This is a more extended introduction to speech signal representations, stressing spectrograms and transcription, with many practical exercises. Das Lesen von Sonagrammen V0.2. Begleitendes Hypertext-Dokument zur Vorlesung (in German) http://www.phonetik.uni-muenchen.de/SGL/SGLHome.html Institut für Phonetik und Sprachliche Kommunikation der Ludwig-Maximilians- Universität München Authors: K. Machelett, H.G. Tillmann Availability: free Requirements: WWW Browser Description: The tutorial deals with Spectrogram Reading following the chapters • Fundamentals • The sound classes in the sonagram • On the differentiation of sounds within the sound classes • Reading sonagrams in practice Impression: Companion material to complete lecture series in spectrogram reading with high expertise and good pictures. Das SPRACHLABOR - eine multimediale Einführung in die Welt des Sprechens/der Phonetik (in German) http://www.media-enterprise.de/sprachla/sprachla.htm Availability: Only demo version Requirements: WWW-browser with sound replay capabilities. Description: Physiology of the speech organs, acoustic fundamentals of the speech 25 Spoken Language Engineering process, spectrogram reading, speech analysis. Impression: Professional program. Tutorien und Skripte der Universität Kiel (in German, English and Swedish) http://www.ipds.uni-kiel.de/links/skripte.de.html Availability: free Requirements: WWW Browser with sound replay capabilities Description: Course papers and audio demonstrations on acoustic phonetics. Also an interactive course on linguistics, speech synthesis, speech recognition. Impression: A very useful

Load more