Telephone Interface for the Email Service

Telephone Interface for the Email Service Nuno Baptista, Rui Prior Manuel E. Correia Instituto de Telecomunicações — Universidade do Porto CRACS — Universidade do Porto Porto, Portugal Porto, Portugal Email: [email protected] Email: [email protected] Abstract—Several studies have demonstrated the advantages There is an ever increasing need for mobile access to data of using IVR (Interactive Voice Response) technologies[5], which services [15]. This tendency is accelerating the development allow the users to phone a computer and access services by rate of mobile devices, where the traditional forms of input pressing the appropriate touch-tone keys on their telephones. Because it offers substantial benefits in terms of cost and time and output (keyboards, mice, and high-resolution displays) efficiency, there is a continuous pursuit for increased automation are cumbersome or impossible to use. The need for mobility using IVR. The real challenge lies in providing a user-friendly, yet is also a driving force behind the decrease in size of the cost-effective interface to users in order to improve their interac- devices we use to interact with Internet services. Function- tion with existing services, making them more usable and useful, ality and ease-of-use are rather limited on small devices, in providing an experience that fits their specific objectives and utilization contexts. This paper describes a method and system which designers are trying to include more and more features for providing customizable audio access to email messages kept into rapidly shrinking packages. This further exacerbates the in several IMAP (Internet Message Access Protocol) backstorages inadequacy of current input/output devices in mobile contexts, using an IVR application that takes full advantage of TTS (Text- making the use of alternative forms of user interaction not only To-Speech) software. advisable, but required. Thus, rather than merely shrinking the Keywords-IVR; voice user interfaces; touch-tone menus size of traditional interface elements, new I/O modalities must be explored. I. INTRODUCTION In some contexts, reliance on visual interaction is disad- Common forms of verbal and written interactions are cur- vantageous, and it is a real problem for visually-challenged rently being replaced by other more efficient communication users. Under these circumstances, human/computer interac- forms in the virtual world [11]. There are many practical and tion must rely on some other sense. Hearing is an obvious economic reasons that explain this current state of affairs [13]. candidate for an text-centric applications. Voice is the most By being able to dematerialize its presence in the virtual world, immediate and natural means of communication for Humans. an individual is able to conduct more then one conversation at Computer interfaces are no exception, and the use of voice for the same time and disseminate its virtual presence simultane- human/machine interactions is nowadays widely spread [3]. ously in many different places, increasing his communication When discussing communication in the Internet, it is in- capabilities by orders or magnitude. This is a game changer for evitable to consider email as one of the most useful and successful social and economical interactions. The widespread widespread services [1]. Its ubiquity is unparalleled, being adoption of cheap broadband Internet services, based on high- deployed in virtually all popular desktop and mobile operating speed data communications technologies like cable modems, systems. Email, however, has almost exclusively been accessed DSL (Digital Subscriber Line) or Fiber to the Home, is by the means of standard GUI (Graphical User Interface) catalyzing new forms of digital services and facilitating the applications. This paper presents a prototype of a practical IVR rapid integration of the Internet in daily activities by businesses (Interactive Voice Response) interface to the email service, a and ordinary users. telephonic MUA (Mail User Agent) that takes advantage of Computer-based VoIP (Voice over IP) services provide real TTS (Text-To-Speech) technologies for relaying textual email cost savings, portability and improved network utilization, information in the form of speech in mobile contexts where with the convergence of voice and data on the Internet visual access (reading) is impractical. Our goal is to further [4]. VoIP is currently being integrated with applications in leverage the ubiquity of email by providing a more versatile widespread use within business Intranets and innovative In- access method for situations where people are fully engaged ternet services [20]. These newer forms of integration have in critical tasks requiring continuous visual attention (such as attracted more users to this technology, since services that were driving) or vision cannot otherwise be used (e.g., by visually either unavailable in the PSTN (Public Switched Telephone impaired people). Service) or were too complex or expensive to deploy are While speech offers some unique advantages and opportuni- currently being made widespread by VoIP and associated ties for computer interaction, the known cognitive limitations services. This trend is expected to increase as the technology [18] of speech-based interfaces amplify the importance of matures and is further integrated into new and innovative data usability in the development of speech applications. IVR services, creating and promoting new forms of communication. usability design and re-engineering know-how ranges from style guides for touch-tone IVRs [10] to comprehensive col- • Security: In order to store user data preferences, we lections of best practices for IVR design [17], and also cover have used a MySQL database. As the database holds state-of-the-art speech-enabled IVRs [2]. Based on sound, an users passwords, we have employed security measures IVR system necessarily involves a serial model for content and cryptographic protocols to store, use and verify user presentation. Items at the top of the list are very prominent, credentials. Since there are two access passwords in the becoming progressively less prominent as we go down the list. system, one for the voice interface and the other for the This contrasts with web design where visual techniques, like web interface, each password is used as a cipher key to bold text, can be used to make items stand out, irrespective of encrypt the other, thus safeguarding data integrity. The their order. Thus, as a complement to the email IVR interface, web access is subject to some security concerns that will we developed a web configuration service that is used for the be detailed later. user pre-configuration of the touch-tone IVR email user agent. • Automatic language detection: Since we can use TTS There were previous approaches to speech-based email systems with support for different spoken languages, we access, e.g., Emacspeak [16]. However, Emacspeak still re- have also implemented automatic email text language quires a computer with a regular keyboard. Together with the detection in order to convey the email message to the concept of virtual folders, introduced in a program called View user in the appropriate spoken language. Mail[12], it was used as inspiration for our work. • Email acronyms expansion: Email writing is an informal The rest of the paper is organized as follows. Section II means of communication that frequently makes heavy us- describes the design of the voice interface and the associated age of acronyms and abbreviations. We have implemented web-based customization interface. Section III describes some an acronym expansion based mechanism to make email implementation details of our prototype. Section IV discusses speech more fluid and understandable. The currently the different TTS modules used in our system. Section V recognizable acronym list can and will be increased in concludes the paper and provides some topics for further the future. Users can add or overload acronym expansion development of this work. entries using a personal acronym list. A. Voice Interface II. SYSTEM DESIGN The voice interface plays a central role in the system. We have used the IMAP model for email access [6] as a Whether using voice menus or graphical menus, the presenta- guiding framework for the hierarchical folder-based structure tion of the software to the user as a list of items is one of the of the developed IVR interface. Our system is based on two most popular styles of interaction. Such structure favors the complementary applications, one is based on voice and the user’s performance by helping him form a mental model that other is based on the web. is in agreement with the actual system model. The web interface is used to configure the IVR system. The The development of the voice interface was performed IVR has many parameters that need to be set and adjusted having in mind the following aspects: prior to its usage, and this can only be done effectively by the • Time: Voice applications have in time their added value. means of a more traditional web application. Parameters are Feedback should be brief but informative in order to save divided in several configuration groups, namely virtual email time and reduce the amount of information that the user folders and message filters specifications, security credentials must hold in memory. for backstorage access, email language automatic detection, • Instant feedback: In a well-designed prompt, it must be and acronyms expansions for a more fluid user hearing

Load more