Parastack: Efficient Hang Detection for MPI Programs at Large Scale

Total Page:16

File Type:pdf, Size:1020Kb

Parastack: Efficient Hang Detection for MPI Programs at Large Scale ParaStack: Eicient Hang Detection for MPI Programs at Large Scale Hongbo Li, Zizhong Chen, Rajiv Gupta University of California, Riverside CSE Department, 900 University Ave Riverside, CA 92521 fhli035,chen,[email protected] ABSTRACT It is widely accepted that some errors manifest more frequently While program hangs on large parallel systems can be detected via at large scale both in terms of the number of parallel processes and the widely used timeout mechanism, it is dicult for the users to problem size as testing is usually performed at small scale to manage set the timeout – too small a timeout leads to high false alarm rates cost and some errors are scale and input dependent [12, 29, 36, 39, and too large a timeout wastes a vast amount of valuable computing 40]. Due to communication, an error triggered in one process (faulty resources. To address the above problems with hang detection, this process) gradually spreads to others, nally leading to a global hang. paper presents ParaStack, an extremely lightweight tool to detect Although a hang may be caused by a single faulty process, this hangs in a timely manner with high accuracy, negligible overhead process is hard to locate as it is not easily distinguishable from with great scalability, and without requiring the user to select a other processes whose execution has stalled. us, the problem of timeout value. For a detected hang, it provides direction for fur- hang diagnosing, i.e. locating faulty processes, has received much ther analysis by telling users whether the hang is the result of an aention [11, 27, 28, 31]. error in the computation phase or the communication phase. For Typically hang diagnosing is preceded by the hang detection step a computation-error induced hang, our tool pinpoints the faulty and this problem has not been adequately addressed. Much work process by excluding hundreds and thousands of other processes. has been done on communication-deadlock — a special case of We have adapted ParaStack to work with the Torque and Slurm hang — using methods like time-out [17, 25, 32], communication parallel batch schedulers and validated its functionality and perfor- dependency analysis [20, 38], and formal verication [37]. ese mance on Tianhe-2 and Stampede that are respectively the world’s tools either use an imprecise timeout mechanism or precise but current 2nd and 12th fastest supercomputers. Experimental results centralized technique that limits scalability. MUST [22, 23] claims demonstrate that ParaStack detects hangs in a timely manner at to be a scalable tool for detecting MPI deadlocks at large scale, negligible overhead with over 99% accuracy. No false alarm is ob- but its overhead is still non-trivial as it ultimately checks MPI se- served in correct runs taking 66 hours at scale of 256 processes and mantics across all processes. ese non-timeout methods do not 39.7 hours at scale of 1024 processes. ParaStack accurately reports address the full scope of hang detection, as they do not consider the faulty process for computation-error induced hangs. computation-error induced hangs. In terms of hang detection, ad hoc timeout mechanism [2, 27, 28, 31] is the mainstream; however, it is dicult to set an appropriate threshold even for users that 1 INTRODUCTION have good knowledge of an application. is is because the optimal timeout not only varies across applications, but also with input Program hang , the phenomenon of unresponsiveness [34], is a characteristics and the underlying computing platform. Choosing common yet dicult type of bug in parallel programs. In large a timeout that is too small leads to high false alarm rates and too scale MPI programs, errors causing a program hang can arise in large timeouts lead to long detection delays. e user may favor either the computation phase or the MPI communication phase. selecting a very large timeout to achieve high accuracy while sac- Hang causing errors in the computation phase include innite ricing delay in detecting a hang. For example, IO-Watchdog [2] loop [31] within an MPI process, local deadlock within a process monitors writing activities and detects hangs based on a user spec- due to incorrect thread-level synchronization [28], so error in one ied timeout with 1 hour as the default. Up to 1 hour on every MPI process that causes the process to hang, and unknown errors processing core will be wasted if the user uses the default timeout in either soware or hardware that cause a single computing node seing. us, a lightweight hang detection tool with high accuracy to freeze. Errors in MPI communication phase that can give rise to is urgently needed for programs encountering non-deterministic a program hang include MPI communication deadlocks/failures. hangs or sporadically triggered hangs (e.g., hangs that manifest rarely and on certain inputs). It can be deployed to automatically Permission to make digital or hard copies of all or part of this work for personal or terminate erroneous runs to avoid wasting computing resources classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation without adversely eecting the performance of correct runs. on the rst page. Copyrights for components of this work owned by others than ACM To address the above need for hang detection, this paper presents must be honored. Abstracting with credit is permied. To copy otherwise, or republish, ParaStack, an extremely lightweight tool to detect hangs in a timely to post on servers or to redistribute to lists, requires prior specic permission and/or a fee. Request permissions from [email protected]. manner with high accuracy, negligible overhead with great scala- SC17, Denver, CO, USA bility, and without requiring the user to select a timeout value. Due © 2017 ACM. 978-1-4503-5114-0/17/11...$15.00 to its lightweight nature, ParaStack can be deployed in production DOI: 10.1145/3126908.3126938 SC17, November 12–17, 2017, Denver, CO, USA Hongbo Li, Zizhong Chen, Rajiv Gupta runs without adversely aecting application performance when no hang arises. It handles communication-error-induced hangs and hangs brought about by a minority of processes encountering a computation error. For a detected hang, ParaStack provides direc- tion for further analysis by telling whether the hang is the result of an error in the computation phase or the communication phase. For a computation-error induced hang, it pinpoints faulty processes. ParaStack is a parallel tool based on stack trace that judges a hang by detecting dynamic manifestation of following paern of behavior – persistent existence of very few processes outside of MPI calls. is simple, yet novel, approach is based upon the fol- lowing observation. Since processes iterate between computation and communication phases, a persistent dynamic variation of the count of processes outside of MPI calls indicates a healthy running state while a continuous small count of processes outside MPI calls strongly indicates the onset of a hang. Based on execution history, ParaStack builds a runtime model of count that is robust even with limited history information and uses it to evaluate the likelihood Figure 1: ParaStack workow – steps with solid border are of continuously observing a small count. A hang is veried if the performed by ParaStack and those shown with dashed bor- likelihood of persistent small count is signicantly high. Upon der require a complimentary tool. detecting a hang, ParaStack reports the process in computation phase, if any, as faulty. e above execution behavior based model is capable of detecting that ParaStack detects hangs in a timely manner at negligible hangs for dierent target programs, with dierent input charac- overhead with over 99% accuracy. No false alarm was observed teristics and sizes, and running on dierent computing platforms in correct runs taking about 66 hours in total at the scale of without any assistance from the programmer alike. ParaStack re- 256 processes and 39.7 hours at the scale of 1024 processes. In ports hang very accurately and in a timely manner. By monitoring addition, ParaStack accurately identies the faulty process for only a constant number of processes, ParaStack introduces negligi- computation-error induced hangs. ble overhead and thus provides good scalability. Finally, it helps in identifying the cause of the hang. If a hang is caused by a faulty process with an error, all the other concurrent processes get stuck 2 THE CASE FOR PARASTACK inside MPI communication calls. If the error is inside communi- Hang detection is of value to application users and developers alike. cation phase, the faulty process will also stay in communication; Application users usually do not have the knowledge to debug the otherwise, it will stay in computation phase. Simply checking application. In batch mode, when a hang is encountered, the ap- whether there are processes outside of communication can tell the plication simply wastes the remainder of the allocated computing type of hang, communication-error or computation-error induced, time. is problem is further exacerbated by the fact that users as well as the faulty processes for a computation-error induced commonly request a bigger time slot than what is really needed to hang. e main contributions of ParaStack are: ensure their job can complete. If users are unaware of a hang occur- rence, they may rerun the application with even a much bigger time • ParaStack introduces highly ecient non-timeout mechanism allocation, which will lead to even more waste. By aaching a hang to detect hangs in a timely manner with high accuracy, negligi- detection capability to a batch job scheduler with negligible over- ble overhead, and great scalability. us it avoids the diculty head, ParaStack can help by terminating the jobs and reporting the of seing the timeout value.
Recommended publications
  • Hang Analysis: Fighting Responsiveness Bugs
    Hang Analysis: Fighting Responsiveness Bugs Xi Wang† Zhenyu Guo‡ Xuezheng Liu‡ Zhilei Xu† Haoxiang Lin‡ Xiaoge Wang† Zheng Zhang‡ †Tsinghua University ‡Microsoft Research Asia Abstract return to life. However, during the long wait, the user can neither Soft hang is an action that was expected to respond instantly but in- cancel the operation nor close the application. The user may have stead drives an application into a coma. While the application usu- to kill the application or even reboot the computer. ally responds eventually, users cannot issue other requests while The problem of “not responding” is widespread. In our expe- waiting. Such hang problems are widespread in productivity tools rience, hang has occurred in everyday productivity tools, such as such as desktop applications; similar issues arise in server programs office suites, web browsers and source control clients. The actions as well. Hang problems arise because the software contains block- that trigger the hang issues are often the ones that users expect to re- ing or time-consuming operations in graphical user interface (GUI) turn instantly, such as opening a file or connecting to a printer. As and other time-critical call paths that should not. shown in Figure 1(a), TortoiseSVN repository browser becomes unresponsive when a user clicks to expand a directory node of a This paper proposes HANGWIZ to find hang bugs in source code, which are difficult to eliminate before release by testing, remote source repository. The causes of “not responding” bugs may be complicated. An as they often depend on a user’s environment. HANGWIZ finds hang bugs by finding hang points: an invocation that is expected erroneous program that becomes unresponsive may contain dead- to complete quickly, such as a GUI action, but calls a blocking locks (Engler and Ashcraft 2003; Williams et al.
    [Show full text]
  • What Is an Operating System III 2.1 Compnents II an Operating System
    Page 1 of 6 What is an Operating System III 2.1 Compnents II An operating system (OS) is software that manages computer hardware and software resources and provides common services for computer programs. The operating system is an essential component of the system software in a computer system. Application programs usually require an operating system to function. Memory management Among other things, a multiprogramming operating system kernel must be responsible for managing all system memory which is currently in use by programs. This ensures that a program does not interfere with memory already in use by another program. Since programs time share, each program must have independent access to memory. Cooperative memory management, used by many early operating systems, assumes that all programs make voluntary use of the kernel's memory manager, and do not exceed their allocated memory. This system of memory management is almost never seen any more, since programs often contain bugs which can cause them to exceed their allocated memory. If a program fails, it may cause memory used by one or more other programs to be affected or overwritten. Malicious programs or viruses may purposefully alter another program's memory, or may affect the operation of the operating system itself. With cooperative memory management, it takes only one misbehaved program to crash the system. Memory protection enables the kernel to limit a process' access to the computer's memory. Various methods of memory protection exist, including memory segmentation and paging. All methods require some level of hardware support (such as the 80286 MMU), which doesn't exist in all computers.
    [Show full text]
  • Mac OS X: an Introduction for Support Providers
    Mac OS X: An Introduction for Support Providers Course Information Purpose of Course Mac OS X is the next-generation Macintosh operating system, utilizing a highly robust UNIX core with a brand new simplified user experience. It is the first successful attempt to provide a fully-functional graphical user experience in such an implementation without requiring the user to know or understand UNIX. This course is designed to provide a theoretical foundation for support providers seeking to provide user support for Mac OS X. It assumes the student has performed this role for Mac OS 9, and seeks to ground the student in Mac OS X using Mac OS 9 terms and concepts. Author: Robert Dorsett, manager, AppleCare Product Training & Readiness. Module Length: 2 hours Audience: Phone support, Apple Solutions Experts, Service Providers. Prerequisites: Experience supporting Mac OS 9 Course map: Operating Systems 101 Mac OS 9 and Cooperative Multitasking Mac OS X: Pre-emptive Multitasking and Protected Memory. Mac OS X: Symmetric Multiprocessing Components of Mac OS X The Layered Approach Darwin Core Services Graphics Services Application Environments Aqua Useful Mac OS X Jargon Bundles Frameworks Umbrella Frameworks Mac OS X Installation Initialization Options Installation Options Version 1.0 Copyright © 2001 by Apple Computer, Inc. All Rights Reserved. 1 Startup Keys Mac OS X Setup Assistant Mac OS 9 and Classic Standard Directory Names Quick Answers: Where do my __________ go? More Directory Names A Word on Paths Security UNIX and security Multiple user implementation Root Old Stuff in New Terms INITs in Mac OS X Fonts FKEYs Printing from Mac OS X Disk First Aid and Drive Setup Startup Items Mac OS 9 Control Panels and Functionality mapped to Mac OS X New Stuff to Check Out Review Questions Review Answers Further Reading Change history: 3/19/01: Removed comment about UFS volumes not being selectable by Startup Disk.
    [Show full text]
  • Recovering from Operating System Crashes
    Recovering from Operating System Crashes Francis David Daniel Chen Department of Computer Science Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign University of Illinois at Urbana-Champaign Urbana, USA Urbana, USA Email: [email protected] Email: [email protected] Abstract— When an operating system crashes and hangs, it timers and processor support techniques [12] are required to leaves the machine in an unusable state. All currently running detect such crashes. program state and data is lost. The usual solution is to reboot the It is accepted behavior that when an operating system machine and restart user programs. However, it is possible that after a crash, user program state and most operating system state crashes, all currently running user programs and data in is still in memory and hopefully, not corrupted. In this project, volatile memory is lost and unrecoverable because the proces- we use a watchdog timer to reset the processor on an operating sor halts and the system needs to be rebooted. This is inspite system crash. We have modified the Linux kernel and added of the fact that all of this information is available in volatile a recovery routine that is called instead of the normal boot up memory as long as there is no power failure. This paper function when the processor is reset by a watchdog. This resumes execution of user processes after killing the process that was explores the possibility of recovering the operating system executing when the watchdog fired. We have implemented this using the contents of memory in order to resume execution on the ARM architecture and we use a periodic watchdog kick of user programs without any data loss.
    [Show full text]
  • How to Use Microsoft Adplus to Capture Hang and Crash Logs for EFT Server
    How to use Microsoft ADPlus to capture hang and crash logs for EFT Server THE INFORMATION IN THIS ARTICLE APPLIES TO: • EFT v6 and later DISCUSSION In order for support to properly determine root cause of a crash or hang of EFT Server, the client must be able to provide a crash dump or a hang dump. This article describes the following procedures: 1. How to use Microsoft ADPlus to capture hang and crash logs. 2. Collect a hang/crash dump. 3. Create and submit a case to support. IMPORTANT: Use of ADPlus to monitor crashes requires an active login on the system. Logging off will kill the ADPlus application. This can run while the server is locked, but user must NOT log off while using this. You will also need to disable any automatic restarting of the EFT service to ensure that the dump files can be collected properly. Confirm if you are experiencing a Hang or a Crash • A Hang is when the process (cftpstes.exe) is still running in task Manager but EFT does not respond to FTP/SSH/HTTP connections. It may also not respond to Administrator requests through COM or through the Administrator Interface. • A Crash is when the process (cftpstes.exe) is no longer running in Task Manager. Download ADPlus Download the 32-bit version of ADPlus from the URL below to capture the Crash or Hang dump. https://support.microsoft.com/en-us/help/286350/how-to-use-adplus-vbs-to-troubleshoot-hangs-and-crashes **Although there is a 64-bit version of ADPLUS, you MUST use the 32-bit version regardless of OS, as EFT is a 32-bit process** How to use Microsoft ADPlus to capture hang and crash logs for EFT Server Install ADPLUS Install ADPLUS, choosing the "custom" option, in the following folder: C:\dbtools Once ADP is installed, create the following folder: C:\ADPlusOutput For an Application Hang If you are investigating a hang, then the adplus.bat file should only be run when the system is hanging.
    [Show full text]
  • 3.01.0000.00 Release Notes
    VTrak® A-Class Mac OS X or macOS SAN Client VTrak Mac OS X Client Package 1.4.2 (build 54047) Release Notes. (Mac OS X/macOS Clients only) This Mac Client Package Requires VTrak A-Class firmware SR3.3 Version 1.16.0000.00 or later R1.00 10/25/2017 This document is applicable for SAN Clients of the following PROMISE A-Class models: Model Description VTrak A3800fSL 4U/24 FC, single controller with 2 FS Support VTrak A3800fDM 4U/24 FC, dual controller with 4 FS support VTrak A3600fSL 3U/16 FC, single controller with 2 FS Support VTrak A3600fDM 3U/16 FC, dual controller with 4 FS support The Client Operating Systems supported by this release include: Vendor Platform Type macOS 10.13 x64 macOS 10.12 (10.12.4, x64 Apple 10.12.5, 10.12.6) OS X 10.11 (10.11.6) x64 VTrak Client Package version 1.4.2 (54047) This release includes the macOS (High Sierra) 10.13 Support Notes: . Includes Extended ACL support for SAN client . Includes Spotlight support - Command line command “mdutil -i on” to enable Spotlight on VTrakFS (network) volume . Supports for English language only. The option to switch language is not implemented. R1.00 10/25/2017 Important Note on upgrading the VTrak Client in macOS 10.13: Sometimes after upgrading to macOS 10.13 and installing the new VTrakFS Client, you won’t be able to mount the file system. This is likely due to complications from the new requirements for User Approved Kernel Extension loading in macOS 10.13.
    [Show full text]
  • How to Start Using Princeton's High Performance Computing Systems
    How to Start Using Princeton’s High Performance Computing Systems Dawn Koffman Office of Population Research Princeton University January 2017 1 Running Your Computation on Princeton’s High Performance Computing Systems significantly more computing resources available compared to your laptop or desktop but often, particularly at first, it is much easier to develop and debug locally, and then ‐ connect ‐ transfer files ‐ submit jobs and make use of full computing power available on these remote systems 2 Workshop Outline Part I • Princeton’s High Performance Computing Systems Part II • Linux Philosophy and User Interface • Files • Directories • Commands • Shell Programs • Stream Editor: sed 3 Princeton’s High Performance Computing Systems • Overview • Obtaining Accounts • Connecting • Transferring Files • Running R scripts and Stata .do Files • Using a Scheduler to Submit Computing Jobs 4 Part I Princeton’s High Performance Computing Systems ‐ remote computing systems managed by Princeton’s Research Computing group http://www.princeton.edu/researchcomputing/ ‐hardware location: ‐ High Performance Computing Research Center (HPCRC) ‐ 47,000‐square‐foot facility opened in 2011 ‐tours ‐ computing systems make up TIGRESS: Terascale Infrastructure for Groundbreaking Research in Engineering and Science ‐in addition to remote computing systems, Research Computing also manages: software licenses and support (Stata, SAS, Matlab, …) visualization lab (open house today, 3:30‐5:30pm, Peter B. Lewis Library, Room 347) 5 How to Get Started ‐ request
    [Show full text]
  • Some Computers May Hang Or Present a Blue Screen of Death
    FAQ Some Computers May Hang or Present a Blue Screen of Death with Stop Code: MEMORY MANAGEMENT Error You may experience a time where your Dell computer stops working or presents a Stop Code: MEMORY MANAGEMENT error Figure 1, (commonly called a Blue Screen Of Death or BSOD), this may be caused by DDR4 2666MHz memory modules installed in the computers listed in Table 1. Dell recommends that the latest BIOS be installed. For more information refer to Dell Knowledge Base article Dell BIOS Updates. Note: To determine whether or not that the latest BIOS includes the the resolution for this issue, perform the following steps. 1. Once on the product page for your computer, select the BIOS file name to expand the section. 2. Touch or click View full driver details. 3. Look under Enhancements:. 4. When Improved memory compatibility for DDR4 2666Mhz memory DIMMS is listed, the BIOS has been updated to resolve the issue. When it does not, please refer to this site later, for an updated BIOS. Dell also provides an application called SupportAssist, which provides automatic computer updates and proactive resolution features to help identify and prevent issues. To run the SupportAssist application, perform the following steps. 1. In the Search Box on your computer, type SupportAssist. 2. Select SupportAssist in the search results to open the application. Note: When SupportAssist is not listed in the search results, it means the application is not installed on your computer. For more information on how to download and install SupportAssist, see the SupportAssist for PCs and tablets page.
    [Show full text]
  • Automating Problem Analysis and Triage Sasha Goldshtein @Goldshtn Production Debugging
    Automating Problem Analysis and Triage Sasha Goldshtein @goldshtn Production Debugging Requirements Limitations • Obtain actionable • Can’t install Visual information about Studio crashes and errors • Can’t suspend • Obtain accurate production servers performance • Can’t run intrusive information tools In the DevOps Process… Automatic build (CI) Automatic Automatic deployment remediation (CD) Automatic Automatic error triage monitoring and analysis Dump Files Dump Files • A user dump is a snapshot of a running process • A kernel dump is a snapshot of the entire system • Dump files are useful for post-mortem diagnostics and for production debugging • Anytime you can’t attach and start live debugging, a dump might help Limitations of Dump Files • A dump file is a static snapshot • You can’t debug a dump, just analyze it • Sometimes a repro is required (or more than one repro) • Sometimes several dumps must be compared Taxonomy of Dumps • Crash dumps are dumps generated when an application crashes • Hang dumps are dumps generated on-demand at a specific moment • These are just names; the contents of the dump files are the same! Generating a Hang Dump • Task Manager, right- click and choose “Create Dump File” • Creates a dump in %LOCALAPPDATA%\Te mp Procdump • Sysinternals utility for creating dumps • Examples: Procdump -ma app.exe app.dmp Procdump -ma -h app.exe hang.dmp Procdump -ma -e app.exe crash.dmp Procdump -ma -c 90 app.exe cpu.dmp Procdump -m 1000 -n 5 -s 600 -ma app.exe Windows Error Reporting • WER can create dumps automatically
    [Show full text]
  • Dell EMC Powerscale Onefs: Using Macos Clients
    Best Practices Dell EMC PowerScale OneFS: Using macOS Clients Abstract This document provides best practices for using Apple® macOS® version 10.13 and above with Dell EMC™ PowerScale™ OneFS™ version 8.0 and above. June 2020 H17954 Revisions Revisions Date Description September 2014 Initial release named “Using Mac OS X Clients with PowerScale OneFS 7.x” September 2019 Renamed title and updated focus on using macOS 10.13 and above on OneFS 8.0 and above; updated with new template June 2020 PowerScale rebranding Acknowledgements This paper was produced by the following members of the Dell EMC: Author: Lieven Lin ([email protected]) Support: Gregory Shiff Dell EMC and the author of this document welcome your feedback along with any recommendations for improving this document. The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any software described in this publication requires an applicable software license. Copyright © 2014–2020 Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC, Dell EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be trademarks of their respective owners. [6/6/2020] [Best Practices] [H17954] 2 Dell EMC PowerScale OneFS: Using macOS Clients | H17954 Table of contents Table of contents Revisions............................................................................................................................................................................
    [Show full text]
  • Why Software Hangs and What Can Be Done with It ∗
    Why Software Hangs and What Can Be Done With It ∗ Xiang Song, Haibo Chen and Binyu Zang Parallel Processing Institute, Fudan University {xiangsong, hbchen, byzang}@fudan.edu.cn Abstract In this paper, we present the first comprehensive study Software hang is an annoying behavior and forms a ma- on real-world software hang bugs. We study the reports of jor threat to the dependability of many software systems. hang-related bugs from four popular open source applica- To avoid software hang at the design phase or fix it in pro- tions: MySQL, PostGre, Apache HTTPD server and Fire- duction runs, it is desirable to understand its characteris- fox, which are widely used in three-tier browser-server ar- tics. Unfortunately, to our knowledge, there is currently chitecture. In total, we collect 307 confirmed hang bugs no comprehensive study on why software hangs and how and analyze the characteristics of 233 bugs with known root to deal with it. In this paper, we study the reported hang- causes (categorized as Certain). We also examine the rest related bugs of four typical open-source software applica- 74 bugs (categorized as Uncertain) to study the obstacles to tions, aiming to gain insight into characteristics of software fix these bugs. hang and provide some guidelines to fix them at the first Our study makes the following observations: place or remedy them in production runs. 1 Introduction • Applications not only heavily suffer from well-known hang causes such as deadlock, data race and infinite Software dependability is crucial to server and user ap- loop, but also from design errors and execution envi- plications, especially mission-critical ones.
    [Show full text]
  • Mapping Guis to Auditory Interfaces
    Mapping GUIs to Auditory Interfaces Elizabeth D. Mynatt and W. Keith Edwards Graphics, Visualization, and Usability Center College of Computing Georgia Institute of Technology Atlanta, GA 3032-0280 [email protected], [email protected] ABSTRACT little trouble accessing standard ASCII terminals. The line- This paper describes work to provide mappings between X- oriented textual output displayed on the screen was stored in based graphical interfaces and auditory interfaces. In our the computer’s framebuffer. An access program could sim- system, dubbed Mercator, this mapping is transparent to ply copy the contents of the framebuffer to a speech synthe- applications. The primary motivation for this work is to sizer, a Braille terminal or a Braille printer. Conversely, the provide accessibility to graphical applications for users who contents of the framebuffer for a graphical interface are sim- are blind or visually impaired. We describe the design of an ple pixel values. To provide access to GUIs, it is necessary to auditory interface which simulates many of the features of intercept application output before it reaches the screen. This graphical interfaces. We then describe the architecture we intercepted application output becomes the basis for an off- have built to model and transform graphical interfaces. screen model of the application interface.The information in Finally, we conclude with some indications of future the off-screen model is then used to create alternative, acces- research for improving our translation mechanisms and for sible interfaces. creating an auditory “desktop” environment. The goal of this work, called the Mercator1 Project, is to pro- KEYWORDS: Auditory interfaces, GUIs, X, visual impair- vide transparent access to X Windows applications for com- ment, multimodal interfaces.
    [Show full text]