Supporting Medical Conversations between Deaf and Hearing Individuals with Tabletop Displays Anne Marie Piper and James D. Hollan Distributed Cognition and Human-Computer Interaction Laboratory Department of Cognitive Science, University of California, San Diego 9500 Gilman Dr., La Jolla, CA 92093 {apiper, hollan}@ucsd.edu

ABSTRACT language interpreter to facilitate communication between a This paper describes the design and evaluation of Shared deaf patient and hearing doctor. Deaf patients must plan Speech Interface (SSI), an application for an interactive ahead to ensure that an interpreter is available. For most multitouch tabletop display designed to facilitate medical patients, privacy is a central concern. Some deaf patients, conversations between a deaf patient and a hearing, non- depending on their comfort level with interpreters, prefer signing physician. We employ a participatory design and actually use an alternate communication channel (e.g., process involving members of the deaf community as well email or instant messenger to discuss sensitive medical as medical and communication experts. We report results issues with their physician). Increasing communication from an evaluation that compares conversation when privacy is one motivation behind the work reported here. facilitated by: (1) a digital table, (2) a human sign language While other viable communication tools for the deaf interpreter, and (3) both a digital table and an interpreter. community exist, tabletop displays with speech recognition Our research reveals that tabletop displays have valuable have potential to facilitate medical conversations between properties for facilitating discussion between deaf and deaf and hearing individuals. Consultations with physicians hearing individuals as well as enhancing privacy and often involve discussion of visuals such as medical records, independence. The contributions of this work include initial charts, and scan images. Interactive tabletop displays are guidelines for cooperative group work technology for users effective for presenting visual information to multiple with varying hearing abilities, discussion of benefits of people at once without necessarily designating one person participatory design with the deaf community, and lessons as the owner of the visual. Taking notes while meeting with about using dictated speech on shared displays. a physician is problematic for deaf individuals because it Categories and Subject Descriptors requires simultaneously attending to the doctor’s facial H.5.3 [Information Interfaces and Presentation (e.g., expressions, the interpreter’s visual representation of HCI)]: Group and Organization Interfaces - Computer- speech, and notes on paper. A tabletop display allows all supported cooperative work. active participants to maintain face-to-face contact while General Terms viewing a representation of conversation in a central Design, Experimentation, Human Factors. location. Our implementation incorporates keyboard input by the patient and speech input by the doctor, allowing the Keywords physician to speak and gesture as they discuss medical Computer-supported cooperative health care, multitouch, details and visuals with the patient. SSI leverages the tabletop groupware, assistive technology, deafness, affordances of multimodal tabletop displays to enhance multimodal interfaces, speech recognition. communication between a doctor and patient, potentially INTRODUCTION transforming a challenging situation into a constructive and This paper presents the design and evaluation of Shared collaborative experience. Speech Interface (SSI), an application for an interactive Our work on SSI provides practical experience designing a tabletop display that enables and supports communication shared communication device for users with varying between a deaf patient and a hearing, non-signing medical hearing and speaking abilities. We examine the challenges doctor. Currently, medical facilities provide a sign of representing speech visually to multiple users around a tabletop display. We also provide design guidelines for

using dictated speech with shared display systems and discuss the implications of our work on multimodal tabletop displays for the broader CSCW community. BACKGROUND Loss of hearing is a common problem that can result from noise, aging, disease, and heredity. Approximately 28 million Americans have significant hearing loss, and of that group work for only hearing populations and have yet to group, almost six million are profoundly deaf [17]. A examine the value of tabletop displays for deaf populations. primary form of communication within the United States Technologies for the Deaf deaf community is American Sign Language. ASL is not a As early as 1975, researchers began investigating how visual form of English; it is a different language with its cooperative computing environments, such as early forms own unique grammatical and syntactical structure. Sources of instant messenger, could facilitate communication estimate that ASL is the fourth most commonly used between deaf and hearing individuals [27]. More recently, language in the U.S. [17]. While ASL is widely used in the HCI researchers have examined how mobile devices, tablet U.S., no one form of sign language is universal. Different computers, and video conferencing technologies can countries and regions use different sign languages. For augment communication for deaf individuals. Schull example, British Sign Language is different from American investigated communication via a browser-based client on Sign Language, although both countries have English as multiple co-located laptops that allows a deaf and hearing their official and primary spoken language. user to access a common browser window and share real- Within the deaf community, as with most communities, time chat information [21]. iCommunicator [7], a there is great variability among individual needs and commercial product, enables communication in a similar abilities. Individuals who were born deaf may or may not way. Only a handful of initiatives, however, have attempted have been raised in the aural tradition where they were to facilitate shared face-to-face communication taught to speak, read, and write in English. Others who experiences. The majority of work has focused on single- suffer from late-onset hearing loss typically have fully user interfaces for distributed applications. For example, developed vocal abilities but are not fluent in ASL and do MobileASL enables two signing individuals to not know how to read lips. This range of individual abilities communicate with each other over cell phones with real and needs has lead to a gamut of techniques for time video [1]. Scribe4Me, a mobile sound translation tool, communicating with and adapting to the hearing world. enables deaf individuals to request a transcription of the Adaptive Techniques past 30 seconds of audio in their environment [10]. There is For the deaf population proficient in a spoken language also work using peripheral displays to visualize various such as English, writing has long been a central form of channels of auditory information for deaf individuals [11]. communication with the hearing world. Deaf individuals The Facetop Tablet project examined visual attention without an interpreter nearby often use handwritten notes problems that deaf individuals experience when trying to and deictic gesturing as a means of communication. view a presentation, watch an ASL interpreter, and take Telephone use was impossible for the deaf community until notes [13]. This project presents a viable approach for the invention of the Teletype and Text Telephone (TTY), a helping deaf individuals follow a conversation with hearing typing-based system that transmits individual lines of individuals but it does not explicitly help deaf individuals keyboard entry over phone lines. Adoption of the TTY and communicate or become active participants in the eventually the personal computer made typing an essential conversation. Past research has also examined enhancing mode of communication within the deaf community. In communication for deaf individuals through gesture recent years, the invention of webcams and increasing recognition with computer vision and wearable computers Internet bandwidth gave rise to communication through (e.g., [24]). videochat with other ASL speakers and ASL interpreters. While these solutions address various communication ASL interpreters play a central role in enabling face-to-face challenges for deaf individuals, none provide a shared communication between many deaf and hearing communication experience that bridges both speaking and individuals. For the deaf population fluent in ASL, listening barriers between deaf and hearing individuals. communicating through an interpreter is an optimal choice Furthermore, many of these communication technologies for many situations. Interpreters, however, are expensive only provide an interface and feedback to one user. Our and not always available. Furthermore, though interpreters system, on the other hand, leverages the cooperative nature are bound by a confidentiality agreement, the presence of a of interactive tabletop displays to enable a shared, co- third person in a highly private conversation may reduce a constructed communication experience between users with deaf person’s comfort and inhibit their willingness to speak varying hearing and speaking abilities. candidly. Tabletop Displays Related Work The field of CSCW has a rich history of research on Researchers have developed a variety of technologies to tabletop displays and their utility for supporting group address communication barriers between the deaf work. This body of research examines cooperative group community and hearing world. Researchers investigating work around tabletop displays with hearing populations and tabletop technologies traditionally explore cooperative is the foundation for our research. We leverage techniques from research on social protocols around digital tables [16], the notion of shared and private spaces on tabletop displays [22], and how to present textual, pictorial, and auditory the problem is it only had one line at a time. It was information to users [15][19]. hard to remember what the issue was if you kept Research on multimodal tabletop interaction, specifically losing it.” integrating speech with touch input, also influenced our When an interpreter is provided by a medical facility, the work. Work by Tse et. al. involving speech commands in deaf individual usually does not have a choice about who group design and gaming experiences is a primary example will interpret. The deaf community is well-connected in our [25][26]. This work focuses on facilitating interactions city, and one deaf woman said that she would not feel between hearing individuals but does not examine the use comfortable going through an interpreter that she knew of multimodal tabletop displays for non-hearing well or going through a male interpreter. She emphasized populations. the need for a better option when a deaf patient is not comfortable with the interpreter. DESIGN PROCESS We employed a participatory design process [20], involving Beyond issues of privacy and autonomy, ensuring accuracy members of the deaf community and domain experts in all of communication with an interpreter is critical. It is often aspects of the design and evaluation of SSI. When the case that interpreters are not experts on the content they designing for special needs populations, it is critical to gain are interpreting. For example, one deaf individual described community support and involve domain experts and having an interpreter in Biology class who knew little about community members from the beginning [6][18]. As we Biology. This made understanding the interpretation prototyped SSI, we concurrently met with deaf individuals, extremely difficult and impacted her ability to learn in the linguists studying deaf culture and sign language, and classroom. When it comes to health care issues, receiving medical professionals who communicate regularly with accurate and full interpretation is essential. Dr. Stevens is a deaf individuals. In fact, the idea for SSI came from a physician and professor at our university hospital. She medical doctor. We began by conducting interviews to established a program to teach medical students about deaf investigate current technologies for communication and culture and how to create a deaf friendly medical problems with existing technologies. environment. Dr. Stevens comments on the challenges of relaying information through an interpreter: Communication Challenges As previously mentioned, sign language interpreters are an “We’re assuming that interpreters have a lot of important link enabling communication between deaf and medical information, and they may not. They may hearing individuals. The physical presence of an be miscommunicating. I always think about the interpreter, however, may inhibit a deaf patient from doctor who told the patient, take three of these a candidly discussing private medical issues. Furthermore, day, and the interpreter didn’t know to explain take other problematic aspects of communicating through one morning, lunch, and dinner.” interpreters include accuracy of interpretation and Facilitating conversation with an interpreter also creates challenges for the deaf individual when attending to challenges for deaf individuals to share their attention multiple channels of visual information. between the conversation and note-taking. Dr. Stevens Karen1 is a Professor of Communication who studies deaf explains: culture and sign language. She was born deaf to two deaf “With an interpreter… the patient has no ability to parents but was raised in the aural tradition by attending make a record of what’s being said because the eyes hearing schools. Karen has adapted exceptionally well to are on the interpreter and they can’t be on your the hearing world by reading lips and developing her vocal paper, on the interpreter, on the doctor getting your abilities. She does not need an interpreter for most emotions. What’s your body language saying? situations, but in an interview mentioned her deaf You’re showing me my knee but I’m looking at husband’s privacy concerns, “My husband for example, lip your face to see how bad this really is… So most of reads his doctor… he doesn’t want an interpreter. If he us go home with notes written down, but if you’re needs an interpreter he wants me to come.” Karen describes deaf, it’s hard to do both.” another situation and reiterates the need for a more private The goal behind our design for SSI is to explore an and independent communication medium: alternative to human interpreters as well as to augment “I know one situation where a therapist needed to conversation that occurs through an interpreter. see a deaf person on an emergency situation and [the patient] wasn’t comfortable going through an Technical Implementation interpreter. So they had two TTYs. They actually We prototyped SSI on a MERL DiamondTouch table [4] took the TTYs and were typing back and fourth. But using the DiamondSpin toolkit [23]. The DiamondTouch table is a multiuser, multitouch top-projected tabletop display. Users sit on conductive pads that enable the system to uniquely identify each user and where each user is 1 All names were changed to preserve anonymity. touching the surface. Our system enables conversational input through standard keyboard entry and a headset microphone. The audio captured from the microphone is fed into a speech recognition engine (currently this is ’ default recognizer, but our application is easily adapted to any off-the-shelf recognizer). SSI uses the Speech API [8] and CloudGarden software [3] to interface with the speech recognition engine and send converted speech-to-text into the main application running on the DiamondTouch table. Because understanding and analyzing conversation becomes increasingly complex as the number of speakers increases, we decided to implement the first version of SSI for two users only, one hearing and one hearing-impaired user. We wanted to investigate whether tabletop displays are an effective communication tool for this basic case of two users prior to expanding to larger group work scenarios. The discussion section at the end of this paper provides ideas for expanding and adapting the SSI Figure 1: first design, users technology for more than two users. seated at corners, color coded speech bubbles Interface Design appear near user; (right) We chose the form factor of a tabletop display because it speech recognition engine allows the doctor and patient to maintain face-to-face gives three “best guesses” communication. For the deaf patient, reading and making of spoken word “one”, user facial expressions is an important communication channel. touches box that matches For the doctor, eye contact with the patient reveals the their intention, then speech patient’s understanding and emotional state. A wall bubble appears on display. mounted display or even a traditional computer monitor would not enable face-to-face interaction in the same way; previous gesture occurred. The moveable nature of speech however, smaller horizontal devices such as a shared tablet bubbles in our design is an analog for the spatial nature of computer may also be worth exploring. ASL signs. Seating users at angled positions makes We explored multiple tabletop interface designs before uniformly orienting speech bubbles towards the bottom of proceeding with the design tested in the evaluation section. the display a natural choice. We deferred the decision of From conversation research, we know that orientation and the specific locations where speech should appear on the position of speakers is an important aspect of display until we had feedback from our initial prototype. In communication. Certain positions afford easier eye contact the first version (see Figure 1) we presented speech bubbles and information sharing, especially when speakers are close to the user who entered the speech but not in an positioned around a table. Sitting across from someone ordered fashion. We wanted to examine how people would gives direct access to facial expressions but makes sharing organize speech bubbles on the display and determine textual information problematic, although some solutions whether preserving turn-taking is important. for reorienting and rotating text have been explored [19]. With our design, both users could have easily entered While sitting side-by-side makes sharing textual conversation through keyboards. We hypothesized that it information much easier, the critical activities of making was important for the physician to speak naturally to the eye contact and reading lips becomes challenging. We patient. The patient could then attend to the physician’s chose to seat users at an angled position (see Figures 4 and body language and read their lips. The doctor’s facial 6) to facilitate eye-contact, lip reading, and information expressions and body language are masked when speech is sharing. entered through a keyboard. As noted by Dr. Stevens, As each person contributes to the conversation, either by reading the physician’s body language is an essential part speaking into a microphone or typing on the keyboard, their of communication in medical conversations. speech appears on the tabletop display in front of them. We While speech recognition engines are constantly refer to these fragments of conversation as “speech improving, transcribing natural language into text is still bubbles.” Speech bubbles are color-coded by user and problematic. We wanted the speaking user to be able to moveable around the display. In ASL, speech has a visual control the speech that the system displays, so SSI provides and spatial element, and signers often point back to where a the speaking user with three “best guesses” of their speech from the recognition engine (see Figure 1). The user Organizing Conversation, Managing Clutter touches the phrase that matches their intended speech and In our first design, the interface became cluttered quickly, the phrase appears on the interface as a speech bubble. even with a trashcan to delete unwanted conversation. While this requires an extra step for the speaking user, it Reviewers said the trashcan was useful for correcting allows greater fidelity in displayed speech and prevents misunderstandings and removing unwanted speech bubbles unintended conversation from appearing on the display. (especially large speech bubbles). However, there was still Users can delete previous parts of conversation by dragging a need to organize conversation and several people speech bubbles to the trashcans in the top corners of the suggested being able to create a new page. display. In general, we wanted to allow users to correct We considered zooming and scrolling interfaces to increase miscommunications and remove unwanted speech. screen real estate, but instead we created a simple tabbed Prototype Review design that leveraged users’ knowledge of web browsers As part of our participatory design process, eight people (see Figure 2). One deaf man said, “The tabs are a great reviewed our design, four of whom are deaf and the other way to know what you’re going to work on and how you’re four are actively involved in the deaf community. An going to move forward. You can go back and refer to interpreter was present to facilitate communication between something without having to search for it…the tabs are a the deaf participants and hearing participants and nice way to do that.” In response to his comment, a deaf researchers. Our questions at this stage of design addressed woman said, “exactly, I think it’s good because it’s not user position at the table, text size and color, speech bubble overwhelming…it’s very deaf friendly. It’s very visual.” behavior, and the use of keyboard entry. This preliminary Overall, reviewers liked the tabbed design and found it easy feedback revealed several key issues about the design of to understand, so we proceeded with this design. SSI. We discuss these and explain how they influenced In our first design, we also presented speech bubbles at an subsequent designs and evaluation. arbitrarily location in front of the user who contributed the Overall User Interface Design speech. Reviewers said that it was difficult to anticipate We found that positioning users at an angled position where speech would appear and that it is important to around the table works well. One deaf person said, “It feels preserve turn-taking. We modified our design to preserve too much like teacher and student when people sit across turn-taking by presenting speech bubbles in a linear pattern, from each other.” Maintaining eye contact and reading offset and color coded to indicate speaker (see Figure 2). shared speech bubbles was not a problem with the current Pre- and Post-Discussion Use configuration. Reviewers said that the text size and colors Dr. Stevens explained that doctors often know several key worked well, but because the speech bubble text is fairly discussion points before they enter a meeting with a patient. large, the display cluttered quickly. Displaying medical She said that it would be extremely helpful to have the images in the background was also received positively, nurse enter talking points to structure and pace the especially for stepping through slides of an MRI or conversation. These could be available on the doctor’s side indicating fracture points in an x-ray image (see Figure 2). of the interface. Dr. Stevens also mentioned that it would be good to add labels to the tabs to show the topics for discussion (e.g., “welcome” and “update since last visit”). Reviewers unanimously wanted the dialogue to persist after the appointment. They mentioned saving the conversation for the next visit, printing it to take home and share with family, and receiving a copy of it via email. Diversity within the Deaf Community One critical finding that came out of our prototype review is that there is great diversity within the deaf community and that our system would work better for certain subpopulations. Members of the deaf community thought that SSI would work well for deaf individuals who feel comfortable using English and for individuals who are Hard of Hearing. They said SSI would be problematic for deaf individuals with low English literacy or low confidence in English communication. Design Modifications Based on feedback received in the preliminary evaluation,

we kept the text size and font the same, added a feature to Figure 2: Navigation tabs at top, speech bubbles are colored display speech in an ordered fashion that preserves turn- and offset by speaker and appear in order, conversation visuals are displayed behind all speech bubbles. taking, and proceeded with the tabbed interface design. We made a slight modification to the look of the speech require participants to discuss information that might be too bubbles by adding a tail on each bubble that indicates the personal. Both the order of conditions and discussion user who added it. After testing speech recognition with prompts were randomized between subjects. An multiple people, including medical professionals, we added experimenter ended each conversation after 8 minutes a button to bring up a virtual keyboard so the doctor could (based on the length of actual medical consultations). After type certain words that are difficult for the voice completing the three conditions, the deaf participant recognition system to recognize. completed a survey about their experience. EVALUATION Analysis Our evaluation examines the role of tabletop displays in After the experiment, two researchers reviewed videos of facilitating medical conversations between a deaf patient the conversations (24 total, approximately 200 minutes of and a hearing, non-signing doctor. We present a user study talking) and examined interaction based on transcription that compares communication through the digital table, an techniques described by Jefferson [8] and McNeill [12]. interpreter, and both the digital table and an interpreter. The video data revealed extremely rich and complex multimodal interactions between the patient, doctor, Methodology interpreter, and digital table. These interactions merit a We conducted a laboratory experiment with eight deaf separate analysis and are topics for future work. In this participants (mean age=33, stdev=11.4, range=[22,52]; 3 paper, we characterize high-level themes relevant to the males) and one medical doctor (age=28, female). All eight design of SSI and summarize survey results. deaf participants were born deaf or became deaf before the age of one. Three participants identified English as their Findings native language and five identified ASL. All participants Overall, participants indicated that digital tables are a were fluent in ASL and proficient at reading and writing in promising medium for facilitating medical conversations. English. Each deaf participant conversed with a medical We observed a rich use of gesture to augment doctor and a professionally trained ASL interpreter about a communication in the Digital Table condition. Survey data sample medical issue. We chose to have each deaf indicated that our application was good for private participant work with the same doctor. This resembles the conversations and enabled independence. However, the real-world scenario where one doctor has similar interaction overall is limited by the technology and certain conversations with multiple patients throughout the day. aspects of communication were lost in practice. None of the participants had used a tabletop display prior to Specifically, imperfections in the speech recognition engine participating in this evaluation. Our evaluation was made conversation in the Digital Table condition conducted in our university laboratory around a considerably slower than in the Interpreter condition. DiamondTouch table. Each session was video taped by two Conversation and Gesture Analysis cameras from different angles to capture participants’ There were several key differences in communication interactions with each other and the digital table. The between the Digital Table and Interpreter conditions. The computer unobtrusively recorded all user interactions with Digital Table condition allowed for asynchrony in the tabletop surface. Three researchers were present for the communication, whereas the interpreter acted as a broker of testing sessions and took notes. conversation and thus encouraged synchronous Procedure interactions. Dialogue in the Interpreter condition was The doctor trained the speech recognition engine prior to verbose and elaborated, while speech in the Digital Table the first testing session. At the beginning of each testing condition was more concise and typographic in nature. We session, the deaf participant and doctor performed a brief observed equitable participation levels in the two period of training together to get to know each other and conditions, the doctor and patient each contributed to about adjust to the task. Then the patient and doctor discussed a half of the conversation. medical issue using SSI (Digital Table), a human Slower conversation in the Digital Table condition was interpreter (Interpreter), and both SSI and the interpreter likely due to problems with the speech recognition process. (Mixed). At the beginning of each conversation, the deaf The system took one second to determine and display the participant received a discussion prompt in English text that speech in textual form. Then the interface required the described a medical topic (e.g., nutrition). Each discussion doctor to tap on the phrase she wanted to add. The best prompt had a corresponding medical visual. For speech recognition result occurred when the doctor broke consistency, the experimenter manually preloaded medial her natural speech into short phrases, but this also slowed visuals into the system under the third tab. The other two communication. On some occasions the doctor made two or tabs were blank conversation space. A paper version of the three attempts before the system accurately recognized her visual was provided for the Interpreter condition. Medical speech. While speech recognition was problematic, it professionals worked with us to ensure that the discussion provided ancillary benefits such as allowing the doctor to prompts reflected authentic conversations that might occur gesture while speaking and the patient to read the doctor’s in a normal patient interaction but whose content did not lips and facial expressions. Gestural and Nonverbal Communication. In the Digital Table condition, we observed that non-verbal and gestural communication played an important role in augmenting and ensuring successful communication. Importantly, the co- located, face-to-face nature of the digital table allowed participants to provide feedback to their partner about their state of understanding through deictic gesture (e.g., patient pointing), gaze sharing, and head nodding. Participants strategically moved speech bubbles in front of the other doctor user to get their attention and pointed between bubbles to make a connection to previous speech [2]. We also noticed a pattern in which one participant moved or pointed to an object on the interface and then one or both participants nodded to confirm their understanding. This pattern occurred frequently in the Digital Table condition. Figure 3 illustrates an example where participants point together. Figure 3: Digital Table condition, doctor circles region on the Video of this instance reveals that the gesture is map and asks “will you be here,” patient responds by accompanied by gaze sharing and head nodding. touching the “here” speech bubble and then Brazil. Observations also revealed the use of iconic gestures to Dr: wash your hands often with purell… augment speech (in one case the doctor pantomimes hand washing, see Figure 4). The doctor, more so than the P: is that a specific brand soap patient, used iconic or illustrative gestures to support her Dr: its just the alcohol based speech. that can prevent a lot of The role of gesture is especially relevant to cooperative illnesses work because it indicates that participants were iteratively refining and confirming their conversational understanding and engaging in highly coordinated activity through verbal and nonverbal channels. Our design enabled this interaction through the face-to-face design, horizontal form factor, moveable speech bubbles, and voice recognition, freeing the doctor’s hands and thereby enabling co-occurring gesture-speech. Affordances of Digital Space. The digital table transformed the ephemeral nature of speech into a tangible and Figure 4: Digital Table condition, doctor and patent share persistent form, thus creating affordances that are not gaze while doctor pantomimes hand washing. Arrows show available in traditional conversation. We observed eye gaze and text emphasis shows gesture timing. interesting behaviors with the speech bubbles because of their form. When a phrase was added to the display that referred to a previous utterance, the “owner” of the speech bubble often moved the new phrase close to the previous utterance. In conversation, the speaker must help listeners understand a reference to a previous utterance through context and explicit referencing. The digital table allowed users to reference previous conversation by placing new speech near an existing speech bubble. Similarly, we observed the doctor and patient using the tail of the speech bubble as a pointing mechanism. That is, participants strategically placed speech bubbles around the display so that the tail of the speech bubble pointed to part of a background visual (see Figure 5). The persistent nature of speech with the digital table allowed participants to review Figure 5: Screenshot from Digital Table condition, doctor their conversation. We observed both the doctor and and patient use speech bubble tails as pointing mechanism for patients looking back over their previous conversation. The a conversation about gluten. Gluten related speech bubbles doctor said “it was good to look back at what I had covered are placed around the grains section of the pyramid. with that particular patient,” and explained that “it would

Dr: cooked pasta should only be the Communication Preferences size of your fist not the big bowls Raw Scores, n=8 that are served at restaurants interpreter Best for private medical conversations

Best for routine medical conversations

doctor Best for emergency medical situations

Made me feel the most independent

Best for remembering information

Allowed me to express myself best

Allowed me to understand the doctor best patient 0% 25% 50% 75% 100%

Digital Table Digital Table + Interp Interpreter Undecided

Figure 7: Survey of communication preferences. Figure 6: Interpreter condition, patient watches simultaneous gestures by doctor and interpreter. conversation depends on whether or not I know the be helpful because it is not uncommon in medicine to have interpreter.” Sharon explained, “for other meetings like a very similar conversations with different patients work situation or job interview, I would prefer to have an throughout the day.” As Figure 5 illustrates, the speech interpreter. But for personal meetings, like with a lawyer, bubbles occluded other objects on the display. Addressing doctor, or specialist, I prefer the digital table.” Jesse also issues of layering and transparency is an area for future said the digital table is good “if a client feels they can’t work on SSI. confide with an interpreter present.” Presence of an Interpreter. We noticed differences in how Independence. Six of eight participants reported that the patients attended to the doctor when the interpreter was digital table alone made them feel the most independent. present. Deaf participants looked at the doctor when they Amber explained, “I don’t have to wait for an interpreter. signed but then shifted their gaze to the interpreter when It saves time.” Cathy, on the other hand, said, “it’s true, I the doctor began speaking. In the Digital Table condition, did not need to rely on an interpreter, but sometimes participants typically looked at the doctor when she was independence isn’t exactly what I want—I value smoother speaking and then looked down at the display. In the Mixed conversation.” There is a tradeoff involved with using the condition, we observed a pattern in which the patient digital table: conversation may be slower, but the patient watched the interpreter sign and then looked down at the has autonomy and privacy. display to read the English version. Several participants Remembering information. Although we do not have data to explained that seeing the doctor’s speech in both ASL and judge this, seven participants indicated that using the digital English was helpful. The interpreter also found benefit in table would be helpful for remembering information. having the digital table present: Participants said “it provides a record that I could go back “What was nice for me as the interpreter was to and look at” and “it’s all documented.” have the printed word. When I didn’t know how to spell something, especially in medical situations, if Understanding the doctor. Half of participants stated that the printed word is there…I can point to it…so that including the digital table helped them understand the about the communication board is very attractive.” doctor best. Jesse explained, “the table could be used to clarify words that the interpreter may not understand or The patient faced visual attention challenges when both the comprehend.” Similarly, Mark said, “the table showed the interpreter and doctor were gesturing at the same time. exact words the doctor used.” Alex said her preference Figure 6 shows an example of this interaction. “depends whether the interpreter is fluent and sharp. The Communication Preferences table is better if the interpreter is bad. In that case I would After experiencing the above conditions, each deaf prefer to type for myself.” participant completed a communications preferences Speed of conversation. In follow-up discussion, several survey. Figure 7 summarizes survey responses. participants said they preferred the interpreter when speed Privacy. Six of eight deaf participants reported that the of conversation was critical. Sharon said “the table will digital table alone was best for private medical work only when both parties are patient.” Amber explained, conversations. Cathy said, “the digital table is best for very “in an emergency I won’t have time or energy to type.” private conversations, but using an interpreter in a private There are cost-benefit tradeoffs between communicating through a quicker channel (the interpreter) versus a private is important to provide users with a mechanism for and independent channel (the digital table). repairing their speech. SSI enables this through trashcans, intentionally placed in the corners of the display to increase DISCUSSION Based on our experience with SSI, we discuss cultural situation awareness by others. factors, design lessons regarding dictated speech on (4) The interface should provide an auxiliary way to enter tabletop displays, and plans for extending this work. speech. There will always be new words or phrases that Cultural Factors stump the recognizer. In our design, we provided a virtual ASL is the native and preferred language for many keyboard for the speaking user and found that this members of the deaf community. There are cultural alternative worked well. implications involved in designing an English-based (5) Application designers should take into account the technology for the deaf. For general conversations, several tolerance of their user population and pace of conversation participants preferred communication in ASL with an in the domain of interest. The doctor in our study said that interpreter. Alex says that communicating through the speech input was useful, but that some doctors would not interpreter allowed her to express herself best: “My identity have the time or patience for voice recognition software is Deaf. I prefer interaction in ASL.” The tradeoff happens and might prefer a second physical keyboard. when a conversation is extremely personal or when one wants independence from an interpreter. In this case, our Extensions of this Work deaf participants indicated that communicating through the SSI currently works for only two users interacting with the digital table promised sufficient increased benefit for them table. We anticipate that other scenarios would involve to use English instead of ASL. additional users and are exploring designs that allow various user configurations and input modalities. Using a Conducting research with a deaf population presents more capable speech recognizer is also an obvious next specific challenges. For non-signing researchers, an step (e.g., Dragon NaturallySpeaking Medical edition [5]). interpreter must be onsite to help facilitate interviews and With improvements to the system, SSI could be useful for a usability studies. We found that traditional means of variety of group work tasks of interest to the deaf recruiting such as online postings and email were community. Deaf participants mentioned wanting this inadequate. Involving community members and domain application for counseling sessions, financial services, experts early on in our process led to a partnership with our design projects, classrooms, and even in retail stores. city’s deaf community services center. One of their members created a video blog posting in ASL that While SSI focuses on one subset of the deaf community, advertised our study. This was highly effective and we believe it would also benefit people who are Hard of illustrates the importance of reaching a population of Hearing or have late onset deafness. Interpreters are not interest through their preferred language and used as widely with these populations, so the assistance of communication medium. a digital table may be even more desirable. We are also considering ways of integrating ASL video interpretations Speech Recognition and Digital Tables into the interface so that users who are less proficient in Speech recognition is a promising technology to support English may watch a video interpretation of the speech. natural forms of interaction around digital tables. This work also has implications for hearing populations. Compared to previous work on spoken commands [25][26], Medical conversations are challenging for a number of dictated speech presents new interaction challenges for reasons, including forgetting questions and instructions. both deaf and hearing populations. We identify design The affordances of SSI such as preloading questions and principles that increased interface usability and speech referencing past conversation stand to benefit hearing recognition: populations as well. (1) The system should limit the impact of ambient noise and CONCLUSION the speaking user should not have to turn the microphone This paper presented the design and analysis of SSI. Our on and off. An on/off button adds an extra layer of work demonstrates that tabletop displays can reduce unnecessary complexity and work for the user. communication barriers for deaf users in a way that (2) The speaking user should have control over the speech maintains face-to-face interaction and enables privacy and that is added to the display. Our system presents the user independence. Research on SSI contributes to the growing with three best guesses from the recognizer. This gives interest in multimodal multitouch collaborative systems and control to the speaker, allowing them to select only complements previous work in the field. Finally, this work accurately detected words and phrases. While this design promises to extend to cooperative computing scenarios for requires an additional step, it greatly improves fidelity. hearing populations to enhance communication through multimodal input. (3) The interface should enable conversational repair. Mistakes in recognition and conversation will happen, so it ACKNOWLEDGMENTS daily planner for people with aphasia. Proceedings of Research is supported by a NSF Graduate Research CHI 2004, 407-414. Fellowship, NSF Grant 0729013, and a Chancellor’s [15] Morris, M.R., Morris, D., and Winograd, T. Individual Interdisciplinary Grant. We thank our study participants, Audio Channels with Single Display Groupware: faculty and staff from UCSD Medical School, and MERL Effects on Communication and Task Strategy. for donating a DiamondTouch table. Proceedings of CSCW 2004, 242-251. REFERENCES [16] Morris, M.R., Ryall, K., Shen, ., Forlines, C., and [1] Cavender, A., Ladner, R., Riskin, E. MobileASL: Vernier, F. Beyond "Social Protocols": Multi-User intelligibility of sign language video as constrained by Coordination Policies for Co-located Groupware. mobile phone technology. Proceedings of ASSETS Proceedings of CSCW 2004, 262-265. 2006, 71-78. [17] NIDCD. 2008. www.nidcd.nih.gov/health/hearing [2] Clark, H. (2003). Pointing and Placing. In “Pointing: Where Language, Culture, and Cognition Meet.” [18] Piper, A.M., O’Brien, E., Morris, M.R., and Winograd, Edited by Sotaro Kita. Mihwah, NJ: Lawrence T. SIDES: A Cooperative Tabletop Computer Game Erlbaum Associates, pp. 243-68. for Social Skills Development. Proceedings of CSCW 2006, 1-10. [3] CloudGarden. 2008. http://www.cloudgarden.com/ [19] Ringel, M., Ryall, K., Shen, C., Forlines, C., and [4] Dietz, P. and Leigh, D. DiamondTouch: A Multi-User Vernier, F. Release, Relocate, Reorient, Resize: Fluid Touch Technology. Proceedings of UIST 2001, 219- Techniques for Document Sharing on Multi-User 226. Interactive Tables. Proceedings of CHI 2004, 1441- [5] Dragon NaturallySpeaking Medical. 2008. 1444. www.nuance.com/dictaphone/emr [20] Schuler, D., and Namioka, A. (Eds.) (1993). [6] Fischer, G. and Sullivan, J. Human-centered public Participatory Design: Principles and Practices. transportation systems for persons with cognitive Hillsdale, NJ: Lawrence Erlbaum Assoc. disabilities – Challenges and insights for participatory [21] Schull, J. An extensible, scalable browser-based design. Proceedings of the Participatory Design architecture for synchronous and asynchronous Conference 2002, 194-198. communication and collaboration systems for deaf and [7] iCommunicator. 2008. www.myicommunicator.com hearing individuals. Proceedings of ASSETS 2006, [8] Java Speech API. 2008. java.sun.com/products/java- 285-286. media/speech [22] Scott, S.D., Carpendale, M.S.T., and Inkpen, K. [9] Jefferson, G. (2004). Glossary of Transcript Symbols Territoriality in Collaborative Tabletop Workspaces. with an Introduction. In “Conversation Analysis: Proceedings of CSCW 2004, 294-303. Studies from the First Generation.” Jesse Benjamins [23] Shen, C., Vernier, F., Forlines, C., and Ringel, M. Publishing Company, pp. 13-32. DiamondSpin: An Extensible Toolkit for Around-the- [10] Matthews, T., Carter, S., Pai, C., Fong, J., Mankoff, J. Table Interaction. Proceedings of CHI 2004, 167-174. Scribe4Me: Evaluating a mobile sound transcription [24] Starner, T., J. Weaver, and A. Pentland, A Wearable tool for the deaf. Proceedings of UbiComp 2006, 159- Computing Based American Sign Language 176. Recognizer. In Personal and Ubiquitous Computing, [11] Matthews, M., Fong, J, Mankoff, J. Visualizing non- Volume 1 (4), 1997. speech sounds for the deaf. Proceedings of ASSETS [25] Tse, E., Greenberg, S., Shen, C. and Forlines, C. 2005, 52-59. Multimodal Multiplayer Tabletop Gaming. [12] McNeill, D. (1992). Hand & Mind: What Gestures Proceedings of PerGames 2006, 139-148 Reveal about Thought. Chicago: University of Chicago [26] Tse, E., Shen, C., Greenberg, S. and Forlines, C. Press. Enabling Interaction with Single User Applications [13] Miller, D., Gyllstrom, K., Stotts, D., Culp, J. Semi- through Speech and Gestures on a Multi-User transparent video interfaces to assist deaf persons in Tabletop. Proceedings of AVI 2006, 336-343. meetings. ACM Southeast Regional Conference 2007: [27] Turoff, M. 1975. Computerized Conferencing for the 501-506. Deaf and Handicapped. ACM SIGCAPH [14] Moffatt, K., McGrenere, J., Purves, B., Klawe, M., The Newsletter, Number 16, 1975, pp. 4-11. participatory design of a sound and image enhanced