Examining Visual Semantic Understanding in Blind and Low-Vision Technology Users
Total Page:16
File Type:pdf, Size:1020Kb
Examining Visual Semantic Understanding in Blind and Low-Vision Technology Users Venkatesh Potluri, Tadashi E Grindeland, Jon E. Froehlich, Jennifer Mankoff <vpotluri,tadg99,jonf,jmankof>@cs.uw.edu Paul G. Allen School of Computer Science and Engineering, University of Washington Seattle, WA, USA ABSTRACT Visual semantics provide spatial information like size, shape, and position, which are necessary to understand and efciently use in- terfaces and documents. Yet little is known about whether blind and low-vision (BLV) technology users want to interact with visual af- fordances, and, if so, for which task scenarios. In this work, through semi-structured and task-based interviews, we explore preferences, interest levels, and use of visual semantics among BLV technol- ogy users across two device platforms (smartphones and laptops), and information seeking and interactions common in apps web browsing. Findings show that participants could beneft from access to visual semantics for collaboration, navigation, and design. To learn this information, our participants used trial and error, sighted assistance, and features in existing screen reading technology like Figure 1: The image above shows a participant using Wikki touch exploration. Finally, we found that missing information and Stix (green) and Play-Doh (orange) to reconstruct the smart- inconsistent screen reader representations of user interfaces hin- phone web Google homepage. der learning. We discuss potential applications and future work to equip BLV users with necessary information to engage with visual semantics. 1 INTRODUCTION User interfaces (UIs) intermix interactive widgets, text, and blank CCS CONCEPTS space, all carefully designed and positioned to convey meaning and afford use. For screen reader users, the size, shape, spatial ! • Human-centered computing Accessibility theory, con- layout, and perceptual affordance of a UI fundamentally differs from cepts and paradigms; Empirical studies in interaction design. sighted users. Though powerful, screen readers linearize access to interactive widgets and content, making it difficult for blind and KEYWORDS low vision (BLV) users to interact with the visual semantics of UIs. Accessibility, visual design, blind and low-vision creators In this paper, we explore whether access to these visual seman- tics is important to the BLV community and, if so, why. Though a ACM Reference Format: rich literature examines how to enhance BLV access to and under- Venkatesh Potluri, Tadashi E Grindeland, Jon E. Froehlich, Jennifer Mankoff. standing of spatial information, it focuses on specialized interfaces 2021. Examining Visual Semantic Understanding in Blind and Low-Vision or tasks, such as maps and navigation [2], access to diagrams and Technology Users. In CHI Conference on Human Factors in Computing Sys- graphs [19, 32, 42] and touchscreens [36, 38]. Our focus is comple- tems (CHI ’21), May 8–13, 2021, Yokohama, Japan. ACM, New York, NY, USA, mentary: investigating how BLV users identify and interact with 14 pages. https://doi.org/10.1145/3411764.3445040 spatial characteristics of modern UIs, such as the size, shape, and rel- ative and absolute position of widgets and content. Because screen readers often function diferently across device platforms–from conventional computers like desktops and laptops (linear access) Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed to touchscreens (direct + linear access) and native applications vs. for proft or commercial advantage and that copies bear this notice and the full citation websites–we examine how these diferences change BLV users’ on the frst page. Copyrights for components of this work owned by others than ACM perception and understanding of UIs. must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission and/or a Our research builds on recent work by Li et al. [44], who con- fee. Request permissions from [email protected]. ducted a formative study of seven BLV users to explore how they CHI ’21, May 8–13, 2021, Yokohama, Japan engage in layout design (e.g., to make slides in PowerPoint and © 2021 Association for Computing Machinery. ACM ISBN 978-1-4503-8096-6/21/05...$15.00 create websites with WordPress). They incorporated their fndings https://doi.org/10.1145/3411764.3445040 into an accessible layout design tool that uses auto-generated tactile sheets overlaid on a tablet to enable the editing of HTML templates. phone screen readers are particularly effective at supporting Our focus is broader: we are interested not only in how BLV users this exploration. engage in creative design tasks that require spatial understanding, (4) We make recommendations for technology designers to sup- but also in how they perceive and interact with spatial information port BLV access to visual semantic information. in modern interfaces in their everyday use. We address two primary Our fndings will help screen reader developers improve capabil- research questions: ities of both desktops and laptops, and touchscreen screen readers. • When, how, and why do BLV technology users want to inter- To support the more efcient learning of visual information, we act with the visual semantics of UIs? How do they interpret must address gaps in current tools (such as missing information or visual semantics of UIs? Relatedly, do these users even want screen reader defciencies) and pay attention to design elements to learn about these semantics? (such as reference points), that helped our BLV participants create • What does prototyping an interface reveal about their un- lo-fi prototypes. Our fndings can inform the way that screen read- derstanding of the visual semantics of UIs? How do lo-fi ers describe interfaces and the interaction options that they support, prototypes created by BLV users and the thought process as well as how we support tasks that require spatial information. of creating these prototypes, e.g., think-aloud comments, inform about their understanding of visual semantics? 2 BACKGROUND AND RELATED WORK To better contextualize visual semantics and non-visual computer To address these questions, we conducted a three-part qualitative access, we discuss the interaction paradigms that are standard in study with ten BLV participants. First, we asked participants about desktop and smartphone (touchscreen) interaction and then enu- their knowledge and use of visual semantics in UIs. We then probed merate prior work on screen reader interactions, access to visual participants’ understanding when using smartphone apps with information, and the creation of accessible visual layouts. As the in- screen readers. Finally, we explored participants comprehension teraction paradigms supported by screen readers vary signifcantly and perception of visual semantics in desktop and mobile web with platform and application, we encourage unfamiliar readers to interfaces through interview questions and a “think aloud” lo-fi try them. Screen readers are freely available for Windows[49], OSX prototyping exercise. Here, participants reconstructed a UI of their [5], and both Android [29] and iPhone [4] smartphones. choice using lo-fi craft materials and described the underlying visual semantics (Figure 1). 2.1 Standard Screen Reading Access to We collected and analyzed questionnaire data, observational Desktops notes, participant-generated artifacts, and study session audio and video recordings. Our study is the first to capture nuanced quali- Screen readers convert visual semantics and content into a lin- tative information of how BLV users understand visual semantics ear, ephemeral stream of output (typically, audio or Braille). They across device platforms and common usage scenarios and probes can also convey hierarchical information (such as the interactor BLV users to express their understanding of UIs through prototyp- hierarchy of an interface), tabular information (such as the title ing. and contents of a table) and semantics (such as non-speech audio Our research fndings suggest that most BLV users want to un- representations of content type, e.g., beeps and tones, or spoken derstand the visual semantics of interfaces in mobile app contexts. identifcation of a heading level). VoiceOver [7] and NVDA [53] Participants, while generally disinterested in the visual semantics convey interactor hierarchy by presenting the UI as a list of inter- of websites, still emphasized the importance of such semantics. actable elements as well as through a separate review cursor that Informed by these fndings and the literature, we enumerate design uses the underlying accessibility tree exposed by Windows APIs, recommendations for technology designers that would enable BLV respectively. Likewise, several approaches have been explored to technology users to better understand and engage more efectively convey notions like tables. Screen readers allow non-linear access with the visual semantics of web layouts. For example, we discuss to tables by providing navigation capabilities horizontally (along integration of visual descriptions of interfaces into existing screen the row) or vertically (along the column). Furthermore, they pro- readers through a verbosity mode similar to settings currently avail- vide announcements that help users identify reference points in a able for punctuation. In summary, we