Using Visual Analytics to Support Artificial Intelligence Development
Total Page:16
File Type:pdf, Size:1020Kb
© Copyright 2019 Nan-Chen Chen Using Visual Analytics to Support Artificial Intelligence Development Nan-Chen Chen A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2019 Reading Committee: Cecilia Aragon, Chair Been Kim Gary Hsieh Amy Ko Program Authorized to Offer Degree: Human Centered Design & Engineering University of Washington Abstract Using Visual Analytics to Support Artificial Intelligence Development Nan-Chen Chen Chair of the Supervisory Committee: Cecilia Aragon Human Centered Design & Engineering In the artificial intelligence (AI) era, complex AI is becoming more common in daily life. Various reports indicate that AI plays an increasingly influential role in human society (e.g., [Russell et al. 2015, Stone et al. 2016]). However, the development of a sophisticated AI system is a challenging process. AI researchers must experiment with different AI system architectures and evaluate their performance on the target tasks. As components in the system interact with each other, it can be difficult to make sense of evaluations and find insight to improve the system. Traditional metrics for system performance (e.g., task accuracy) often fail to provide useful and actionable insights for AI developers. The burden for AI researchers is to analyze evaluations to understand how changes impact system behavior, and then find ways to enhance the system. Visualization is a powerful tool to present information in a way that is easier for humans to absorb than text [Little 2015]. Given that AI development involves a vast amount of information, visualization can facilitate insight discovery for AI developers. Among all possible visualization approaches, visual analytics is that which specifically focuses on using visualization to support analytical tasks and is a powerful method for insight discovery. However, designing visual analytics tools is not a trivial task. In particular, since AI development is highly specialized and requires a significant amount of experience, designing visual analytics tools to satisfy AI developers’ needs is challenging. Human-centered design is a powerful technique to address such needs. Therefore, in this thesis, we focus on using a human-centered design approach to study, design, and engineer visual analytics tools to support AI system development. The thesis includes the design of two visualization tools: QSAnglyzer, a visual analytics tool for evaluation analysis; and AnchorViz, an interactive visualization for discovering errors in interactive machine learning classifiers. In addition, based on an interview study, we propose a framework to describe the current practice of AI development workflows, highlighting issues and suggesting design implications for designers and researchers. By using a human-centered design approach, this thesis aims to contribute to the fields of human-computer interaction (HCI), visualization (VIS), and visual analytics (VA) about AI developers and AI development processes, as well as how to design visual analytics for this domain.Our ultimate goal is to pave a road for creating better tools for AI developers, lowering the barriers for AI development, and making AI more accessible to a wider range of people to build and use. TABLE OF CONTENTS Chapter 1. Introduction ............................................................................................................ 1 1.1 Research Questions ................................................................................................. 4 1.2 Thesis Statement ..................................................................................................... 6 Chapter 2. Background ............................................................................................................ 7 2.1 Enabling Interactive Machine Learning ................................................................ 11 2.1.1 More diverse settings for interactive classification ........................................... 12 2.1.2 Human-intuitive ways of input .......................................................................... 15 2.1.3 Fitting into end users’ natural workflows ......................................................... 18 2.2 Supporting Machine Learning Model Development ............................................ 19 2.2.1 Visualization in iML GUIs ............................................................................... 20 2.2.2 GUIs to support data labeling, featuring, and evaluation analysis ................... 23 2.2.3 Opening up the black box to enable analysis .................................................... 29 2.3 Summary ............................................................................................................... 50 Chapter 3. QSAnglyzer: Visual Analytics for Evaluation Analysis ..................................... 52 3.1 Introduction ........................................................................................................... 52 3.2 Related Work ........................................................................................................ 55 3.2.1 Visual analytics (VA) for multi-experiment result analysis ............................. 55 3.2.2 VA for inspecting computational models ......................................................... 56 i 3.2.3 VA for inspecting computational models, multi-view learning, and multiple clustering solutions ............................................................................................................... 57 3.3 User Requirement Analysis .................................................................................. 58 3.3.1 Identifying user requirements ........................................................................... 58 3.3.2 Preliminary investigation: Sunburst visualization ............................................ 62 3.3.3 Design rationales ............................................................................................... 64 3.4 Prismatic Analysis ................................................................................................ 67 3.5 QSAnglyzer........................................................................................................... 68 3.5.1 Angles ............................................................................................................... 69 3.5.2 Interface design ................................................................................................. 71 3.6 Evaluation ............................................................................................................. 77 3.6.1 Lab study with non-experts ............................................................................... 78 3.6.2 Use cases from expert reviews .......................................................................... 83 3.7 Discussion and Future Work ................................................................................. 86 3.7.1 Scaffolding evaluation analysis workflows ...................................................... 86 3.7.2 Adding evaluation-based or automatic angles .................................................. 87 3.7.3 Supporting complex AI system development ................................................... 88 3.7.4 Applying prismatic analysis to other AI domains............................................. 88 3.8 Conclusion ............................................................................................................ 89 3.9 Acknowledgments................................................................................................. 89 Chapter 4. AnchorViz: Interactive Visualization for Error Discovery in Interactive Machine Learning ........................................................................................................................................ 90 ii 4.1 Introduction ........................................................................................................... 90 4.1.1 Motivating scenario .......................................................................................... 93 4.2 Background and Related Work ............................................................................. 94 4.2.1 Unknown unknowns and prediction errors ....................................................... 94 4.2.2 Searching for items to label .............................................................................. 95 4.2.3 Visualization for ML......................................................................................... 97 4.2.4 Semantic memory and concept decomposition ................................................. 98 4.3 AnchorViz ............................................................................................................. 98 4.3.1 Interface and system design .............................................................................. 99 4.4 Evaluation ........................................................................................................... 104 4.4.1 User study ....................................................................................................... 105 4.4.2 Quantitative and qualitative data analysis ....................................................... 108 4.5 Results ................................................................................................................. 111 4.5.1 Discovered items ............................................................................................. 111