Opening the Blackbox of Automated Artificial Intelligence with Conditional Parallel Coordinates
Total Page:16
File Type:pdf, Size:1020Kb
AutoAIViz: Opening the Blackbox of Automated Artificial Intelligence with Conditional Parallel Coordinates Daniel Karl I. Weidele*, Justin D. Weisz, Erick Oduor, Michael Muller, Josh Andres, Alexander Gray and Dakuo Wang IBM Research - [email protected] ABSTRACT work is often referred to as human-centered machine learning Artificial Intelligence (AI) can now automate the algorithm selec- or human-in-the-loop machine learning [9, 28, 44]. tion, feature engineering, and hyperparameter tuning steps in a ma- Recently, techniques have been developed to automate various chine learning workflow. Commonly known as AutoML or AutoAI, steps of the data science process, such as preparing data, select- these technologies aim to relieve data scientists from the tedious ing models, engineering features, and optimizing hyperparame- manual work. However, today’s AutoAI systems often present only ters [22, 25, 27, 30, 50]. These techniques – known as AutoAI limited to no information about the process of how they select and or AutoML – are being incorporated into commercial products generate model results. Thus, users often do not understand the (e.g. [11, 12, 16, 32, 39]) as well as open source packages [4, 7, 36]. process, neither do they trust the outputs. In this short paper, we Although the ability for AutoAI to speed up the model-building provide a first user evaluation by 10 data scientists of an experi- process is very promising, there are challenges due to the opaque mental system, AutoAIViz, that aims to visualize AutoAI’s model nature of how AutoAI actually creates models [28, 44, 45]. Users do generation process. We find that the proposed system helps users to not necessarily understand how and why AutoAI systems make the complete the data science tasks, and increases their understanding, choices they make when selecting algorithms or tuning hyperpa- toward the goal of increasing trust in the AutoAI system. rameter values. This “blackbox” nature of AutoAI operation hinders the collaboration between data scientists and AutoAI systems [45]. CCS CONCEPTS Several solutions have been proposed to make AutoAI more trans- parent in its operation in order to increase trust. These proposals • Human-centered computing → Empirical studies in visu- include documenting the process by which machine learning mod- alization; • Computing methodologies → Continuous space els were created [13, 33] and using visualizations to show the search search; Machine learning approaches. space of candidate algorithms or hyperparameter values [10, 37, 45]. KEYWORDS However, these solutions either document the state of an AI model after it has been created, or focus only on one step of the model AutoAI, AutoML, visualization, parallel coordinates, human-AI col- generation workflow (e.g., search algorithms or selecting hyperpa- laboration, democratizing AI rameters). There remains a gap in providing users with an overview ACM Reference Format: of how an AutoAI system works in operation, from the moment Daniel Karl I. Weidele, Justin D. Weisz, Eno Oduor, Michael Muller, Josh data are read to the moment candidate models are produced. Andres, Alexander Gray and Dakuo Wang. 2020. AutoAIViz: Opening the In this paper, we provide a first user evaluation of a new kind of Blackbox of Automated Artificial Intelligence with Conditional Parallel visualization – Conditional Parallel Coordinates (CPC) [46]– Coordinates. In 25th International Conference on Intelligent User Interfaces to open up the “black box” of AutoAI operation. CPC is a variation (IUI ’20), March 17–20, 2020, Cagliari, Italy. ACM, New York, NY, USA, of the parallel coordinates visualization [18, 19], that allows the 5 pages. https://doi.org/10.1145/3377325.3377538 user to expose increasing amounts of detail added to a conventional 1 INTRODUCTION Parallel Coordinates display interactively. We integrated CPC into an existing AutoAI platform [44] and conducted a usability study Data scientists apply machine learning techniques to gain insights with 10 professional data scientists to understand its usefulness. from data in support of making decisions [26]. The challenge is that arXiv:1912.06723v3 [cs.LG] 17 Jan 2020 Our results suggest that data scientists are able to successfully this process is labor intensive, requiring input from multiple spe- glean information from the visualization in order to gain a better cialists with different skill sets [1, 31, 34, 47, 48]. As a result, AI and understanding of how AutoAI makes decisions, while those models Human-Computer Interaction (HCI) researchers have investigated are being built. Our work shows how increased process transparency how to design systems with features that support data scientists leads to improved collaboration between data scientists and AutoAI in creating machine learning models [24, 28, 34, 41, 42, 44]. This tools, as a step toward democratizing AI for non-technical users. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation 2 RELATED WORK on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or 2.1 Human-in-the-loop Data Science and republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. Automated Machine Learning IUI ’20, March 17–20, 2020, Cagliari, Italy Many studies have focused on understanding the work practices of © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-7118-6/20/03...$15.00 data scientists. Muller et al. and Hou & Wang both reported that data https://doi.org/10.1145/3377325.3377538 science workflows operate as an iterative conversation between IUI ’20, March 17–20, 2020, Cagliari, Italy Daniel Karl I. Weidele et al. Figure 1: Screenshot showing the CPC visualization (top) and the leaderboard (bottom, partially shown). In CPC, each colored line running from left to right represents a machine learning pipeline, and corresponds to a row in the leaderboard. In this ex- ample, all pipelines consist of three steps: transformation step 1, transformation step 2, and estimator. Pipelines are evaluated on four metrics: group disparity, prediction time, and ROC AUC scores on the training data and the holdout data. data scientists and other stakeholders, focused around data and tuning, enabling users to monitor the process and adjust the search model artifacts [15, 34]. As such, data science work is a confluence space in real time. While these visualizations improve transparency of people with many different roles, skills, and needs [48]. Amershi of single components in the AutoAI process, no visualization has et al. highlighted the importance of adopting a human-centered yet tackled a representation of the entire process, from data inges- approach to building data science tools [2, 3]. Kross & Guo users tion to model evaluation. To visualize the entire process, we use expressed a strong desire for an integrated user interface with both Conditional Parallel Coordinates (CPC), which offers data scientists, code and narrative. These needs are perhaps best captured in the (a) a real-time overview of the machine learning pipelines as they narrative uses [24] of the Jupyter Notebook environment [20, 21], are being created, and (b) the ability to view detailed information and researchers have conducted numerous studies of how data at each step in the AutoAI process. scientists incorporate notebooks into their workflows [38, 41], how they conduct version control for notebooks [23], and how they enable simultaneous multi-user editing in notebooks [42]. 3 CONDITIONAL PARALLEL COORDINATES What remains less well-understood is how data scientists will FOR AUTOAI incorporate automated machine learning technologies into their We provide a first user-evaluation of a recently proposed Condi- workflows. While initial work suggests that data scientists see tional Parallel Coordinates (CPC) component [46]. Compared to AutoAI tools as collaborative [44], much about the underlying classic Parallel Coordinates (PC) visualizations [8, 17], CPC intro- process by which AutoAI operates remains obscure to data sci- duces new layers of details recursively. entists [10, 28, 37, 45]. Our efforts in this work focus on increasing Pipeline configuration data generated by an AutoAI search pro- trust in AutoAI by increasing the transparency of its operation. cess is considered conditional data. We say hyperparameters are conditioned on the choice of a pipeline step. This conditionality en- 2.2 Trust via Transparency in AI Systems ables CPC, which allows to inspect conditional layers. AutoAI first Recent criticisms of “blackbox” machine learning models (e.g. [29, chooses a sequence of transformation steps before training an esti- 30, 40, 49]) begin with the notion that if one does not understand mator. Then, the hyperparameters of these choices are fine-tuned. how a model produces a recommendation, then that recommenda- Further, the user can bound the AutoAI search space by setting tion may not be trustworthy. The issue of trust may be crucial for constraints, such as group disparity against model bias, or predic- AutoAI systems [6]. Not only must a model’s recommendations be tion runtime for inference speed