Expertise and Explanations

I Can Do Better Than Your AI: Expertise and Explanations James Schaffer John O’Donovan James Michaelis US Army Research Laboratory University of California Santa Barbara US Army Research Laboratory Playa Vista, CA Santa Barbara, CA Adelphi, MD [email protected] [email protected] [email protected] Adrienne Raglin Tobias Höllerer US Army Research Laboratory University of California Santa Barbara Adelphi, MD Santa Barbara, CA [email protected] [email protected] ABSTRACT CCS CONCEPTS Intelligent assistants, such as navigation, recommender, and • Information systems → Expert systems; Personaliza- expert systems, are most helpful in situations where users tion; • Human-centered computing → User models; HCI lack domain knowledge. Despite this, recent research in cog- design and evaluation methods. nitive psychology has revealed that lower-skilled individuals may maintain a sense of illusory superiority, which might KEYWORDS suggest that users with the highest need for advice may be User Interfaces; Human-Computer Interaction; Intelligent the least likely to defer judgment. Explanation interfaces – a Assistants; Information Systems; Decision Support Systems; method for persuading users to take a system’s advice – are Cognitive Modeling thought by many to be the solution for instilling trust, but ACM Reference Format: do their effects hold for self-assured users? To address this James Schaffer, John O’Donovan, James Michaelis, Adrienne Raglin, knowledge gap, we conducted a quantitative study (N=529) and Tobias Höllerer. 2019. I Can Do Better Than Your AI: Expertise wherein participants played a binary decision-making game and Explanations. In Proceedings of 24th International Conference with help from an intelligent assistant. Participants were on Intelligent User Interfaces (IUI ’19). ACM, New York, NY, USA, profiled in terms of both actual (measured) expertise and 14 pages. https://doi.org/10.1145/3301275.3302308 reported familiarity with the task concept. The presence of explanations, level of automation, and number of errors 1 INTRODUCTION made by the intelligent assistant were manipulated while Many people express distrust or negative sentiment towards observing changes in user acceptance of advice. An analysis intelligent assistants such as recommender systems, GPS nav- of cognitive metrics lead to three findings for research in igation aids, and general purpose assistants (e.g., Amazon’s intelligent assistants: 1) higher reported familiarity with the Alexa). Simultaneously, the amount of information available task simultaneously predicted more reported trust but less to an individual decision maker is exploding, creating an adherence, 2) explanations only swayed people who reported increased need for automation of the information filtering very low task familiarity, and 3) showing explanations to process. Distrust may be due to intelligent assistants be- people who reported more task familiarity led to automation ing overly complex or opaque, which generates uncertainty bias. about their accuracy or relevance of results. This creates an enormous challenge for designers - even if intelligent assistants can deliver the right information at the right time, it can still be a challenge to persuade people to trust, understand, ACM acknowledges that this contribution was authored or co-authored and adhere to their recommendations. by an employee, contractor, or affiliate of the United States government. While trust is thought to be a primary issue in system As such, the United States government retains a nonexclusive, royalty-free acceptance, a user’s self-assessment of his or her own knowl- right to publish or reproduce this article, or to allow others to do so, for edge may also play a significant role. Personality traits re- government purposes only. lated to self-reported ability have been studied recently in IUI ’19, March 17–20, 2019, Marina del Ray, CA, USA © 2019 Association for Computing Machinery. cognitive psychology [33, 41]. A well known result from this ACM ISBN 978-1-4503-6272-6/19/03...$15.00 research is the “Dunning-Kruger” effect, where low-ability https://doi.org/10.1145/3301275.3302308 individuals maintain an overestimated belief in their own IUI ’19, March 17–20, 2019, Marina del Ray, CA, USA Schaffer et al. ability. This effect states that a lack of self-awareness, which recommendations to the participants, or precisely control the is correlated with lower cognitive ability, may lead some to level of error. We present an analysis of participant data that inflate their own skills while ignoring the skills of others. explains the consequences of self-assessed familiarity (our Moreover, the broader study of “illusory superiority,” which measure of confidence) for intelligent assistants. Changes in tells us that it takes a highly intelligent person to gauge the adherence to recommendations, trust, and situation aware- intelligence of others, may play a role in the adoption of ness are observed. Our analysis allows us to address the intelligent assistants. This begs the question: how can infor- following research questions: mation systems sway over-confident users that believe they (1) How does a person’s self-assessed task knowledge are smarter than the machine? predict adherence to advice from intelligent assistants? Explanations are becoming the commonly accepted an- (2) Are rational explanations from an intelligent assistant swer to making intelligent assistants more usable (e.g, [29]). effective on over-confident people? Explanations inform users about the details of automated (3) What is the relationship between over-confidence, au- information filtering (such as in recommender systems) and tomation bias, system error, and explanations? help to quantify uncertainty. Research into explanations started with expert systems studies [1, 28], and more recently 2 BACKGROUND their trust and persuasion effects have been studied in recommender systems [36, 50]. Beyond this, explanations have This section reviews the work that serves as the foundation been shown to fundamentally alter user beliefs about the sys- of the experiment design and interpretation of participant tem, including competence, benevolence, and integrity [56]. data. On the negative side of things, theories of human awareness Explanations and Intelligent Assistants [17] and multitasking [23] predict that users who spend time perceiving and understanding explanations from intelligent The majority of intelligent assistant explanation research has assistants may have lower awareness of other objects in their been conducted in recommender and expert systems [49], environment (or must necessarily take more time to complete and older forms of explanations present multi-point argu- a given task), which could negatively affect performance9 [ ]. ments (e.g., Toulmin style argumentation [32]) about why a This might have implications for “automation bias,” or the particular course of action should be chosen. These explana- over-trusting of technology [13], which occurs when users tions could be considered “rational” in that they explain the become complacent with delegating to the system. Whether mathematical or algorithmic properties of the systems they explanations alleviate or exacerbate automation bias is still explain, which contrasts with more emotional explanations, an open research question. There are still more intricacies to such as those that are employed by virtual agents [31, 38].The explanations, for instance, recent research in intelligent user differential impact of explanation on novices and experts has interfaces [4] has found that explanations interact with user also been studied [1]: novices are much more likely to ad- characteristics [25] to affect trusting beliefs about a recom- here to recommendations due to a lack of domain knowledge, mender system, implying that not all explanations are equal while expert users require a strong “domain-oriented” ar- for all users. Moreover, since intelligent assistants are often gument before adhering to advice. In this context, “expert” presented as autonomous, cognitive psychology might tell us refers to a true expert (indicated by past experience) rather that explanations that sway a cooperative and trusting user than a self-reported expert. Most of these studies focus on [7, 35] might be completely ignored by an overconfident indi- decision making domains (financial analysis, auditing prob- vidual. This problem could be amplified when imperfections lems), where success would be determined objectively. The in the systems are perceived. work presented here builds on the research by Arnold et The above concerns illustrate the idea that the use of al. [1] by quantifying objectively determined expertise, and explanations might need to be tailored to the user’s self- assessing its relationship with self-reported familiarity and confidence and the potential error of the information system interaction effects with system explanations. or intelligent assistant. This work investigates how a person’s prior reported familiarity with a task vs. their actual Errors by Intelligent Assistants measured expertise affects their adherence to recommen- Intelligent assistants often vary in their degree of automa- dations. A study of (N=529) participants is conducted on a tion and effectiveness. The pros and cons of varying levels binary choice task (the n-player Iterated Prisoner’s Dilemma of automation have been studied in human-agent teaming [2], here introduced in the form

Expertise and Explanations

Strong Reciprocity and Human Sociality∗

A Review of the Social Science Literature on the Causes of Conflict

Chemical Game Theory

Bringing in Darwin Bradley A. Thayer

The Governance Problem in Aggregate Litigation

Bounded Rationality

Books Recommended by Williams Faculty

The Evolution of Cooperation Robert Axelrod; William D. Hamilton Science, New Series, Vol

Etica & Politica / Ethics & Politics

Property in Land

The Pseudo-Democrat's Dilemma: Why Election Observation Became an International Norm

Axelrod 1986.Pdf