SIGCHI Conference Paper Format

View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Lancaster E-Prints Evaluation Strategies for HCI Toolkit Research David Ledo1*, Steven Houben2*, Jo Vermeulen3*, Nicolai Marquardt4, Lora Oehlberg1 and Saul Greenberg1 1 Department of Computer Science, University of Calgary, Canada · {first.last}@ucalgary.ca 2 School of Computing and Communications, Lancaster University, UK · [email protected] 3 Department of Computer Science, Aarhus University, Denmark · [email protected] 4 UCL Interaction Centre, University College London, UK · [email protected] *Authors contributed equally to the work. ABSTRACT a medium to explore research contributions, whereas others Toolkit research plays an important role in the field of HCI, embody their contributions in the code itself [27]. Some- as it can heavily influence both the design and implementa- times, toolkit researchers are asked for a particular evaluation of interactive systems. For publication, the HCI commu- tion method without consideration of whether such an eval- nity typically expects toolkit research to include an evalua- uation is necessary or appropriate to the particular toolkit tion component. The problem is that toolkit evaluation is contribution. Consequently, acceptance of toolkits as a re- challenging, as it is often unclear what ‘evaluating’ a toolkit search contribution remains a challenge and a topic of much means and what methods are appropriate. To address this recurrent discussion [8,27,30,66,73,82]. In line with other ar- problem, we analyzed 68 published toolkit papers. From our eas of HCI [30,82], we should expect HCI toolkit research to analysis, we provide an overview of, reflection on, and dis- use appropriate evaluation methods to best match the partic- cussion of evaluation methods for toolkit contributions. We ular research problem under consideration [31,45,86]. How- identify and discuss the value of four toolkit evaluation strat- ever, while research to date has used evaluation methods, egies, including the associated techniques that each employs. there is little overall reflection on what methods are used to We offer a categorization of evaluation strategies for toolkit evaluate toolkits, when these are appropriate, and how the researchers, along with a discussion of the value, potential methods achieve this through different techniques. limitations, and trade-offs associated with each strategy. The last two decades have seen an increase in HCI toolkit Author Keywords papers [66]. These papers typically employ a range of evalu- Toolkits; user interfaces; prototyping; design; evaluation. ation methods, often borrowing and combining techniques ACM Classification Keywords from software engineering, design, and usability evaluation. H.5.2. Information interfaces and presentation (e.g. HCI): From this corpus, we can consider how toolkit researchers User Interfaces – Evaluation/methodology. collectively derive what evaluation methods are useful, when they are appropriate and how they are performed. INTRODUCTION Within HCI, Greenberg [30] defined toolkits as a way to en- Based on an analysis of 68 representative toolkit papers, this capsulate interface design concepts for programmers, includ- paper contributes an overview and in-depth discussion of ing widget sets, interface builders, and development environ- evaluation methods for toolkits in HCI research. We identify ments. Such toolkits are used by designers and developers to four types of evaluation strategies: (1) demonstration, (2) us- create interactive applications. Thus, they are generative age, (3) technical benchmarks, and (4) heuristics. We pre- platforms designed to create new artifacts, while simplifying sent these four evaluation types, and opine on the value and the authoring process and enabling creative exploration. limitations associated with each strategy. Our synthesis is based on the sample of representative toolkit papers. We link While toolkits in HCI research are widespread, researchers interpretations to both our own experiences and earlier work experience toolkit papers as being hard to publish [77] for by other toolkit researchers. Researchers can use this synthe- various reasons. For example, toolkits are sometimes consid- sis of methods to consider and select appropriate evaluation ered as merely engineering, as opposed to research, when in techniques for their toolkit research. reality some interactive systems are ‘sketches’ using code as WHAT IS A TOOLKIT? Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are Within HCI literature, the term ‘toolkit’ is widely used to de- not made or distributed for profit or commercial advantage and that copies scribe various types of software, hardware, design and con- bear this notice and the full citation on the first page. Copyrights for com- ceptual frameworks. Toolkit research falls into a category of ponents of this work owned by others than the author(s) must be honored. constructive research, which Oulasvirta and Hornbæk define Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission as “producing understanding about the construction of an in- and/or a fee. Request permissions from [email protected] teractive artefact for some purpose in human use of computing” [83]. They specify that constructive research is driven CHI 2018, April 21–26, 2018, Montreal, QC, Canada © 2018 Copyright is by the absence of a (full) known solution or resources to im- held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-5620-6/18/04 $15.00 plement and deploy that solution. As constructive research, https://doi.org/10.1145/3173574.3173610 . Table 1. Overview of all toolkits in the sample. Types: (1) Demonstration, (2) Usage, (3) Technical Performance and (4) Heuristics. toolkits examine new conceptual, design or technical solu- [82] discusses how interface builders opened interface de- tions to unsolved problems. To clarify our review’s scope, sign to artists and designers. we define and summarize what is meant by “toolkit” and G4. Integrating with Current Practices and Infrastruc- “toolkit evaluation”, and why HCI researchers build toolkits. tures. Toolkits can align their ideas to existing infrastructure Defining a Toolkit and standards, enabling power in combination [82] and high- We extend Greenberg’s original definition [30] to define lighting the value of infrastructure research for HCI [25]. For toolkits as generative platforms designed to create new in- example, D3 [14] integrated with popular existing standards, teractive artifacts, provide easy access to complex algo- which arguably contributed significantly to its uptake. rithms, enable fast prototyping of software and hardware in- G5. Enabling Replication and Creative Exploration. terfaces, and/or enable creative exploration of design Toolkits allow for replication of ideas that explore a concept spaces. Hence, toolkits present users with a programming or [30], which collectively can create a new suite of tools that configuration environment consisting of many defined per- work together to enable scale and create “larger and more mutable building blocks, structures, or primitives, with a se- powerful solutions than ever before” [82]. quencing of logical or design flow affording a path of least resistance. Toolkits may include automation (e.g. recogniz- Evaluating Toolkits A common concern among HCI toolkit and system research- ing and saving gestures [67]) or monitoring real-time data ers is the difficulty in publishing [77]. This might be due to (e.g. visualization tools [52,64]) to provide developers with the expectations and prevalence of evaluation methods (e.g. information about their own process and results. user studies), regardless of whether the methods are neces- Why Do HCI Researchers Build Toolkits? sary or appropriate to the toolkit’s contribution. Part of the Before discussing toolkit evaluation, we elaborate on what problem is a lack of clear methods [77] or a clear definition they contribute to HCI research. Wobbrock and Kientz posi- of ‘evaluation’ within a toolkit context. As toolkit designers, tion toolkits as artifact contributions, where “new our stance is that the evaluation of a toolkit must stem from knowledge is embedded in and manifested by artifacts and the claims of the paper. Evaluation is a means to follow the supporting materials that describe them” [111]. We sum- through with the proposed claims of the innovation. We marize discussions by Myers et. al [73], Olsen [82] and should ask ourselves: what do we get out of the evaluation? Greenberg [30] on the value of HCI toolkits into five goals: Toolkits are typically different from systems that perform G1. Reducing Authoring Time and Complexity. Toolkits one task (e.g. a system, algorithm, or an interaction tech- make it easier for users to author new interactive systems by nique) as they provide generative, open-ended authoring encapsulating concepts to simplify expertise [30,82]. within a solution space. Toolkit users can create different so- G2. Creating Paths of Least Resistance. Toolkits define lutions by reusing, combining and adapting the building rules or pathways for users to create new solutions, leading blocks provided by the toolkit. Consequently, the trade-off them to right solutions and away from wrong ones [73]. to such generative

Load more