Visualizing Uncertainty in Checking Test Result Reports During the Crisis: A Design Study

by

Jorin Diening Weatherston B.Seng, University of Victoria, 2017

A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Computer Science

in the Department of Computer Science

c Jorin Diening Weatherston, 2020 University of Victoria

All rights reserved. This thesis may not be reproduced in whole or in part, by photocopying or other means, without the permission of the author. ii

Visualizing Uncertainty in Drug Checking Test Result Reports During the Opioid Crisis: A Design Study

by

Jorin Diening Weatherston B.Seng, University of Victoria, 2017

Supervisory Committee

Dr. Margaret-Anne Storey, Supervisor (Department of Computer Science)

Dr. Charles Perin, Committee Member (Department of Computer Science)

Dr. Dennis Hore, Committee Member (Department of Computer Science) iii

ABSTRACT

Potent () are entering recreational drug manufacturing processes, sometimes without the knowledge of people who use . This is contributing to tens of thousands of accidental overdose deaths each year. Recreational drug checking services during the opioid crisis face unique challenges in delivering test results to peo- ple who use drugs. These challenges are caused by uncertainties in drug composition, the chemical analysis processes used, and the complex contextual considerations of drug checking services themselves. In this thesis I describe a design study in collab- oration with a local drug checking service to explore visualizing uncertainty in drug checking test result reports. From this research we generate a number of research con- tributions. I have identified the new and impactful application domain of visualizing uncertain drug checking test results. Within this application domain I conducted a design study to generate a test result report that suits the problem context and ac- complishes the design goals described by the drug checking service stakeholders. This design study generates reflective considerations on conducting design studies in this context, intermediate design artifacts, and finally a test result report software applica- tion. The design study also led to the identification of a new uncertainty visualization design space for proportional charts. I apply that design space in the generation of some intermediate design artifacts. I position these research contributions within the drug checking and uncertainty visualization research fields, and describe our planned future work in the hopes that future research will positively impact the application domain. iv

Contents

Supervisory Committee ii

Abstract iii

Table of Contents iv

List of Tables viii

List of Figures x

Acknowledgements xiv

Dedication xv

1 Introduction 1 1.1 Motivation ...... 1 1.2 Contributions ...... 3 1.3 Thesis Layout ...... 4 1.3.1 Part One ...... 4 1.3.2 Part Two ...... 5 1.3.3 Part Three ...... 5

2 Methodology 6 2.1 The Design Science Paradigm ...... 6 2.2 Information Location and Task Clarity in Design Studies ...... 7 2.3 The Relevance, Rigour, and Visual Cycles ...... 9 2.4 Presenting Contributions using Technological Rules, Novelty and a Vi- sual Abstract ...... 9 2.5 Design Study Structure ...... 10 2.6 Methodologies used Within Design Iterations ...... 12 v

2.6.1 Five Design Sheet Methodology ...... 12 2.6.2 Design Space Exploration Process ...... 14

3 Drug Checking Service Context 16 3.1 Stakeholder Types ...... 16 3.1.1 Clients ...... 18 3.1.2 Workers ...... 19 3.1.3 Chemical Analysts ...... 21 3.2 Drug Testing Systems and Test Result Formats ...... 22 3.2.1 Component and Percent Composition Test Results ...... 23 3.3 Uncertainty in the Drug Checking Service ...... 24 3.4 Existing Test Result Delivery Methods ...... 27 3.4.1 Literature Concerning the Communication of Drug Checking Test Results ...... 29

4 Requirements Analysis 32 4.1 Requirements and Acceptance Criteria Gathering Processes ...... 32 4.1.1 Semi-Structured Interviews ...... 33 4.1.2 Interview Protocol ...... 34 4.1.3 Design Feedback Meetings ...... 34 4.1.4 Design Feedback Survey ...... 35 4.2 Requirements and Acceptance Criteria ...... 37

5 Design Goal One: Visualizing Percent Composition and Compo- nent Composition 40 5.1 Improving Drug Checking Test Results Delivery Using Charts . . . . 40 5.2 Selecting Appropriate Charts ...... 41 5.2.1 Percent Composition Chart ...... 43 5.2.2 Component Composition Chart ...... 44

6 Design Goal Two: Visualizing Uncertainty in Percent Composition 46 6.1 Empowering Clients with Uncertainty and Confidence Data in Test Results ...... 46 6.2 Uncertainty in Percent Composition ...... 48 6.2.1 Characterizing Uncertainty in Percent Composition Test Results 50 6.2.2 Uncertainty Visualizations for the Public ...... 52 vi

6.2.3 Design Guidance for Visualizing Uncertainty ...... 52 6.3 Unquantified Uncertainty Design Space ...... 55 6.3.1 Preliminaries ...... 56 6.3.2 Step 1 - Breakdown ...... 56 6.3.3 Step 2 - Dimensions ...... 57 6.3.4 Step 3 - Systematic Exploration ...... 58 6.3.5 Step 4 - Application ...... 59

7 Design Goal Three: Visualizing Confidence in Component Com- position 62 7.1 Confidence in Component Composition ...... 62 7.1.1 Characterizing Confidence in Component Composition Test Re- sults ...... 64 7.1.2 Design Guidance for Visualizing Confidence in Component Com- position ...... 65 7.1.3 Generating Confidence Indicator Design Alternatives . . . . . 66

8 Additional Report Design Goals 69 8.1 Design Goal Four: Digital and Handout Reports Must be the Same . 69 8.2 Design Goal Five: The Visual Report Must Present Basic Drug Check- ing Service Information ...... 71 8.3 Design Goal Six: The Visual Report Must Present Descriptors of the Drug Sample ...... 72 8.4 Design Goal Seven: The Visual Report Must Highlight Fentanyl In the Test Results ...... 74 8.5 Design Goal Eight: Chemical Analysts Must be Able to Interpret the Test Results ...... 76 8.6 Design Goal Nine: The Visual Report Must Explicitly Disclaim Itself 77 8.7 Final Report Design ...... 78

9 Implementation 82 9.1 Artifact Deployment Iteration: Make Stage ...... 82 9.2 Artifact Deployment Iteration: Deploy Stage ...... 84 9.3 Using the Application ...... 85

10 Discussion 91 vii

10.1 Contribution 1: Design Study ...... 91 10.2 Contribution 2: Design Space ...... 97 10.2.1 Evaluating the Unquantified Uncertainty Design Space . . . . 98 10.2.2 Design Space Consistency ...... 99 10.2.3 Design Space Completeness ...... 100 10.2.4 Using the Unquantified Uncertainty Design Space as a Visual- ization Researcher ...... 102 10.2.5 Understanding Designs Produced by the Unquantified Uncer- tainty Design Space as an End User ...... 103 10.3 Contribution 3: Visual Report ...... 103 10.3.1 Dominant Design Trade-offs ...... 103 10.3.2 Transferability ...... 104 10.3.3 Limitations ...... 105

11 Future Work & Conclusions 107 11.1 Future Work ...... 107 11.1.1 Evaluation of the Drug Checking Test Result Digital Report and Handout Report ...... 107 11.1.2 Unquantified Uncertainty in Proportional Charts Design Space 108 11.2 Conclusions ...... 109

Bibliography 111

A Drug Checking Service Flow 117

B Design Feedback Survey 119 viii

List of Tables

Table 2.1 Design iterations and deployment iteration with primary activities. 13

Table 3.1 Chemical analysis methods and type of drug checking test results produced...... 23 Table 3.2 A depiction of components and their representative hit-scores. Hit-scores go from low to high confidence. Ratios of 1 represent a complete identification. The SERS scale is qualitative, and based on an unexposed internal set of thresholds; chemical analysts only get to see the colours...... 24 Table 3.3 This is an example of an FTIR test output. It shows a list of components and the percent of the sample’s composition they each comprise...... 24 Table 3.4 In-person communication of results directly to clients across 31 services. Note that some services deliver results in multiple ways. [4] ...... 27

Table 4.1 This tables presents the connections between my stakeholders, information gathering processes, design constraints, and design goals. The Requirements and Acceptance Criteria column includes short descriptions of design guidance gathered at a high, medium and low level of abstraction. The Design Goals col- umn indicates which design goals satisfy the requirements and acceptance criteria. The Source column indicates where require- ments and acceptance criteria were collected from. The Process column indicates which technique was used to collect the require- ments and acceptance criteria. I describe the codes below the data...... 38 ix

Table 5.1 A prioritization of common visualization tasks to select charts for each data type [45]. The original set of tasks is in the left column, rank numbers for my orderings are in the middle-left column, task ordering for percent composition is in the middle-right column, and task ordering for component composition is in the right column. 42 x

List of Figures

Figure 2.1 Sedlmair et al.’s[47] diagram depicting information location and task clarity. Design projects can be placed within these axes to understand whether or not a project could be conducted using a design study...... 8 Figure 2.2 The complete research timeline composed of design and deploy- ment iterations, and five design activity stage types...... 12

Figure 3.1 A diagram of service flow within Substance with stakeholders and stages...... 17 Figure 3.2 A row from the EcstasyData.org database. Accessed: 01/06/2019 28 Figure 3.3 A sample test result report from the EcstasyData.org database. Accessed: 01/06/2019 ...... 29

Figure 4.1 The complete research timeline with stakeholder feedback and requirements analysis processes highlighted...... 33

Figure 5.1 The percent composition pie vs cake charts (left), and the com- ponent composition table chart (right)...... 43

Figure 6.1 The percent composition pie and cake charts with 100% axis. . 47 Figure 6.2 Decomposition of the pie and cake chart into visual marks. . . . 57 Figure 6.3 Examples of low, medium and high manipulations to individual visual variables of individual visual marks...... 59 Figure 6.4 An example of combining manipulations to individual visual vari- ables to create new, compound manipulations...... 61 Figure 6.5 Applying a zig-zag line and dotted line modifications to the pie and cake charts to generate design alternatives...... 61

Figure 7.1 The component composition chart, with confidence data cells indicated in red and legend indicated in green...... 63 xi

Figure 7.2 Component composition confidence indication using a linear scale which balances between green confidence and red uncertainty. . 67 Figure 7.3 Component composition confidence indication using a multi-state iconographic black and white icon scale...... 68

Figure 8.1 Early visual report design with color eventually became a black and white final report design...... 71 Figure 8.2 Design elements that display the service information in the report. 72 Figure 8.3 Photographic, iconographic, and textual design iterations for dis- playing sample identification...... 74 Figure 8.4 Highlighting fentanyl results throughout the visual report. Red highlights the locations in the report that the indicator was placed. 76 Figure 8.5 Designs for presenting qualitative interpretations of the test re- sults as a whole...... 77 Figure 8.6 An example of the report disclaimer. The content will surely change as the service evolves...... 78 Figure 8.7 Final Report Design Page 1 ...... 79 Figure 8.8 Final Report Design Page 2 ...... 80

Figure 9.1 The complete research timeline with the deployment iteration highlighted...... 82 Figure 9.2 The software application visual report with sections containing drug identifiers, fentanyl test results, component composition and percent composition charts. This is a work in progress. . . 84 Figure 9.3 The software application visual report with sections containing qualitative interpretations, and service information. Hours and locations are still to be added. This is a work in progress . . . . 84 Figure 9.4 The visual report software when the chemical analyst is searching for a clientID within the database...... 85 Figure 9.5 The visual report software with preloaded data and areas the chemical analyst must enter data into in order to finalize the report highlieghted in red...... 86 Figure 9.6 The visual report with the three fields that must be manually filled out partly complete. Icons indicate how these fields are to be filled out...... 87 xii

Figure 9.7 The visual report software with a finalized report ready to be saved to the database and printed onto paper for use in the harm reduction conversation...... 89

Figure 10.1This visual representation of the design study captures the rele- vance, rigour, and design cycles from Hevner [18], and presents the outcome as a technological rule from van Aken [52]. This presentation style was inspired by Storey et al.’s visual abstract [51]...... 97 Figure 10.2An example of the desired overlap between design space dimen- sions. Note how changes to the nature of the boundary edge mark’s width cause changes in the area mark’s area. This over- lap is a positive overlap in this context as the boundary edge angle, and the segment area are both becoming ambiguous. The more the boundary edge mark’s width increases, the more the area is obscured...... 100

Figure A.1 A diagram depicting the stakeholders and processes within the Substance drug checking service...... 118

Figure B.1 The design feedback survey...... 120 Figure B.2 The design feedback survey...... 121 Figure B.3 The design feedback survey...... 122 Figure B.4 The design feedback survey...... 123 Figure B.5 The design feedback survey...... 124 Figure B.6 The design feedback survey...... 125 Figure B.7 The design feedback survey...... 126 Figure B.8 The design feedback survey...... 127 Figure B.9 The design feedback survey...... 128 Figure B.10The design feedback survey...... 129 Figure B.11The design feedback survey...... 130 Figure B.12The design feedback survey...... 131 Figure B.13The design feedback survey...... 132 Figure B.14The design feedback survey...... 133 Figure B.15The design feedback survey...... 134 Figure B.16The design feedback survey...... 135 xiii

Figure B.17The design feedback survey...... 136 Figure B.18The design feedback survey...... 137 Figure B.19The design feedback survey...... 138 Figure B.20The design feedback survey...... 139 Figure B.21The design feedback survey...... 140 Figure B.22The design feedback survey...... 141 Figure B.23The design feedback survey...... 142 Figure B.24The design feedback survey...... 142 Figure B.25The design feedback survey...... 143 Figure B.26The design feedback survey...... 144 Figure B.27The design feedback survey...... 145 Figure B.28The design feedback survey...... 146 Figure B.29The design feedback survey...... 147 Figure B.30The design feedback survey...... 148 Figure B.31The design feedback survey...... 149 Figure B.32The design feedback survey...... 150 Figure B.33The design feedback survey...... 151 Figure B.34The design feedback survey...... 152 Figure B.35The design feedback survey...... 153 Figure B.36The design feedback survey...... 154 Figure B.37The design feedback survey...... 155 Figure B.38The design feedback survey...... 156 Figure B.39The design feedback survey...... 157 xiv

ACKNOWLEDGEMENTS

I would like to thank:

All of the people who have supported this research and this writing process. Friends, family, and colleagues all helped me navigate this transformative learning pro- cess successfully, and you have each had an immeasurably positive impact on me and this work. Of particular importance are my supervisor Dr. Margaret- Anne Storey, committee members Dr. Charles Perin and Dr. Dennis Hore, and Cassandra Petrachenko. You lead me, personally, professionally, and scientifi- cally throughout this academic chapter, and helped bring this work to fruition. Thank you for your incredibly important contributions to all aspects of this work and this chapter in my life. To my friend Dr. Eirini Kalliamvakou, I would like to say thank you for reviewing and assisting me in all aspects of creating this thesis, but also for being so generous with your time and thoughtfulness in our conversations. To the members of the Computer Human Interaction and Soft- ware Engineering (CHISEL) Lab, you are an amazing team and a wonderful group of people to spend three years with. I care about each of you, and look forward to connecting wherever we are around the world.

Of critical importance to my success in this endeavour were the Bell ladies. Pamela, Jessica, and Cheryl, and especially my beautiful girlfriend Vanessa, you all supported me as I climbed the heights in the immediate ways that someone who is working hard cannot seem to remember to do. Thank you for caring for me so well and for being the buoy to my anchor; you are the best kind of people and deserve all the love in the world.

To Cam, Bet and Connor (collectively the Weatherstons) I give you credit for raising me to push myself, having numerous characteristics I aspire to embody, and for loving and supporting me at every step throughout my entire life. You are my counsel, my admired friends, and my cherished family. I selfishly wish for your long and happy lives, and for more amazing adventures together from here on out.

This was impossible to do alone, and amazing to do together. To all my friends and family, my colleagues and collaborators, thank you deeply and sincerely.

Jorin Diening Weatherston xv

DEDICATION

For everyone impacted by the opioid crisis. Chapter 1

Introduction

Recreational drugs (drugs) are not subject to manufacturing monitoring or regulated labelling. People who use drugs depend on the honesty and knowledge of drug deal- ers for access to safe drug supply. Drug overdoses causing injury and death can and do occur. For example, opioids, and particularly fentanyl and its potent chemical analogues, have been responsible for more than 10,300 deaths in Canada between January 2016 and September 20181. The same data-set indicates that between Jan- uary 2018 and September 2018, there were 3,286 opioid overdose deaths, of which 93% were accidental1. Fentanyl is 100 times stronger than morphine2, and carfentanil is 10000 times stronger than morphine3. Drug checking services are finding these opioids, and their chemical analogs are adulterants throughout the illicit drug market. For example, , marijuana, meth, , and even counterfeit prescription drugs have been found to contain fentanyl [14]. The unmonitored spread of powerful opioids through- out the drug supply is known in 2020 as the “global opioid crisis”.

1.1 Motivation

Drug checking services are one harm reduction response to the global opioid crisis. I collaborated with one such drug checking service called Substance4 to conduct the research within this thesis. Substance is a drug checking service run by a team of

1https://infobase.phac-aspc.gc.ca/datalab/national-surveillance-opioid-mortality.html 2https://www.cdc.gov/drugoverdose/opioids/fentanyl.html 3https://pubchem.ncbi.nlm.nih.gov/compound/carfentanil 4substance.uvic.ca 2

chemists and social workers in collaboration with partner sites offering social services in Victoria, BC. They offer a free, onsite, drug checking service to any member of the public who wishes to know more about their drug’s contents. Substance faces unique challenges and conflicting constraints in delivering its ser- vices. For example, this service is delivered onsite at multiple sites to serve the com- munities struck hardest by the opioid crisis. However, the service’s movement between partner sites means drug checking systems must be mobile, which contributes to some limitations on error reporting features the mobile chemical analysis machines possess and error reducing processes conducted by service staff. Larger machines intended for stationary use can employ hardware-dependent measurement checks and even con- duct cross-analyses to corroborate findings within a single system. Additionally, the service must return test results within 30 minutes. The short timeframe further re- duces the possibility of performing accurate error measurements on each drug sample. This example is just one trade-off faced by this mobile onsite drug checking service. Substance utilizes five chemical analysis technologies to test drug samples. Each technology introduces uncertainty into test results in the form of measurement and procedural error. Each technology has different levels of sensitivity and accuracy. Each technology has a different chemical analysis process. Each technology has fallible human beings operating it. As a result, instruments do not always agree with one another on the components they see. Disagreements between the outputs of different technologies are frequent. Some machines are entirely blind to components that others can see easily. Quanti- fying this complex error to the point of complete safety within short timeframes is not something that an onsite drug checking service can do. As a result, Substance can never guarantee the safety of a drug sample or provide a safe dosage of a drug sample. The decision of drug-consumption must always lie with the people who use drugs, also called clients of the service in this thesis. Therefore, for clients to make informed drug use decisions, it is necessary to pro- vide them with actionable test results. Providing actionable test results involves describing the chemical analysis processes that generate test results. It also involves describing the shortcomings of the drug checking process and relevant forms of un- certainty in test results. A tool that captures all the necessary test results data and standardizes delivering drug checking test results is therefore necessary. At Substance, harm-reduction workers present test results during a harm-reduction conversation. During this conversation, clients ask questions about drug use, drug 3

content, and receive harm reduction resources. This harm-reduction conversation is also where clients learn about uncertainty in the test results. Ensuring clients understand test results can be challenging to achieve. Clients sometimes come into the service in altered mental states [31], sometimes carrying multiple samples, and some on behalf of others whom they deliver results to later. Substance staff have indicated that clients struggle to keep harm-reduction conversa- tion outcomes matched with test results, and test results matched with physical drug samples due to these complicated circumstances. Substance described situations where misunderstandings of test results could cause harm. A client who could receive a correct test result of 91% confidence that their drug sample contained caffeine could go on to confront and change their drug dealer. The client could do this because they misunderstood the result incorrectly as the drug sample consisting of 91% caffeine, indicating a low quality purchase. A misun- derstanding like this could have a negative impact on the client. Thus, the Substance team has called for the creation of a visual drug checking report, which hopes to solve some of these problems. This report would ideally visually present test results, highlight uncertainty and dangerous components, and facilitate safer drug use decisions and better harm reduction conversations. Including uncertainty in test results is likely to make the data and visualizations more complex, and visualization researchers face non-trivial barriers to including uncertainty within representations [20]. However, Roth et al.[44] indicate that it is the visualization researcher’s responsibility to reveal uncertainty to end-users of visualizations so they can make informed decisions, even when end-users are the general public. Recent work by Correll et al.[11] has indicated that as visualization researchers, we have the ethical responsibility to reveal hidden uncertainty in our visualizations. I joined Substance as a visualization researcher to create a visual test result report (visual report) that captures test result data, its uncertainty, and qualitative inter- pretations. The rest of this thesis describes a design study I conducted to create such a visual report for the onsite Substance drug checking service during the opioid crisis.

1.2 Contributions

The following are the primary contributions made by this research. Contribution 1: Design Study My primary contribution is the first design 4

study in the unexplored and important drug checking visualization research domain. This design study allows me to characterize the problem space, identify critical design goals, and gather and analyze stakeholder feedback. Through the design study, I generate a final design and I implement and deploy a solution to the problem context. I situate the design study as interdisciplinary research relying on and contributing to both the drug checking literature and visualization design literature. Contribution 2: Design Space A new design space comprises my second con- tribution. It is a design space for visualizing uncertainty for percent composition test results. I apply this design space to generate design alternatives within the drug checking visualization problem domain. I analyze the design space in terms of consistency and completeness and discuss its effectiveness. Contribution 3: Visual Report My third contribution is the visual report itself. This contribution primarily benefits the drug checking research domain. The visual report design and the visual report artifact are both included in this contri- bution. I discuss the transferability and generalizability of the visual report within the drug checking domain, and also discuss the effectiveness of the visual report once deployed within a real drug checking context.

1.3 Thesis Layout

This thesis is structured as follows.

1.3.1 Part One

Chapter 2 Methodology; introduces the design science paradigm, outlines the design study structure I adopted, and additional methodologies I used within design processes. This chapter also introduces the design science concepts necessary to contextualize the rest of the thesis. Chapter 3 Drug Checking Service Context; describes contextual information needed to perform a design study in this visualization application domain. This chapter includes descriptions of stakeholders, the application domain and problem characterization. It also includes drug checking test results and forms of uncertainty and confidence. Chapter 4 Requirements Analysis; describes stakeholder input and the feed- back gathering processes I used. I present nine design goals distilled from the require- 5

ments and acceptance criteria gathered from stakeholder feedback.

1.3.2 Part Two

Chapter 5 Design Goal One: Visualizing Percent Composition and Com- ponent Composition; describes how I satisfied design goal one; that of visualizing percent composition and component composition with charts. I select baseline charts and design them to present the percent composition and component composition data formats. Chapter 6 Design Goal Two: Visualizing Uncertainty in Percent Com- position; describes how I satisfied design goal two; that of visually representing uncertainty in percent composition test results. This includes the identification of a new design space I used in generating design alternatives which present uncertainty in percent compositions. Chapter 7 Design Goal Three: Visualizing Confidence in Component Composition; describes how I satisfied design goal three; that of representing confi- dence in component composition test results. This includes the creation of an icono- graphic scale to indicate confidence in component composition. Chapter 8 Additional Report Design Goals; describes how the visual report satisfied the remaining six design goals. I describe how these design goals are satisfied, and I explain transitions between design alternatives using stakeholder feedback and requirements. Chapter 9 Implementation; lists the technologies used to implement the report and describes the choice of each technology. I describe how the report visualization system fits into the drug checking context.

1.3.3 Part Three

Chapter 10 Discussion; discusses the resulting report in terms of meeting design goals, and the primary trade-offs present in the final design. Also included is a qual- itative analysis I conducted of the newly identified design spaces described in terms of consistency and completeness. I also discuss generalizability and transferability of contributions within and outside the drug checking research field. Chapter 11 Future Work and Conclusion; describes the plans for conducting a user study given the unique restrictions on stakeholder access. I also describe areas of research that may benefit from this work and conclude the thesis. 6

Chapter 2

Methodology

In this chapter I present the overarching design science research paradigm this research falls within. I also describe in detail the design study methodology that I used to structure the research as well as additional methodologies that I used within the design study.

2.1 The Design Science Paradigm

Design science is a paradigm for generating knowledge about designing “a better world” [52, p.4] Additions to the body of design science knowledge come in the form of design study contributions. Each design study’s goal is to solve a real design problem with one or more specific design solutions and generate design knowledge. These design problems and solutions must be characterized and validated using accepted design science techniques. With the quality of the pairing of design solutions with design problems verified, it is then possible to gain insight into solving a class of design problems. According to van Aken, the general design insights in combination with the problem-solution pairing form a design science knowledge contribution [52]. The quality of design science contributions are dependent on the quality of the resulting solution design, design methodologies used, and evaluations performed. van Aken [52] notes that the relevance of a solution to its problem is directly related to the quality of the designer and design process inputs. The rigour of design science research is similarly proportional to the quality of the evaluation of the solution in its context. Design science researchers can enhance the rigour and relevance of their research through triangulation between information sources, controlled observations, 7

and cross-case analyses [52]. Design science research can be complicated because it typically solves design prob- lems within socio-technical contexts. Socio-technical contexts are arrangements of in- formation technology, people, and related processes [36]. van Aken [52] describes how solving design problems within socio-technical contexts is challenging partly due to the uniqueness and unpredictability of human agency. Additionally, designing within socio-technical systems inherently involves political and ethical design considerations. Human agency, politics, and ethics are all factors that impact design outcomes during a design study.

2.2 Information Location and Task Clarity in De- sign Studies

This research is a design study focused on designing a visual report for the Substance drug checking service during the opioid crisis. In Sedlmair et al.’s[47] touchstone research on conducting design science within the domain of information visualization, the authors indicate that the primary objective of design studies is to solve open-ended visual design problems. Sedlmair et al., state that design studies are problem-driven research projects wherein design researchers create solutions for real people and real problems with creatively derived design solutions. According to Sedlmair et al., the design study process is flexible and iterative to identify design objectives, generate design alternatives, and improve problem-solution fit. This flexibility enables design studies to collect emergent requirements and generate designs that suit contextual nuances. Sedlmair et al.[47] describe in their work the type of visualization research which design studies are best suited for by defining two descriptive axes as shown in Figure 2.1. The first axis is an information location axis with a range from inside the expert’s head to inside the computer. The second axis is a task clarity axis with a range from wholly described to completely undefined. Research projects fall into this two- dimensional problem space based on the location of information and whether critical tasks are fuzzy or crisp in their definition. In this design study I have access to test results within a database, however, my stakeholders within the socio-technical context posses critical knowledge about drug checking test results. This locates information within the desireable middle area of 8

Figure 2.1: Sedlmair et al.’s[47] diagram depicting information location and task clarity. Design projects can be placed within these axes to understand whether or not a project could be conducted using a design study. the information location axis. As for task clarity, I have relatively clear descriptions of the tasks which a solutions generated by the design study are expected to support. However, as within most socio-technical settings, emergent behaviour and situations must be adapted to by stakeholders to ensure client safety which making drug use decisions. This research is thus within the desirable middle area within the task clarity axis. 9

2.3 The Relevance, Rigour, and Visual Cycles

I also rely upon Hevner’s [18] three design science cycles to describe how the ac- tions taken during the design science research relate to accepted processes within the design science literature. Hevner describes three cycles that connect three con- texts to describe the design science paradigm. The three cycles are relevance, design and rigour, and the three contexts are the problem environment, the design science research process, and the design science knowledge base. The relevance cycle connects the environment containing the problem to be solved with the design science research process. The relevance cycle does this in two ways; one, through the elicitation of requirements to guide design and acceptance criteria to guide evaluation, and two, through field testing of design alternatives in the problem context using acceptance criteria. In turn, this cycle generates more requirements and acceptance criteria and the cycle repeats. The design cycle connects the problem context requirements and grounding design science literature as the necessary inputs to creatively generate designs. It also con- nects the problem context acceptance criteria and evaluation processes from literature as the inputs to evaluate the generated designs. The rigour cycle connects the design science research process with the existing design science knowledge base in literature by grounding design and evaluation in literature, and by contributing back new knowledge to be added to the design science knowledge base. This thesis refers to these three design cycles to connect research actions to design science processes. Hevner’s rigour, relevance, and design cycles capture the connections between knowledge, design and context as processes performed during design science. This literature contextualizes the design science research presented within this thesis and also helped structure my iterative research process.

2.4 Presenting Contributions using Technological Rules, Novelty and a Visual Abstract

In addition to Hevner’s three cycle design science paradigm, van Aken’s concept of technological rules [52] and Storey et al.’s design science visual abstract [51] help guide the presentation of the contributions of this design study. 10

Van Aken’s concept of technological rules captures general knowledge about the mappings between design problems and proposed design solutions. Technological rules capture how proposed interventions may create desired effects in a given context. Technological rules describe a specific solution for a specific problem, but they also expose more general classes of problem-solution pairings. These general classes of pairings enable the transfer of design knowledge from one specific context into another to solve a similar problem. The visual report contribution is a potential example of an intervention and is discussed later within the thesis. Storey et al.[51] bring Hevner’s and van Aken’s work together through the creation of a visual abstract for presenting design science. Storey et al., combine Hevner’s three cycles and van Aken’s technological rules concepts to create a visual abstract which assists design scientists in presenting and reflecting on design science contributions. The authors additionally signal that novelty is a critical aspect of contributions to design science. Storey et al.’s visual abstract concept has places to capture all of these aspects of design science contributions, and I rely on the visual abstract later in this thesis to present the contributions of this research effectively. In this section I integrate design efforts with the rigour, relevance and design cycles described by Hevner [18]. I explain the design study processes used to create design proposals for sections of the visual report. Additional methodologies which I used as part of my design study include the Five Design Sheet methodology described by Roberts [43] in combination with a design space exploration process outlined by Shultz et al.[46]. These were useful when generating designs and employing user feedback gathering processes to evaluate the effectiveness of design candidates with stakeholders. By adopting formal procedures into an iterative design process, I strengthen the connection to the rigour, relevance and design cycles.

2.5 Design Study Structure

It was vital to follow a structured process while exploring design ideas during design cycles, and I describe these processes here. The first process I used is requirements analysis, where I take the knowledge I gain about my problem space and consolidate it with the existing requirements and problem knowledge to generate design goals. The second process is visual design, where I use design goals and bodies of 11

related literature as inputs to a visual design process. The third process is stakeholder feedback, where I take newly generated design artifacts and assess their effectiveness using design feedback processes. These three processes I followed together form a design iteration. Emergent requirements gathering and visual-design-fit-to-context are strong mo- tivations to choose the design study methodology as described by design study re- searchers [47,55]. By using this methodology, I bring in stakeholder perspectives and responses to design alternatives early and throughout the design process. I performed multiple design iterations to generate and refine designs without putting vulnerable stakeholder groups at risk. This risk to stakeholders originates from the combination of two design considerations; the inherently early and unre- fined concepts that arise early in design study research; and, the critical safety nature of the information and decision-making surrounding drug checking test results. I per- formed three design iterations before satisfying the acceptance criteria of my stake- holder groups. The stakeholder groups include people who use the service (clients), chemists performing drug checking (chemical analysts) and social workers providing harm reduction resources (harm-reduction workers). After the three design iterations were complete, I began implementing and de- ploying the design as a usable artifact during a deployment iteration. During the deployment iteration I first polished the design after gathering final design feedback. I then moved into a making stage and a deployment stage. The making stage in- volved choosing technologies and implementing the drug checking report as a software artifact. The deployment stage involved packaging the report and installing it into the laptops within the service for use. Following this process led to significant changes to the report’s features in inter- mediate design iterations, to the acceptance criteria and requirements, and led to adaptations to the methodologies used within design iterations. It has not yet been possible for me to perform a direct user study with people who use drugs due to the existing anonymity and privacy considerations of the partner sites. Instead, I have planned a user study for future work, which collects feedback indirectly from other groups of stakeholders. A full timeline of the research process is shown in Figure 2.2. This graphical research timeline presents the temporal nature of the research while avoiding an exhaustive literal description. To complement the written description, the reader can get a sense of the design activities executed within the design iteration 12

Figure 2.2: The complete research timeline composed of design and deployment iter- ations, and five design activity stage types. stages in Table 2.1. Table 2.1 shows the three design iterations broken down into the three stages within design iterations, as well as the deployment iteration and its make and deploy stage. Inside each cell I briefly describe the activities that I performed during that stage. With the design study methodology forming the overarching structure of the re- search within this thesis described, I next describe two additional methodologies used within the three design iterations to generate design alternatives.

2.6 Methodologies used Within Design Iterations

I used the five design sheet methodology described by Roberts et al. [43] and a design space exploration process described by Schulz et al. [46] within my design iterations. These formalized design processes help bring knowledge from the design science body of knowledge into the process of designing the report intervention.

2.6.1 Five Design Sheet Methodology

Roberts et al.[43] conceptualize a hand-drawn process of divergent exploration and convergent solution-finding for visual designs using a set of five sheets. They note that unstructured hand-drawn brainstorming efforts help generate early prototype designs. They rethink that messy process as a structured prescriptive process encap- sulated and supported by five specially designed sheets. Robert et al.’s premise is that by constraining the early exploration of hand-drawn visual alternatives within a well thought out process, there should be less forgotten good ideas, better feedback gathering, and arrival at better solutions sooner. 13

Design Iteration Requirements Analysis Stage Visual Design Stage Stakeholder Feedback Stage #1 (January - February 2019) Introductory meetings and Visualization lit- Stakeholder feedback meet- gathering high-level goals. erature search for ing on report builder first Semi-structured interviews existing off-the-shelf concept. with stakeholders to collect report builder solu- initial stakeholder require- tions. Report builder ments visual design project. #2 (March - May 2019) Analysis of feedback from Visualization design Design feedback survey de- stakeholder feedback meet- literature searches sign and creation. Deploy- ing. Consolidated feed- to support design ment of design feedback sur- back with existing require- choices. Data- vey to stakeholders. Please ments to generate an up- oriented report visual reference Appendix B for dated list of requirements. design process and the complete survey tool. Confirmatory meetings with design space identifi- stakeholders to verify re- cation. quirements and design direc- tion. Drug checking litera- ture search to find examples of test results data-oriented reports. #3 (June - July 2019) Analysis of design feedback Visualization design Stakeholder feedback meet- survey results. Consolidated literature searches to ings to resolve the final de- feedback with existing re- assist in final design sign choices. Final design quirements to generate an decisions. A final set complete. updated list of requirements. of alternative report Identification of unresolved designs generated. and/or remaining design de- cisions. Deployment Iteration Make Stage Deploy Stage (August - December 2019) Selection of technologies to Installation of applica- implement a drug checking tion onto drug check- report. Implementation of ing service machines design in those technologies. completed. Testing data formatting and data access. Table 2.1: Design iterations and deployment iteration with primary activities. 14

The five sheets they propose are a task definition sheet, three principle design sheets, and a final solution sheet to support the visualization researcher formalize their design exploration process. This methodology suited my specific problem context due to the complexity and sensitivity of the research domain. The five design sheet methodology was applied primarily during the second and third design iterations for creating visual designs in the visual design stages. These five sheets are available on Robert et al.’s website1, and I follow the processes they describe. I firstly characterize the tasks the visualizations needed to satisfy on the first sheet. Then, design ideas were explored and generated on the second to fourth sheets. Finally, these designs were filtered down to a subset of solutions on the fifth. Using this methodology, I generated visualization ideas during the visual design stages of design iterations. Design ideas were digitized and presented to stakeholders during stakeholder feedback stages. The five design sheet methodology helped my stakeholders understand that design concepts were not finalized implementations, even when digitized in presentations to the team.

2.6.2 Design Space Exploration Process

Shulz et al.[46] describe how visual design spaces rely upon the conceptual decom- position of a design whole into its essential parts. These parts and their modifiable attributes form the n dimensions of a design space. The researcher can then explore this n-dimensional design space by making modifications to positions on dimensions of the design space. Thus, every location in a design space represents an n-tuple of design choices for a given design concept. With a design space outlined, it is possible to select and modify sub-portions of the design to explore the effects of dimensional changes on the design’s properties. With modifications to some dimensions made, all of the individual parts are reconstituted into a new whole to generate a complete design alternative. By systematically explor- ing design spaces, Schulz et al., describe how it is possible to discover non-intuitive design solutions that may never have occurred to visualization researchers. In this thesis, I formalize a unique design space based on carefully selected charts and the design goals to be accomplished. I decompose the selected charts into their parts, and systematically explore the design space to generate alternative visual pre-

1http://fds.design/index.php/resources-and-publications/ 15

sentations of test result data in alignment with design goals. I then use these al- ternatives in subsequent stakeholder feedback stages within the design iterations to generate new requirements and acceptance criteria. In summary, by following a methodological design cycle, the rigour and relevance of the design study are enhanced. 16

Chapter 3

Drug Checking Service Context

Many contextual factors impacted the design, process, and outcomes of my research. I present an overview of the most relevant contextual factors in this chapter as part of my efforts to ensure the relevance of the solution to the problem. Much of this context was gathered during relevance cycles and helped define requirements and acceptance criteria for design proposals. I verified this contextual information with the drug checking literature to support the rigour cycle. These efforts are in alignment with Hevner’s [18] relevance and rigour cycles and help generate the ‘thick’ descriptions, which van Aken [52] requires from high-quality design science.

3.1 Stakeholder Types

The context of my study has three primary stakeholders; clients, harm reduction workers, and chemical analysts. Each group has individual and shared requirements and goals and each is affected by systemic forces and possess different perspectives. The visual report design must find a balance of the requirements and goals of these stakeholders. I include a service flow diagram in Figure 3.1 to show the stakeholders, their roles and primary concerns are described. I collected this stakeholder information through extensive interactions with service staff over an extended period, and in the case of clients, also from descriptions of client demographics in the related literature. 17

Figure 3.1: A diagram of service flow within Substance with stakeholders and stages. 18

3.1.1 Clients

Clients are members of the public who wish to have drug samples tested by the drug checking service. They can bring in multiple samples and bring in samples on behalf of others. Due to privacy and anonymity concerns I was never able to interview or survey client stakeholders (clients) directly. This lack of direct access was an essential factor in subsequent research decisions, results, limitations, and future work. I collected these client stakeholder descriptors from the other two stakeholder types and litera- ture. Clients can belong to any demographic. However, some demographics outweigh others in the population of people who utilize the service. These can be people who are homeless, people who are regular users of drugs, and who use consumption sites. As the name suggests, consumption sites are locations that drug users can consume drugs in supervised settings. These facilities were created to reduce harm within the drug user community. These harm reduction goals include reducing , which transmits diseases, reducing injuries and deaths due to overdoses, and providing access to related harm reduction resources. Substance is a drug checking service offered as part of a community harm reduction resource. Substance has partnered with sites in Victoria specifically to target under- served demographics who may not have access to any other drug checking services. These partners include Aids Vancouver Island1 and Solid Outreach2. From the drug checking literature, Liang et al.[31] note that during the two decades of drug checking services existing users have typically been considered ‘party drug’ users. The authors note that in recent years, drug checking has extended into popula- tions with a higher ratio of females, higher socio-economic status, and also marginal- ized populations. As drug checking services begin to serve marginalized populations, such as injection drug users, new challenges in offering effective services are likely to emerge. An example is described in Karamouzian et al.’s work [22]; after receiving positive fentanyl test results at safe injection facilities in Vancouver, BC, clients did not discard their drugs but reduced dosages instead. Van Aken notes that dynamic human behaviour like this is a typical challenge of socio-technical design science prob- lem contexts, and this a great example where human agency must be accounted for. Though the service is open to any members of the public, many clients come from

1http://avi.org/our-services/victoria 2https://solidvictoria.org/outreach/ 19

the homeless demographic and do not have regular exposure to primary visualization usage or chart reading situations. Clients also may not have access to the internet, may not have followed normative education paths, or may not have much interest in learning how the service generates drug checking data. The clients who access Substance are concerned with maintaining their anonymity and privacy. Stigma and criminalization motivate client concerns surrounding posses- sion and use. Non-client stakeholders indicated that clients must be protected from further stigmatization and criminalization when they access Substance. Therefore, the service must remain neutral in the value judgments that it is perceived to make through its messaging. For example, Substance should never directly or indirectly say that using drugs is wrong or judge clients for the drug use decisions they make. Another client concern is the speed of service. According to the research partners, a large portion of the clients will leave the service if the results take longer than 30 minutes. This time-limit is borne out in drug checking literature as well. In a global review of drug checking services, Barrat et al. [3] collected data on service turnaround times and drug checking service format. Formats include mail-in, fixed-site, and on- site drug checking services. Turnaround times are typically less than 30 minutes, with only a few taking more than an hour for the on-site drug checking services surveyed.

3.1.2 Harm Reduction Workers

Harm reduction workers are immersed in the client context and are the first service staff clients encounter within the drug checking service. Harm reduction workers are service staff and researchers who have backgrounds in social work with vulnerable populations. They may have been formally educated in social work, or have gained their experience within social work settings. These experiences give them a unique and intimate perspective on the challenges that clients face, and they are primarily concerned with the welfare of clients. Harm reduction workers are responsible for running demographic surveys, col- lecting samples from clients, and engaging the client with harm reduction resources during the chemical analysis. Harm reduction workers also participate in helping disseminate drug checking test results to the client. Dissemination occurs during the harm reduction conversation concerning the drug checking test results. In this conver- sation, harm reduction workers help translate the results into useful harm reduction actions the client could take. The harm reduction conversation or intervention is a 20

common feature of drug checking services that return results to clients [4]. Within Substance, harm reduction workers face the challenge of delivering a critical service to a disenfranchised and extremely vulnerable population. Stigmatization, criminalization, and general societal segregation put the drug-using demographic at a disadvantage when it comes to accessing critical services. For people who use drugs, drug checking services can be their only way of gaining safety-critical information about the drugs they consume. Harm reduction workers are also very interested in highlighting uncertainty within the drug checking test results for clients. According to the harm reduction workers, clients interpret test results originating from an official organization like Substance as being factual and accurate. Harm reduction workers want clients to understand that the results contain un- certainty and want the report to represent that uncertainty. Beyond this, harm reduction workers indicate that the test results report must present data intuitively and ethically. Additionally, having a test result report is useful in helping harm reduction workers prevent accidental overdose. A report would provide drug content information and qualitative interpretations and depictions of uncertainty in the test results, and do so in a consistent and reproducable format. Additionally, a report could provide links to drug safety resources and service information. Having a report also enables tracking improvements of messaging outcomes as measured by changes in drug-usage behaviour to adulterant-positive test results with a test results artifact. Changing drug use decisions in response to adulterant-positive test results is discussed by Kennedy et al.’s [24] work on the willingness to use drug checking services in supervised injection sites. The authors note the mixed results in changes to behaviour in response to test results, even when test results indicate potentially harmful drug contents. Being able to track changes to a test result artifact that conveys test results could improve the outcomes of drug checking services and drug use decision making by clients. As shown in Fig 3.1, once the harm reduction worker has collected and documented each drug sample from the client, the samples are passed on for chemical analysis. 21

3.1.3 Chemical Analysts

Chemical analyst stakeholders possess the training to operate and understand the specific chemical analysis systems used within the service. As shown in Figure 3.1, chemical analyst stakeholders are responsible for running chemical analyses on the drug samples to produce test results. Beyond understanding how to perform each chemical analysis, chemical analysts must have an understanding of the theoretical effectiveness of each chemical analysis process and be able to interpret the results for client stakeholders. Chemical analysts are primarily concerned with operating their systems effec- tively and producing actionable and accurate test results. The chemical analysts on Substance have years of drug checking experience between them and have delivered hundreds to thousands of test results. Chemical analysts face the challenge of interpreting test results produced by im- perfect chemical analysis systems on behalf of clients. The chemical analysis processes used do not always find the same drug components or produce consistent results. Factors that impact test results include the accuracy of system measurements, the sensitivity of analysis processes to procedural mistakes, and the pairing of analy- sis techniques to drug sample types. Chemical analysts also interpret and resolve discrepancies in test results. Once test results are summarized, chemical analysts present test results in the harm reduction conversation with the harm reduction worker and client. They also describe their qualitative interpretations and answer any questions that arise as a result of the test result data. A specific request from the chemical analyst stakeholders to reduce these challenges was a visual test result report as that would allow them to present complex test result information in repeatable and straightforward ways. There are also actions that service stakeholders perform outside their chemical analyst and harm reduction worker roles. Service stakeholders may use their experi- ence in drug checking to help clients understand the harms that different drugs may have or describe harm-reducing actions that could be taken even when those actions are outside their primary role. The above stakeholder data describe dominant perspectives, qualitative accep- tance criteria, and design requirements in the Substance context. These stakeholder descriptions are a critical part of Hevner’s relevance and rigour cycles. I increase the relevance of designed solutions to the problem context and increase the rigour of this 22

research by collecting these descriptors over time and from distinct sources.

3.2 Drug Testing Systems and Test Result For- mats

In this section, I present the drug testing technologies and test result formats and describe sources of uncertainty within the test results. Drug checking test results fall in three data categories: component composition, percent composition, and qualitative interpretation. A chemical composition breakdown of a drug sample generates a component com- position, which is similar to the ingredients in a recipe without the quantities of each ingredient included. Mobile drug checking services have been providing crude com- ponent composition test results for decades through . Reagent testing mixes reagents with samples to produce indicator colours and is something a drug checking service can do anywhere on a small budget. Component composition test results provide critical information that people who use drugs need to make decisions about . For example, if cocaine or MDMA (up, uppers) tests for unwanted components, a client can discard them and buy from someone different without much cost to themselves. However, for those who use heroin or (down, downers), the stakes are higher. A dominant portion of opioid samples tests positive for the pres- ence of fentanyl. Fentanyl is added to, or fully replaces down in samples because fentanyl greatly enhances the desired effects of those drugs. Down enhanced with fentanyl is becoming cheaper and commonplace, but fentanyl has also begun to spread throughout drug supplies. According to my stakeholders, simply identify- ing component composition is now insufficient for safe drug use decisions. Instead, identifying and quantifying the amount of fentanyl and other chemical components as percents of composition in drug samples has become critical in making safe drug use decisions for opioids. Most onsite drug checking services do not offer percent composition test results [3]. Until recently, the necessary technology to produce percent composition test results has been physically large, expensive and complex to operate. The miniaturization of technology capable of producing percent composition results for onsite drug checking services has only become available in the past couple of years. This technological 23

advancement represents a considerable step forward for drug checking operations. However, even with percent compositions generated, it is crucial for chemical analysts to qualitatively interpret test results and present a summary for clients of the service.

3.2.1 Component and Percent Composition Test Results

Substance utilizes five drug checking systems. Each system utilizes a different method for determining drug composition, with test results produced by these methods falling into two categories, as seen in Table 3.1. The general test result data formats ab- stracted out of the system outputs are component composition results and percent composition results.

Chemical analysis method Test result category Fentanyl test strips Component Composition Gas Chromatography-Mass Spectrometry (GC-MS) Component Composition (Raman) Component Composition Surface-Enhanced Raman Spectroscopy (SERS) Component Composition Fourier-Transform Infrared Spectroscopy (FTIR) Percent Composition

Table 3.1: Chemical analysis methods and type of drug checking test results produced.

Component Composition

Machines that output component composition lists indicate the “what” of composi- tion. Each machine outputs a list of ingredients, and also includes a hit-score for each ingredient. A hit-score is a numerical or symbolic representation of the confidence of identification for each component. Fentanyl test strips test only for a single compo- nent and are either positive or negative, with almost no limit to sensitivity. Table 3.2 shows examples of the output data for each system.

Percent Composition

Percent composition machines decompose the ratios of ingredients within a drug sample. They answer the “how much” for each component they identify in drug samples. Percent composition results are lists of components with a percentage of the sample’s composition attributed to each component in the list. Only FTIR is capable of producing these results at this time. Table 3.3 shows an example FTIR output for a drug sample. 24

Machine Single Component Representation Hit-Score Fentanyl Test Strip Fentanyl positive/negative GC-MS Component name 0-1000/1000 Raman Component name 0-100/100 SERS Component name green/orange/red FTIR Component Name 0-1000/1000

Table 3.2: A depiction of components and their representative hit-scores. Hit-scores go from low to high confidence. Ratios of 1 represent a complete identification. The SERS scale is qualitative, and based on an unexposed internal set of thresholds; chemical analysts only get to see the colours.

FTIR Component List Percentage Heroin 45% Caffeine 27% Sugar (mannitol) 20% Fentanyl 8%

Table 3.3: This is an example of an FTIR test output. It shows a list of components and the percent of the sample’s composition they each comprise.

If clients know the ratios of components that make a drug sample, then clients can estimate the strength of a drug sample, its potential effect on them, and make harm reducing choices that are not possible with only component composition information. Because FTIR also includes a version of component composition information, this makes percent compositions the most valuable single test result within the service.

3.3 Uncertainty in the Drug Checking Service

Substance faces several uncertainties in delivering its services. Some uncertainties pertain to the systems used to generate test results, and other uncertainties arise due to the nature of offering a drug checking service to members of the public. Some uncertainty sources are from the drug checking systems, and some are from the nature of the service. Test result data contains some of the following uncertainties, the definitions of which I collected and verified with chemical analyst stakeholders during design meetings in early design meetings. Drug checking system uncertainties are:

• Error: Every system has a degree of error in its results. This type of error usually is mathematically described for each system and presented as bands of 25

error in test results. None of the drug checking systems have error bands, but all systems have indications of confidence of identification, as shown in Table 3.2.

• Identifiable components: The systems differ in the sets of components that they can identify. For example, GC-MS is unable to ever see sugars due to their combustion during the analysis process.

• Limit of sensitivity: All systems are limited to identifying components above a minimum concentration threshold. For example, FTIR cannot accurately produce results for components below 2% concentration.

• Drug homogeneity: Drug samples can be poorly mixed, resulting in mixture inhomogeneity. Inhomogeneity means that different parts of drug samples will produce different analysis results. Additionally, drug checking systems used by Substance are intentionally non-destructive.

• Drug analysis processes: Each chemical analysis system has a standard operating procedure that is followed to generate test results. These standard operating procedures vary in how much the chemical analyst is required to make decisions that determine the chemical profile produced.

• Inter-system discrepancies: Due to each system possessing a different un- certainty profile, there are always inter-machine discrepancies. One technique may produce a component composition list that is missing components from another technique’s result. The FTIR technique which produces percent composition test results, requires the most decisions from chemical analysts as it uses a manual subtractive signal analysis process. In this analysis process, the machine scans the sample, which produces a total sample signal. The system then tries to match pure compo- nent signals from a library of pure component signals to portions of the sample signal based on fitting the signal lines together. Once the chemical analysts find the best fit for a component in the library, they subtract that signal out of the sample signal. Subtracting out the signal leaves behind a residual signal. Chemical analysts continue this process of fitting and subtracting pure compo- nents out of the sample signal until the residual signal is just noise. The order of subtraction of signals is critical and can produce different sets of components 26

being matched and reported as drug sample contents. This non-determinism, due to the drug analysis process, can have a dramatic effect on the components identified within a drug sample.

There are uncertainties related to the nature of offering any drug checking service. These uncertainties are due to service performance factors or practical reasons. Service uncertainties are:

• Multi-sample clients: Clients to the service regularly bring multiple samples into the service. The pairing between physical samples and test results can be mixed up in the current handwritten sticky-note and verbal method of results delivery. This process reduces Substance’s ability to track its impact.

• For-a-friend clients: The service also serves clients who bring drug samples into the service on behalf of others. Clients convey test results to these third parties outside the service. Confusion pairing physical samples with test results compounds with a delayed out-of-service recounting of harm reduction advice to third parties.

• Client psychological state: There is the possibility that clients are entering the service with incapacitated faculties. Imparting harm-reduction information effectively to clients may be difficult in these situations.

• Service staff knowledge and experience: Service staff in both the chemical analysis and harm reduction roles have a wide variety of experience. Therefore, clients may have a considerably different harm reduction conversation depending on the service staff present.

• Client stay duration: Clients would like to receive their drug checking results as quickly as possible. A turnaround time of 30 minutes from client entry of service to client exit with results for all drug samples is the maximum acceptable time frame. After this amount of time, clients may leave the service with a subset of test results or no results at all.

• Public facing service: The drug checking service aims to serve any member of the public. The service must be designed to accommodate the full spectrum of experience with visualizations and chemical analysis data. Keeping report designs intuitive and informative is critical. 27

The needs of the stakeholders, the degree of uncertainty in test results, and the challenges of offering an on-site drug checking service all help motivate this design study to generate an effective visual report. The desired outcome with a visual report would be to mitigate the negative effects of some of these uncertainties and facilitate some of the positive outcomes the service hopes to create. I searched the literature to see if an existing report solution is suitable for Substance’s context and describe those next.

3.4 Existing Test Result Delivery Methods

The drug-checking body of literature is extensive and detailed. Nevertheless, there was only one report that I could find that mentions results delivery mechanisms directly; a comprehensive global review of drug checking services conducted by the National Drug and Research Centre in Sydney, Australia [3,4] in 2017. In this report, and accompanying bulletin, the authors summarize the service architectures of drug checking services around the world, including the results delivery methods. Results communication occurs in different ways to different stakeholders across the 31 services surveyed in Barret’s article [4]. Stakeholders in this report include law- enforcement bodies, medical organizations, NGOs and clients. Of particular interest in this survey are descriptions of how drug checking services deliver drug checking results to clients directly. Table 3.4 presents the methods from services that do so.

Communication Method Number of services App 1 Text message 2 Aggregate report 4 Website (with code) 4 Website (public) 6 Email 10 Phone call 11

Table 3.4: In-person communication of results directly to clients across 31 services. Note that some services deliver results in multiple ways. [4]

I visited each website in the global review survey. Public websites with open access databases had reports accessible without requesting examples from the services. I reviewed these websites to see how reports presented results. 28

Of these public databases, EcstasyData.org has relevant test reports available. EcstasyData.org runs a drug checking service but also sources additional data from other drug-checking operations. EcstasyData.org’s result reports are particularly use- ful as examples because the underlying data is from multiple drug checking operations, meaning report designs must transfer somewhat between services. In Figure 3.2 an example row from the database contains basic drug sample iden- tifiers, drug content information, and links to a more comprehensive sample report. An example of a sample report is presented in Figure 3.3.

Figure 3.2: A row from the EcstasyData.org database. Accessed: 01/06/2019

This information includes drug sample identifiers such as photographs, date tested, originating location, description and originating drug checking service. Also included are the expected drug and test results. The test result data include reagent test results (Marquis, Mecke and Mandelin are well-known reagent tests) and a by-mass breakdown of the drug sample. These test result reports are effective at presenting basic component composition test results for party-drug users. These reports do not present forms of uncertainty or confidence in their test results, which are important for Substance’s needs according to the stakeholders. They also do not attempt to resolve discrepancies between individual tests, nor do they have any specific presentations of test results concerning opioids. These reports present the masses of the components, but understanding how these masses relate to the total drug sample mass is difficult because a total mass is not listed. A different visual presentation of drug checking results and information may therefore be beneficial to increase the usability of these reports. The remaining available drug checking reports observed lacked the same information as those on EcstasyData.org, or included fewer data. With this search for existing solutions concluded, a search for literature concerning communicating drug checking test re- sults more generally began. 29

Figure 3.3: A sample test result report from the EcstasyData.org database. Accessed: 01/06/2019

3.4.1 Literature Concerning the Communication of Drug Check- ing Test Results

Measham [37] describes the creation, implementation, and results of the UK’s first in-person festival drug checking service. In earlier drug checking blog posts by Measham34 and a Master’s thesis [12] she describes the importance of communicating drug checking test results effectively. Measham describes that the service providers chose to deliver test results verbally during in-person consultations at the festival. However, Measham does not discuss why verbal results delivery was the best results delivery solution. Measham’s blog posts indicate that test results delivery mecha-

3https://volteface.me/feature/the-festival-drug-report-part-ii/ 4https://volteface.me/feature/pentylone-care-can-multi-agency-safety-testing-help/ 30

nisms must communicate the limitations of drug checking to service users, but they do not describe how descriptions are delivered, even verbally. Measham cites work by Brunt [9], which describes the challenges of making “trade-offs between speed, accuracy, reliability and portability of (drug checking) equipment.” These challenges reflect the concerns of Substance’s stakeholders. Also, Measham’s reference Winstock et al.[54] directly highlights that the accuracy of test results tends to diminish with increasing service mobility; however, the perception of accuracy of results does not necessarily also diminish. This false sense of security is what my stakeholders are concerned about, and why they wish to capture this directly within a report that replaces their currently verbal results delivery approach. It is noteworthy that, as Measham’s work is recent research concerning the cre- ation of a new drug checking service, Measham’s paper draws on the full body of drug checking and harm reduction literature and never discusses results delivery mecha- nisms. There are also no mentions of research related to the artifacts in use during results delivery. For other drug checking services responding to the opioid crisis, such as those studied by Karamouzian [22], clients are said to be “notified” of results. While this may be appropriate in their case where a single highly accurate fentanyl test strip test result represents all the tests performed, Substance must communicate five completely distinct test results in a summarized format. Work by Glick et al.[13] explored different stakeholder perspectives on drug check- ing services and found that test result accuracy is among the most critical concerns. They also report that stakeholders are concerned that if they are only presented with fentanyl presence as the strips indicate, and not potency information their ability to make drug usage decisions is severely limited because most injection drug supplies contain fentanyl. This limitation motivates the inclusion of multi-format drug testing results that clients can use to triangulate drug usage decisions. The drug-checking service analyzed in Glick et al.’s work does just this by using two additional tests that complement the fentanyl test strip. One can identify components from lists of drug components, and the second can indicate how much of some components are in the sample. However, even in this context, which is very similar to Substance as they use different systems and results may not agree, the authors do not indicate how they communicate the results beyond stating that clients receive them verbally. Laing et al.[31] report that making comparisons between drug checking services can be difficult if there is no common basis for reporting test results. They recom- 31

mend the development of protocols for reporting data from drug checking services to facilitate shared learning across services. The messaging and test results delivery processes and artifacts influence the impact that drug checking services have, and so they should be standardized if possible as well. The literature I reviewed does not describe the test result artifacts delivered to the clients in detail. Papers concerning drug checking messaging focus on who delivers it, the content delivered, but never discuss artifact delivery mechanisms themselves. In short, I did not find any research about existing drug checking test results delivery solutions that meet my contexts requirements, but there is certainly the motivation to create a solution. 32

Chapter 4

Requirements Analysis

In this chapter I present the requirements and acceptance criteria gathering processes undertaken during this design study. The purpose of gathering this information is to generate a set of design requirements and acceptance criteria generated out of the problem context to guide design processes. Requirements and acceptance criteria are the artifacts that move between the problem and design spaces in Hevner’s relevance cycle [18]. The gathering of high-quality requirements and acceptance criteria brings the solution as close to the optimal form for the context as possible. Outside of solving a specific design problem, it is a critical outcome of design studies to generate more general design knowledge. Creating qualitative connections between solutions and problem contexts is what enables the transfer of generated general design knowledge to related problem contexts [47]. This general knowledge could be codified in the form a technological rule, wherein an intervention is applied to a problem in a context [52]. Technological rules are considerably more valuable contributions to design science when they are well triangulated between stakeholder feedback, problem contexts, and related literature. Therefore, gathering high-quality design guidance in the form of requirements and acceptance criteria from coordinated techniques is a critical factor in conducting impactful design studies.

4.1 Requirements and Acceptance Criteria Gath- ering Processes

During this design study, I gathered the requirements and acceptance criteria four times, shown in green in Figure 4.1. I first conducted semi-structured interviews 33

before beginning the design iterations to gather contextual and design problem infor- mation. I then conducted a design feedback meeting, a design feedback survey, and a final design feedback meeting at the ends of the three design iterations of the design study. This multi-modal requirements and acceptance criteria gathering process allowed me to triangulate and cross-reference feedback on design ideas during requirements analysis stages.

Figure 4.1: The complete research timeline with stakeholder feedback and require- ments analysis processes highlighted.

Each requirements analysis stage involved taking context and problem descrip- tions, requirements and acceptance criteria, and consolidating them into design guid- ance for the following design stage. During requirements analysis stages, new infor- mation can cause the abandonment of requirements or the modification of acceptance criteria. This distillation process resulted in a body of unusable requirements and outdated acceptance criteria. It also means that the final visual report design only satisfies a subset of the design guidance produced. Instead of taking the reader through all of the abandoned requirements and outdated acceptance criteria, I in- stead only present the final set of satisfied design goals from now on. The Appendix contains the full set of unrefined requirements and acceptance criteria. Note that I collected only requirements and acceptance criteria from the chemical analysts and harm reduction workers. Any design guidance collected about clients came indirectly from those stakeholder groups or from literature.

4.1.1 Semi-Structured Interviews

Hove and Anda [19] indicate that having a combination of open-ended and specific questions is an approach that helps gather both foreseen and unforeseen contextual information in semi-structured interviews. According to Barnum et al.[2], it is appro- priate to perform relatively few (5) interviews while conducting qualitative research. 34

The authors state that interviewing five participants can generally produce 80% of the necessary design guidance. I performed five one-on-one, in-person, one hour, semi-structured interviews to col- lect initial contextual information and characterize the stakeholder groups. During these interviews, I asked standard questions, some stakeholder-group specific ques- tions, and opened a dialogue about potential test report solutions. I used standard questions to facilitate comparisons between the stakeholder group’s needs and pri- orities. Stakeholder specific questions helped characterize the group generally but also helped generate a sense of the lived experiences of individuals within stakeholder groups.

4.1.2 Interview Protocol

The line of questions started with informal questions to put the interviewees at ease. I then asked the interviewees this set of specific questions:

• What is your role in the drug checking service?

• How do you interact with drug checking test results in the service?

• What do you think about a visual report for delivering test results?

• What can you tell me about the clients of the service?

Depending on the answers to the above questions, I then asked more general follow-up questions to collect more detail. This protocol enabled comparisons between stakeholder groups and the initial characterization of the client stakeholder group. From these interviews, I also achieved a sense of how a visual report might fit into the context. However, the design guidance collected during this initial phase was fairly general. Further feedback gathering and refinement of requirements happened in the following design iterations.

4.1.3 Design Feedback Meetings

I conducted two one-hour design feedback meetings during the design study to present design alternatives to the entire drug checking team. The team consists of ten people across two of the three stakeholder groups. During these meetings, I projected design options to team members and described how design features attempt to satisfy design 35

goals. Open discussion on the implications of designs, design improvements, and requirements surrounding the visual report generated design feedback which I tracked with meeting notes. The demographics of the team during these meetings were four harm reduction stakeholders (two male, two female) and six chemical analysis stakeholders (three male, three female). Together these groups provided feedback on designs from their stakeholder perspectives. The needs of clients were a crucial talking point during these meetings. This proxy-based client information was critically important in ensuring the relevance of designs to the client stakeholder group.

4.1.4 Design Feedback Survey

Much of the feedback gathered to this point within the design study was subjective. Making design decisions and resolving conflicts between requirements was challenging to accomplish in a team meeting setting. I used a semi-formal design feedback survey in place of a design feedback meeting for the second design iteration’s stakeholder feedback stage. I include the entire survey in Appendix B. A survey was useful because of the high number of design decisions to be made at this point in the study and provided independent subjective opinions to help guide my design decisions. This survey was anonymous so that the stakeholders could feel free to present their honest preferences and opinions about design ideas. A survey was also a way to hear from quieter participants. The survey presented both multiple-choice questions where those surveyed could choose ‘best’ designs for a section of the report and open feedback areas for each question to capture emergent concepts. I described the purpose of each report sec- tion. These descriptions included the type of data to be presented and the desired outcomes. For example, I show a survey question in Figure B.15 concerning drug sample identification.

Survey Creation

I generated the survey content from the set of design alternatives for each report section. The survey consists of three sections. The first section introduced the survey and described the overarching goals of the visual report. This section starts at the beginning of Appendix B in Figure B.1. Sections provided instructions on question structure and how to answer the survey questions. 36

The second section presented questions concerning the report section layout, as well as provided design proposals for each report section. This section starts in Appendix B at Figure B.9. Each question indicated the section of the report that designs are for, the data within and intended purpose of the section, and presented design options for that report section. The third section consisted of a single question that presented a report randomly put together from the design options. This section starts in Appendix B at Figure B.39. This filled-out report helped collect feedback concerning stakeholder impres- sions of a complete report. A final textual open feedback area collected the whole- report design feedback. I collected feedback on the survey from fellow visualization researchers in a pilot of the survey. These researchers had not previously seen the designs, and the primary goal was to verify that the design ideas were intuitive. Appendix B contains the survey instrument itself. The survey link was submitted to the drug checking team in a general chat mes- sage. The link submission included a survey description, instructions for taking the survey, and a time-frame in which the design survey would be available. I did not provide remuneration to those surveyed, and in consultation with my collaborators and supervisor established that ethics forms were not required as this informal survey was being sent exclusively within my team to gather feedback and brainstorm ideas as part of “regular duties” expectations.

Survey Results

The survey, which was open for seven days, saw ten responses with a 100% completion rate. Though the drug checking team had 23 members in the communication channel used to disseminate the survey, only ten members are active internal members who attend meetings. That there are ten active members and ten survey responses were submitted suggests that all active members submitted a response. Getting all active team members to participate in the design feedback survey is valuable for collecting the impressions of both non-client stakeholder groups. I was also able to collect indirect feedback information about the client stakeholders via stakeholder proxy. The survey results generated new design requirements and guided difficult design decisions between design options. For each question, survey-takers indicated design preferences, design choice suggestions (“other” selection option), and open textual 37

feedback. By counting indications of preference between design decisions, I was able to make design decisions between design options when a clear winning design was apparent. However, there were times when design options tied, and times when only a few people had answered the question by choosing design options they liked. In these cases, I used the open textual feedback and “other” multiple-choice options for qualitative guidance. This design guidance was consolidated with the previous design guidance to generate inputs for the next visual design stage.

4.2 Requirements and Acceptance Criteria

Hevner’s relevance cycle describes both requirements and acceptance criteria [18]. He states that requirements are essential application domain information which con- nects the problem and potential solution, and that acceptance criteria are measures of effectiveness in the problem context. Both requirements and acceptance criteria impact the design process. However, requirements are design cycle inputs used to generate design alternatives, whereas acceptance criteria are design cycle inputs used to measure the effectiveness of the design within the application domain. Both are necessary to create compelling designs as both inform design choices. The requirements analysis process I used in this design study involved iteratively collecting requirements and acceptance criteria, generating designs that hypotheti- cally suite the problem context, and then gathering feedback on those designs from stakeholders. The feedback I gathered generates new requirements and acceptance criteria, which I consolidated in the subsequent design iteration with already existing requirements as shown in Table 4.1. When I combine requirements and acceptance criteria into holistic design ob- jectives, I refer to them as design goals, as is done in other design studies such as Wunderlich et al. [55] and Lebeuf et al. [33]. These two articles present a set of design goals that capture the essence of their final design efforts. These distilled objectives represent the overarching intentions of the final design. The design goals which I distilled out of my design study are as follows:

Design Goals

• G1: Visualize percent composition and component composition using charts. 38

Table 4.1: This tables presents the connections between my stakeholders, informa- tion gathering processes, design constraints, and design goals. The Requirements and Acceptance Criteria column includes short descriptions of design guidance gathered at a high, medium and low level of abstraction. The Design Goals col- umn indicates which design goals satisfy the requirements and acceptance criteria. The Source column indicates where requirements and acceptance criteria were col- lected from. The Process column indicates which technique was used to collect the requirements and acceptance criteria. I describe the codes below the data. 39

• G2: Visually represent uncertainty in percent composition test results.

• G3: Visually represent confidence in component composition test results.

• G4: The digital and paper handout versions of the visual report be visually similar.

• G5: The visual report must present basic drug checking service information.

• G6: The visual report must present descriptors of the drug sample.

• G7: The visual report must highlight fentanyl test results wherever they are throughout the visual report.

• G8: Chemical analysts must be able to provide qualitative test results inter- pretations within the visual report.

• G9: The visual report must provide service disclaimers.

Design goals one to three are satisfied using an uncertainty in percent composi- tion design space and a confidence in component composition table chart designed as solutions for those design goals. Design goals four to nine are concerned with sections of the report surrounding the charts and describe test results, providing disclaimers to help clients understand limitations, and provide access to service information for clients. In the next chapter I discuss how I addressed design goal one which is to visualize percent composition and component composition using charts. 40

Chapter 5

Design Goal One: Visualizing Percent Composition and Component Composition

In Chapter 3, I presented the percent composition and component composition data formats produced by the chemical analysis systems. In this chapter I describe how that data is presented using specially selected and designed visual charts.

5.1 Improving Drug Checking Test Results Deliv- ery Using Charts

Using charts to represent test result data is an approach to improve client under- standing of test results. Charts are helpful because they use visual symbolic rep- resentations to convey data values and expose relationships within the data. My stakeholders hoped that a visual report design that presents test results data intu- itively and exposes the most critical relationships within that data for clients would help them make better harm reduction decisions. As a result the service stakeholders indicated that there should be a clear difference between the percent composition and component composition test results in the visual report. As mentioned, the current practice for presenting test results is using handwritten descriptions, which combines percent composition and component composition results in summary. In Chapter 1, I described an anecdote about the potential adverse effects of confusing percent composition and component composition test results as one motivation to create the 41

visual report. Depicting these data in very different charts helps avoid confusion between percentage composition and confidence in composition, and also facilitates more productive harm-reduction conversations based on the data. From both the harm-reduction and chemical analyst stakeholder groups, I gath- ered requirements that motivated providing clients with as close to raw data as pos- sible. Non-client stakeholders suggest that bringing clients closer to the raw data to empower client stakeholder’s within the service. Chemical analysts wished to surface as much data as was reasonable to facilitate a deeper understanding of the test results by the clients. Harm-reduction stakeholders considered it a moral responsibility for the service to give clients access to decision making information, but not use unfa- miliar presentation or confusing amounts of detail. In all, my stakeholders provided strong motivations to present clients with simple visual representations of the raw data which don’t overstate the accuracy of that data.

5.2 Selecting Appropriate Charts

To select charts that would suit the data formats and design requirements of the problem context I reviewed computer-human interaction literature and visualization literature. I relied upon John et al.[21] for their exploration of boundary objects, which the authors define as intermediary artifacts which are created to facilitate collaboration and communication between disparate fields or groups. It was helpful to understand the visual report as a form of boundary object which sits between harm-reduction workers, chemical analysts and clients during the harm reduction conversation. In this role, the visual report helps facilitate the dissemination of decision-making information from service staff to members of the public concerning their drug samples and drug use. Placing the visual report into a boundary object role in the context helped me un- derstand priorities in the subsequent step of determining which charts should present the percent composition and component composition data. For selecting which charts would best present the data, I relied upon Saket et al.[45] for their exploration into the task based effectiveness of basic charts which helped me make empirically informed chart choices in alignment with my desired outcomes. From my review I collected a set of candidate charts which were related to my problem space, and then filtered them according to their visual simplicity, fit-to-data, fit-to-requirements, and how commonplace the charts were. A visually simple chart is one which presents the un- 42

derlying data using as little chart structure as possible. In order for me to consider a chart to have a close fit-to-data means that the chart presents all of the informa- tion in the data, and nothing more. In order for me to consider a chart to have a close fit-to-requirements means that the chart does not rely on colour, animation, stigmatizing elements, or depend on complex and unfamiliar patterns or interpreta- tions. Additionally, for a chart to be commonplace it must be a graphic which a magazine or newspaper might reasonably use, as the visual report is intended for use with the general public. These criterion are codified within Table 4.1 as accessibility, usability, and readability non-functional requirements. The full set of charts I quickly collected was extensive, however, I wanted to corroborate my decisions with literature on selecting visual charts. Saket et al.’s research on the task-based effectiveness of

Saket et al., Task List # Percent Composition Component Composition Find Anomalies 1 Find Anomalies Find Anomalies Find Clusters 2 Find Extremum Retrieve Value Find Correlation 3 Retrieve Value Determine Range Compute Derived Value 4 Order Compute Derived Value Characterize Distribution 5 Characterize Distribution Find Extremum Find Extremum 6 Compute Derived Value Order Filter 7 Filter Filter Order 8 Determine Range Find Clusters Determine Range 9 Find Clusters Find Correlation Retrieve Value 10 Find Correlation Characterize Distribution Selected Chart Pie Chart Table Chart

Table 5.1: A prioritization of common visualization tasks to select charts for each data type [45]. The original set of tasks is in the left column, rank numbers for my orderings are in the middle-left column, task ordering for percent composition is in the middle-right column, and task ordering for component composition is in the right column. basic visualizations [45] was particularly helpful to solidify the choice of the charts for percent composition and component composition test results. In their paper, they empirically evaluate a tabular chart, bar chart, scatterplot chart, line chart, and pie charts in relation to performance with common visualization tasks these charts are used to support. The measures Saket et al., use to assess chart performance on tasks are speed, accuracy and user preference. In my safety-critical decision making appli- cation domain, it is more important that clients understand the raw data accurately (accuracy) and they are interested in and willing to read the charts presented (user 43

preference) than how quickly they can read the charts (speed). I selected potential chart candidates by rank ordering Saket et al.’s tasks based on the requirements and acceptance criteria gathered from stakeholders. I present the prioritization of these tasks in Table 5.1 according to the requirements and acceptance criteria, as well as the problem description. This prioritization of tasks results in the pie and table charts as standing out from the rest in their performance in accuracy and user preference.

5.2.1 Percent Composition Chart

Figure 5.1: The percent composition pie vs cake charts (left), and the component composition table chart (right).

For visualizing percent composition, I selected a pair of closely related chart candidates to choose between, the simple pie chart, and the similar cake chart [6], which is a stacked proportional form of a bar chart. I present these three charts in Figure 5.1. The pie chart and cake chart present proportional data through the rela- tive size of segments (each data point) in relation to the size of all segments together (representing the whole which segments are part of). Percentage marks spaced at regular intervals along a percent axis helps end-users better facilitate understanding the relative sizes of segments. Though less necessary in the case of the pie chart due to its naturally closed design, the cake chart benefits more clearly from denoting that segments add to 100% of the total. Additionally, each segment of the pie chart con- tains a segment label in the form of a drug name, accompanied by the percent of the drug sample it comprises. In the case that the percentages of a drug sample do not add to 100%, an additional black-filled segment will be added with an “Unknown” label and percentage of unaccounted-for composition. The results of the design feedback survey gathered later in the research process indicated that the cake chart design was preferable over the pie chart for the same 44

tasks. Five out of seven stakeholders specifically stated their preferences for the cake chart over the pie chart, but with no stakeholders stating they preferred the pie chart over the cake chart.

5.2.2 Component Composition Chart

For visualizing component composition, I selected a basic tabular chart, and I show the final chart layout in Figure 5.1. Conversations with the stakeholders motivated the choice and design of the tabular chart. Stakeholders considered the tabular chart to be the best method for presenting the component composition data because of its ease of understanding, customizability, and the familiarity of tabular charts with the public. The stakeholders thought that the tabular chart would be effective at presenting all component composition results from each of the five tests as it enables clients to see the diversity of results and ask questions about discrepancies they might notice. It was also a very natural choice when depicting the multiple sources of component composition data within the service and surfacing the discrepancies between their results, which was a critical requirement of the stakeholders. Two header labels were requested by my stakeholders as additions to the table design to facilitate client understanding of the chart. These colloquial headers were added to ease the readability of the chart for stakeholders who are not familiar with the chemical analysis systems. In Figure 5.1, I show how I added a top row to the component composition table design to provide simple descriptions for the categories of columns in the table chart. The first reads “What did we find?”, which then lists the full set of compounds found by all the systems, and the second reads “Which test found it?”, which connects the compounds to systems which identified them. Underneath the colloquial headers, I have a second row which starts on the left with an “Ingredients” header which sits over the list of compounds in the drug. The word ingredients was used instead of compounds because it is a less technical term that the stakeholders requested. The “Overall” header sits over a summary of the combination of the confidence indications found to the right of this column. This overall summary is what the chemical analyst is presenting to the client as their interpretation of the results, while also providing the client with empowering insight into the raw data should they be interested in exploring that more. The subsequent headers to the right each name a chemical analysis system and each column presents the confidence data from that system. It is worth noting that the “Fentanyl Test 45

Strip” will only ever indicate fentanyl related test results, but it is included should a fentanyl analog be found and present positive and negative test strip results. Now that I have presented the selection process of the charts and described the design elements of the selected charts, I move on to describing how I visually rep- resent confidence in component composition and uncertainty in percent composition as required by design goal two: visually represent uncertainty in percent composition test results. 46

Chapter 6

Design Goal Two: Visualizing Uncertainty in Percent Composition

Design goal one led to the selection of pie and cake charts to represent percent compo- sition data, and the selection of a customized tabular chart to represent component composition data. The purpose of design goal two is to integrate visual represen- tations of uncertainty in percent composition data into the pie and cake charts. I integrate confidence in component composition data into the tabular chart in Chap- ter 7. Design goal two reads “visually represent uncertainty and confidence in the test results.” and involves me taking the pie and cake charts and introducing uncertainty within them. I begin with explaining how both uncertainty in percent composition and confidence in component composition may help empower clients.

6.1 Empowering Clients with Uncertainty and Con- fidence Data in Test Results

For a detailed literature exploration of why presenting uncertainty and confidence in the test results is essential, please see Chapter 3, but I present some simple examples of the usefulness of these data types here, and the pie and cake charts in Figure 6.1. In the case of presenting uncertainty in percent composition, stakeholders care that the report visualizes uncertainty in the percent composition chart because the proportions of components influence the effects of a drug sample. For example, a client can 47

reference previous experience with similarly proportioned drug samples to estimate the strength of a tested drug sample. Understanding drug strength informs harm- reducing decisions for the client, however, it is critical that clients not be permitted to place blind faith in what are actually uncertain percent composition test results.

Figure 6.1: The percent composition pie and cake charts with 100% axis.

In the case of presenting confidence in component composition, the identification of particular components, like fentanyl, is critical when some components can be very harmful at low concentrations. Presenting confidence in the component composition chart helps clients understand how certain a method is in having identified a given component. Each system that produces component composition data accompanies each component identified with a confidence indication. These indications are derived from specific metrics within each system and are presented using different represen- tations, as seen in Table 3.2. Each system presents confidence in different ways by using colour scales, numeric scales, and icon scales. Beyond the data directly generated by the drug checking systems are the inter- system relationships between test results. These relationships are used by harm re- duction and chemical analyst stakeholders to understand whether or not components are more or less likely to be present. There are drug components which one test can identify, but another cannot, or has difficulty identifying, and so another technique is relied upon to find those components. By combining and interpreting the results 48

of all five systems, chemical analysts can create an overall qualitative picture of sam- ple contents to present to clients, and the visual charts need to facilitate discussions about this overall interpretation. In Chapter 3 I described my ultimately unsuccessful search for examples of present- ing uncertainty and confidence in drug checking test results within the drug checking literature. With the need to present uncertainty and confidence data well established both by literature and my stakeholders, the question then becomes how to visually represent uncertainty and confidence in the charts I have selected. I answer this ques- tion in two distinct parts in the following sections, part one addresses uncertainty in percent composition in this chapter, and part two addresses confidence in component composition in the following chapter.

6.2 Uncertainty in Percent Composition

In Beard and MacKaness’s [5] uncertainty in geo-spatial information (GIS) visual- ization article they describe three levels of data quality assessments for decision- makers. According to the authors, visualization researchers can aim to facilitate an understanding of uncertainty or confidence data qualities using visalizations through “notification, identification, and quantification” [5, p.40]. The authors say that noti- fication indicates the potential of data problems, identification categorizes the nature of the data quality issue, and quantification actually shows both the nature and ex- tent of the data problem. In this chapter concerning percent composition data, I am attempting merely to notify clients that there is the presence of a data problem, whereas, in the following chapter, I am attempting to quantify the data problem for clients. Beard and MacKaness also categorize users within three groups based on their data use needs. These three data use case categories are a production of in- formation goods, decision-making using information, and an exploration and research use case. The production and exploration cases are for information experts, and the decision-making use case can be for any member of the public who wishes to understand data qualities and make decisions. For both the uncertainty in percent composition and confidence in component composition design challenges, I classify clients to be in the decision-making use category. Beard and MacKaness suggest that data quality indication designs must consider the specific requirements of effectively presenting decision-making information to the group of end-users: I consider this to be a high-priority consideration within this and the next chapter. 49

I now describe the sources of uncertainty to be visualized in the percent composi- tion chart using more in-depth descriptions of the uncertainty in percent composition test results.

• Measurement Error: The FTIR system produces an amount of error in its test results. The webpage1 for the Agilent 4500 FTIR system that is being used by Substance has white papers concerning applications of the tool in test- ing drugs. A paper on cannabinoid strength estimation2 describes how the system is used in testing various types of cannabinoid drug samples. The per- cent composition estimations in these applications are for THC and THCA, the psychoactive substances in marijuana. The measurement error in the results for different tests ranges from 0.8% to 6.0% for THC and THCA percentage estimates. These estimates of error were calculated manually by chemists in comparison to reference tests done using stationary and more robust chemi- cal analysis systems. Performing these error calculations for clients, and their unique drugs are not possible for the drug checking service in short time frames.

• Manual Signal Subtraction Process: Substance depends on a manual sig- nal subtraction process to determine percent compositions instead of relying on the automated analysis produced by the FTIR. The automated analysis relies upon libraries of pure component signals to perform its analysis; however, the interactions and combinations of chemical compounds in real drug samples re- duce the reliability of the automated process. Upon scanning the sample with the FTIR technique, the chemical analysts see a total-sample signal, which in- cludes all of the compound signals combined. Chemical analysts then select compounds from a list of suggested compounds produced by the FTIR system. Each compound has a hit-score associated with it which indicates how well the software has identified that component using a score between 1-1000. The sig- nals of selected compounds are then subtracted from the averaged signal, one after another, until the residual signal is reduced to noise. This process pro- duces a list of compounds and percentages for those compounds. However, the order in which signal subtraction occurs can change which signals are available for selection in the following steps, which means there is non-determinism in the

1https://www.agilent.com/en/products/ftir/ftir-compact-portable-systems/ 4500-series-portable-ftir 2https://www.agilent.com/cs/library/applications/5991-8810EN_cary630_cannabis_ application.pdf 50

list of compounds in the percent composition test results. Therefore, the expe- rience of chemical analysts determines the quality of selections made, meaning that incorrect selections are possible.

• Sensitivity: A subtractive signal analysis process means that errors compound for subsequent selections. The compounded error means that although the limit of sensitivity of the FTIR may be quite low (good) in ideal conditions, practically speaking, anything that makes up less than 3-4% of the sample is difficult to estimate confidently. Compounds exist that are dangerous below this threshold, such as carfentanil. It is, therefore, difficult to assign percentages to compounds making up less than 3-4% of a drug sample.

• Missing Library Entries: The library of signals in which the software com- pares sample signals to contains thousands of compounds. However, experimen- tal or uncommon or undiscovered compounds could be missing from the library of signals. These compounds would either be effectively invisible to the FTIR or labelled as unknowns.

• Combinations of Compounds: It is commonly the case that single com- pounds are dangerous at specific concentrations. However, two normally harm- less compounds may be dangerous when combined. Successfully identifying these combinations depends on the FTIR system seeing both compounds, the chemical analyst producing a percent composition that contains both and then recognizing the combination as potentially harmful.

The above list contains a diverse set of uncertainties to visualize in charts. I turn to the uncertainty visualization literature to understand how this uncertainty might be better characterized.

6.2.1 Characterizing Uncertainty in Percent Composition Test Results

Uncertainty can be described in terms of type and extent. Kwakkel et al., describe four levels of uncertainty within decision support systems [30], which makes their work helpful in characterizing the uncertainty in percent composition test results. Level 1 is shallow uncertainty, wherein probabilities can be used to produce alternative 51

outcomes. Level 2 is medium uncertainty, wherein uncertain alternatives can be rank- ordered, but their relative likelihood is left unspecified. Level 3 is deep uncertainty, wherein the ordering of the likelihood of alternative outcomes is impossible. Level 4 is recognized ignorance, wherein alternatives cannot even be enumerated and being surprised is possible. However, the level of uncertainty is present in Substance’s drug checking data depends on the nature of the uncertainty within the data. Drug checking results generally contain both what Potter et al. [41] call aleatoric and epistemic uncertainty. Aleatoric uncertainty—or statistical uncertainty— rep- resent unknowns that arise from variations in measurements by chance. Epistemic uncertainty represents unknowns that arise from practical knowledge limitations, or measurement process deficiency reasons, that are left unmitigated. In the case of percent composition test results, the FTIR system provides multiple non-deterministic outputs, changing percent-composition orderings of components, and identifies different components at the limit of sensitivity based on which part of the drug sample is in the scanner. These aleatoric uncertainties result in Level 3 deep uncertainty where the ordering of alternative outcomes is impossible. Epistemic uncertainty also contributes to the uncertainty in percent composition test results. Substance could theoretically verify all of its findings through more intensive, and slower, chemical analysis processes. However, as mentioned before, service turnaround time requirements prohibit this comparative analysis, and the determination of precise error is therefore impossible. Thus, the service relies upon experiential estimation in the absence of other options. The requirement that test results present forms of uncertainty to assist client un- derstanding remains despite this fact, and so we combine the aleatoric and epistemic uncertainty estimations to arrive at recognized ignorance, or Level 4 uncertainty. According to my stakeholders, conveying aleatoric and epistemic Level 4 uncer- tainty to drug clients has been very challenging. Without a tool to assist communica- tion, the service stakeholders could make changes to the nature of the service. In order to reduce uncertainty in test results, chemical analysts could potentially character- ize and systematically account for some sources of error within test results. To more clearly describe this uncertainty, harm reduction workers could establish protocols for handling different harm reduction delivery scenarios. Additionally, to ensure clients have previous experience with uncertainty in data, the service could limit access to the service, or change the manner of service delivery in significant ways. However, to my stakeholders, none of the above approaches would be acceptable compromises 52

to make within a drug checking service which is responding to the opioid crisis, and must serve members of the general public. Instead, a method for representing an unquantified amount of uncertainty within the percent composition visual charts is needed: an unquantified uncertainty visualization.

6.2.2 Uncertainty Visualizations for the Public

A question to consider at this point is, is it possible for the public to utilize uncer- tainty information in visualization supported decision-making? In fact, research in uncertainty visualization has highlighted the risk of not including uncertainty repre- sentations in data-driven decision-making activities with the public [11]. In a qualitative study of interactions between uncertainty representations and decision making, Roth found that “decision-makers purposefully treat ‘best available’ information as ‘best possible’ and try to reduce a decision to a simple yes/no” [44, p. 326]. This simplification is relevant because drug usage decisions are complicated, and there is the chance that people who use drugs could treat our drug checking test results as the best possible information as opposed to the best available information. Whether or not to use a drug sample should never be simplified to a yes/no decision, according to the stakeholders. Empirical evidence from Greis et al.[15] and Kay et al.[23] show that the public does benefit from well-designed uncertainty visualizations. Correll declares that, as visualization design researchers, “We ought to visualize hidden uncertainty” [11] as visualizations are socio-political-ethical actions, and visualization researchers must be cognizant of the effects of visualizations on their end-users. Based on the perspectives of the stakeholders, the visual report design will have socio-political-ethical impacts on the population of people who use drugs, and so this visualization research must account for these effects as much as possible.

6.2.3 Design Guidance for Visualizing Uncertainty

Olston and Mackinlay [39] outline the differences between bounded uncertainty and statistical uncertainty and describe a technique called ambiguation for displaying bounded uncertainty within proportional charts. Ambiguation, according to the au- thors, is useful for delimiting a space in a proportional chart in which the boundaries of segments fall. These ambiguation bands between proportional chart segments help the end-user understand that uncertainty exists within the data being presented, but 53

does not characterize the distribution of the error within the band. In some sense, ambiguation is an indication of uncertainty that does not display the degree of un- certainty. In their article, Olston and Mackinlay display ambiguation in pie charts and stacked bar charts with multiple bars in examples of bounded uncertainty, show- ing the appropriateness of these visualizations for displaying percent-of-whole data and an unusual form of uncertainty. This design idea supports the conceptual feasi- bility of including bounded uncertainty in pie and cake-style charts; the challenge is now how to introduce unquantified uncertainty in pie and cake charts. In their systematic literature review of assessments of uncertainty visualizations, Kinkeldey [25] introduce their concept of the Uncertainty Visualization cube (UViz3). This cube has three dichotomies as its dimensions; intrinsic / extrinsic, coincident / adjacent, and static / dynamic. These dimensions were selected based upon their review of uncertainty visualization literature and practical considerations within vi- sualization design. The first dimension, intrinsic/extrinsic, describes whether designs rely on exist- ing symbology to represent uncertainty, or they add additional symbols. The second dimension, coincident/adjacent, describes whether designs represent data and uncer- tainty within the same view or separate views. The third dimension, static/dynamic, describes whether designs use animation or not. These dimensions describe a design space in which existing uncertainty visualiza- tions can be spatially categorized and evaluated. Additionally, areas in the design space can be identified as desirable for uncertainty visualizations to end up within, and then uncertainty visualizations can be designed to align with the characteristics of areas in the design space. UViz3 helped me choose which parts of the pie and bar charts to modify. The requirements of the project indicated that intrinsic, coincident, and static uncertainty visualizations would be most suitable. Intrinsic designs were desirable to indicate uncertainty within the existing chart components, and because adding additional symbology would complicate charts. Us- ing intrinsic symbology helps deepen the visual connection between the uncertainty and the proportions which contain the uncertainty. Coincident designs were desireable because there is only one representation within the visual report for displaying percent composition results; the proportional chart. As well, keeping visuals simple and intuitive was a factor. 54

Static designs were desireable because, although the digital version of the visual report could support animations, the paper version of the visual report would not. The requirement to have the digital and paper versions of the visual report be the same would mean that the animations likely couldn’t include important data within them. Skau and Kosara’s research [48] on what parts of pie charts are important in conveying proportional information indicates that when compared with arc length and area, the angle is the least important characteristic of pie charts. Therefore, the line representing angle is potentially a candidate for presenting uncertainty. This line is also present in the conceptually symmetrical cake chart as the horizontal line between segments, along with nearly all other parts of the pie and cake chart. Design ideas are easily translated between the two charts as a cake chart is essentially a linearized form of the pie chart. This means that there exists a similar line within the cake chart that could be re-utilized for the same purpose as within the pie chart. Olston and MacKinlay’s ambiguation [39] concept also inspired the idea that an indication of unquantified uncertainty could be integrated into both the pie and cake chart designs directly. In all, the above literature indicates that:

1. more than statistical forms of uncertainty can be visually represented within proportional charts,

2. there is a target area within UViz3 which uncertainty visualization designs should fall if they are going to suit my problem space, and,

3. there are some aspects of proportional charts which affect the baseline func- tionality of the chart more than others, and therefore are better candidates for manipulation on behalf of representing uncertainty.

This leaves the problem of how to generate design candidates which solve our target problem using these charts. To do so, I used an application of Bertin’s visual variables [7] (and extensions) to uncertainty visualization, as Kunz [29] and MacEachren [35] have done previously, to identify which visual variables to manipulate to convey unquantified uncertainty. Conversations with my visualization researcher colleagues about the challenges I was facing, and Bertin’s concept of breaking down of charts into visual marks and visual 55

variables, led to the creation of a formal design space for introducing unquantified uncertainty to percent composition charts.

6.3 Unquantified Uncertainty Design Space

From Schulz et al.[46], we get a clear definition of what a design space is. Design spaces are hypothetical constructs based on the idea that the whole is the sum of the parts. By deconstructing an artifact into its parts and reconstructing it with modifications to those parts, it is possible to create what was not created before and better understand the original artifact. Design spaces are useful when a visualization researcher wants to formalize their implicit design decisions and improve communication of sets of design decisions to others. Design spaces are described by design dimensions, with nearly infinite design options available for each dimension. Points in the design space are tuples of design choices along each of the dimensions, and a researcher explores a design space by making sets of design choices. I developed the following process to describe the unquantified uncertainty visual- ization design space based on an abstract process followed by Schulz et al.’s article [46].

1. Breakdown: I choose a breakdown of the charts into parts.

2. Dimensions: I describe the dimensions of the design space based on those parts.

3. Systematic Exploration: I describe the design space as a whole by using tuples of design decisions in each dimension.

4. Application: I apply the design space to generate candidate designs specific to the application domain. And then later in this thesis I,

5. Assess: Assess the completeness and consistency of the design space (Chapter 10), and I,

6. Evaluate: Evaluate the candidate designs within the context (Chapter 11). 56

6.3.1 Preliminaries

Although well constructed design spaces are not limited to producing designs for a single design problem, contextual requirements must play a role in how the design space is utilized to solve a specific problem. The requirements gathering processes I followed in this design study led to numerous requirements and acceptance criteria. The detailed list of requirements and acceptance criteria are in Chapter 4. The general design principles derived from the requirements which guide the creation of this design space I outline as follows:

1. Introduce unquantified uncertainty into the pie and cake charts without dis- rupting the usefulness of the charts as compared to their unmodified states.

2. Help clients understand that the percent composition test results contain un- quantified uncertainty and that they should think critically about the implica- tions of the data.

3. Intuitively indicate uncertainty within the existing visual marks of the charts without additional symbology.

6.3.2 Step 1 - Breakdown

I broke down the pie and cake charts in similar ways. It was possible to break down both charts together because of the structural symmetry between the pie and cake charts. I decomposed the proportional charts using a logical spatial hierarchy of abstraction. At the highest level of abstraction, two elements compose the proportional charts; a segmented chart, and a percent axis. The segmented chart represents the propor- tions within the data using its spatial characteristics. The percent axis assists in the estimation of proportions by indicating a scale from zero to one hundred along one side of the segmented chart. It is possible to further decompose both the segmented chart and percent axis into their spatial components. The resulting spatial compo- nents can be described as visual marks within Bertin’s seminal research [7]. Bertin’s work also describes how each visual mark possesses visual variables [7]. Visual vari- ables are the modifiable characteristics of the visual marks. Examples are line colour, line width, and line style. From these charts, I produced six visual marks from the segmented chart and two from the percent axis and I show this decomposition in 57

Figure 6.2. Further decompositions of the charts result in segmenting visual elements in illogical ways that break apart intuitively atomic portions of the charts, and so I stopped here.

Figure 6.2: Decomposition of the pie and cake chart into visual marks.

6.3.3 Step 2 - Dimensions

Together visual marks and their visual variables make up the dimensions of the design space. These visual marks each possess a set of unique and shared visual variables. The visual variables control the visual appearance of the visual marks. Visual variables also possess meta-variables. An example of a meta-variable would be the degree of manipulation a variable is experiencing; for example, line width can be doubled, tripled or more and this represents the extent of the manipulation of the line’s width visual variable. Bertin refers to the range of possible manipulation for a visual variable as its resolution [7]. How modifiable a visual variable is (resolu- tion) is a relevant design decision because the degree of manipulation impacts other visualization characteristics. Therefore, the number of individual dimensions in this design space is the number of visual variables from each visual mark added together, multiplied by the extent of modifications to each visual variables. This multiplicity creates a huge design space, as combinations of manipulations are also possible. 58

To navigate through the design space, one can do so by manipulating the visual variables of each of the visual marks by a chosen degree amount. The tuple of design dimensions, and the possible design choices for each of those dimensions, describe the design space. I generated six shared visual marks from decomposing the pie and cake charts.

• Boundary Edge Marks: The edge which indicates the separation of chart segments perpendicular to the percent axis.

• Magnitude Edge Marks: The edge which indicates the size of the segment parallel to the percent axis.

• Label Marks: The textual labels which contain segment information.

• Areas Marks: The spaces contained within the shape formed by the boundary and magnitude edge marks.

• Axis Marks: The regularly spaced markings and text which run along the magnitude edge marks to assist in segment comparisons.

• Chart Legend: The legend which commonly contains combinations of area mark colour and label mark information.

Each of the visual marks possesses visual variables that visualization research can manipulate to create changes in the visual chart. Examples of visual variables that could be modified include colour, width and style.

6.3.4 Step 3 - Systematic Exploration

Once I determined the dimensions of the design space, I was able to explore the design space systematically. During this exploration I chose a dimension and tried out as many variations to the visual marks and visual variables as I could in order to flesh out my understanding of that dimension. I repeated this process for each dimension in the design space, as well as combinations of changes to multiple dimensions, and different extents of modification, in order to understand the capabilities of the design space. I present some of this exploration process in Figure 6.3. This design space exploration began as an emergent and generative process, but as my understanding of the design space increased I was able to begin identifying better and worse design concepts. For example, better design concepts would avoid 59

Figure 6.3: Examples of low, medium and high manipulations to individual visual variables of individual visual marks.

disturbing the baseline functionality of the charts, whereas worse design concepts would add unnecessary visual clutter. The baseline reference that I measured design concepts against were the unmodified pie and cake charts. These charts have well- established levels of usability, readability, and preferability and maintaining those characteristics as much as possible is a requirement for the resulting designs to be effective in the problem application.

6.3.5 Step 4 - Application

My exploration of the design space highlighted dimensions of the charts that would be useful in meeting the requirements of my design goals. The design decisions described here were filtered by considering how the designs might be understood by an end user without guidance from anyone, and through consultation with my visualization colleagues about their impressions of design ideas. Changes to the label mark’s visual variables were closely related to proportion data as they literally describe it. However, the most direct manipulation of the label, that of presenting ranges of percentage values would suggest a specific amount of uncertainty when in reality there is an unknown amount of uncertainty. Additionally, 60

adding a tilde, italicized text or underlining could be misunderstood as simple font style decisions as opposed to indicating uncertainty. Changes to the magnitude edge mark’s visual variables, examples of which I show in Figure 6.3, could perhaps indicate that something is strange about the entirety of the data as the effect is seen parallel to the 100% axis. However, the changes I explored did not hint at uncertainty specifically within the proportions of segments. Changes to the area mark’s visual variables, were not rated as successful either. Fill textures composed of question marks could indicate uncertainty in what that component is as opposed to uncertainty in the amount of that component, and colour fills similar to Olston and Mackinlay’s [39] ambiguation suggest specific amounts of uncertainty. Ambiguation is for presented bounded, or known, amounts of uncer- tainty, which it indicates using the width of the band of ambiguation of the segment boundaries. These, and other, area mark design ideas did not present the intended concept effectively. The most promising design concept I generated used boundary edge marks which delineate segments of the proportional charts. Kosara’s work [26] on whether arcs, an- gles or areas are most important for understanding pie and doughnut charts indicates that angles are the least important aspect of those charts for understanding segment size. This means that since the boundary edges are what make up the angles of pie charts they are good candidates for modification to convey unquantified uncertainty without negatively impacting chart readability. Since boundary edges also exist in the cake charts, the same modifications have the same implications within that chart. Modifying the boundary edges in ways that suggest uncertainty without suggesting a scale of modification was ultimately my best option, and I show some examples in Figure 6.3. With desirable manipulations identified in the design space, it is possible to begin combining modifications to generate new ones. Creating combinations of manipu- lations is then possible within the visual variable, or across visual marks. Figure 6.4 shows an example of two modifications within the same visual mark combined generating a third modification style. With a sense of which visual mark modifications might be useful in the context, I wanted to explore visual variable modifications using some test data. I used an arbi- trary test result of 45% heroin, 27% caffeine, 20% sugar (mannitol), and 8% fentanyl as my test data. I then generated design ideas that could capture the symbolic nature of the uncertainty and draw client attention to the fact that it is unquantified. Figure 61

Figure 6.4: An example of combining manipulations to individual visual variables to create new, compound manipulations.

6.5 shows two examples of manipulations of the boundary edge in both the pie and cake charts. The first on the left uses a zig-zag line in the place of the regular straight line, and the second uses a dotted straight line in place of the regular straight line.

Figure 6.5: Applying a zig-zag line and dotted line modifications to the pie and cake charts to generate design alternatives.

By hierarchically combining modifications I generated hundreds of design alter- natives for the problem application. This design space was used to generate design candidates that were presented to stakeholders in the design feedback survey. The survey presented a choice between the proportional charts, and a manner of display- ing unquantified uncertainty within that chart as two separate decisions. I include screenshots of the designs used in the design feedback survey in the Appendix as well as in Figure 6.5. Stakeholders strongly preferred the zig-zag line unquantified uncertainty design concept applied to the boundary edge mark applied to the cake chart as the underlying chart, which is the chart on the second to the left in Figure 6.5. Therefore the cake chart with zigzag segment boundaries is the chart design used to represent percent composition within the report. With the design challenge of visualizing unquantified uncertainty in percent com- position complete, I now describe design goal three; that of presenting confidence in component composition. 62

Chapter 7

Design Goal Three: Visualizing Confidence in Component Composition

The previous chapter covered design goal two, where I visualize uncertainty in percent composition. The purpose of design goal three is to integrate confidence in component composition data visuals into the tabular chart.

7.1 Confidence in Component Composition

In the previous chapter I describe how Beard and MacKaness [5] delineate three levels of indications for data quality assessments within decision-making contexts. I cate- gorized presenting uncertainty in percent composition as an indication level measure which merely suggests to end-users that a data quality problem exists. For confidence in component composition, however, I classify this as Beard and MacKaness’ quan- tification data quality indication. This means that I must indicate both the nature and extent of the data quality problem for clients to assess. In the case of confidence data, the nature of the data are hit-scores that each machine produces. Each ma- chine’s hit-score uses a different scale and is generated in a different way as described in Chapter 3, however they all produce the same abstract component identification quality information. The extent of the data is the hit-score value, or quality of the identification, for each component by each system. For presenting both uncertainty and confidence in drug checking test results, I 63

classify end-users, the clients, as falling within Beard and MacKaness’ decision-making context. This, in turn, means I must ensure my designs reflect the specific needs of my end users in their particular use case. This means that I must bring the requirements, context, and literature I have gathered together with the customized chart design and confidence data to generate a table design that satisfies all the constraints. The primary design outcomes my stakeholders are interested in from this design are threefold; enable chemical analysts to present an overall interpretation of the component composition; bring clients closer to the data to empower their decision- making; and enable a conversation about discrepancies between the test results from different chemical analysis systems. The first design constraint is the tabular chart I selected and customized in col- laboration with my stakeholders, which I show in Figure 7.1 and describe in detail in Chapter 5.

Figure 7.1: The component composition chart, with confidence data cells indicated in red and legend indicated in green.

The second design constraint is the actual confidence data which I populate the chart with. The confidence data scales, or hit-scores as they are also referred to, for each chemical analysis system are presented in Chapter 3, with each hit-score scale serving a similar purpose; they indicate the system’s confidence in identifying a particular component. However, the actual hit-score data values are not stored within the test results database at this time in Substance’s research project. This means that any confidence visualization cannot rely on the data to automatically generate the confidence indicators inside the chart, and instead, they would need to be set by the 64

chemical analysts. Beyond the values not being in the database, manually setting indications is required because chemical analysts must interpret hit-scores from each scale differently, summarize across the hit-scores to populate the “Overall” column in the chart, and account for states outside the hit-scores scales. During my requirements gathering processes, additional states were identified outside of confidence, which are important for chemical analysts to be able to show to the clients. For example, hit- scores cannot capture when a machine is incapable of seeing a drug component as hit- scores are only generated when systems do see a component. Previous requirements such as having a black and white design, and avoiding stigmatization of clients still apply, which means that this design challenge must satisfy relatively tight design constraints.

7.1.1 Characterizing Confidence in Component Composition Test Results

I start the process of characterizing confidence data by describing the differences I see between confidence and uncertainty. I describe the information contained within hit-scores as confidence instead of as uncertainty because the ratios become closer to 1.0 as uncertainty decreases. Additionally, for example, in a hit-score from one of the chemical analysis systems of 150/1000, the numerator and denominator is a value of confidence of identification. If the ratio was instead one of uncertainty the same machine would state 750/1000 as its score for the same test run, with a ratio of one indicating perfect uncertainty. Thus, I consider hit-scores to be approximations of confidence and describable as the mathematical complement of uncertainty for this study. Because of the complementary nature of confidence and uncertainty, I reapply Kwakkel et al.’s four levels of uncertainty [30], and Potter’s descriptions of aleatoric (statistical) and epistemic (methodological) uncertainty [41] for the purpose of better understanding confidence in this design study. Confidence, just like uncertainty, can be described in terms of type and extent. The chemical analysis systems provide hit-scores which estimate the confidence of identification. It is the practical requirements of the service which prevent the hit- scores from being entered into the database at this time for direct utilization in visuals, and, practical requirements which prevent doing detailed cross analyses of the confidence of each component identification. Therefore, the challenges of confidence 65

data are more easily attributed to epistemic shortcomings as opposed to aleatoric origin. As for the extent of confidence, the hit-scores generated by each system are a direct measure of the extent of the confidence of identification of a component, which means that for each test result chemical analysts can establish confidence quite clearly. However, the table chart as a whole includes all five chemical analysis system’s results with each system having a different hit-score scale. Having hit-scores helps chemical analysts understand the test results, but perfectly accounting for differences between the scales would be difficult. Therefore, I classify the confidence data as a whole as Level 2 or medium uncertainty. With a clearer understanding of the confidence data being presented in the table chart, I then explored confidence visualization literature for design guidance.

7.1.2 Design Guidance for Visualizing Confidence in Com- ponent Composition

Visualization literature most commonly refers to confidence in terms of confidence intervals as a form of uncertainty [1, 10, 34, 40], or as a measure of visualization performance with end-users [8,25], or both [35]. Potter et al. [40] describe how uncertainty information is composed of error, ac- curacy and confidence, and explore how to better depict various forms of uncertainty data using their so called summary plots. In Potter et al.’s work they are specifically referring to confidence as confidence intervals and statistical calculations of probabil- ity distributions which have a purely mathematical description. Confidence intervals are different from the confidence in component composition data that I am presenting, as each hit-score has a hidden underlying heuristic which the chemical analysis system uses to generate the hit-score, as well as system-specific thresholds which qualitatively indicate how hit-scores should be interpreted. Additionally differentiating confidence intervals from confidence in component composition is the previously mentioned state of a machine not being able to identify some components, which must be captured within the component composition chart within the same table cells. One example of visualizing confidence other than confidence intervals is Kumpf et al.’s [28] work exploring clustering weather forecast projection data to depict the confidence meteorologists have in weather outcomes. In these confidence visualiza- tions, designs overlay different potential forecasts onto a single map with areas of high 66

predictability and low predictability becoming more evident than in single forecast vi- sualizations. Unfortunately, this confidence visualization research requires an expert level of understanding of weather data, and only general takeaways from research like this can translate to the non-expert level of confidence visualization required in my context. In the other category of confidence and visualization is an example of using con- fidence as a measure of visualization performance in Riveiro et al.’s [42] study, where they use measures of this type of confidence when designing a decision-making tool that visually presents uncertainty in data while identifying targets. In their paper, the authors present study subjects with uncertainty visualizations and ask them to gauge their confidence in the decisions they are making using the visualization. The outcomes of this application of confidence in visualization research are helpful when trying to determine what visuals helps users feel confident in their choices, but does not apply to visualizing confidence directly. Directly useful literature on presenting Type 2 aleatoric confidence seems limited based on my exploration of literature. Instead, I was able to use design guidance from uncertainty visualization literature due to the similarities between confidence and uncertainty. For example, in MacEachren et al.’s [35] seminal empirical work on visual semiotics used to depict uncertainty in visualizations, they explore fundamen- tal techniques to introduce uncertainty into the existing dimensions and symbology of visualizations. MacEachren empirically explores how well representative instanti- ations of fundamental techniques perform in accuracy, precision, and trustworthiness in the spatial, temporal and attribute dimensions. Again, I reapply this uncertainty visualization literature as design guidance for depicting confidence in the table chart.

7.1.3 Generating Confidence Indicator Design Alternatives

For visual consistency, it is desirable for each machine’s results to be presented us- ing the same design language. As described in Chapter 3 Table 3.2, each system’s component composition test results consist of lists of ingredients with hit-scores as- sociated with each component found. Each drug checking system uses a different hit-score metric, but the visual report must abstract these into a common indication of confidence. The design guidance and design constraints I gathered suggests that confidence indications be kept simple and keep the chart as unmodified as possible. The first 67

attempt to generate an indication of confidence used a sliding bar, which showed the level of confidence vs the level of uncertainty, as seen in Figure 7.2. This design concept directly depicts the complementary nature of confidence and uncertainty. As the bar moves further towards one end of the slider or the other, it indicates that there is more or less confidence. However, emergent design constraints invalidated this design direction. I learned that I would only be able to use black and white in the designs, and this design concept is difficult to understand without colour. Additionally, the numerical values which would automatically populate a chart like this are not available within the database, and so these would have to be set manually which is not ideal. As well, design feedback that I gathered from the stakeholders indicated that green and red could be misunderstood to imply positive and negative, and therefore potentially stigmatizing, and so this design idea was considered invalid.

Figure 7.2: Component composition confidence indication using a linear scale which balances between green confidence and red uncertainty.

Instead the chart must be black and white, use manually set indications, and not stigmatize the clients with visual language. A discontinuous scale of values was what my stakeholders indicated they could use instead of the linear scale of hit-scores. A stateful scale simplifies inputing and describing the data for my stakeholders. I created designs around three and five levels of confidence, and also added a detection-not- possible state as well. Figure 7.3 shows how the combination of a set of icons and a legend below the table chart can be used to present stateful confidence in component 68

composition data. This stateful icon scale uses intuitive icons to indicate the nature of the component composition confidence data visually.

Figure 7.3: Component composition confidence indication using a multi-state icono- graphic black and white icon scale.

As I show in Figure 7.3, I explored a number of icon scales, with both positive and negative feedback from my stakeholders during the design feedback survey and design feedback meetings. The final selection was the icon scale indicated in red in Figure 7.3, and I show this scale in use with the table chart design with artificial test data below the icon scales for reference. Both uncertainty in percent composition and confidence in component composi- tion are now visually represented in visually simple ways within the selected charts as required by design goal one, two and three. Next I describe how the remaining six design goals were satisfied. 69

Chapter 8

Additional Report Design Goals

In the previous three chapters I described the design of charts for presenting un- certainty in percent composition and confidence in component composition. Those designs satisfied design goals one, two and three. In this chapter I describe how the remaining six design goals for the rest of the visual report were satisfied through design iterations. For each design goal description, I provide sets of transitional descriptive design images. These images include artifacts used in gathering feedback from stakeholders and help present how design ideas evolved between design iterations. I include a description of how requirements and feedback impacted the design with each image set. The rightmost image within each figure is the design concept that ultimately satisfies the design goal.

8.1 Design Goal Four: Digital and Handout Re- ports Must be the Same

The service staff stakeholders can have a hard time connecting the outcomes of the service with events which occur within the harm-reduction conversation. This chal- lenge was a significant motivator in their requests for a visual drug checking report; it would allow them to anchor their conversations using a test results artifact. However, clients take their drugs and test results outside of the drug checking service and may use their drugs at a later time. Clients using reports outside of the service means that the harm-reduction con- versation has to be memorable in order to be effective. Having a copy of the visual 70

report go with clients outside of the drug checking service is one way to enhance the effects of the harm-reduction conversation. In this way, when clients are making their drug use decisions, they have the same artifact that they used during their visit to the drug checking service. However, this means that both the digital and paper versions of the handout reports must be visually similar, and present the same data the same way. Printing off a report would be an effective method of delivering copies of the visual report to clients. However, Substance is an onsite and mobile service with multiple locations, so the printers used are laser black and white printers for reliability and low cost. Using simpler printers means that if the printed report is only able to show black and white, then so too should the digital report if the visual reports are to be visually similar and, therefore, memorable. Making the digital reports black and white means that the service stakeholders can print the paper reports off of any printer from a pdf or screenshot of the digital report, and no unique conversion is necessary to make it readable. Creating a black and white digital report precludes issues with stigmatizing colour usage (green has positive connotations, red has negative connotations according to my stakeholders). It also dramatically reduces the design decisions that I can and need to make while satisfying all of the rest of the design goals. As described in the previous three chapters, the percent composition and compo- nent composition visual charts cannot use colour channels to convey uncertainty and confidence information. Alerts about fentanyl content within the report cannot be coloured red if tests are positive for fentanyl, or green if not present. This requirement means that designs must rely upon visual marks and visual variables that are usable in black and white. Grey-scale was not considered a meaningful design option to convey actual data properties because of the potential unreliability of producing the tones with the printers. Thus the digital and resulting printed reports only convey data using black and white with grey-scale used only in section layout and absolutely no colour in designs, as seen in Figure 8.1. 71

Figure 8.1: Early visual report design with color eventually became a black and white final report design.

8.2 Design Goal Five: The Visual Report Must Present Basic Drug Checking Service Informa- tion

Drug checking meetings saw many design discussions about increasing the positive impact of the drug checking service. Because visual reports are to be handed out 72

to clients, stakeholders believed that including service information could improve the name recognition of the service and encourage more clients to use the service. Clients who had received test results may have conversations with other people who use drugs and talk about the service using the printed report. Having printed reports which include this information also helps people who are living on the streets, and do not have access to the internet to find service hours or locations. The information that the stakeholders believed would be useful to include was the name of the service, the website of the service, and the locations and hours of the service. As seen in Figure 8.2, there were no previous iterations as stakeholders only requested this in the last feedback session.

Figure 8.2: Design elements that display the service information in the report.

8.3 Design Goal Six: The Visual Report Must Present Descriptors of the Drug Sample

Drug samples vary in appearance, consistency, and delivery method. Powders, pills, paper strips, pebbles and rocks are all form factors that drug samples come in. Drug sample colours can vary widely, and some include differentiating manufacturing mark- ings when in pill form. However, it is often true that multiple nondescript white 73

powders in plastic bags are brought in by a single client. Service staff stakeholders frequently serve clients who are bringing in multiple bags of white powders on behalf of multiple other people, as well as themselves. The challenge faced here is, how does one keep the connection between the visual report and drug sample strong, as mixups could cause overdoses. It is, therefore, vital to all of the stakeholder groups that test results and drug samples are closely connected. Physically stapling samples to printed paper reports is certainly one way to keep drug samples and test results connected. Additionally, including a section in the visual report dedicated to identifying the drug sample that produced the test results is a way to reinforce this connection further should the printed report and drug sample become disconnected. However, there are some design challenges in presenting this information within this design context. The natural element to introduce to accomplish this is a picture taken of the drug sample. A photo would help clients differentiate between the grain size of powders, colours, or to match bag shapes or types. The leftmost iteration within Figure 8.3 does this; however, this was problematic because it included a photo of the sample. Including a photo of drug samples on the report was discussed at length by my stakeholders because it enables clients to match bags to their reports, but it also potentially facilitates criminalization action against people who use drugs. As the report was eventually made to be black and white, these images also had to be black and white, further reducing their utility. A subsequent design iteration replaced the image with a categorical image de- picting the type of drug sample tested. I produced powder, pill, and paper tab iconographic placeholder designs in black and white. However, these were not de- scriptive enough to make matches between the drug samples and the test results on their own. Thus, the final design option was to describe the sample textually. It is possible to place a meaningful description there, but it would be more difficult to positively identify the sample in the case the client was arrested. Thus, drug sample identifiers were added, as are shown in Figure 8.3. 74

Figure 8.3: Photographic, iconographic, and textual design iterations for displaying sample identification. 8.4 Design Goal Seven: The Visual Report Must Highlight Fentanyl In the Test Results

Producing useful fentanyl test results for clients is a primary objective for Substance during the opioid crisis. How to present these test results was a deeply discussed and at-times divisive design decision. During discussions, the full spectrum from “we should not even especially indicate fentanyl” to “we should put a skull and cross- bones next to it” was heard during design feedback meetings from stakeholders. A dedicated area in the visual report was set aside for the fentanyl test strip results because of how important they are for drug use decisions. This area would present a “Positive” or“Negative” test result for fentanyl’s presence in the drug sample. Ad- ditionally, fentanyl results were to be indicated in the component composition and percent composition charts. In Hall et al. [16] the authors formalize methods of emphasizing data in data visu- alizations. The authors survey and present emphasis techniques that have been used within visualization literature, classify these techniques according to their temporal- ity and whether emphasis techniques are implicit or explicit, and provide guidance for generating new emphasis techniques. The following requirements, as well as global requirements found in Table 4.1 dictate the characteristics of a valid design solution for presenting fentanyl test results.

• Fentanyl test results are to be presented as a prominent standalone result, within the component composition table and within the percent composition chart.

• The same emphasis technique must work across all three locations in the differ- ent locations. 75

• The emphasis technique can neither rely upon colour nor animation.

• The emphasis technique must be effective when printed in black and white on paper and be visually similar in digital and printed formats.

According to Hall et al.’s classification of emphasis techniques this limits design op- tions to non-temporal extrinsic emphasis techniques. Additionally, the emphasis tech- nique must work within two different data charts and a textual data field. Of the 16 intrinsic visual variables Hall et al. identified during their survey of emphasis techniques, only static size, position, texture, orientation, and shape varia- tions can be considered. Of these five remaining emphasis options, changing the size, orientation or shape of fentanyl segments would likely break the baseline function- ality of the cake chart, and changing the position of the textual data field relative to nothing does not emphasize it. Adding a texture to the fentanyl segment in the cake chart could have been helpful if these segments were going to be larger than the common maximum of 5% of composition, and if adding a texture to the data field was possible. Instead, an exploration of extrinsic emphasis was conducted wherein additional symbology was added to the fentanyl data presentations. In an example of providing extrinsic indications to emphasize important data, Hellier et al. [17] describe successfully using shapes and colours on drug labels to indicate drug strength. Their work was inspiring; however, I needed to reapply their concepts to work in black and white. I also reviewed work by Lamy et al. [32] showing that iconographic languages can be used to convey medical conditions effectively. Their work supported the idea that icons could be used to describe concepts related to medical concerns using extrinsic symbols. I explored existing icon sets such as FontAwesome1, Material Design2 and Haw- cons3 for icon ideas. I targeted the resulting set of icon ideas at alerting clients to fentanyl test results locations throughout the report. The fentanyl data field was modified to includes a description of an icon that tracks other fentanyl test results throughout the visual report. Should the sample not contain any fentanyl, then the icon is not present anywhere in the visual report except for in the description of what the icon means. I initially chose the alert symbol shown in Figure 8.4, however, during stakeholder feedback, I learned that the alert symbol could be too stigmatizing, as a portion of 1https://fontawesome.com/ 2https://material.io/design/ 3http://hawcons.com/ 76

the population actively seeks fentanyl as their preferred drug. Instead, the report required an icon that would draw attention without adding stigma to fentanyl use. The design needed to invoke a ‘seen’ or ‘identified’ or ‘look’ response, as opposed to an ‘alert’ or ‘alarm’ or ‘yield’ response. It was more important for the icon to be highly noticeable throughout the report, as opposed to conveying meaning. I proposed the eye icon in a subsequent design meeting, and the stakeholders preferred this design. Figure 8.4 shows the design’s progression.

Figure 8.4: Highlighting fentanyl results throughout the visual report. Red highlights the locations in the report that the indicator was placed.

8.5 Design Goal Eight: Chemical Analysts Must be Able to Interpret the Test Results

There are multiple drug checking systems and, therefore, multiple sources of error and uncertainty, as discussed in previous chapters. It is crucial, therefore, to present clients with a summary or overall interpretation of the test results. Part of this overall interpretation is present within the component composition’s Overall column. The chemical analyst is responsible for filling that out with their overall estimation of com- ponent composition results. The stakeholders indicated that there is also frequently the need to present a qualitative interpretation of the entire drug checking test result. 77

I explored including purely qualitative scales that the chemical analyst could use to indicate their impressions of the drug sample. Scales that I considered includ- ing were ”sample strength” and ”overall confidence.” Early designs presented these iconographic scales which sliders could point to along straight or curved paths. This design language was used to help contextualize the test results within the broader drug checking results seen by the chemical analysts, and help clients remember what chemical analyst impressions of the results were once they left the service. In the design feedback survey, the qualitative sample strength scale was considered useful but the overall confidence scale was not. The stakeholders appreciated the sam- ple strength scale’s ability to help clients make drug use decisions, but were concerned that people with different drug use histories might misinterpret the implications. For example, a sample indicated as weak could still be harmful to inexperienced drug users. The possibility of someone thinking that because a drug is relatively weak compared to the average and therefore safe to consume would need to be explicitly disclaimed if this qualitative scale was going to be included in the report. The fi- nal design of the sample strength scale only retained the slider and a ”higher” and ”lower” label on each end of the scale, which I show in Figure 8.5. Chemical analysts reported that they would also appreciate an area to describe their interpretations of test results using plain text, and this was added.

Figure 8.5: Designs for presenting qualitative interpretations of the test results as a whole.

8.6 Design Goal Nine: The Visual Report Must Explicitly Disclaim Itself

Substance provides a service to a chronically under-served and stigmatized population during the opioid crisis. All drug checking services inherently produce results that 78

contain error and uncertainty. It is only fair to the clients of the service that the report clearly explains the shortcomings of the test results to clients so they know how to safely utilize the information they receive. A section of the visual report must present a clear description of what the service can and cannot be expected to do for the client. Figure 8.6 shows an example of the kind of content the disclaimer will include. Aside from disclaimers about the data, they also include reminders about the larger societal context of drug checking. These include the hazards of carrying a drug checking test result report on one’s person, and a section concerning limitations on the traceability between test results and drug samples. This second concept is included to help ensure people do not detach a “clean” test result to untested drugs in a bid to increase sales.

Figure 8.6: An example of the report disclaimer. The content will surely change as the service evolves.

With these design goals accomplished, the final report design was ready.

8.7 Final Report Design

The full design is shown in Figures 8.7, and 8.8. During the final design feedback session, stakeholders requested to change to a two-page report instead of the original single-page configuration I had. The change to two pages was done to increase the readability and enable double-sided printing on a single sheet. 79

Figure 8.7: Final Report Design Page 1 80

Figure 8.8: Final Report Design Page 2 81

I describe the implementation and deployment of the visual report in the next chapter. 82

Chapter 9

Implementation

With the design iterations in hand, and a visual report design finalized, I was then able to begin implementation of a software application version of the report which I will deploy into the context. Figure 9.1 shows how the deployment iteration contains a final consolidation of feedback and requirements to integrate final feedback on the visual report design prior to creating the software application. Please note that the

Figure 9.1: The complete research timeline with the deployment iteration highlighted. software application is under development still, and that the screenshots within this section represent the partly finalized visual report software as it will look in its final form.

9.1 Artifact Deployment Iteration: Make Stage

At this stage I was able to implement the final report design within a software ap- plication. To accomplish this I chose to use Lodestone1 as the software platform to

1https://thechiselgroup.org/projects/ 83

implement the final solution. Lodestone is a visualization research platform imple- mented using TypeScript2, NodeJS3, D34, PixiJS5, and ElectronJS6 by the CHISEL research group7 at the University of Victoria. It facilitates rapid research visualiza- tion development by employing a plugin-oriented architecture that supports common visualization libraries in Typescript. I chose this tool for this research project partly because I work as a research programmer developing Lodestone and am familiar with the platform. However, I primarily chose Lodestone because the underlying technology enables the easy cre- ation and deployment of novel research visualizations for production environments in the form of Electron applications. This drug checking visual report is a prime example of the target use case of Lodestone, where a research visualization is going to be used practically within a research context. In order to populate the application with data, it was necessary to connect to the actual test result database using its data schema and API. With the data coming into the application, I was able to create an interface that would present the drug checking test results. Because the report contained unusual uncertainty visualizations for the percent composition, it was necessary to create a new visualization algorithm for the rendering of percent composition charts. I also discovered during the implementation of the visualization that the design iterations had not fully captured all of the edge cases in the design space. This was a reminder to me to bring real data into the design process as early as possible in order to account for these types of edge cases. For example, what does the percent composition chart design look like when the amount of substances accounted for in percent composition test results do not add up to 100%? To account for this, I added a fill texture to the design, and an “unknown” component label for the segment. Figures 9.2 and 9.3 show actual screenshots of the visual report in its current form with all the content present, but the design has to be polished until it matches the quality shown in the previous chapter in Figure 8.7 and Figure 8.8. Later in this chapter you can see how the final version will appear including interactions to enter data and save reports.

2https://www.typescriptlang.org/ 3https://nodejs.org/ 4https://d3js.org/ 5https://www.pixijs.com/ 6https://electronjs.org/ 7https://thechiselgroup.org/ 84

Figure 9.2: The software application visual report with sections containing drug iden- tifiers, fentanyl test results, component composition and percent composition charts. This is a work in progress.

Figure 9.3: The software application visual report with sections containing qualitative interpretations, and service information. Hours and locations are still to be added. This is a work in progress

9.2 Artifact Deployment Iteration: Deploy Stage

Once I have fully implemented the visual report within Lodestone, I will deploy it directly onto the computers at each of Substance’s drug checking locations. The deployment should be easy because Electron enables the creation of applications containing the report interface for any operating system. Once I install the software application onto the service laptops, the software can connect to the database over the internet and begin rendering test results for specific clients and their samples. 85

9.3 Using the Application

Three use-cases rely upon the digital and paper visual report artifacts within the drug checking service. The first stage is the data entry use case relying on the software application.

1. The chemical analyst uses drug checking systems to generate drug sample data. The chemical analyst then enters and loads the data into a central database that contains all the drug checking test results.

2. The chemical analyst then uses the visual report application to populate a visual report with the data from the database for a specific clientID. The chemical analyst must first search for the clientID as shown in Figure 9.4. Data which

Figure 9.4: The visual report software when the chemical analyst is searching for a clientID within the database.

is automatically loaded from the database into the visual report is the percent composition, component composition, sample description, fentanyl test result, expected substances and the date of testing. An example of a preloaded report is shown in Figure 9.5 which is ready for the final data to be entered. The data which still needs to be entered into the report is highlighted in red, and is the confidence in component composition indicator icons, the sample strength slider, and any additional notes in the qualitative notes area. The data entry actions are shown in Figure 9.6 in progress on the three fields; a dropdown menu for selecting icons in the table, a pencil in the analyst notes section, and a hand sliding the slider for sample strength. 86

Figure 9.5: The visual report software with preloaded data and areas the chemical analyst must enter data into in order to finalize the report highlieghted in red. 87

Figure 9.6: The visual report with the three fields that must be manually filled out partly complete. Icons indicate how these fields are to be filled out. 88

3. A screenshot of the report is then saved to the database with an associated clientID and drugSampleID for future reference. A fully populated report is shown in Figure 9.7 that can be saved to the database or printed for use in the following use case.

The application then enters the harm-reduction conversation use case, also relying on the software application.

1. With a version of the visual report saved to the database for each clientID- drugSampleID pair, it is possible for users to enter a clientID and drugSampleID into a search field to see a visual report whose test results are ready for presen- tation in the harm-reduction conversation. It is also possible that the chemical analyst will have already printed off a paper copy and directly reference the software application used to generate the report during the conversation.

2. The client, chemical analyst, and harm-reduction worker can talk with one another about the drug checking test results, and their implications.

3. With the harm-reduction conversation concluded, the client may request a printed report to go with their drug sample if that has not already been pro- vided. End-users can print reports using the print button and a connected laser printer.

Finally, the printed report leaves the drug checking service in the hands of the client, and its hypothetical third visual report handout use case.

1. The client takes the printed report out of the service, perhaps stapled to a drug sample bag.

2. The client then goes on to use the drugs as originally intended, use the drugs in a modified manner, or discard the drugs, based on the information they received within the drug checking service, and from the printed report.

3. Alternatively, the client may bring samples and printed reports to third party individuals. The client can then use the printed report to describe the test results to the 3rd party.

4. Because the visual report contains disclaimers, all parties outside of the drug checking service have the opportunity to understand the limitations of the test 89

Figure 9.7: The visual report software with a finalized report ready to be saved to the database and printed onto paper for use in the harm reduction conversation. 90

results. Additionally, the printed report presents the service’s website, loca- tions and hours so that new customers can access the service and its resources themselves.

When the the make and deploy stages are fully completed, the software application will be used by the chemical analyst and harm reduction stakeholders to present uncertain drug checking test results to clients every week, who take will then be able to take away printed reports for future reference. Once this software artifact is being used in context with client stakeholders, it will enable feedback on the visual report design from clients in future work. In the next chapter I discuss the three contributions arising from this body of research. 91

Chapter 10

Discussion

In this chapter I discuss the three primary contributions I describe within this thesis. I discuss how the design study measures against Hevner’s rigour, relevance and design cycle definitions, describe a potential technological rule derived from this research, and use a visual abstract to coordinate a presentation of the design study as a whole. I also discuss the design space contribution in terms of consistency and completeness. I also discuss the visual report design and software artifacts in terms of design trade- offs, transferability and limitations

10.1 Contribution 1: Design Study

Hevner defines the relevance cycle as initializing the design process with an application domain [18]. This application domain contains a meaningful problem to be solved and generates both requirements that designs must satisfy as well as acceptance criteria for evaluating that design. This design study has captured some fundamental aspects of the drug checking visualization application domain during this first foray into the field. There certainly remain many more important challenges available in this space with a strong need for visualization research. That research in this area is likely to have a meaningful and immediate impact in the real world should encourage other visualization researchers to contribute. This thesis captures high-level aspects of the application domain for visualization researchers to benefit from in the future, but more design studies in the area are worth conducting to further refine our understanding of the application domain. 92

In this design study, I have ensured that the characterization of the problem space was thorough and that I collected requirements and acceptance criteria during multiple iterations of the design. I utilized requirements and acceptance criteria as design guidance inputs for creating designs, and then iteratively evaluated those designs with stakeholders in relation to acceptance criteria. This work towards improving the relevance of the resulting visual report is de- signed to enhance the abilities of the stakeholders within the drug checking service and produced a solution to the problems stakeholders initially described. Hevner defines the rigour cycle as the method in which prior design and application- specific knowledge are brought into the design process to help guide design efforts [18]. Rigour in research ensures that contributions of design studies are well-grounded in the prior art, and make novel contributions to the existing knowledge base. In this design study, I have brought drug checking, uncertainty visualization and confidence visualization, and design science literature into my research for critical guidance in all of my design processes. I identified gaps within the drug checking lit- erature where report artifacts and uncertainty descriptions have not been described, and begin to open that discussion to researchers. I generated a new design space that fills a gap in the uncertainty visualization literature for presenting unquantified uncertainty. I present a new and impactful application domain for visualization re- search, with rich ethical and societal potential. The contributions in this thesis are rooted in previous knowledge and extend to fill significant gaps in the literature. However, in a 2019 paper by Meyer and Dykes [38], rigour in design study has been highlighted as a recurring perceived shortcoming of design studies and thus deserves further elaboration. To distance design study from criticisms of insufficient rigour brought forward by positivist perspectives, Meyer and Dykes describe design study as “wicked subjective and diverse” [38, p.89], thus positioning design study more closely to interpretivistic interpretation. In short, interpretivism is the epistemolog- ical perspective that truth is constructed through the subjective experiences of each individual. This is in contrast to positivism, which places value within the discovery of objective truths. I agree with Meyer and Dykes interpretivistic classification of design study because of my experiences of design study as an subjectively emergent and organic process. To assist visualization researchers in improving and present- ing claims of rigour in their design studies, Meyer and Dykes propose six criteria by which design study rigour can be explored from an interpretivist as opposed to the more traditional positivist visualization research perspective. The criteria the Meyer 93

and Dykes suggest rigorous design study embody are reflexive, informed, abundant, plausible resonant, and transparent, each of which I discuss here. Reflexive: The research and the researcher impact one another. I had a limited understanding of the sensitivity of stigmatized populations in the initial stages of this research. I began to comprehend the challenges that the end users face on a daily basis through conversations with my stakeholders, and began to under- stand the limits of my knowledge. To improve my understanding I chose to explore visualization ethics literature, and came to the research as a different visualization researcher the more I learned. This changed how I prioritized design requirements within the study, but also expanded my appreciation of visualizations as ethically and politically charged artifacts. Thus, the research had impacted me, and because I was the one doing the research the research was also impacted. The designs I produced prioritized my end user’s needs more than previously; I decided to conduct addi- tional design feedback sessions; and, I further clarified requirements. The impact of researcher and research on one another went through iterations, bringing each closer to the other, just as design study has design iterations to bring problem and solution together. Informed: Design efforts draw upon design knowledge, design examples, and specific applications of abstract design guidelines. This design study was conducted in the new visualization application domain of drug checking. To my knowledge, no one else has done this type of research in this area, and so it was critical to support my design efforts with literature, existing drug checking reports, and design principles. I did not want to design within a vacuum, and so during the design study, and within this thesis, I have supported my decisions and designs with information from outside the immediate application domain. However, being the first to attempt research like this means that I am pulling together concepts from interdisciplinary sources. In so doing I am making informed, but primarily intuitive as opposed to empirical, connections. Abundant: Having more data, designs, consultations, stakeholders, and collaborators, among other things, is better than having less of those same aspects. In conducting this research I have drawn upon as many appropriate people, articles, and design concepts as has been reasonably possible. I have collected sub- jective and qualitative data four times using three methodologies to identify design goals and collect concept impressions with stakeholders. I have distilled more than 40 design requirements and identified nine design goals as a result of this effort, and 94

then used that design guidance to generate hundreds of chart concepts, of which I include only a small subset in Appendix B. To increase the abundance of supporting resources and to improve designs I could have collected more data, reached out to other drug checking services, and evaluated the final software implementation with clients. These were not achievable within the resource constraints of this project, but would have provided valuable insights. Plausible: Claims are backed up by relevant and convincing evidence. I have tried to support each claim with relevant evidence. The extent of my success in this attempt has been largely determined by the lack of previous research within the specific area. Instead, I have used content from specific domains to back up the matching aspect of my research. For example, part of this research is concerned with uncertainty visualization in drug checking. I chose to rely upon uncertainty visualization literature and drug checking literature separately to lend plausibility to the combination presented within this thesis. To complement the application of single domain knowledge to this multi-domain research, the design study includes qualitative assessment of solutions within context. This direct evidence provides contextual support for the claims I make about domain specific aspects of the research. Resonant: The research facilitates understanding and inspires researchers to participate. To facilitate understanding I have included extensive contextual in- formation about the application domain and design challenges. I have also made specific efforts to encourage interest and further action throughout this thesis by il- luminating both the opportunities and responsibilities I see in doing research in this domain. I specifically describe my work as a first foray into the field and thus highlight the prospect of expanding the work further. Transparent: All of the above criteria, and the research itself, are con- ducted and presented such that discussion and investigation is invited. It has been easy to slip into a defensive posture while writing this thesis. The chal- lenges I faced felt large and supporting work was hard to come by. I have striven to present my research as both meaningful and worth improving on and present the above criteria with honesty. I have attempted to facilitate discussion as much as I have attempted to prove points or make claims throughout this thesis. I do this because I personally believe that asking the right questions is more valuable than finding the right answers. Hevner defines the design cycle as the place where the inputs of relevance and rigour are used to generate designed solutions to real problems [18]. In the design 95

cycle, literature and problem space contribute to the creation of designs that are then evaluated in the context using acceptance criteria, which in turn act as inputs to the next design cycle. Hevner states that design studies must balance constructing and evaluating the ever-changing design artifact and that substantial lab testing is critical before sending artifacts out into the world. This design study consists of three design iterations wherein design proposals were vetted by stakeholders who represent the end-users of the design artifact. Each design iteration includes a requirements analysis stage, design stages, and a feedback stage, with each stage serving a distinct purpose within the design cycle. As described within Chapter 3, our most sensitive stakeholder group has been entirely shielded from this design process for their protection from unfinished intermediate designs. Now that the visual report design is considered to be safe enough for use in context, future work can establish even more thoroughly the efficacy of this visual report solution. This research project has worked within the limitations imposed by the problem context to create a solution that solves a real problem by using a rigorous design process, which is what Hevner intends with his relevance, rigour, and design cycles. As a drug checking service is a complex arrangement of information technology systems, processes and people, it is fitting to describe our design solution as closely aligned with van Aken’s concept of technological rules [52]. A technological rule can be understood as describing “the desired effect in a given situation along with a proposed intervention to achieve that effect” [51, p. 183]. A definition for this technological rule could be the following: In this design study we empower people who use drugs with actionable drug checking test results data (effect) in drug checking services during the opioid crisis(situation) using an uncer- tainty and confidence enhanced visual drug checking test results report(intervention). The technological rule needs to be refined and verified through evaluation of the im- pact of the visual report in its context in future work. Van Aken also describes design science as an effort to create a better world. His article explores how socio-technical systems are ethically and politically opinionated artifacts that can enable, disable, harm or help groups of stakeholders and that no design science contribution is complete without a discussion of such matters. It is clear that the visual report, and the process that produced it, was ethically and politically charged. Discussions between chemical analyst and harm reduction worker stakeholders frequently centred on the rights, perspectives and needs of a stigmatized and criminalized population. Many design decisions were made to protect, 96

enable and empower client stakeholders, despite them not being at the design table. This drug checking visual report is designed to support safety-critical decision- making concerning drug use in 2020 when opioid overdoses are rampant. By collect- ing design guidance multiple times before ever delivering this report to clients, it is hopefully true that this process generated a solution with minimal negative impacts. According to Correll [11], as visualization researchers, we have the moral respon- sibility to present the hidden uncertainty in data. Correll argues that neither data nor visualizations are neutral and objective, and instead, they are subjective and opinionated. These experiences in designing and creating this drug checking report have shown me just how opinionated the designs can be. Both groups of stakeholders that I had direct access to saw the design artifacts as conveying far more than their lit- eral content. They saw the artifacts as reflecting value positions on drug use and on the people who use drugs. This process helped expose more about the stakehold- ers, but also corroborates Correl’s argument about the pitfalls present in conducting visualization research in highly sensitive research contexts. Storey et al. present a visual abstract for describing design science contributions based on van Aken’s and Hevner’s concepts [51]. This visual abstract helps researchers understand and present their design science contributions for effective communication back into the knowledge base. I draw upon Hevner’s and Storey et al.’s representations of design studies to generate Figure 10.1 to present this design study. Storey et al. additionally present the concept of novelty as a critical measure of the value of contributions made by design studies. They present the concept of novelty to defend from researchers who would suggest that having a clear problem and solution descriptions reduces designing solutions to ‘mere “routine design”’ [51, p. 184]. The authors argue that reviewers assess novelty using the generalized technological rule as opposed to the detailed specifics of solution and problem. This design study generated a technological rule which is, in literature searches I have done to date, novel. In their global review of drug checking services, Barrat et al. [3] enumerated all drug checking services they could find in 2018. I visited all of the websites of all the drug checking services in the global review to find solutions to the problems faced in the context unsuccessfully. I also reviewed the latest research in the leading drug checking journals and also did not find discussions of presenting uncertainty in drug checking test result artifacts that served my purposes in this research. 97

Figure 10.1: This visual representation of the design study captures the relevance, rigour, and design cycles from Hevner [18], and presents the outcome as a technolog- ical rule from van Aken [52]. This presentation style was inspired by Storey et al.’s visual abstract [51].

10.2 Contribution 2: Design Space

This thesis describes a design space for visually presenting unquantified uncertainty in proportional data using pie and cake charts. This design space appears to be novel as I was unable to find previous examples of unquantified uncertainty visualizations or formally defined design spaces for introducing uncertainty into pie and cake charts. I hope that by keeping this design space un-constrained to presenting drug check- ing data, there may be opportunities to reapply the design space for other applications in need of presenting unquantified uncertainty. It seems that one area could be in visualizations of the stock market, where, for the lay-person, it may be helpful to know that that the stock market behaves unpredictably, and they may wish to make carefully considered decisions with stock market data. More generally, this design space may inspire adaptions of the concept of visualizing unquantified uncertainty, where the end-goal is to generate critical thinking in end-users using symbolic repre- sentations of unknown unknowns. Design spaces are abstract constructs and their usefulness and qualities need to be discussed in relation to accepted criteria. I now discuss the design space using established criteria [46] to ensure it is well conceptualized and useful. 98

10.2.1 Evaluating the Unquantified Uncertainty Design Space

Schulz et al. [46] discuss their own design space in terms of completeness and con- sistency. According to the authors, design space consistency is determined by the frequency of combinations of design decisions, which result in invalid designs. An invalid design is a combination of design decisions that produce a design instance that violates a basic principle which the design space is intended to adhere to. An example might my design space only producing proportional chart designs which are entirely ineffective at visually showing proportions. These would be invalid designs, and if a design space primarily produces invalid designs then it is an inconsistent design space. The completeness of a design space, on the other hand, is determined by the ability of the design space to fully populate its problem space with designs. If the problem space remains underpopulated with design solutions in some areas, the design space has failed to capture that portion of the problem space. For example, the problem space in my research requires proportional chart designs which present unquantified uncertainty in those proportions. It is possible that the design space could be con- structed such that only introducing numerical quantities of uncertainty is possible. The design space being incapable of populating an essential portion of its problem space is an indication of incompleteness. To overcome this, a visualization research could add more dimensions to the design space which may enable the design of solutions within parts of the problem space (increasing completeness), but these additional dimensions may also enable the design of invalid designs (decreasing consistency). Likewise, reducing the dimensions of the design space may make all the designs produced by the design space more valid (increasing consistency), but this reduction may prevent the population of parts of the problem space (decreasing completeness). Design spaces must moderate consistency and completeness against one another in relation to the problem space in order to be balanced. To summarize, the goal of creating high quality design spaces is to balance the flexibility of the design space with the focus of the design space. I now discuss consistency and completeness of the unquantified uncertainty in proportional charts design space. 99

10.2.2 Design Space Consistency

Consistency is the ability of the design space to generate valid design candidates using its constructs. I was able to see how different dimensions interact and understand some of the invalid designs produced by performing a systematic exploration of the design space. I define an invalid design from this design space as a chart which does not effectively convey unquantified uncertainty in proportional data for my context. One way in which design spaces can become inconsistent is if dimensions overlap. Dimensions overlap directly when dimensions have overlapping definitions, and when changes within one dimension impact another dimension. The design space proposed in this thesis has two levels of dimensions. The visual marks are the primary dimen- sions, and visual variables are secondary dimensions. These design space dimensions are well isolated from one another because their definitions come from sub-portions and characteristics of the two charts, and thus do not overlap very much. For ex- ample, the boundary edges and segment edges are entirely distinct dimensions of the design space because they are unique elements of the charts and their definitions do not overlap. However, due to the spatial nature of the design space, those dimensions situated in proximity to one another have more potential to impact one another. There will be dimensional collisions in almost all visualization design spaces because their dimensions share a physical/spatial medium. However, having dimensions im- pact one another is not necessarily indicative of negative or destructive inconsistency in a design space. The intersections between dimensions can inspire creative design ideas or expose un-intuitive but valid points in the design space. These surprising outcomes imply that there exists both desirable and undesirable overlap in design space dimensions. Desirable overlap between dimensions occurs when changes in one dimension cause changes in an adjacent dimension, but those other changes enhance a desirable char- acteristic of the resulting design. It is clear that changes to one dimension, enhancing undesirable characteristics of resulting designs, would be undesirable overlap between dimensions and create invalid designs. See Figure 10.2 for an example of a desirable overlap in design choices within the proposed design space. In most design spaces it is possible to make changes to visual marks using vi- sual variables to create designs that are entirely unrelated to solving the intended problem. For example, an enthusiastic but unfortunately misguided researcher might change the area fill of segments to a picture of cats in order to indicate highly reliable 100

Figure 10.2: An example of the desired overlap between design space dimensions. Note how changes to the nature of the boundary edge mark’s width cause changes in the area mark’s area. This overlap is a positive overlap in this context as the boundary edge angle, and the segment area are both becoming ambiguous. The more the boundary edge mark’s width increases, the more the area is obscured. proportional data; this is an obviously ridiculous concept as dogs would be far more appropriate. I believe that each of the dimensions within this design space can be used to generate more potentially valid designs than potentially invalid designs when using the dimensions appropriately. This conclusion suggests that the design space possesses an acceptable amount of consistency.

10.2.3 Design Space Completeness

Completeness is the ability of the design space to generate designs that fully populate the problem space. Design space completeness can be assessed by describing the designs which fall within the existing design space, within a modified design space, and outside the design space [46].

Within the Existing Design Space

The problem space which the design space needs to populate is described as con- sisting of designs which effectively convey the concept of unquantified uncertainty in proportional data using pie and cake charts. The problem space therefore has areas concerned with conveying unquantified uncertainty, preservation of proportional data, and preservation of the effectiveness of the two proportional charts. Valid designs fall within an intersection of these required characteristics, and it is this intersection that the design space must populate. The design space captures solutions which concern the combination of pie charts, cake charts, unquantified uncertainty, required task support, and proportional data. In using my design space to populate a target problem space, a visualization researcher is able to generate designs which include one of the base charts, proportional data, 101

and a representation for uncertainty. The dimensions within the selected chart can be manipulated to introduce unquantified uncertainty within the proportions of the chart in numerous ways. The design space is able to generate similar uncertainty visualizations using the pie and cake charts because all of the dimensions of the design space are shared between the two charts. This means that the design space can be used to generate solutions where either a pie or cake chart is preferred using similar design ideas. The design space is also designed to introduce unquantified uncertainty within existing visual and spatial marks and variables of the pair of charts. This unique approach contrasts with other uncertainty visualizations that would introduce error bars or other additional symbology to the charts to convey quantified or approxi- mated uncertainty within the same charts. However, these would fall outside of the parameters of the intended problem space.

Within a Modified Design Space

New areas in problem spaces can be populated with solutions by modifying a design space. By minimally modifying this design space, it is possible to create uncertainty visualizations that present different types of uncertainty. For example, design’s using Olston and Mackinlay’s [39] concept of ambiguation for bounded uncertainty could be generated by adding another dimension which is an ambiguation segment between proportions. One could also present proportional data using different underlying charts than the pie and cake charts. There are other proportional charts such as a bar chart or doughnut chart which share structure with the cake and pie chart that visualization researchers could transfer unquantified uncertainty concepts to. Since the service generates a handout report and wants the user experience to be similar with the digital and handout reports, the design space does not allow for ani- mation style manipulations of the dimensions in the design space. However, one could create non-static uncertainty visualizations relying on technologies not described here with animation.

Outside the Design Space

Outside this design space are uncertainty visualizations that use additional iconogra- phy and symbols to portray uncertainty in the data. The essence of the design space 102

is that it symbolically represents the unquantified uncertainty in the data visualized. The inclusion of external symbols designed to accomplish the same purpose could negate the intended benefits of the design space. The design space also does not contain non-chart methods of conveying unquan- tified uncertainty, via text for example, or other types of proportional charts. The final test of a design space is during evaluation with real end-users whose design problem the designs aim to solve. I have evaluated design concepts generated out of this design space with stakeholders, and the designs were deemed acceptable by those stakeholders. It, therefore, appears that this design space makes accept- able trade-offs between completeness and consistency and can generate valid designs. However, it would be appropriate to perform further explorations of the completeness and consistency of this design space with client stakeholders as well.

10.2.4 Using the Unquantified Uncertainty Design Space as a Visualization Researcher

This design space is general enough to be used outside of the drug checking problem context and so other researchers may wish to use it in their own problem domains. In order to adopt the design space for another problem space, visualization researchers must satisfy some criteria for its use. Firstly, researchers must familiarize themselves with the dimensions and charts within the design space and have proportional data with an unknown amount of uncertainty which they wish to present visually. Second, the researchers must be able to describe what the priorities of the target audience is in viewing this uncertain data, as well as be able to articulate the desired outcome is of including the unquantified uncertainty in the charts. With these criteria, it is then possible for the researchers to select which of the two charts is more appropriate for the problem in hand, and begin to generate and iterate on design concepts using stakeholder feedback. As researchers begin to further comprehend the nature of the problem space by evaluating design validity, they will also begin to understand which dimensions of the design space are more or less flexible when manipulated to generate candidate designs. They can then produce better designs, and hopefully find a design solution which satisfies their most important criteria while also maintaining the base functionality of the underlying charts. As mentioned previously, visualization researchers create visualization designs which are neither amoral nor apolitical statements. It is my hope that by creat- 103

ing and presenting one of the first unquantified uncertainty design spaces, that other visualization researchers will be encouraged to consider including unknown unknowns within their charts to empower end-users with more truthful data representations.

10.2.5 Understanding Designs Produced by the Unquanti- fied Uncertainty Design Space as an End User

I intend for the inclusion of unquantified uncertainty indications within proportional charts to help end users recognize that something unusual and noteworthy exists within the data. It is possible that an unaided or inexperienced person may consider unquantified uncertainty design elements to simply be aesthetic design choices. How- ever, for the vast majority of end-users, and especially in the case of clients of the drug checking service who are presented with these charts during the harm reduc- tion conversation, the inclusion of unfamiliar and visually compelling design changes should capture the attention of anyone who has any visualization experience with cake and pie charts. In applying the design space to my problem domain I attempted to utilize visually striking changes to the charts. I also used dimensions of the charts which were directly related to comprehension of segment size, a design decision which strikes at the core use case end users know these charts are used for: that of presenting proportions. Inspiring my end users to question the nature of the data presented by the charts was a core goal embodied by my designs, as those questions represent the opportunity for harm reduction workers and chemical analysts to discuss shortcomings and relationships within the data to the clients. The information clients receive as a result of their questioning will hopefully help them in making safer drug use decisions.

10.3 Contribution 3: Visual Report

The visual report design and software application are contributions to the visualiza- tion and drug checking communities.

10.3.1 Dominant Design Trade-offs

There were two dominant design trade-offs at play throughout the design of the visual report. Firstly, there is a delicate balance between stigmatization and the 104

empowerment of clients in the drug checking context. During the design study, harm reduction workers voiced concern about the stigmatization of people who frequently use drugs, whereas the chemical analysts voiced concern about the empowerment of people who infrequently use drugs. While offering a stigmatizing drug checking service may not directly lead to deaths, it could deter critically under-served populations from accessing the drug checking service. However, not presenting clear enough warnings about the content of drugs tested could result in the overdose, injury and death of people for whom specific drug samples are too potent. I address this requirement conflict by providing clients with as much unopinionated information as possible in the visual report design. The visual report presents clients with percent and component composition test results but also includes uncertainty and disclaimers so that they know the report is never entirely accurate, and their drugs never genuinely safe. Additionally, the visual report indicates the presence of fentanyl throughout the visual report and indicates confidence in component composition using carefully selected icons that do not suggest value-based judgments that might deter clients from the service. The textual content of the deployed report was adjusted from designs to provide useful information without using language that stigmatizes drug use. The second trade-off, which stakeholders extensively discussed within stakeholder feedback sessions is that of data abstraction. Questions such as “How much data is too much data?” “Should uncertainty be shown at all?” “What is safe to present to clients who will be leaving with printed reports?” were challenging questions to answer. The resulting design decisions are essential for the clients as the visual report impacts their drug use decisions. Any future visualization designer researching presenting drug checking test results will face similar challenges.

10.3.2 Transferability

There are two artifacts from this research that could transfer to other drug checking services; the visual report design, and the visual report software application. The visual report software application should be transferable to any drug check- ing service in the world. The software application was developed using Lodestone, which is data source agnostic and operating system agnostic. The deployment of the application to any computer running Windows, macOS, or Linux is possible through Electron app deployments. 105

The design process of the visual report has kept the visual report free of po- tentially harmful value-based judgments while remaining descriptive and addressing issues of presenting uncertainty, a challenge which all drug checking services face to varying degrees. Additionally, the visual report relies upon the component composi- tion and percent composition data types. These two abstract data types capture all the data that a drug checking service can produce out of drug checking systems. If a drug checking service cannot quantify the uncertainty in their test results for any reason, then the visual report design should be useful as it still indicates unquantified uncertainty. Therefore, the visual report design and software application are not tightly bound to Substance’s service architecture and drug checking machines.

10.3.3 Limitations

The primary limitation of this research is that the visual report design has not yet been evaluated in the context directly with client stakeholders. I mitigated this challenge by collecting feedback as frequently as possible during the design study from the available client proxies who are familiar with the client stakeholder group. Gathering client stakeholder feedback is crucial for further improving the design beyond the feedback gathered from the chemical analysts and harm reduction workers as proxies. I designed this visual report without any supporting drug checking visualization literature. Literature searches found no prior work directly relating to the challenges of visualizing uncertainty in drug checking test results. This lack of direct literature support means that I referenced and collected literature that is from outside this specific domain. Fortunately, there are strong, distinct bodies of literature concerning drug checking, uncertainty visualizations, and design studies. I relied on this literature to make informed design decisions as I creatively generated solutions. There were multiple occasions where the requirements that I gathered were contra- dictory. These contradictory requirements both over-constrained the problem space and also inspired creativity in deriving solutions. To tackle this challenge, I collected feedback multiple times and in multiple ways in order to resolve these contradictions and identify where resolutions may need to be generated by the client stakeholder group itself. It is also a limitation of this work that I worked with a single drug checking service specifically located within Victoria, British Columbia, Canada. Much insight would 106

be gained from repeating this design study with other drug checking services in other countries. I have discussed the three contributions of the thesis within this chapter. However, there remains work which could be done to follow up with this research effort. In the following chapter I present recommended future work and conclude the thesis. 107

Chapter 11

Future Work & Conclusions

In the final chapter of this thesis I present future research which could extend the value of the work I have already done, and provide concluding remarks.

11.1 Future Work

There are two primary areas of future work stemming from the research in this thesis. The first relates to the evaluation of the digital and handout drug check reports, and the second relates to exploring further applications of the design space.

11.1.1 Evaluation of the Drug Checking Test Result Digital Report and Handout Report

I have completed the design for both the digital and handout reports. In order to further refine the designs of the reports it is necessary to evaluate their performance in harm reduction conversations and outside of the drug checking service. The challenge now, as well as before, is how to collect feedback on these designs from clients. Well established metrics could be used in the case of gathering feedback about the effectiveness of the visual report’s design. However, it is unlikely that the client stakeholders could ever be directly requested to evaluate designs in a formal process. Both chemical analysts and harm reduction stakeholders have stated they are willing to record their impressions of the effectiveness of the reports during harm reduction conversations. They acted as proxies during the initial design phase, and could ask clients about their understanding of critical design elements. The design of the report can then improve in response to this indirect design feedback. 108

The service staff stakeholders regard handing out reports as having major ramifi- cations for the drug community and the service’s outcomes. Impacts on drug supply quality, accountability of drug dealers, preventing overdoses of at-home drug users, and bringing more clients to the service were common conversation topics during design feedback meetings about the handout report. However, assessing the impact of report handouts is challenging when it is not simply a matter of asking clients. Service stakeholders could collect anecdotal feedback from clients who took away handout reports, and from service staff within the partner sites. This third-party anecdotal evidence could be connected to design choices based on what parts of the handout reports contributed to outcomes in the community. This feedback might help answer design questions raised during the design study such as whether or not sample photos unreasonably increase criminalization, or whether or not qualitative scales help clients make harm reducing decisions. Even though these design evaluations would be based upon indirect feedback data, it is likely worth undertaking due to the potential harms and benefits handout reports could bring.

11.1.2 Unquantified Uncertainty in Proportional Charts De- sign Space

I identified and used a design space that generates unquantified uncertainty visual- izations for percent composition data. This design space emerged from the design study, and helped generate design candidates for the design feedback survey. The design feedback survey supported critical design decisions with the report design. However, there are many avenues of future research waiting for this design space. It would be useful to perform task-based evaluations of usability and intuitiveness on designs generated by the design space similar to what Saket et al. did in their paper [45]. The design space could also benefit from empirical studies concerning which of the dimensions are best suited to conveying unquantified uncertainty. It should also be seen if design concepts depicting unquantified uncertainty can translate to other types of uncertainty inside the same charts. It would also be valuable to further quantify the relationships between pie charts and cake charts, as there may be conclusive evidence that helps resolve the ongoing pie chart controversies described by Spence et al. and Kosara et al. [26, 27, 48–50, 53]. These research ideas could help us understand how beneficial this design space might be within the uncertainty visualization community. 109

11.2 Conclusions

Drug checking services are facing increasing challenges during the opioid crisis. Drug use decisions based on drug checking test results can now significantly impact client health outcomes, and serve as more than a check against poor recreational drug quality. As a result, new design problems are exposing themselves within the drug checking community concerning the communication of drug checking test results. My first contribution is a design study. I have situated this design study within the design science and design study literature to ensure that the design solutions it generates are both relevant and rigorous. In this thesis I describe the effects and workarounds of interesting contextual challenges in the form of limited access to client stakeholders. I addressed these challenges in the design study through an extensive iterative design feedback process using proxy feedback sources. The design study includes a detailed problem and application domain characterization, stakeholder and data format descriptions, and extensive requirements and acceptance criteria descriptions. I employed a multi-stage iterative design study methodology to gather as much design guidance as possible before making the visual report and deploying it into the context. This collaborative research process exposed emergent and unexpected contextual and stakeholder requirements and helped contextualize the solution within the socio-technical drug checking domain. In future research, the deployed versions of the visual report solution will be evaluated using third party feedback methods. My second contribution is a formally structured design space. This design space was created to facilitate the generation of design candidates for visualizing unquan- tified uncertainty in percent composition test results. This design space uses propor- tional charts that share basic underlying structures which enables design concepts to be translatable between them. This new design space enables uncertainty visualiza- tion researchers to explore depicting unquantified uncertainty in proportional charts. The design space is a novel contribution to the uncertainty visualization community, which has seemingly not yet explored revealing this form of uncertainty in context like those described in this thesis. The third and final contribution that I make in this thesis is the actual visual report itself. The test results report comes in digital and paper handout versions and attempts to balance requirements from the problem context while accomplishing nine context-specific design goals. To evaluate designs, I have gathered stakeholder feedback and input in an iterative design process. I have developed and will continue 110

to polish and deploy the digital and handout reports into the drug checking context after overcoming complex and competing contextual design challenges. This novel contribution is best situated within drug checking and visualization literature. Drug checking services represent a first-line emergency response to the opioid cri- sis in 2020, and must adapt their test results delivery systems to a complex problem context in order to be effective. There is a delicate balance to strike between abstract- ing test result data to simplify results delivery, and empowering people who use drugs with as much decision-making information about their drug samples as possible. As visualization researchers, we have the ethical responsibility to present uncertainty to end-users, even when they are members of the public, and especially when their safety is affected. I was very honoured to participate in such a sensitive, challenging, and interesting design project. I hope to continue to do impactful research and support the creation of a better world. 111

Bibliography

[1] A. Abbasloo, V. Wiens, M. Hermann, and T. Schultz. Visualizing Tensor Normal Distributions at Multiple Levels of Detail. IEEE Transactions on Visualization and Computer Graphics, 22:975–984, 2016.

[2] C. Barnum. The ‘magic number 5’: Is it enough for web-testing? Information Design Journal, 11:160–170, 2002.

[3] M. J. Barratt, M. Kowalski, L. J. Maier, and A. Ritter. Global review of drug checking services operating in 2017. Modelling Program Bulletin, (24), 2018.

[4] M. J. Barratt, M. Kowalski, L. J. Maier, and A. Ritter. Global review of drug checking services operating in 2017 (Drug Policy Modelling Program, Bulletin No. 24). Sydney, Australia: National Drug and Alcohol Research Centre., 2018.

[5] K. Beard and W. Mackaness. Visual access to data quality in geographic infor- mation systems. Cartographica, 30:37–45, 1993.

[6] B. G. Becker. Visualizing decision table classifiers. Proc. IEEE Symp. on Infor- mation Visualization, pages 102–105, 1998.

[7] J. Bertin. Semiology of graphics. Central Asia book series. University of Wis- consin Press, 1983.

[8] G-P. Bonneau, H-C. Hege, C. R. Johnson, M. M. Oliveira, K. Potter, P. Rhein- gans, and T. Schultz. Overview and state-of-the-art of uncertainty visualization. In Scientific Visualization, pages 3–27. Springer, 2014.

[9] T. Brunt. Drug checking as a harm reduction tool for recreational drug users: opportunities and challenges. Lisbon: European Monitoring Centre for Drugs and Drug , 2017. 112

[10] Y. Cao, W. Chen, S. Cheng, Y. Sun, Q. Liu, and Y. Li, Y.and Shi. A Simple Brain Storm Optimization Algorithm via Visualizing Confidence Intervals. In Yuhui Shi, Kay Chen Tan, Mengjie Zhang, Ke Tang, Xiaodong Li, Qingfu Zhang, Ying Tan, Martin Middendorf, and Yaochu Jin, editors, Simulated Evolution and Learning, pages 27–38, Cham, 2017. Springer International Publishing.

[11] Michael Correll and Michael. Ethical Dimensions of Visualization Research. In Proc. 2019 ACM Conf. on Human Factors in Comput. Syst., pages 1–13, New York, New York, USA, 2019. ACM Press.

[12] H. Fisher and F. Measham. Night lives: Reducing drug-related harm in the night time economy. 2018.

[13] J. L. Glick, T. Christensen, N. P. Ju, M. McKenzie, T. C. Green, and S. G. Sherman. Stakeholder perspectives on implementing fentanyl drug checking: Results from a multi-site study. Drug and Alcohol Dependence, 194:527–532, 2019.

[14] Traci C. Green and Michael Gilbert. Counterfeit medications and fentanyl, Oct 2016.

[15] M. Greis, J. Hullman, M. Correll, M. Kay, and O. Shaer. Designing for un- certainty in hci: When does uncertainty help? In Proc. 2017 CHI Conf. Ext. Abstracts on Human Factors in Comput. Syst., CHI EA ’17, pages 593–600, New York, NY, USA, 2017. ACM.

[16] K. W. Hall, C. Perin, P. G. Kusalik, C. Gutwin, and S. Carpendale. Formalizing Emphasis in Information Visualization. Computer Graphics Forum, 35:717–737, June 2016.

[17] E. Hellier, M. Tucker, N. Kenny, A. Rowntree, and J. Edworthy. Merits of Using Color and Shape Differentiation to Improve the Speed and Accuracy of Drug Strength Identification on Over-the-Counter Medicines by Laypeople. Journal of Patient Safety, 6:158–164, sep 2010.

[18] A. R. Hevner. A Three Cycle View of Design Science Research. Technical report, 2007. 113

[19] S. E. Hove and B. Anda. Experiences from conducting semi-structured interviews in empirical software engineering research. In 11th IEEE International Software Metrics Symposium (METRICS’05), pages 10 pp.–23, Sep. 2005.

[20] J. Hullman. Why Authors Don’t Visualize Uncertainty. IEEE Transactions on Visualization and Computer Graphics, pages 1–1, Aug 2019.

[21] B. E. John, L. Bass, R. Kazman, and E. Chen. Identifying gaps between HCI, software engineering, and design, and boundary objects to bridge them. In Ex- tended abstracts of the 2004 conference on Human factors and computing systems - CHI ’04, page 1723, New York, New York, USA, 2004. ACM Press.

[22] M. Karamouzian, C. Dohoo, S. Forsting, R. McNeil, T. Kerr, and M. Lysyshyn. Evaluation of a fentanyl drug checking service for clients of a supervised injection facility, vancouver, canada. Harm reduction journal, 15:46, 2018.

[23] M. Kay, T. Kola, J. R. Hullman, and S. A. Munson. When (ish) is My Bus? In Proc. 2016 ACM Conf. on Human Factors in Comput. Syst., pages 5092–5103, New York, New York, USA, 2016. ACM Press.

[24] M. C. Kennedy, A. Scheim, B. Rachlis, S. Mitra, G. Bardwell, S. Rourke, and T. Kerr. Willingness to use drug checking within future supervised injection services among people who inject drugs in a mid-sized Canadian city. Drug and Alcohol Dependence, 185:248–252, Apr 2018.

[25] C. Kinkeldey, A. M. MacEachren, and J. Schiewe. How to Assess Visual Com- munication of Uncertainty? A Systematic Review of Geospatial Uncertainty Visualisation User Studies. The Cartographic Journal, 51:372–386, 2014.

[26] R. Kosara. An Empire Built On Sand. In Proceedings of the Beyond Time and Errors on Novel Evaluation Methods for Visualization - BELIV ’16, pages 162–168, New York, New York, USA, 2016. ACM Press.

[27] R. Kosara and D. Skau. Judgment Error in Pie Chart Variations. Eurographics Conf. on Visualization, 2016.

[28] A. Kumpf, B. Tost, M. Baumgart, M. Riemer, R. Westermann, and M. Raut- enhaus. Visualizing Confidence in Cluster-Based Ensemble Weather Forecast Analyses. IEEE Transactions on Visualization and Computer Graphics, 24:109– 119, 2018. 114

[29] M. Kunz, A. Grˆet-Regamey, and L. Hurni. Visualization of uncertainty in natural hazards assessments using an interactive cartographic information system. Nat. Hazards, 59:1735–1751, Dec 2011.

[30] J. H. Kwakkel, W. E. Walker, and V. A. W. J. Marchau. Classifying and com- municating uncertainties in model-based policy analysis. International Journal of Technology, Policy and Management, 10:299–315, Nov 2010.

[31] M. K. Laing, K. W. Tupper, and N. Fairbairn. Drug checking as a potential strategic overdose response in the fentanyl era. International Journal of Drug Policy, 62:59–66, dec 2018.

[32] J-B. Lamy, C. Duclos, A. Bar-Hen, P. Ouvrard, and A. Venot. An iconic language for the graphical representation of medical concepts. BMC Medical Informatics and Decision Making, 8:16, Dec 2008.

[33] C. Lebeuf and K.and Storey M. A. Voyloshnikova, E.and Herzig. Understand- ing, debugging, and optimizing distributed software builds: a design study. Pro- ceedings - 2018 IEEE International Conference on Software Maintenance and Evolution, ICSME 2018, pages 496–507, 2018.

[34] Q. Liu, W. N. Chen, J. D. Deng, T. Gu, H. Zhang, Z. Yu, and J. Zhang. Bench- marking Stochastic Algorithms for Global Optimization Problems by Visualizing Confidence Intervals. IEEE Transactions on Cybernetics, 47:2924–2937, 2017.

[35] A. M. MacEachren, R. E. Roth, J. O’Brien, B. Li, D. Swingley, and M. Gahegan. Visual Semiotics & Uncertainty Visualization: An Empirical Study. IEEE Trans. Vis. Comput. Graphics, 18:2496–2505, Dec 2012.

[36] S. T. March and G. F. Smith. Design and natural science research on information technology. Decision Support Systems, 15:251–266, 1995.

[37] F. C. Measham. Drug safety testing, disposals and dealing in an English field: Exploring the operational and behavioural outcomes of the UK’s first onsite ‘drug checking’ service. International Journal of Drug Policy, 67:102–107, may 2019.

[38] Miriah Meyer and Jason Dykes. Criteria for Rigor in Visualization Design Study. IEEE Transactions on Visualization and Computer Graphics, pages 1–1, aug 2019. 115

[39] C. Olston and J. D. Mackinlay. Visualizing data with bounded uncertainty. In IEEE 2002 Symp. on Inf. Vis., pages 37–40. IEEE Comput. Soc, 2002.

[40] K. Potter, J. Kniss, R. Riesenfeld, and C. R. Johnson. Visualizing summary statistics and uncertainty. Computer Graphics Forum, 29:823–832, 2010.

[41] K. Potter, P. Rosen, and C. R. Johnson. From Quantification to Visualization: A Taxonomy of Uncertainty Visualization Approaches. In IFIP Working Confer- ence on Uncertainty Quantification, pages 226–249. Springer, Berlin, Heidelberg, 2012.

[42] M. Riveiro, T. Helldin, G. Falkman, and M. Lebram. Effects of visualizing uncertainty on decision-making in a target identification scenario. Computers & Graphics, 41:84–98, June 2014.

[43] J. C. Roberts, Chris H., and Panagiotis D. R. Sketching Designs Using the Five Design-Sheet Methodology. IEEE Trans. Vis. Comput. Graphics, 22:419–428, 2016.

[44] R. E. Roth. A Qualitative Approach to Understanding the Role of Geographic In- formation Uncertainty during Decision Making. Cartogr. Geogr. Inf. Sc., 36:315– 330, 2009.

[45] B. Saket, A. Endert, and C. Demiralp. Task-Based Effectiveness of Basic Visu- alizations. IEEE Trans. Vis. Comput. Graphics, 14, 2018.

[46] H. J. Schulz, T. Nocke, M. Heitzler, and H. Schumann. A design space of vi- sualization tasks. IEEE Transactions on Visualization and Computer Graphics, 19:2366–2375, 2013.

[47] M. Sedlmair, M. Meyer, and T. Munzner. Design Study Methodology: Reflec- tions from the Trenches and the Stacks. IEEE Trans. Vis. Comput. Graphics, 18:2431–2440, Dec 2012.

[48] D. Skau and R. Kosara. Arcs, Angles, or Areas: Individual Data Encodings in Pie and Donut Charts. Comput Graph Forum, 35:121–130, 2016.

[49] I. Spence. No Humble Pie: The Origins and Usage of a Statistical Chart. Journal of Educational and Behavioral Statistics, 30:353–368, Dec 2005. 116

[50] I. Spence and S. Lewandowsky. Displaying proportions and percentages. Appl. Cogn. Psychol., 5:61–77, 1991.

[51] M-A. Storey, E. Engstrom, M. Host, P. Runeson, and E. Bjarnason. Using a Visual Abstract as a Lens for Communicating and Promoting Design Science Research in Software Engineering. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 181–186. IEEE, nov 2017.

[52] J. E. van Aken. Design Science: Valid Knowledge for Socio-technical System Design. In Communications in Computer and Information Science, volume 388 CCIS, pages 1–13. Springer Verlag, 2013.

[53] C. E. Walter, W. Crosby, and W. College. The Relative Merits of Circles and Bars for Representing Component Parts. Journal of the American Statistical Association, 21:119–132, 1926.

[54] A. R. Winstock, K. Wolff, and J. Ramsey. Ecstasy pill testing: harm minimiza- tion gone too far? Addiction, 96:1139–1148, 2001.

[55] M. Wunderlich, K. Ballweg, G. Fuchs, and T. von Landesberger. Visualization of Delay Uncertainty and its Impact on Train Trip Planning: A Design Study. Computer Graphics Forum, 36:317–328, Jun 2017. 117

Appendix A

Drug Checking Service Flow

In these appendices I provide:

• Appendix A: A service flow diagram which depicts an abstraction of how actors, artifacts and processes interoperate within the drug checking service.

• Appendix B: Screenshots of the entire design feedback survey instrument used to gather design feedback in design iteration two of the design study. 118

Figure A.1: A diagram depicting the stakeholders and processes within the Substance drug checking service. 119

Appendix B

Design Feedback Survey 120

Figure B.1: The design feedback survey. 121

Figure B.2: The design feedback survey. 122

Figure B.3: The design feedback survey. 123

Figure B.4: The design feedback survey. 124

Figure B.5: The design feedback survey. 125

Figure B.6: The design feedback survey. 126

Figure B.7: The design feedback survey. 127

Figure B.8: The design feedback survey. 128

Figure B.9: The design feedback survey. 129

Figure B.10: The design feedback survey. 130

Figure B.11: The design feedback survey. 131

Figure B.12: The design feedback survey. 132

Figure B.13: The design feedback survey. 133

Figure B.14: The design feedback survey. 134

Figure B.15: The design feedback survey. 135

Figure B.16: The design feedback survey. 136

Figure B.17: The design feedback survey. 137

Figure B.18: The design feedback survey. 138

Figure B.19: The design feedback survey. 139

Figure B.20: The design feedback survey. 140

Figure B.21: The design feedback survey. 141

Figure B.22: The design feedback survey. 142

Figure B.23: The design feedback survey.

Figure B.24: The design feedback survey. 143

Figure B.25: The design feedback survey. 144

Figure B.26: The design feedback survey. 145

Figure B.27: The design feedback survey. 146

Figure B.28: The design feedback survey. 147

Figure B.29: The design feedback survey. 148

Figure B.30: The design feedback survey. 149

Figure B.31: The design feedback survey. 150

Figure B.32: The design feedback survey. 151

Figure B.33: The design feedback survey. 152

Figure B.34: The design feedback survey. 153

Figure B.35: The design feedback survey. 154

Figure B.36: The design feedback survey. 155

Figure B.37: The design feedback survey. 156

Figure B.38: The design feedback survey. 157

Figure B.39: The design feedback survey.