Embedding Comparator: Visualizing Differences in Global Structure and Local Neighborhoods Via Small Multiples

Embedding Comparator: Visualizing Differences in Global Structure and Local Neighborhoods via Small Multiples Angie Boggusty1 , Brandon Cartery1 , and Arvind Satyanarayan1 1CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA Figure 1: The Embedding Comparator (left) facilitates comparisons of embedding spaces via local neighborhood dominoes: small multiple visualizations depicting local substructures (right). Abstract Embeddings mapping high-dimensional discrete input to lower-dimensional continuous vector spaces have been widely adopted in machine learning applications as a way to capture domain semantics. Interviewing 13 embedding users across disciplines, we find comparing embeddings is a key task for deployment or downstream analysis but unfolds in a tedious fashion that poorly supports systematic exploration. In response, we present the Embedding Comparator, an interactive system that presents a global comparison of embedding spaces alongside fine-grained inspection of local neighborhoods. It systematically surfaces points of comparison by computing the similarity of the k-nearest neighbors of every embedded object between a pair of spaces. Through case studies, we demonstrate our system rapidly reveals insights, such as semantic changes following fine-tuning, language changes over time, and differences between seemingly similar models. In evaluations with 15 participants, we find our system accelerates comparisons by shifting from laborious manual specification to browsing and manipulating visualizations. 1. Introduction general English to legal or medical text [HR18]). In speech recog- nition [BH14], computer vision [BZK∗17], recommendation sys- Embedding models map high-dimensional discrete objects into tems [KBV09], computational biology [YWBA18,BB19,RBT ∗19, arXiv:1912.04853v2 [cs.HC] 6 Mar 2021 lower-dimensional continuous vector spaces such that the vectors BBB∗19], and computational art [ERR∗17, HE18], evaluating em- of related objects are located close together. Although the indi- beddings has helped inform future training procedures and reveal vidual dimensions and structure of embedding spaces can be diffi- the impact of different training datasets, model architectures, hy- cult to interpret, embeddings have become widely used in machine perparameters, or even random weight initializations. learning (ML) applications because their structure usefully captures domain-specific semantics. For example, in natural language pro- cessing (NLP), embeddings map words into real-valued vectors in To understand how users evaluate and compare embeddings, we a way that co-locates semantically similar words [MSC∗13]. conducted a series of semi-structured interviews with users across disciplines who frequently use embedding models as part of their A key task when working with embedding models is evaluating research or in application domains. Users balance between exam- the representations they learn. For instance, users may wish to de- ining global semantic structure via dimensionality reduction plots termine whether embeddings can be transferred between tasks in a and inspecting local neighborhoods of specific embedded objects. domain with limited training data (e.g., applying an embedding of Our conversations reveal shortcomings of these approaches, including unprincipled object selection strategies that rely heavily on domain knowledge or repetitive ad hoc analysis, and siloed tools that y Both authors contributed equally to this research. focus on either one model at a time or depict only the global struc- 2 Boggust et al. / Embedding Comparator: Visualizing Differences in Global Structure and Local Neighborhoods via Small Multiples ture of the embedding space. As a result, users feel concerned they representations learned by different models, as internal representa- may miss unexpected insights or lack a comprehensive understand- tions may differ even while input saliency or input-output behavior ing of the embedding space. Moreover, users are unable to develop remains the same. In our formative interviews (Section3), we found tight feedback loops or rapidly iterate between generating and an- users often compare these internal representations (e.g., to compare swering hypotheses as their current processes include limited inter- semantic differences between hidden layers of a particular model). active capabilities and, thus, require tedious manual specification. In response, we present the Embedding Comparator, an interac- 2.2. Visual embedding techniques and tools tive system for analyzing a pair of embedding models. Drawing on Interpreting the representations learned at the embedding layers of the insights from our formative interviews, the Embedding Com- ML models is challenging as embedding spaces are generally high- parator balances between visualizing the models’ global structures dimensional and latent. To reason about these spaces, researchers with comparing the local neighborhoods. To simplify identifying project the high-dimensional vectors down to two or three di- the similarities and differences between the two models, the system mensions using techniques such as principal component analysis calculates a similarity score for every embedded object based on its (PCA) [Jol86], t-SNE [MH08], and UMAP [MHM18]. Visualiz- reciprocal local neighborhood (i.e., how many of an object’s nearest ing these projections reveals the global geometry of these spaces as neighbors are shared between the two models, and how many are well as potential substructures such as clusters, but effectively do- unique to each model). These scores are visualized in several ways ing so may require careful tuning of hyperparameters [WVJ16] — a including through a histogram of scores, through color-encoding process that can require non-trivial ML expertise. The Embedding the global geometry plots, and critically, through local neighbor- Comparator provides a modular system design such that users can hood dominoes: small multiple visualizations that facilitate rapid use a dimensionality reduction technique of their choice. By de- comparisons of local substructures. A variety of interactive me- fault, however, we use PCA as it highlights, rather than distorts, the chanics help facilitate a tight iterative loop between analyzing these global structure of the embedding space [WVJ16], and is determin- global and local views — for instance, by interactively selecting istic, a desire of our formative interviewees (Section3). points in the global plots, or by searching for specific objects, users can filter dominoes, and hovering over dominoes highlights their By default, many projection packages generate visualizations points in the global views to provide additional context. that are static and thus do not facilitate a tight question-answering feedback loop as users need to repeatedly regenerate visualiza- Through case studies and first-use studies, we demonstrate how tions, slowing down the exploration process. Recently, researchers the Embedding Comparator helps scaffold and accelerate real- have begun to explore interactive systems for exploring embed- world exploration and analysis of embedding spaces. Using tasks dings including via direct manipulation of the projection [STN∗16, based on our formative interviews, we show how our system sup- PLvdM∗16,HPvU∗17], interactively filtering and reconfiguring vi- ports use cases such as understanding effects of fine-tuning and sual forms [HG18, TKBH16], and defining attribute vectors and conducting linguistic analysis. Our system design enables replica- analogies [LJLH19]. While our approach draws inspiration from tion of previously published results using only a handful of interac- these prior systems, and similarly provides facilities for exploring tions, without the need for task-specific metrics. As we demonstrate local neighborhoods, the Embedding Comparator primarily focuses in case studies (Section5) and validate in first-use studies (Sec- on identifying and highlighting the similarities and differences be- tion6), the Embedding Comparator shifts the process of analyzing tween different representations of embedded objects. To do so, we embeddings from tedious and error-prone manual specification to compute a similarity metric for every embedded object and use this browsing and manipulating a series of visualizations. metric to drive several interactive visualizations (Section4). The Embedding Comparator is freely available as open- source software, with source code at: https://github. 2.3. Techniques for comparing embedding spaces com/mitvis/embedding-comparator, and a live demo at: http://vis.mit.edu/embedding-comparator. To compare spaces, some techniques align embeddings through linear transformation [TZCS15, HLJ16b, HLJ16a, MLS13, CTL18] or alignment of neurons or the subspaces they span [LYC∗16, ∗ 2. Related work WHG 18]. In contrast, the Embedding Comparator does not align the embeddings and can be used in cases where a linear mapping 2.1. ML model interpretability between the spaces does not exist, which may occur if they have different structures [MLS13]. Our system exposes the objects that ML models are widely regarded as being “black boxes” as it are most and least similar between two vector spaces via a recip- is difficult for humans to reason about how models arrive at rocal local neighborhood similarity metric, and local neighborhood their decisions [Lip18]. Numerous tools help users understand based metrics have been shown to usefully capture differences in model behavior [HKPC18],

Load more