Depth-Based Visualizations for Ensemble Data and Graphs

DEPTH-BASED VISUALIZATIONS FOR ENSEMBLE DATA AND GRAPHS by Mukund Raj A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computing School of Computing The University of Utah August 2018 Copyright c Mukund Raj 2018 All Rights Reserved The University of Utah Graduate School STATEMENT OF DISSERTATION APPROVAL The dissertation of Mukund Raj has been approved by the following supervisory committee members: Ross T. Whitaker , Chair(s) 03 Apr 2018 Date Approved Robert M. Kirby , Member 03 Apr 2018 Date Approved Suresh Venkatasubramanian , Member 03 Apr 2018 Date Approved P. Thomas Fletcher , Member 03 Apr 2018 Date Approved Alon Efrat , Member 11 Apr 2018 Date Approved by Ross T. Whitaker , Chair/Dean of the Department/College/School of Computing and by David B. Kieda , Dean of The Graduate School. ABSTRACT Ensemble data sets appear in many domains, often as a result of a collection of solutions arising from the use of different parameters or initial conditions in simulations, mea- surement uncertainty associated with repeated measurements of a natural phenomenon, and inherent variability in natural or human events. Studying ensembles in terms of the variability between ensemble members can provide valuable insight into the generating process, particularly when mathematically modeling the process is complex or infeasible. Ensemble visualization is a way to understand the underlying generating model of data by studying ensembles of solutions or measurements. The objective of ensemble visualization is often to convey characteristics of the typical/central members, outliers, and variability among ensemble members. In the absence of any information about the generative model, a family of nonparametric methods, known as data depth, provides a quantitative notion of centrality for ensemble members. Data-depth methods also form the basis of several ensemble visualization techniques, including the popular Tukey boxplot. This dissertation explores data depth as a basis for visualizing various types of data for which existing visualization methods are either not directly applicable or present sig- nificant limitations. Such data include ensembles of three-dimensional (3D) isocontours, ensembles of paths on a graph, ensemble data in high-dimensional and inner-product spaces, and graphs. The contributions of this dissertation span the following three aspects of data-depth based visualizations: first, development of new data-depth methods that address the limitations of existing methods for computing center-outward order statistics for various types of ensemble data; second, development of novel visualization strategies that use existing and proposed data depth methods; and third, demonstration of the effec- tiveness of the proposed methods in real motivating applications. CONTENTS ABSTRACT ............................................................. iii LIST OF FIGURES ....................................................... vii NOTATION AND SYMBOLS .............................................. xii CHAPTERS 1. INTRODUCTION .................................................... 1 1.1 Contributions . .7 1.2 Overview . .8 2. TECHNICAL BACKGROUND ......................................... 10 2.1 Order and Rank Statistics . 10 2.1.1 Order Statistics . 10 2.1.2 Rank Statistics . 11 2.1.3 Order and Rank Statistics in Visualization . 11 2.2 Data Depth . 12 2.2.1 Data-Depth Approaches . 14 2.2.1.1 Distance metric . 14 2.2.1.2 Weighted mean . 14 2.2.1.3 Space partition . 16 2.2.1.4 Extensions to graphs . 17 3. EVALUATING SHAPE ALIGNMENT VIA ENSEMBLE VISUALIZATION ... 19 3.1 Introduction . 19 3.1.1 Brain Atlas Construction . 21 3.1.2 Data Preprocessing for Atlases . 24 3.1.3 Expert Evaluation Study Details . 24 3.2 Our Visualization Pipeline . 26 3.2.1 Ensemble Visualization Overview . 26 3.2.2 Ensemble Visualization Prototype System . 29 3.3 Evaluation . 31 3.4 Conclusions . 37 4. PATH BOXPLOTS FOR CHARACTERIZING UNCERTAINTY IN PATH ENSEMBLES ON A GRAPH ........................................... 40 4.1 Introduction . 40 4.2 Background and Related Work . 43 4.3 Band Depth for Paths on Graphs . 46 4.4 Path Boxplot Visualization . 51 4.5 Results . 52 4.5.1 Transportation Networks . 54 4.5.2 Computer Networks (Autonomous Systems) . 55 4.6 Conclusion and Future Work . 57 5. ANISOTROPIC RADIAL LAYOUT FOR VISUALIZING CENTRALITY AND STRUCTURE IN GRAPHS ............................................ 61 5.1 Introduction . 61 5.2 Background . 64 5.2.1 Centrality and Depth . 64 5.2.2 Stress and Multidimensional Scaling (MDS) . 65 5.2.3 Strictly Monotone and Smooth Regression . 66 5.3 Method . 67 5.3.1 Anisotropic Radial Layout . 68 5.3.2 Visualization . 70 5.4 Results . 71 5.4.1 Zachary’s Karate Club . 71 5.4.2 Terrorist Network From 2004 Madrid Train Bombing . 72 5.4.3 Coappearance Network for Characters in Les Miserables ............. 72 5.5 Discussion . 73 6. VISUALIZING HIGH-DIMENSIONAL DATA USING ORDER STATISTICS . 78 6.1 Introduction . 78 6.2 Background . 81 6.2.1 Order Statistics and Data Depth . 81 6.2.2 Data-Depth-Based Visualizations . 82 6.2.3 Multidimensional Scaling (MDS) . 83 6.2.4 Monotone Regression Along One Variable for Multivariate Data . 84 6.3 Method . 86 6.3.1 Projecting Multidimensional Data Using Order Statistics (Order Aware Projection) . 86 6.3.2 Field Overlay and Projection Bagplot Visualizations . 89 6.4 Results . 90 6.4.1 MNIST Data . 90 6.4.2 Iris Flower Data . 92 6.4.3 Unidentified Flying Object (UFO) Encounters Data . 94 6.4.4 Breast Cancer Data . 95 6.5 Discussion . 96 6.6 Future Work . 99 7. ELLIPSE BAND DEPTH .............................................. 100 7.1 Ellipse Band Depth . 102 7.2 Results . 104 7.2.1 Synthetic 2D Data . 104 7.2.2 Synthetic 3D Data . 105 7.2.3 Chemicals in Kernel Space . 105 v 7.3 Discussion . 106 8. DISCUSSION AND FUTURE WORK ................................... 111 8.1 Discussion . 112 8.2 Future Work . ..

Load more