Pedvis: a Structured, Space-Efficient Technique for Pedigree Visualization

PEDVIS: A STRUCTURED, SPACE- EFFICIENT TECHNIQUE FOR PEDIGREE VISUALIZATION by Claurissa Tuttle A thesis submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Master of Science i n Graphics and Visualization School of Computing The University of Utah August 2011 Copyright © Claurissa Tuttle 2011 All Rights Reserved The University of Utah Graduate School STATEMENT OF THESIS APPROVAL The thesis of Claurissa Tuttle has been approved by the following supervisory committee members: Cláudio T. Silva , Chair 02/07/11 Date Approved Juliana Freire , Member 02/07/11 Date Approved Peter Shirley , Member 02/07/11 Date Approved and by Al Davis , Chair of the Department of School of Computing and by Charles A. Wight, Dean of The Graduate School. ABSTRACT Public genealogical databases are becoming increasingly populated with historical data and records of the current population's ancestors. As this increasing amount of available information is used to link individuals to their ancestors, the resulting trees become deeper and denser, which justifies the need for using organized, space-efficient layouts to display the data. Existing layouts are often only able to show a small subset of the data at a time. As a result, it is easy to become lost when navigating through the data or to lose sight of the overall tree structure. On the contrary, leaving space for unknown ancestors allows one to better understand the tree's structure, but leaving this space becomes expensive and allows fewer generations to be displayed at a time. In this work, we propose that the H-tree based layout be used in genealogical software to display ancestral trees. We will show that this layout presents an increase in the number of displayable generations, provides a nicely arranged, symmetrical, intuitive and organized fractal structure, increases the user's ability to understand and navigate through the data, and accounts for the visualization requirements necessary for displaying such trees. Finally, user-study results indicate potential for user acceptance of the new layout. CONTENTS ABSTRACT .............................................................................................. iii LIST OF FIGURES ...................................................................................vi LIST OF TABLES.................................................................................. viii ACKNOWLEDGEMENTS.......................................................................ix CHAPTERS 1. INTRODUCTION ............................................................................1 2. BACKGROUND ...............................................................................4 2.1 Overview of Tree Layouts ........................................................4 2.2 Graph Drawing Approaches......................................................7 2.3 Methods of Displaying Genealogical Data................................8 3. THE H-TREE AND DISPLAY SPACE ANALYSIS....................11 3.1 The Layout..............................................................................11 3.2 Display Space Analysis...........................................................14 4. SOFTWARE PROTOTYPE ..........................................................18 4.1 Data Set Organization and H-tree Functionality......................18 4.2 User Interaction ......................................................................20 5. CASE STUDIES..............................................................................24 5.1 Duplications, Patterns, and Anomalies....................................24 5.2 Layout Comparisons ...............................................................27 v 6. USER EVALUATION....................................................................31 6.1 Comparative Tasks......................................................................31 6.2 Survey Analysis and Results .......................................................32 7. DISCUSSION AND LIMITATIONS.............................................36 8. CONCLUSION ...............................................................................38 APPENDIX ...............................................................................................41 REFERENCES .........................................................................................42 LIST OF FIGURES 1. An illustration of the quadrant-based organization of the H-tree layout, which enables an efficient display space usage, making visualization of a considerable number of generations possible while still allowing precise identification of the family branches (bottom left frame shows 18 generations related to the maternal grandfather of the central individual). Unknown ancestor regions (void spaces in the top left frame) and duplicate nodes (blue nodes in the top right frame) can also be identified clearly......................................................................................... 2 2. The parents are alternately either horizontally or vertically aligned with the child. .................................................................................................................... 12 3. The H-tree layout for 1 generation (left) to 5 generations (right). Generations 1 through 3 are labeled to show how the nodes are moved out as another generation is added............................................................................................... 13 4. Left: Each of the 4 grandparents are placed in the center of their respective areas and their ancestors are placed around them, but still within that area. Middle: An H-tree overlay shows how the individuals are organized within each section. Each of the 16 great-grandparents of the root is placed in the center of each of the 16 sections shown. This shows the organization of the data into natural family clusters. Right: A traditional binary tree. The 16 colored sections in the middle figure correspond to the 16 colored sections of this figure......................... 14 5. An 11-generation view. Empty space is an indication of unknown information. The nodes that are colored in cyan are duplicates. Blue and pink nodes represent unique individuals. ................................................................................ 19 6. The root is changed from Daniel to Hannah, adding the individuals along the path between them to the roadmap........................................................................ 22 7. The quadrant-based organization in connection with node highlighting allows users to easily identify common ancestors. A path between the two occurrences of the individual has been drawn in....................................................................... 26 vii 8. A typical tree visualization. Some areas of the graph are more populated than others, which areas reflect the importance placed on record keeping by royalty, religious leaders, political leaders, and others. ...................................................... 26 9. A representation of genealogy that is contained in the Bible. The vast amount of duplicate individuals is a result of the intermarriage that exists in Adam and Eve’s time. ........................................................................................................... 27 10. Comparing node-link diagram (left), fan chart (center), and H-tree (right) when displaying eight (top) and eleven generations (bottom)......................................... 29 11. This figure illustrates the H-tree’s inherited self-similarity property, as the magnified subgraph demonstrates the same shape and structure as the graph. ....... 29 LIST OF TABLES 1. Space Requirements (For k generations of a complete binary tree and an assumed 1 × 1 unit square required for each individual)........................................ 16 2. The number of displayable generations k of a complete tree given the length l available for the most constrained dimension........................................................ 17 3. Data Set Statistics................................................................................................. 25 4. Preferences for comparison task 1. Most and least preferred are assigned when one or two layouts are preferred over the other(s) and neutral is assigned only when there is a first, second, and third preference................................................. 33 5. Preferences for comparison task 2. ....................................................................... 33 ACKNOWLEDGMENTS This material is reprinted with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of the University of Utah’s products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from IEEE by writing to [email protected]. By choosing to view this material, you agree to all provisions of the copyright laws protecting it. I would like to thank Luis Gustavo Nonato and Cláudio T. Sliva for the work their efforts in helping this work be successful. I would also like to thank those who participated in the user evaluation of the PedVis as well as the members of the Institutional Review Board for reviewing and permitting the evaluation. This work was funded by the National Science Foundation (grants 04-615, IIS-0905385, IIS-0844546, CNS-0855167,

Load more