Using Sankey Diagram to Analyze Drug Pipeline Tanmay Khole, Bristol-Myers Squibb, Berkeley Heights NJ, USA

Total Page:16

File Type:pdf, Size:1020Kb

Using Sankey Diagram to Analyze Drug Pipeline Tanmay Khole, Bristol-Myers Squibb, Berkeley Heights NJ, USA Paper DV03 Using Sankey Diagram to Analyze Drug Pipeline Tanmay Khole, Bristol-Myers Squibb, Berkeley Heights NJ, USA ABSTRACT Sankey diagrams are a specific type of flow diagram, in which the width of the arrows is shown proportionally to the flow quantity. Sankey diagrams put a visual emphasis on the major transfers or flows within a system. They are helpful in locating dominant contributions to an overall flow. This paper will focus on drug pipeline of a sponsor and leverage data from clinicaltrials.gov to analyze number of clinical trials a sponsor has with respect to conditions, interventions, and phases. This will be visualized with the use of Sankey diagram and display the weightage a sponsor has given to a drug or a condition based on the phases of clinical trials. A drug pipeline gives us an idea about the future of a company and this paper will give a deep dive on some of the aspects by use of sankey diagram. INTRODUCTION This paper analyzes data from clinicaltrials.gov for selected few clinical trial sponsors and uses that info to create sankey diagram. A sankey diagram is a visualization used to depict a flow from one set of values to another. The things being connected are called nodes and the connections are called links. Sankeys are best used when you want to show a many-to-many mapping between two domains or multiple paths through a set of stages and data from clinicaltrials.gov is an excellent example to analyze a sponsor’s drug pipeline to see which clinical condition or interventions are focused by sponsor with respect to stages of clinical trials. Techniques such as data mapping, data analysis and data visualization are used to create the sankey diagrams displayed in this paper. Phase I clinical trials are excluded from data analysis and data visualization for ease of understanding the flow of clinical trials which are in Phase 2-4. Data is obtained in csv file format from clinicaltrials.gov using advanced search option and searching only for sponsor section. Analysis is performed on trials with status: "Active, not recruiting", "Available", "Enrolling by invitation, "Not yet recruiting", or "Recruiting". - 1 - SANKEY DIAGRAM FOR CLINICALTRIALS.GOV DATA Data obtained from clinicaltrials.gov in csv format is one record per trial, see figure 1. In order to use it for Sankey diagram, it needs to be processed as per below steps: • Data Mapping • Data Analysis • Data Visualization Figure 1: Data obtained from clinicaltrials.gov and imported into SAS® dataset. Sponsors listed in table 1 are considered in this paper for data analysis and to create sankey diagrams for the on-going clinical trials of each sponsor. Clinical trials with status: "Active, not recruiting", "Available", "Enrolling by invitation, "Not yet recruiting", or "Recruiting" are considered as on-going. Only those clinical trials are selected where sponsor is the lead sponsor of that clinical trial. Sponsor Distinct On-going Data Extraction Date Clinical Trials Count Sponsor 1 Bristol-Myers Squibb 250 22NOV2019 Sponsor 2 Janssen 126 Sponsor 3 Merck & Co. 173 22JAN2020 Sponsor 4 Amgen 56 Sponsor 5 Bayer 56 Table 1: List of Sponsors - 2- DATA MAPPING Data mapping is an essential component in order to connect links and nodes in sankey diagrams. Clinical trials data obtained from clinicaltrials.gov contains multiple names for same conditions (e.g.: “NSCLC”, “Non-Small Cell Lung Cancer”, or “Carcinoma, Non-Small-Cell Lung”), figure 2, and multiple names for same drug/biologic compounds (e.g.: "Nivolumab", "Opdivo", "BMS- 936558", "ONO-4538“), figure 3. Hence it is important to identify each condition and intervention into correct category. As there are numerous conditions, they are mapped into high-level categories like Solid Tumors, Cardiovascular, Leukemia & Lymphoma, etc. See figure 4 for example of mapping different conditions to high-level category. Figure 2: Mapping different names of same condition into single category. Figure 3: Mapping different names of same compound/intervention into single category. Figure 4: Mapping different conditions to high-level category. - 3- Below mapping rules are applied before data analysis step. The mapping rules are designed to identify the focus of the sponsor regards to clinical conditions/interventions. • Clinical trials with multiple phases are mapped toward the higher phase • Clinical trials with multiple clinical conditions are mapped towards each condition • Clinical trials with multiple interventions are mapped towards each intervention of the respective sponsor Example 1: Clinical trial NCT03331198, title “Study Evaluating Safety and Efficacy of JCAR017 in Subjects With Relapsed or Refractory Chronic Lymphocytic Leukemia (CLL) or Small Lymphocytic Lymphoma (SLL)”, has trial design for phase 1 and phase 2. As per the mapping rules, it will be mapped for Phase 2 only. This trial also has multiple clinical conditions listed such as Chronic Lymphocytic Leukemia, Small Lymphocytic Lymphoma, and will be mapped to each clinical condition as per the mapping rules. Example 2: Clinical trial NCT04088500, title “A Study of Combination Nivolumab and Ipilimumab Retreatment in Patients With Advanced Renal Cell Carcinoma” has multiple interventions: Nivolumab and Ipilimumab. As per the mapping rules, this trial will be mapped to each intervention listed. Example 3: Clinical trial NCT03036098, title “Study of Nivolumab in Combination With Ipilimumab or Standard of Care Chemotherapy Compared to the Standard of Care Chemotherapy Alone in Treatment of Patients With Untreated Inoperable or Metastatic Urothelial Cancer” has multiple interventions: nivolumab, ipilimumab, gemcitabine, cisplatin, carboplatin but only the first two are sponsor’s compounds, hence this trial will be mapped to two interventions: nivolumab & ipilimumab. Data mapping for this paper is performed by creating flags/identifiers for each condition and intervention listed in respective sponsor’s clinical trials data. Each sponsor listed in table 1 have unique compounds and mapping of each compound/intervention is required by closely observing the data. Data obtained from clinicaltrials.gov is one record per trial (horizontal data format) and it needs to be transformed into vertical data format as shown in figure 5 by using the flags created for each condition category and intervention. - 4- Figure 5: Horizontal data mapped and transformed into vertical data format DATA ANALYSIS Data analysis is performed by calculating number of objects with respect to its categories which needs to be displayed in sankey diagram. The categories are used as nodes and the count of those objects are used to determine the width of links between the selected categories. In this paper, data analysis is performed by calculating number of clinical trials with respect to sponsor, conditions, interventions, and phases. This step is performed after data mapping to ensure correct connection of links and nodes. SAS® macro %sankey_nodes is used for data analysis and reference code can be found in the appendix. %sankey_nodes(inds = ct_gov ,outds = sankey_out ,nodes=%str(sponsor|conditions|interventions|phases) ,cond = ); %sankey_nodes will calculate the number of objects, in this case, number of clinical trials. The mapped data is fed into “inds” macro parameter. The nodes (categories) which needs to be displayed in the sankey diagram are listed in “nodes” macro parameter and if any condition needs to be applied, it can be listed in “cond” macro parameter. This macro creates a macro variable &sankeydata. and output dataset which has data for sankey diagram stored in it. It gets used in the data visualization step to create sankey diagram. - 5- DATA VISUALIZATION Data visualization step is performed using SAS® macro %sankey2html and D3.js which is a JavaScript library. The output created is in HTML format. %sankey2html(indata = %nrbquote(&sankeydata.) ,outfl = %sysfunc(pathname(outg,f))/sankey.html ,width = 2100 ,height = 700 ,flow_num = ); %sankey2html macro reads macro variable &sankeydata. created from %sankey_nodes and implement it in HTML file. The output file location and HTML filename is specified in “outfl” macro parameter. “width” and “height” parameters are used for sankey diagram height and width. “flow_num” parameter is used to display link labels above a specified number. Sankey diagrams displays flow of number of clinical trials from SPONSOR → CONDITIONS → INTERVENTIONS → PHASES which are also used as nodes for sankey diagrams displayed in this paper. The thickness of the links signifies the number of clinical trials connecting the nodes. - 6- SANKEY DIAGRAM 1 Sponsor: Bristol-Myers Squibb Node 1: Sponsor; Node 2: Clinical Conditions; Node 3: Interventions; Node 4: Clinical Trial Phases Number of on-going clinical trials for each node are displayed in parenthesis. Clinical trials with multiple phases are counted toward the higher phase; Clinical trials with multiple clinical conditions are counted towards each condition; Clinical trials with multiple interventions are counted towards each intervention. Note: Data analysis and data visualization performed in this paper is not an official representation of any of the sponsor’s pipeline but based on the data acquired from clinicaltrials.gov. - 7 - SANKEY DIAGRAM 2 Sponsor: Janssen Node 1: Sponsor; Node 2: Clinical Conditions; Node 3: Interventions; Node 4: Clinical Trial Phases Number of on-going clinical trials for each node are displayed in parenthesis. Clinical trials with multiple phases are counted toward the higher
Recommended publications
  • Visual Analytic Tools and Techniques in Population Health and Health Services Research: Scoping Review
    JOURNAL OF MEDICAL INTERNET RESEARCH Chishtie et al Review Visual Analytic Tools and Techniques in Population Health and Health Services Research: Scoping Review Jawad Ahmed Chishtie1,2,3,4, MSc, MD; Jean-Sebastien Marchand5, PhD; Luke A Turcotte2,6, PhD; Iwona Anna Bielska7,8, PhD; Jessica Babineau9, MLIS; Monica Cepoiu-Martin10, PhD, MD; Michael Irvine11,12, PhD; Sarah Munce1,4,13,14, PhD; Sally Abudiab1, BSc; Marko Bjelica1,4, MSc; Saima Hossain15, BSc; Muhammad Imran16, MSc; Tara Jeji3, MD; Susan Jaglal15, PhD 1Rehabilitation Sciences Institute, Faculty of Medicine, University of Toronto, Toronto, ON, Canada 2Advanced Analytics, Canadian Institute for Health Information, Toronto, ON, Canada 3Ontario Neurotrauma Foundation, Toronto, ON, Canada 4Toronto Rehabilitation Institute, University Health Network, Toronto, ON, Canada 5Universite de Sherbrooke, Quebec, QC, Canada 6School of Public Health and Health Systems, University of Waterloo, Waterloo, ON, Canada 7Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada 8Centre for Health Economics and Policy Analysis, McMaster University, Hamilton, ON, Canada 9Library & Information Services, University Health Network, Toronto, ON, Canada 10Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada 11Department of Mathematics, University of British Columbia, Vancouver, BC, Canada 12British Columbia Centre for Disease Control, Vancouver, BC, Canada 13Department of Occupational Science and Occupational
    [Show full text]
  • A Visual Technique to Analyze Flow of Information in a Machine Learning System
    A Visual Technique to Analyze Flow of Information in a Machine Learning System Abon Chaudhuri, Walmart Labs, Sunnyvale, CA, USA Abstract dition to statistical analysis, the use of visual analytics to answer Machine learning (ML) algorithms and machine learning these questions effectively is becoming increasingly popular. based software systems implicitly or explicitly involve complex Going one step deeper, we observe that the flow of infor- flow of information between various entities such as training data, mation across various entities can often be formulated as joint or feature space, validation set and results. Understanding the sta- conditional probability distributions. A few examples are: dis- tistical distribution of such information and how they flow from tribution of class labels in the training data, conditional distribu- one entity to another influence the operation and correctness of tion feature values given a label, comparison between distribution such systems, especially in large-scale applications that perform of classes in test and training data. Statistical measures such as classification or prediction in real time. In this paper, we pro- mean and variance have well-known limitations in understand- pose a visual approach to understand and analyze flow of infor- ing distributions. On the other hand, visualization based tech- mation during model training and serving phases. We build the niques allow a human expert to analyze information at different visualizations using a technique called Sankey Diagram - con- levels of granularity. To give a simple example, a histogram can ventionally used to understand data flow among sets - to address be used to examine different sub-ranges of a probability distribu- various use cases of in a machine learning system.
    [Show full text]
  • Go with the Flow: Sankey Diagrams Illustrate Energy Economy
    Narrative: In this EcoWest presentation, we break down energy trends in the U.S. and Western states by using a graphic known as a Sankey diagram. Energy flows through everything so it’s only fitting to use this type of flow chart to depict our complex energy economy. 1 Narrative: Sankey diagrams are named after an Irish military officer who used the graphic in 1898 in a publication on steam engines. Since then, Sankey’s diagrams have won a dedicated following among data visualization nerds. The graphics summarize flows through a system by varying the width of lines according to the magnitude of energy, water, or some other commodity. Source: Wikipedia.org URL: http://en.wikipedia.org/wiki/Matthew_Henry_Phineas_Riall_Sankey 2 Narrative: One of the earliest and most famous examples of the form illustrates Napoleon’s disastrous Russian campaign in the early 19th century. Source: Napoleon's retreat from Moscow, by Adolph Northen (1828–1876) URL: http://en.wikipedia.org/wiki/File:Napoleons_retreat_from_moscow.jp g 3 Narrative: Created by Charles Joseph Minard, a French civil engineer, the graphic depicts the army’s movement across Europe and shows how their ranks were reduced from 422,000 troops in June 1812, when they invaded Russia, to just 10,000, when the remnants of the force staggered back into Poland after retreating through a brutal winter. Data visualization guru Edward Tufte says it’s “probably the best statistical graphic ever drawn.” Source: Wikipedia.org URL: http://en.wikipedia.org/wiki/Charles_Joseph_Minard 4 Narrative: Sankey diagrams created by the Lawrence Livermore National Laboratory depict both the source and use of energy.
    [Show full text]
  • Identifying Students' Progress and Mobility Patterns in Higher
    ORAN, MARTIN, KLYMKOWSKY, STUBBS 1 Identifying Students’ Progress and Mobility Patterns in Higher Education Through Open-Source Visualization Ali Oran, Andrew Martin, Michael Klymkowsky, and Robert Stubbs Abstract For ensuring students’ continuous achievement of academic excellence, higher education institutions commonly engage in periodic and critical revision of its academic programs. Depending on the goals and the resources of the institution, these revisions can focus only on an analysis of retention-graduation rates of different entry cohorts over the years, or survey results measuring students level of satisfaction in their programs. They can also be more comprehensive requiring an analysis of the content, scope, and alignment of a program’s curricula, for improving academic excellence. The revisions require the academic units to collaborate with university’s data experts, commonly the Institutional Research Office, to gather the needed information. The information for departments’ faculty and decision makers should be presentable in a highly-informative yet easily-interpretable manner, so that the review committee can quickly notice areas of improvement and take actions afterwards. In this study, we discuss the development and practical use of a visual that was developed with these key points in mind. The visuals, referred by us as “Students’ Progress Visuals”, are based on the Sankey diagram and provide information on students’ progress and mobility patterns in an academic unit over time in an easily understandable format. They were developed using open source software, and recently began to be used by several departments of our research intensive higher-ed institution for academic units’ review processes, which includes members of the campus community and external area experts.
    [Show full text]
  • Application of Various Open Source Visualization Tools for Effective Mining of Information from Geospatial Petroleum Data
    The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-5, 2018 ISPRS TC V Mid-term Symposium “Geospatial Technology – Pixel to People”, 20–23 November 2018, Dehradun, India APPLICATION OF VARIOUS OPEN SOURCE VISUALIZATION TOOLS FOR EFFECTIVE MINING OF INFORMATION FROM GEOSPATIAL PETROLEUM DATA N. D. Gholba 1, Arun Babu 1 *, S. Shanmugapriya 1, Abhishek Singh 1, Ashutosh Srivastava 2, Sameer Saran 2 1 M. Tech Students, Indian Institute of Remote Sensing, Dehradun, Uttrakhand, India – (guava.iirs, arunlekshmi1994, priya.iirs2017, abhisheksingh2441)@gmail.com 2 Geoinformatics Department, Indian Institute of Remote Sensing, Dehradun, Uttrakhand, India – (asrivastava, sameer)@iirs.gov.in Commission V, WG V/8 KEYWORDS: Open Source Geodata Visualization, Data Mining, Global Petroleum Statistics, QGIS, R, Sankey Maps ABSTRACT: This study emphasizes the use of various tools for visualizing geospatial data for facilitating information mining of the global petroleum reserves. In this paper, open-source data on global oil trade, from 1996 to 2016, published by British Petroleum was used. It was analysed using the shapefile of the countries of the world in the open-source software like StatPlanet, R and QGIS. Visualizations were created using different maps with combinations of graphics and plots, like choropleth, dot density, graduated symbols, 3D maps, Sankey diagrams, hybrid maps, animations, etc. to depict the global petroleum trade. Certain inferences could be quickly made like, Venezuela and Iran are rapidly rising as the producers of crude oil. The strong-hold is shifting from the Gulf countries since China, Sudan and Kazakhstan have shown a high rate of positive growth in crude reserves.
    [Show full text]
  • Economy-Wide Material Flow Accounts Handbook – 2018 Edition
    Economy-wide material flow accounts HANDBOOK 2018 edition Economy-wide material flow accounts flow material Economy-wide 2 018 edition 018 MANUALS AND GUIDELINES Economy-wide material flow accounts HANDBOOK 2018 edition Manuscript completed in June 2018 Neither the European Commission nor any person acting on behalf of the Commission is responsible for the use that might be made of the following information. Luxembourg: Publications Office of the European Union, 2018 © European Union, 2018 Reproduction is authorized for non-commercial purposes only, provided the source is acknowledged. The reuse policy of European Commission documents is regulated by Decision 2011/833/EU (OJ L 330, 14.12.2011, p. 39). Copyright for the photographs: Cover © Vladimir Wrangel/Shutterstock For any use or reproduction of photos or other material that is not under the EU copyright, permission must be sought directly from the copyright holders. For more information, please consult: http://ec.europa.eu/eurostat/about/policies/copyright Theme: Environment and energy Collection: Manuals and guidelines The opinions expressed herein are those of the author(s) only and should not be considered as representative of the official position of the European Commission Neither the European Union institutions and bodies nor any person acting on their behalf may be held responsible for the use which may be made of the information contained herein ISBN 978-92-79-88337-8 ISSN 2315-0815 doi: 10.2785/158567 Cat. No: KS-GQ-18-006-EN-N Preface Preface Economy-wide material flow accounts (EW-MFA) are a statistical accounting framework describing the physical interaction of the economy with the natural environment and the rest of the world economy in terms of flows of materials.
    [Show full text]
  • Process Book (CS171 Project)
    Process Book (CS171 Project) Project Title: Ukraine Improvised Explosive Devices Project Team Online Studio 3 Group 3 Valérie Lavigne Marius Panga Jayaram Shivas Vadakumpu- [email protected] [email protected] ram [email protected] Team Roles Team Coordinator: Valérie - Producing a tasks list in Asana from the assignements for each week and making sure all the work is assigned Code Collaborator: Shivas - Setting up and organizing the Github repository - Overall web app layout and designer Data Stewart: Marius - Identifying potentially relevant data from various sources, extracting, translating and transforming it to a format that is consistent and easy to merge with the main dataset. Alternating Responsibilities Role shared across team members, one team member volunteers each week: - Team Submitter: Packaging the week’s work and submitting it - Updating the process book - Tasks for each week are listed in Asana and each team member volunteers for the tasks he/she wants to do, tasks assignment is also discussed at the weekly meting - Updating the various supporting documents is done in a collaborative manner, with each member contributing in an agile way with relevant input. Project Description Background and Motivation Valérie works as a defence scientist for Defence R&D Canada and is the Canadian representative on the NATO Research Task Group IST-141 Exploratory Visual Analytics. Through her work, she was exposed to a dataset and presentation about the Ukraine Improvised Explosive Devices (IED) situation produced by the NATO Counter-IED Center of Excellence (NATO C-IED COE) which is an International Military Organi- zation, multinationally manned and funded by contributions from 9 sponsoring NATO nations (http://www.coec-ied.es/).
    [Show full text]
  • Testing Covariance Models for MEG Source Reconstruction of Hippocampal Activity
    Testing covariance models for MEG source reconstruction of hippocampal activity Supplementary Material George C. O’Neilla, Daniel N. Barryb, Tim M. Tierneya, Stephanie Mellora, Eleanor A. Maguirea, Gareth R. Barnesa aWellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, University College London, London, UK bDepartment of Experimental Psychology, University College London, London, UK 1) The pattern of hippocampal results in the MEG literature To investigate the prevalence of unilateral and bilateral hippocampal activity in reported MEG studies, we performed a miniature review of literature. Methods We searched the PubMed database for all documents that contained both the terms magnetoencephalography and hippocampus that were published between 1st Jan 1990 and 1st July 2020. We then checked that each paper returned by PubMed met the following criteria: 1. The publication was not a duplicate result of a previously parsed result; 2. The publication was not a literature review or commentary; 3. The publication was not pathological case report; 4. The publication was not a simulation study; 5. The publication contained electrophysiological recordings of humans that covered both hippocampal regions; 6. A source-level analysis of the experimental data was performed; 7. A significant activation or network node was identified in either or both hippocampal regions. If all criteria were met, we recorded whether there were reported activations in one or both hippocampal regions and identified what type of experiment was performed and what source inversion method was implemented. Results PubMed retuned 191 publications containing both keywords, of which we eliminated 108 for failing to meet all 7 criteria, leaving 83 qualifying publications.
    [Show full text]
  • A Sankey Framework for Energy and Exergy Flows1
    The 2nd ICEST, Cairo, Egypt, 18-21 February 2013 A Sankey Framework for Energy and Exergy Flows1 Kamalakannan Soundararajan a, Hiang Kwee Ho a, Bin Su a a Energy Studies Institute, National University of Singapore, Singapore Abstract Energy flow diagrams in the form of Sankey diagrams have been identified as a useful tool in energy management and performance improvement. However there is a lack of understanding on how such diagrams should be designed and developed for different applications and objectives. At the national level, matching various features of Sankey diagrams with the various objectives of energy performance management provides a framework for better understanding how Sankey diagrams can be designed for national level analysis. As part of the framework, boundaries outlined around a group of facilities provide a refined representation of sub-systems that trace energy use in various conversion devices, products and services. Such a representation identifies potential areas for energy savings; an important objective of energy performance management. However, Sankey diagrams based on energy balance falls short in effectively meeting this objective. Sankey diagrams based on exergy balance on the other hand provide unique advantages in identifying potential areas for energy savings. This is illustrated at a facility level, using the example of a LNG regasification facility that overlays both energy and exergy flow diagrams. Keywords: Sankey diagrams, national level energy analysis, energy stages, energy flows, energy savings, exergy balance 1. Introduction Sankey diagrams have been used as an effective tool to focus on energy flow and its distribution across various energy systems. It is represented by arrows, where the width of which represents the magnitude of the flow.
    [Show full text]
  • Interactive Data Visualizations of Database Applications Using JS Libraries D3 and Cytoscape
    Interactive Data Visualizations of Database applications using JS libraries D3 and Cytoscape Kantha Cikaiahgari DOAG 2018 1 Table of contents About PITSS GmbH 03 JavaScript Libraries 16 Data Visualization 06 D3 Features 17 Why interactivity 07 Cytoscape Features 18 Business usecases 08 Integration of JS libraries 19 Data Models –Graph Layouts 10 Customization of Graphs 20 Sankey Diagram 11 Benefit Analysis 21 CoSE Bilkent Layout 14 Demo 23 Contact details 24 2 Who are PITSS? 4 locations in 3 countries Solutions provider for software development, 70 employees analysis, and modernization 15 languages 40 programming languages 17 years’ migration experience 3 Concentrated experience PITSS.CON features a huge wealth of experience and has already learned more than a top consultant knows with 40 years‘ practical experience. 1000+ ~ 20 years applications consulting + + PITSS.CON 3000+ projects 4 Efficiency! Increase quality. Reduce costs. With PITSS.CON, consistently high quality code is generated—at significantly lower staff costs! 5 Data Visualization Analyze & communicate information clearly and effectively through graphical means 6 Why Visualizations need to be interactive • More information • Easier perception of more complex data • Promotes exploration • Provides support in decision making Data Good UX relies on 3 primary rules: Design 1. Overview first Tools 2. Zoom and filter 3. Then details-on-demand Effective Visualization 7 Business Usecases for Data Visualization The probable primary usecases for the data visualization include 1. Navigation Flow a. Code Flow b. Data Lineage, how data could run through a system of different ETL (extract - transform - load) components. 2. Process flow 8 Business Usecases for Data Visualization • To visualize - Dependency of more than 1000+ different packages and how they interact with eachother Eg: Dagre Layout - Process flows like how a user runs through different forms in an application Eg: CoSE (Compund Spring Embedder) Layouts - How data flows through different ETL(Extract Transform and Load) Components of the Database.
    [Show full text]
  • How Do Ancestral Traits Shape Family Trees Over Generations?
    Author manuscript, published in IEEE Transactions on Visualization and Computer Graphics, 24(1), pp. 205-214, 2018. DOI: 10.1109/TVCG.2017.2744080 How Do Ancestral Traits Shape Family Trees over Generations? Siwei Fu, Hao Dong, Weiwei Cui, Jian Zhao, and Huamin Qu a b c Figure 1. (a) TreeEvo organizes and demonstrates the entire collection of family trees by growth and continuity in a Sankey diagram like visualization. In this example, Sankey nodes in each row represent all trees with the same depth, which are categorized into three groups: left-inclined (blue), balanced (white), and right-inclined (red). (b) After the blue Sankey node is selected, detailed composition of the node, i.e., a set of trees, is displayed in a space-efficient layout. Trees of each specific structure are represented by a rectangle, of which the color indicates inclination and area encodes the number trees. The node-link structure of family trees is displayed if the rectangle is large enough. (c) Family trees included in the red Sankey node, which are right-inclined, are displayed upon selection. Abstract— Whether and how does the structure of family trees differ by ancestral traits over generations? This is a fundamental question regarding the structural heterogeneity of family trees for the multi-generational transmission research. However, previous work mostly focuses on parent-child scenarios due to the lack of proper tools to handle the complexity of extending the research to multi-generational processes. Through an iterative design study with social scientists and historians, we develop TreeEvo that assists users to generate and test empirical hypotheses for multi-generational research.
    [Show full text]
  • LMDI Decomposition of Energy-Related CO2 Emissions Based on Energy and CO2 Allocation Sankey Diagrams: the Method and an Application to China
    sustainability Article LMDI Decomposition of Energy-Related CO2 Emissions Based on Energy and CO2 Allocation Sankey Diagrams: The Method and an Application to China Linwei Ma 1,2, Chinhao Chong 1,2,3, Xi Zhang 1, Pei Liu 1, Weiqi Li 1,3,*, Zheng Li 1 and Weidou Ni 1 1 State Key Laboratory of Power Systems, Department of Energy and Power Engineering, Tsinghua-BP Clean Energy Research and Education Centre, Tsinghua University, Beijing 100084, China; [email protected] (L.M.); [email protected] (C.C.); [email protected] (X.Z.); [email protected] (P.L.); [email protected] (Z.L.); [email protected] (W.N.) 2 Tsinghua-Rio Tinto Joint Research Centre for Resources, Energy and Sustainable Development, Tsinghua University, Beijing 100084, China 3 Sichuan Energy Internet Research Institute, Tsinghua University, Chengdu 610200, China * Correspondence: [email protected]; Tel.: +86-10-6279-5734-302 Received: 30 December 2017; Accepted: 25 January 2018; Published: 29 January 2018 Abstract: This manuscript develops a logarithmic mean Divisia index I (LMDI) decomposition method based on energy and CO2 allocation Sankey diagrams to analyze the contributions of various influencing factors to the growth of energy-related CO2 emissions on a national level. Compared with previous methods, we can further consider the influences of energy supply efficiency. Two key parameters, the primary energy quantity converted factor (KPEQ) and the primary carbon dioxide emission factor (KC), were introduced to calculate the equilibrium data for the whole process of energy unitization and related CO2 emissions. The data were used to map energy and CO2 allocation Sankey diagrams.
    [Show full text]