Learn to Create a Heat Map in Python with Data from NCHS (2018)

Total Page:16

File Type:pdf, Size:1020Kb

Learn to Create a Heat Map in Python with Data from NCHS (2018) Learn to Create a Heat Map in Python With Data From NCHS (2018) © 2021 SAGE Publications, Ltd. All Rights Reserved. This PDF has been generated from SAGE Research Methods Datasets. SAGE SAGE Research Methods: Data 2021 SAGE Publications, Ltd. All Rights Reserved. Visualization Learn to Create a Heat Map in Python With Data From NCHS (2018) How-to Guide for Python Introduction In this guide, you will learn how to create a heat map using the Python programming language. Readers are provided links to the example dataset and encouraged to replicate this example. An additional practice example is suggested at the end of this guide. The example assumes you have downloaded the relevant data files to a folder on your computer and that you are using the JupyterLab environment. The relevant code should, however, work in other environments too. Contents 1. Heat Map 2. An Example in Python: Monthly Deaths by Influenza and Pneumonia in California, 2014–2018 2.1. The Python Procedure 2.1.1. JupyterLab Notebooks 2.1.2. Testing Out the Programming Environment 2.1.3. Creating Our Notebook, Importing Necessary Modules 2.1.4. Reading In and Formatting Our Data 2.1.5. Plotting the Chart 2.1.6. Saving the Plot 2.1.7. Alternative Version With Classified Color Scale Page 2 of 33 Learn to Create a Heat Map in Python With Data From NCHS (2018) SAGE SAGE Research Methods: Data 2021 SAGE Publications, Ltd. All Rights Reserved. Visualization 2.2. Exploring the Output 3. Your Turn 1. Heat Map Heat maps are visualizations that use a two-dimensional arrangement of colored rectangles, where the row corresponds to one data dimension and the column to another. Each rectangular block represents an intersection between the two plotted series and is colored by the values for observations in that particular intersection. Heat maps use position and color hue to encode values. One or both axes of the heat map can be quantitative, in which case the blocks represent values falling into different bins or value ranges. The axes can also be qualitative, thus representing different categories. Heat maps are also often used to show time series. A similar tabular visualization that shows intersections of data series with qualitative values (e.g., true/false; low, medium, high) is usually called a matrix. 2. An Example in Python: Monthly Deaths by Influenza and Pneumonia in California, 2014–2018 Figure 1 shows a heat map of monthly influenza and pneumonia deaths in California. The heat map gives a good impression of the seasonal variation in deaths attributed to these causes and highlights months where particularly many lives were lost to these causes. A continuous linear yellow–orange–red color scheme was used with darker hues corresponding to larger death counts. Data values are displayed for each individual cell. The horizontal axis is labeled month and lists months from January to December. The vertical axis ranges from 2014 to 2018, in increments of 1. The data are tabulated below. Page 3 of 33 Learn to Create a Heat Map in Python With Data From NCHS (2018) SAGE SAGE Research Methods: Data 2021 SAGE Publications, Ltd. All Rights Reserved. Visualization Year January February March April May June July August September October November December 2014 986 693 521 475 446 382 392 364 361 373 447 503 2015 918 730 577 521 452 420 379 358 385 417 455 573 2016 679 680 749 521 420 360 421 330 376 403 417 635 2017 985 722 617 511 466 403 360 372 368 399 396 762 2018 1,674 817 771 554 427 389 359 324 319 358 441 485 As the count increases, the intensity of the color shade increases. Text under the map reads, “Source: NCHS, 2018.” Figure 1. Heat Map of Monthly Influenza and Pneumonia Deaths in California 2014–2018 2.1 The Python Procedure Python is a general-purpose programming language that supports several Page 4 of 33 Learn to Create a Heat Map in Python With Data From NCHS (2018) SAGE SAGE Research Methods: Data 2021 SAGE Publications, Ltd. All Rights Reserved. Visualization programming paradigms and has a very clear syntax. It is a versatile tool, particularly for data manipulation and visualization. As Python was originally created as a learning tool, it is also reasonably easy to read for beginners. For more information, visit https://www.python.org/. You can write Python code with any plain text editor, such as Sublime Text or Visual Studio Code. For the purposes of this tutorial, you do not need to install anything additional, as we will be using a web-based programming environment. Note: This tutorial uses Python 3. Many online articles about Python programming and other sources discuss Python 2, which differs slightly, but in important ways from Python 3. Although code written in Python 2 often works in Python 3 and vice versa, this not always the case, and mixing the two Python versions can lead to errors or unexpected results. 2.1.1 JupyterLab Notebooks The traditional way of programming would be to write some code in a text file, then building and running it to generate an output. In a notebook, on the other hand, the code is broken down into cells, which can be run one at a time, displaying results right in the editor. This makes working with code and experimenting with changing parameters much more flexible and is particularly suitable for interactive data exploration, where the Python programming language shines. Sharing small code projects (such as visualizations!) generally becomes much simpler with the notebook approach, since you can save the entire notebook and send it to others. We will be using JupyterLab, a modern web-based notebook interface for Python, that requires no installation on the user’s part. To try a notebook online, just open https://jupyter.org/try and click Try JupyterLab. A cloud-hosted ready-to-use online JupyterLab environment will be activated after a short wait. Try refreshing the window if loading stalls. Page 5 of 33 Learn to Create a Heat Map in Python With Data From NCHS (2018) SAGE SAGE Research Methods: Data 2021 SAGE Publications, Ltd. All Rights Reserved. Visualization Take note that this JupyterLab session is hosted on https://mybinder.org/, and it will timeout after ~15 minutes if inactive. Make sure to download and save your notebooks locally before leaving the computer. If your session has expired, start a new one from https://jupyter.org/try and use the interface to upload your saved notebook to continue where you left off. Note: The online trial of JupyterLab is a good place to start if you want to experiment with programming in Python, but for continued use in the future, it is recommended to install the Conda package and environment management system and JupyterLab locally on your system. To obtain Conda, it is easiest to install one of two distributions: Anaconda, a powerful Python and R distribution that includes over 250 packages for various uses, or Miniconda, a minimal version Anaconda that includes only conda, Python, package dependencies, and a few other useful packages (JupyterLab not included). For more information on obtaining Conda, you can visit https://docs.conda.io/projects/conda/en/latest/user-guide/install/. If you choose the Miniconda distribution, you will need to install JupyterLab locally from your terminal (Mac) or Command Prompt/PowerShell (Win) with conda install -c conda- forge jupyterlab. For more information on installing JupyterLab, visit https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html. If you already have an installation of Anaconda or similar and a preferred plain text editor at your disposal, the relevant code covered in this tutorial should work in other environments as well. 2.1.2 Testing Out the Programming Environment If you open https://jupyter.org/try and click Try JupyterLab, you will be welcomed by a rather complex demo example. We will ignore this for now and create our own new notebook instead. Under File choose New > New Notebook. In the Select Kernel prompt, choose Python 3. Page 6 of 33 Learn to Create a Heat Map in Python With Data From NCHS (2018) SAGE SAGE Research Methods: Data 2021 SAGE Publications, Ltd. All Rights Reserved. Visualization You now see an empty notebook called Untitled.ipynb with one empty cell: a text box where you type code for execution. The cells of a notebook are convenient for structuring code in small chunks that can be run one at a time—as opposed to the more common way of building and running a whole script at once. A single cell can contain as many or few lines of code as you want. You could also change the cell to hold markdown-formatted text instead of code to write longer comments or add illustrations. You can test what JupyterLab does by writing some code in the empty cell. Click inside the cell and type in the following: print("Hello, world!") Hit shift + enter or press the small play arrow ▸ above in the toolbar to run the cell. 2.1.3 Creating Our Notebook, Importing Necessary Modules Create a new notebook and save it with a name, for example, polarchart.ipynb. You will refer back to this should your JupyterLab session time out. If we were running this project locally, we would first need to install all the modules necessary for generating the visualization. However, the trial environment launched from https://jupyter.org/try conveniently comes equipped with everything for our purposes. • Matplotlib (MPL for short) is one of the most popular visualization libraries for Python.
Recommended publications
  • Research Article Advanced Heat Map and Clustering Analysis Using Heatmap3
    Hindawi Publishing Corporation BioMed Research International Volume 2014, Article ID 986048, 6 pages http://dx.doi.org/10.1155/2014/986048 Research Article Advanced Heat Map and Clustering Analysis Using Heatmap3 Shilin Zhao, Yan Guo, Quanhu Sheng, and Yu Shyr Center for Quantitative Sciences, Vanderbilt University, Nashville, TN 37232, USA Correspondence should be addressed to Yu Shyr; [email protected] Received 6 June 2014; Accepted 2 July 2014; Published 16 July 2014 Academic Editor: Leng Han Copyright © 2014 Shilin Zhao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Heat maps and clustering are used frequently in expression analysis studies for data visualization and quality control. Simple clustering and heat maps can be produced from the “heatmap” function in R. However, the “heatmap” function lacks certain functionalities and customizability, preventing it from generating advanced heat maps and dendrograms. To tackle the limitations of the “heatmap” function, we have developed an R package “heatmap3” which significantly improves the original “heatmap” function by adding several more powerful and convenient features. The “heatmap3” package allows users to produce highly customizable state of the art heat maps and dendrograms. The “heatmap3” package is developed based on the “heatmap” function in R, and itis completely compatible with it. The new features of “heatmap3” include highly customizable legends and side annotation, a wider range of color selections, new labeling features which allow users to define multiple layers of phenotype variables, and automatically conducted association tests based on the phenotypes provided.
    [Show full text]
  • Visualization and Exploration of Transcriptomics Data Nils Gehlenborg
    Visualization and Exploration of Transcriptomics Data 05 The identifier 800 year identifier Nils Gehlenborg Sidney Sussex College To celebrate our 800 year history an adaptation of the core identifier has been commissioned. This should be used on communications in the time period up to and including 2009. The 800 year identifier consists of three elements: the shield, the University of Cambridge logotype and the 800 years wording. It should not be redrawn, digitally manipulated or altered. The elements should not be A dissertation submitted to the University of Cambridge used independently and their relationship should for the degree of Doctor of Philosophy remain consistent. The 800 year identifier must always be reproduced from a digital master reference. This is available in eps, jpeg and gif format. Please ensure the appropriate artwork format is used. File formats European Molecular Biology Laboratory, eps: all professionally printed applications European Bioinformatics Institute, jpeg: Microsoft programmes Wellcome Trust Genome Campus, gif: online usage Hinxton, Cambridge, CB10 1SD, Colour United Kingdom. The 800 year identifier only appears in the five colour variants shown on this page. Email: [email protected] Black, Red Pantone 032, Yellow Pantone 109 and white October 12, 2010 shield with black (or white name). Single colour black or white. Please try to avoid any other colour combinations. Pantone 032 R237 G41 B57 Pantone 109 R254 G209 B0 To Maureen. This dissertation is my own work and contains nothing which is the outcome of work done in collaboration with others, except as specified in the text and acknowledgements. This dissertation is not substantially the same as any I have submit- ted for a degree, diploma or other qualification at any other university, and no part has already been, or is currently being submitted for any degree, diploma or other qualification.
    [Show full text]
  • Bringing 'Bee-Cological' Data to Life Through a Relational Database and an Interactive Visualization Tool By
    Bringing 'Bee-cological' Data to Life through a Relational Database and an Interactive Visualization Tool by Xiaojun Wang A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the Degree of Master of Science in Bioinformatics and Computational Biology Aug 2018 APPROVED BY: Dr. Carolina Ruiz Dr. Robert J. Gegear Dr. Elizabeth F. Ryder 1 Abstract Over the past decade, bumblebees have rapidly declined in abundance and geographic distribution at an alarming rate, raising major social, economic and ecological concern worldwide. However, we presently lack effective bumblebee conservation strategies due to a lack of information on the specific ecological needs of each species. The ‘Beecology Project’ was created to fill this knowledge gap by utilizing citizen scientists to collect data on floral resource use patterns of foraging bees in naturally occurring mixed species communities across Massachusetts. In addition to its research goals, the Beecology Project also has the educational goal of providing a modular, integrated biology - computer science framework (a BIO-CS bridge) to assist teachers in developing curricula to meet the next generation biology and computer science standards at the high school level. The Beecology team has developed Android and Web mobile apps to assist citizen scientists to collect and submit field data on bumblebee and plant species interactions. Other Beecology team members also collected a substantial amount of bumblebee data through field research and online digital museum collections. However, there was no central location dedicated to the storage of such data. There was also no way for users such as researchers, educators, and the general public to access all of the collected data in an ecologically-meaningful way.
    [Show full text]
  • How Do They Make and Interpret Those Dendrograms and Heat Maps; Differences Between Unsupervised Clustering and Classification
    BINF 636: Lecture 9: Clustering: How Do They Make and Interpret Those Dendrograms and Heat Maps; Differences Between Unsupervised Clustering and Classification. Description: Clustering, for the purpose of this lecture, is the exploratory partitioning of a set of data points into subgroups (clusters) such that members of each subgroup are relatively similar to each other and members of distinct clusters are relatively dissimilar. For example, one might have gene expression profiles from a set of samples of a particular type of tumor and wish to see if the samples separate out into distinct subgroups. In this case one could be looking to uncover evidence of previously unknown subtypes, or one might wish to see if the results of clustering the gene expression profiles are consistent with classification by histopathology. In this class we will describe how dendrograms, such as the example to the right, are constructed using hierarchical agglomerative clustering, where one starts with each of the data points as an individual cluster, and in successive steps combines the pair of clusters that are “closest” to each other into one new cluster. This requires specifying a distance measure between data points and between clusters. Each clustering step reduces by one the number of existing clusters until at the end of the final step there is one cluster containing all the data points. If one has ordered the data points along a line so that at each step the clusters that are joined together are adjacent to each other, one can draw a corresponding diagram (dendrogram) where the heights of the vertical lines reflect the distance between the pair of clusters joined at each stage of the procedure.
    [Show full text]
  • OLIVER: a Tool for Visual Data Analysis on Longitudinal Plant Phenomics Data
    bioRxiv preprint doi: https://doi.org/10.1101/411595; this version posted May 19, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. OLIVER: A Tool for Visual Data Analysis on Longitudinal Plant Phenomics Data Oliver L Tessmer David M Kramer∗ Jin Chen∗ Dept. of Energy Plant Research Lab Dept. of Energy Plant Research Lab Inst. for Biomedical Informatics Michigan State University Michigan State University University of Kentucky East Lansing, USA East Lansing, USA Lexington, USA [email protected] [email protected] [email protected] Abstract—There is a critical unmet need for new tools to phenotyping make it possible to probe plant growth, photo- analyze and understand “big data” in the biological sciences synthesis and other properties under dynamic environmental where breakthroughs come from connecting massive genomics conditions [7], [11], [18], [31], [44]. Similar approaches are data with complex phenomics data. By integrating instant data visualization and statistical hypothesis testing, we have developed impacting other fields, such as biochemistry, drug development a new tool called OLIVER for phenomics visual data analysis and behavior studies [3], [5], [6], [40]. with a unique function that any user adjustment will trigger real- Despite these major phenotyping technological advances, time display updates for any affected elements in the workspace. biomedical scientists are facing the difficulty of analyzing By visualizing and analyzing omics data with OLIVER, biomed- longitudinal phenomics data, for the nonlinear temporal pat- ical researchers can quickly generate hypotheses and then test their thoughts within the same tool, leading to efficient knowledge terns in a high-dimensional space are difficult to detect.
    [Show full text]
  • Superheat: an R Package for Creating Beautiful and Extendable Heatmaps for Visualizing Complex Data
    Superheat: An R package for creating beautiful and extendable heatmaps for visualizing complex data Rebecca L. Barter Department of Statistics, University of California, Berkeley and Bin Yu Department of Statistics, University of California, Berkeley January 30, 2017 Abstract The technological advancements of the modern era have enabled the collection of huge amounts of data in science and beyond. Extracting useful information from such massive datasets is an ongoing challenge as traditional data visualization tools typically do not scale well in high-dimensional settings. An existing visualization technique that is particularly well suited to visualizing large datasets is the heatmap. Although heatmaps are extremely popular in fields such as bioinformatics for visualizing large gene expression datasets, they remain a severely underutilized visualization tool in modern data analysis. In this paper we introduce superheat, a new R package that provides an extremely flexible and customiz- able platform for visualizing large datasets using extendable heatmaps. Superheat enhances the traditional heatmap by providing a platform to visualize a wide range of data types simultaneously, adding to the heatmap a response variable as a scatterplot, model results as boxplots, correlation information as barplots, text information, and more. Superheat allows the user to explore their data to greater depths and to take advantage of the heterogeneity present in the data to inform analysis decisions. The goal of this paper is two-fold: (1) to demonstrate the potential of the heatmap as a default visualization method for a wide arXiv:1512.01524v2 [stat.AP] 26 Jan 2017 range of data types using reproducible examples, and (2) to highlight the customizability and ease of implementation of the superheat package in R for creating beautiful and extendable heatmaps.
    [Show full text]
  • Data Mining Mobile Devices Defines the Collection of Machine-Sensed Mobile Mining Data Devices Mobile Devices Environmental Data Pertaining to Human Social Behavior
    Marketing / Data Mining and Knowledge Discovery Mena With today’s consumers spending more time on their mobiles than on their PCs, Data Mining new methods of empirical stochastic modeling have emerged that can provide marketers with detailed information about the products, content, and services their customers desire. Data Mining Mobile Devices defines the collection of machine-sensed Devices Data Mining Mobile Mobile Devices environmental data pertaining to human social behavior. It explains how the integration of data mining and machine learning can enable the modeling of conversation context, proximity sensing, and geospatial location throughout large communities of mobile users. Jesus Mena • Examines the construction and leveraging of mobile sites • Describes how to use mobile apps to gather key data about consumers’ behavior and preferences • Discusses mobile mobs, which can be differentiated as distinct marketplaces—including Apple®, Google®, Facebook®, Amazon®, and Twitter ® • Provides detailed coverage of mobile analytics via clustering, text, and classification AI software and techniques Mobile devices serve as detailed diaries of a person, continuously and intimately broadcasting where, how, when, and what products, services, and content your consumers desire. The future is mobile—data mining starts and stops in consumers’ pockets. Describing how to analyze Wi-Fi and GPS data from websites and apps, the book explains how to model mined data through the use of artificial intelligence software. It also discusses the monetization
    [Show full text]
  • Download the Publicly Available R Software Language, Among a Few Other Operating System-Specific Requirements
    Khomtchouk et al. Source Code for Biology and Medicine (2014) 9:30 DOI 10.1186/s13029-014-0030-2 METHODOLOGY Open Access HeatmapGenerator: high performance RNAseq and microarray visualization software suite to examine differential gene expression levels using an R and C++ hybrid computational pipeline Bohdan B Khomtchouk1*, Derek J Van Booven2 and Claes Wahlestedt1 Abstract Background: The graphical visualization of gene expression data using heatmaps has become an integral component of modern-day medical research. Heatmaps are used extensively to plot quantitative differences in gene expression levels, such as those measured with RNAseq and microarray experiments, to provide qualitative large-scale views of the transcriptonomic landscape. Creating high-quality heatmaps is a computationally intensive task, often requiring considerable programming experience, particularly for customizing features to a specific dataset at hand. Methods: Software to create publication-quality heatmaps is developed with the R programming language, C++ programming language, and OpenGL application programming interface (API) to create industry-grade high performance graphics. Results: We create a graphical user interface (GUI) software package called HeatmapGenerator for Windows OS and Mac OS X as an intuitive, user-friendly alternative to researchers with minimal prior coding experience to allow them to create publication-quality heatmaps using R graphics without sacrificing their desired level of customization. The simplicity of HeatmapGenerator is that it only requires the user to upload a preformatted input file and download the publicly available R software language, among a few other operating system-specific requirements. Advanced features such as color, text labels, scaling, legend construction, and even database storage can be easily customized with no prior programming knowledge.
    [Show full text]
  • View in the FDA’S Voluntary Genomics Data Submission Program
    Fang et al. BMC Bioinformatics 2010, 11(Suppl 6):S4 http://www.biomedcentral.com/1471-2105/11/S6/S4 PROCEEDINGS Open Access An FDA bioinformatics tool for microbial genomics research on molecular characterization of bacterial foodborne pathogens using microarrays Hong Fang1, Joshua Xu1, Don Ding1, Scott A Jackson2, Isha R Patel2, Jonathan G Frye3, Wen Zou4, Rajesh Nayak4, Steven Foley4, James Chen4, Zhenqiang Su1, Yanbin Ye1, Steve Turner1, Steve Harris4, Guangxu Zhou1, Carl Cerniglia2, Weida Tong4* From Seventh Annual MCBIOS Conference. Bioinformatics: Systems, Biology, Informatics and Computation Jonesboro, AR, USA. 19-20 February 2010 Abstract Background: Advances in microbial genomics and bioinformatics are offering greater insights into the emergence and spread of foodborne pathogens in outbreak scenarios. The Food and Drug Administration (FDA) has developed a genomics tool, ArrayTrackTM, which provides extensive functionalities to manage, analyze, and interpret genomic data for mammalian species. ArrayTrackTM has been widely adopted by the research community and used for pharmacogenomics data review in the FDA’s Voluntary Genomics Data Submission program. Results: ArrayTrackTM has been extended to manage and analyze genomics data from bacterial pathogens of human, animal, and food origin. It was populated with bioinformatics data from public databases such as NCBI, Swiss-Prot, KEGG Pathway, and Gene Ontology to facilitate pathogen detection and characterization. ArrayTrackTM’s data processing and visualization tools were enhanced with analysis capabilities designed specifically for microbial genomics including flag-based hierarchical clustering analysis (HCA), flag concordance heat maps, and mixed scatter plots. These specific functionalities were evaluated on data generated from a custom Affymetrix array (FDA- ECSG) previously developed within the FDA.
    [Show full text]
  • Heatmap Visualization with Spreadsheet
    Heatmap Visualization With Spreadsheet hisScrumptious miff. Trustworthy and tetanic Chester Timmie impassion coordinates offside some or faggings horseshoer scant so when lymphatically! Shelley is Oncoming admiring. and Moroccan Saul always launders all-fired and parachutes Credit card and wait if you need to show off when the heatmap visualization tool for colors that the You understand apply with same charting styles and elements to map charts that merchandise can serve other Excel charts. You sick also paper the features and options available debt the template to customize and extend. How can I use COUNTD function in Tableau? Refresh page allows you advise select a map software this and laptop people like them to lane the api? The visualization of heatmaps are visualized as long does not. Details and give this account menu that would otherwise, leave it starts to predict like work. By visual summary of heatmap and visualize georeference data with. Brady has the best times to make decisions and then right format, i have lots of pattern you will be viewed it? How do with spreadsheets offers a visualization for their visualizations built for your heatmaps you want to visualize how you to create subplots and. If matter can test the method with playing good computer, you can depart the chart mode upon the default value but Color, it will not conclude until your spreadsheet is published to the web. The Formula Consistency View shades all cells with the same formulae using the same colour. Thankfully, click include the desired pin and click back the camera icon. This is useful for datasets that you update frequently, such as historical frequency of visits.
    [Show full text]
  • Heat Map Visualization for Electrocardiogram Data Analysis Haisen Guo1†, Weidai Zhang1†, Chumin Ni1†, Zhixiong Cai1, Songming Chen2 and Xiansheng Huang2*
    Guo et al. BMC Cardiovascular Disorders (2020) 20:277 https://doi.org/10.1186/s12872-020-01560-8 RESEARCH ARTICLE Open Access Heat map visualization for electrocardiogram data analysis Haisen Guo1†, Weidai Zhang1†, Chumin Ni1†, Zhixiong Cai1, Songming Chen2 and Xiansheng Huang2* Abstract Background: Most electrocardiogram (ECG) studies still take advantage of traditional statistical functions, and the results are mostly presented in tables, histograms, and curves. Few papers display ECG data by visual means. The aim of this study was to analyze and show data for electrocardiographic left ventricular hypertrophy (LVH) with ST- segment elevation (STE) by a heat map in order to explore the feasibility and clinical value of heat mapping for ECG data visualization. Methods: We sequentially collected the electrocardiograms of inpatients in the First Affiliated Hospital of Shantou University Medical College from July 2015 to December 2015 in order to screen cases of LVH with STE. HemI 1.0 software was used to draw heat maps to display the STE of each lead of each collected ECG. Cluster analysis was carried out based on the heat map and the results were drawn as tree maps (pedigree maps) in the heat map. Results: In total, 60 cases of electrocardiographic LVH with STE were screened and analyzed. STE leads were mainly in the V1,V2 and V3 leads. The ST-segment shifts of each lead of each collected ECG could be conveniently visualized in the heat map. According to cluster analysis in the heat map, STE leads were clustered into two categories, comprising of the right precordial leads (V1,V2,V3) and others (V4,V5,V6, I, II, III, aVF, aVL, aVR).
    [Show full text]
  • Using Dendritic Heat Maps to Simultaneously Display Genotype Divergence with Phenotype Divergence
    RESEARCH ARTICLE Using Dendritic Heat Maps to Simultaneously Display Genotype Divergence with Phenotype Divergence Matthew Kellom, Jason Raymond* School of Earth and Space Exploration, Arizona State University, Tempe, Arizona, United States of America * [email protected] a11111 Abstract The advancement of techniques to visualize and analyze large-scale sequencing datasets is an area of active research and is rooted in traditional techniques such as heat maps and dendrograms. We introduce dendritic heat maps that display heat map results over aligned DNA sequence clusters for a range of clustering cutoffs. Dendritic heat maps aid in visualiz- OPEN ACCESS ing the effects of group differences on clustering hierarchy and relative abundance of sam- Citation: Kellom M, Raymond J (2016) Using pled sequences. Here, we artificially generate two separate datasets with simplified Dendritic Heat Maps to Simultaneously Display mutation and population growth procedures with GC content group separation to use as Genotype Divergence with Phenotype Divergence. PLoS ONE 11(8): e0161292. doi:10.1371/journal. example phenotypes. In this work, we use the term phenotype to represent any feature by pone.0161292 which groups can be separated. These sequences were clustered in a fractional identity Editor: Patrick Jon Biggs, Massey University, NEW range of 0.75 to 1.0 using agglomerative minimum-, maximum-, and average-linkage algo- ZEALAND rithms, as well as a divisive centroid-based algorithm. We demonstrate that dendritic heat Received: January 4, 2016 maps give freedom to scrutinize specific clustering levels across a range of cutoffs, track changes in phenotype inequity across multiple levels of sequence clustering specificity, and Accepted: June 11, 2016 easily visualize how deeply rooted changes in phenotype inequity are in a dataset.
    [Show full text]