Sec 2.4 Statistical Graphics Key Concepts: ™ Graphs Are Excellent Tools for Describing, Exploring and Comparing Data

Total Page:16

File Type:pdf, Size:1020Kb

Sec 2.4 Statistical Graphics Key Concepts: ™ Graphs Are Excellent Tools for Describing, Exploring and Comparing Data Sec 2.4 Statistical Graphics Key Concepts: Graphs are excellent tools for describing, exploring and comparing data. Describing data: Histogram‐ consider distribution, center, variation, and outliers. Exploring data: Features that reveal some useful and/or interesting characteristic of the data set. Comparing data: construct similar graphs to compare data sets. Definitions: 1. Frequency Polygon uses line segments connected to points located directly above class midpoint values. 2. Relative Frequency Polygon uses relative frequencies (proportions or percentages) for the vertical scale. 2. Ogive is a line graph that depicts cumulative frequencies. 1 Sec 2.4 Statistical Graphics 3. Dotplot consists of graph in which each data value is plotted as a point (or dot) along a scale of values. 4. Stemplot (or stem-and-leaf plot) represents quantitative data by separating each value into two parts: the stem (such as the leftmost digit) and the leaf (such as the rightmost digit). Pulse Rates of Females 5. Bar graph uses bars of equal width to show frequencies of categories of qualitative data. 6. Multiple bar graph has two or more sets of bars, and is used to compare two or more data sets. Median Income of Males and Female 7. Pie Chart: a circle that is divided into sectors that represent categories 2 Sec 2.4 Statistical Graphics 8. Pareto chart is a bar graph for qualitative data, with the added stipulation that the bars are arranged in descending order according to frequencies. 9. Scatterplot is a plot of paired (x, y) quantitative data with a horizontal x-axis and a vertical y-axis. 10. Time-series graph is a graph of time-series data, which are quantitative data that have been collected at different points in time. 3 Sec 2.4 Statistical Graphics 1. Stemplot How to make a stem-and-leaf display 1. Divide the digits of each data value into two parts. The leftmost part is called the stem and the rightmost part is called the leaf (ones digit or decimal place). 2. Align all the stems in a vertical column from smallest to largest. Draw a vertical line to the right of all the stems. 3. Place all the leaves with the same stem in the same row as the stem, and arrange the leaves in increasing order. 4. Use a label to indicate the magnitude of the numbers in the display. We include the decimal position in the label rather than with the stem or leaves. Example 1: Use a stem‐and‐leaf plot to display the data. The data represent the ages of the top 25 wealthiest people in the world. Be sure to indicate the scale. 51 76 67 80 56 73 58 71 78 49 62 84 50 49 87 40 59 47 54 84 61 79 59 52 63 Answer: Interpretation: 2. Dotplot Example 2. Use data from example 1 to construct a dot plot and identify unusual data values 51 76 67 80 56 73 58 71 78 49 62 84 50 49 87 40 59 47 54 84 61 79 59 52 63 Answer: The horizontal axis should include numbers between 40 to 84 (smallest to largest data values) Interpretation: 4 Sec 2.4 Statistical Graphics 3. Pie Chart Example 3. Use a pie chart to display the data. The data represent the number of countries in the United Nation by continent. (Source: United Nations) Continent Number of Relative countries, f frequency North America 23 South America 12 Europe 43 Oceania 14 Africa 53 Asia 47 Total Interpretation: 4. Pareto Chart Example : Use the data from example 3 to construct Pareto Chart (frequency) 5 .
Recommended publications
  • An Introduction to Psychometric Theory with Applications in R
    What is psychometrics? What is R? Where did it come from, why use it? Basic statistics and graphics TOD An introduction to Psychometric Theory with applications in R William Revelle Department of Psychology Northwestern University Evanston, Illinois USA February, 2013 1 / 71 What is psychometrics? What is R? Where did it come from, why use it? Basic statistics and graphics TOD Overview 1 Overview Psychometrics and R What is Psychometrics What is R 2 Part I: an introduction to R What is R A brief example Basic steps and graphics 3 Day 1: Theory of Data, Issues in Scaling 4 Day 2: More than you ever wanted to know about correlation 5 Day 3: Dimension reduction through factor analysis, principal components analyze and cluster analysis 6 Day 4: Classical Test Theory and Item Response Theory 7 Day 5: Structural Equation Modeling and applied scale construction 2 / 71 What is psychometrics? What is R? Where did it come from, why use it? Basic statistics and graphics TOD Outline of Day 1/part 1 1 What is psychometrics? Conceptual overview Theory: the organization of Observed and Latent variables A latent variable approach to measurement Data and scaling Structural Equation Models 2 What is R? Where did it come from, why use it? Installing R on your computer and adding packages Installing and using packages Implementations of R Basic R capabilities: Calculation, Statistical tables, Graphics Data sets 3 Basic statistics and graphics 4 steps: read, explore, test, graph Basic descriptive and inferential statistics 4 TOD 3 / 71 What is psychometrics? What is R? Where did it come from, why use it? Basic statistics and graphics TOD What is psychometrics? In physical science a first essential step in the direction of learning any subject is to find principles of numerical reckoning and methods for practicably measuring some quality connected with it.
    [Show full text]
  • Cluster Analysis for Gene Expression Data: a Survey
    Cluster Analysis for Gene Expression Data: A Survey Daxin Jiang Chun Tang Aidong Zhang Department of Computer Science and Engineering State University of New York at Buffalo Email: djiang3, chuntang, azhang @cse.buffalo.edu Abstract DNA microarray technology has now made it possible to simultaneously monitor the expres- sion levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremen- dous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. Cluster analysis seeks to partition a given data set into groups based on specified features so that the data points within a group are more similar to each other than the points in different groups. A very rich literature on cluster analysis has developed over the past three decades. Many conventional clustering algorithms have been adapted or directly applied to gene expres- sion data, and also new algorithms have recently been proposed specifically aiming at gene ex- pression data. These clustering algorithms have been proven useful for identifying biologically relevant groups of genes and samples. In this paper, we first briefly introduce the concepts of microarray technology and discuss the basic elements of clustering on gene expression data.
    [Show full text]
  • Reliability Engineering: Today and Beyond
    Reliability Engineering: Today and Beyond Keynote Talk at the 6th Annual Conference of the Institute for Quality and Reliability Tsinghua University People's Republic of China by Professor Mohammad Modarres Director, Center for Risk and Reliability Department of Mechanical Engineering Outline – A New Era in Reliability Engineering – Reliability Engineering Timeline and Research Frontiers – Prognostics and Health Management – Physics of Failure – Data-driven Approaches in PHM – Hybrid Methods – Conclusions New Era in Reliability Sciences and Engineering • Started as an afterthought analysis – In enduing years dismissed as a legitimate field of science and engineering – Worked with small data • Three advances transformed reliability into a legitimate science: – 1. Availability of inexpensive sensors and information systems – 2. Ability to better described physics of damage, degradation, and failure time using empirical and theoretical sciences – 3. Access to big data and PHM techniques for diagnosing faults and incipient failures • Today we can predict abnormalities, offer just-in-time remedies to avert failures, and making systems robust and resilient to failures Seventy Years of Reliability Engineering – Reliability Engineering Initiatives in 1950’s • Weakest link • Exponential life model • Reliability Block Diagrams (RBDs) – Beyond Exp. Dist. & Birth of System Reliability in 1960’s • Birth of Physics of Failure (POF) • Uses of more proper distributions (Weibull, etc.) • Reliability growth • Life testing • Failure Mode and Effect Analysis
    [Show full text]
  • Chartmaking in England and Its Context, 1500–1660
    58 • Chartmaking in England and Its Context, 1500 –1660 Sarah Tyacke Introduction was necessary to challenge the Dutch carrying trade. In this transitional period, charts were an additional tool for The introduction of chartmaking was part of the profes- the navigator, who continued to use his own experience, sionalization of English navigation in this period, but the written notes, rutters, and human pilots when he could making of charts did not emerge inevitably. Mariners dis- acquire them, sometimes by force. Where the navigators trusted them, and their reluctance to use charts at all, of could not obtain up-to-date or even basic chart informa- any sort, continued until at least the 1580s. Before the tion from foreign sources, they had to make charts them- 1530s, chartmaking in any sense does not seem to have selves. Consequently, by the 1590s, a number of ship- been practiced by the English, or indeed the Scots, Irish, masters and other practitioners had begun to make and or Welsh.1 At that time, however, coastal views and plans sell hand-drawn charts in London. in connection with the defense of the country began to be In this chapter the focus is on charts as artifacts and made and, at the same time, measured land surveys were not on navigational methods and instruments.4 We are introduced into England by the Italians and others.2 This lack of domestic production does not mean that charts I acknowledge the assistance of Catherine Delano-Smith, Francis Her- and other navigational aids were unknown, but that they bert, Tony Campbell, Andrew Cook, and Peter Barber, who have kindly commented on the text and provided references and corrections.
    [Show full text]
  • 3B – a Guide to Pictograms
    3b – A Guide to Pictograms A pictogram involves the use of a symbol in place of a word or statistic. Why would we use a pictogram? Pictograms can be very useful when trying to interpret data. The use of pictures allows the reader to easily see the frequency of a geographical phenomenon without having to always read labels and annotations. They are best used when the aesthetic qualities of the data presentation are more important than the ability to read the data accurately. Pictogram bar charts A normal bar chart can be made using a set of pictures to make up the required bar height. These pictures should be related to the data in question and in some cases it may not be necessary to provide a key or explanation as the pictures themselves will demonstrate the nature of the data inherently. A key may be needed if large numbers are being displayed – this may also mean that ‘half’ sized symbols may need to be used too. This project was funded by the Nuffield Foundation, but the views expressed are those of the authors and not necessarily those of the Foundation. Proportional shapes and symbols Scaling the size of the picture to represent the amount or frequency of something within a data set can be an effective way of visually representing data. The symbol should be representative of the data in question, or if the data does not lend itself to a particular symbol, a simple shape like a circle or square can be equally effective. Proportional symbols can work well with GIS, where the symbols can be placed on different sites on the map to show a geospatial connection to the data.
    [Show full text]
  • Fundamental Statistical Concepts in Presenting Data Principles For
    Fundamental Statistical Concepts in Presenting Data Principles for Constructing Better Graphics Rafe M. J. Donahue, Ph.D. Director of Statistics Biomimetic Therapeutics, Inc. Franklin, TN Adjunct Associate Professor Vanderbilt University Medical Center Department of Biostatistics Nashville, TN Version 2.11 July 2011 2 FUNDAMENTAL STATI S TIC S CONCEPT S IN PRE S ENTING DATA This text was developed as the course notes for the course Fundamental Statistical Concepts in Presenting Data; Principles for Constructing Better Graphics, as presented by Rafe Donahue at the Joint Statistical Meetings (JSM) in Denver, Colorado in August 2008 and for a follow-up course as part of the American Statistical Association’s LearnStat program in April 2009. It was also used as the course notes for the same course at the JSM in Vancouver, British Columbia in August 2010 and will be used for the JSM course in Miami in July 2011. This document was prepared in color in Portable Document Format (pdf) with page sizes of 8.5in by 11in, in a deliberate spread format. As such, there are “left” pages and “right” pages. Odd pages are on the right; even pages are on the left. Some elements of certain figures span opposing pages of a spread. Therefore, when printing, as printers have difficulty printing to the physical edge of the page, care must be taken to ensure that all the content makes it onto the printed page. The easiest way to do this, outside of taking this to a printing house and having them print on larger sheets and trim down to 8.5-by-11, is to print using the “Fit to Printable Area” option under Page Scaling, when printing from Adobe Acrobat.
    [Show full text]
  • Biostatistics (BIOSTAT) 1
    Biostatistics (BIOSTAT) 1 This course covers practical aspects of conducting a population- BIOSTATISTICS (BIOSTAT) based research study. Concepts include determining a study budget, setting a timeline, identifying study team members, setting a strategy BIOSTAT 301-0 Introduction to Epidemiology (1 Unit) for recruitment and retention, developing a data collection protocol This course introduces epidemiology and its uses for population health and monitoring data collection to ensure quality control and quality research. Concepts include measures of disease occurrence, common assurance. Students will demonstrate these skills by engaging in a sources and types of data, important study designs, sources of error in quarter-long group project to draft a Manual of Operations for a new epidemiologic studies and epidemiologic methods. "mock" population study. BIOSTAT 302-0 Introduction to Biostatistics (1 Unit) BIOSTAT 429-0 Systematic Review and Meta-Analysis in the Medical This course introduces principles of biostatistics and applications Sciences (1 Unit) of statistical methods in health and medical research. Concepts This course covers statistical methods for meta-analysis. Concepts include descriptive statistics, basic probability, probability distributions, include fixed-effects and random-effects models, measures of estimation, hypothesis testing, correlation and simple linear regression. heterogeneity, prediction intervals, meta regression, power assessment, BIOSTAT 303-0 Probability (1 Unit) subgroup analysis and assessment of publication
    [Show full text]
  • 1) Key Words 2) Tally Charts 3) Pictograms 4) Block Graph 5) Bar
    KS2 1) Key Words 2) Tally Charts 3) Pictograms 4) Block Graph 5) Bar Graphs 6) Pie Charts 7) Grouped Tally Charts (KS2/3 analysis) 8) Grouped Frequency Diagrams 9) Frequency Polygons 10) Line Graphs 11) Scatter Diagrams 12) Cumulative Frequency Diagrams 13) Box Plots 14) Histograms KS4 15) Grouped Tally Charts (KS4 analysis) 16) What Makes A Good Graph * Analysing Data Key words Axes Linear Continuous Median Correlation Origin Plot Data Discrete Scale Frequency x -axis Grouped y -axis Interquartile Title Labels Tally Types of data Discrete data can only take specific values, e.g. siblings, key stage 3 levels, numbers of objects Continuous data can take any value, e.g. height, weight, age, time, etc. Tally Chart A tally chart is used to organise data from a list into a table. The data shows the number of children in each of 30 families. 2, 1, 5, 0, 2, 1, 3, 0, 2, 3, 2, 4, 3, 1, 2, 3, 2, 1, 4, 0, 1, 3, 1, 2, 2, 6, 3, 2, 2, 3 Number of children in a Tally Frequency family 0 1 2 3 4 or more Year 3/4/5/6:- represent data using: lists, tally charts, tables and diagrams Tally Chart This data can now be represented in a Pictogram or a Bar Graph The data shows the number of children in each family. 30 families were studied. Add up the tally Number of children in a Tally Frequency family 0 III 3 1 IIII I 6 2 IIII IIII 10 3 IIII II 7 4 or more IIII 4 Total 30 Check the total is 30 IIII = 5 Year 3/4/5/6:- represent data using: lists, tally charts, tables and diagrams Pictogram This data could be represented by a Pictogram: Number of Tally Frequency
    [Show full text]
  • Big Data for Reliability Engineering: Threat and Opportunity
    Reliability, February 2016 Big Data for Reliability Engineering: Threat and Opportunity Vitali Volovoi Independent Consultant [email protected] more recently, analytics). It shares with the rest of the fields Abstract - The confluence of several technologies promises under this umbrella the need to abstract away most stormy waters ahead for reliability engineering. News reports domain-specific information, and to use tools that are mainly are full of buzzwords relevant to the future of the field—Big domain-independent1. As a result, it increasingly shares the Data, the Internet of Things, predictive and prescriptive lingua franca of modern systems engineering—probability and analytics—the sexier sisters of reliability engineering, both statistics that are required to balance the otherwise orderly and exciting and threatening. Can we reliability engineers join the deterministic engineering world. party and suddenly become popular (and better paid), or are And yet, reliability engineering does not wear the fancy we at risk of being superseded and driven into obsolescence? clothes of its sisters. There is nothing privileged about it. It is This article argues that“big-picture” thinking, which is at the rarely studied in engineering schools, and it is definitely not core of the concept of the System of Systems, is key for a studied in business schools! Instead, it is perceived as a bright future for reliability engineering. necessary evil (especially if the reliability issues in question are safety-related). The community of reliability engineers Keywords - System of Systems, complex systems, Big Data, consists of engineers from other fields who were mainly Internet of Things, industrial internet, predictive analytics, trained on the job (instead of receiving formal degrees in the prescriptive analytics field).
    [Show full text]
  • 2021 Garmin & Navionics Cartography Catalog
    2021 CARTOGRAPHY CATALOG CONTENTS BlueChart® Coastal Charts �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 04 LakeVü Inland Maps �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 06 Canada LakeVü G3 �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 08 ActiveCaptain® App �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 09 New Chart Guarantee� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 10 How to Read Your Product ID Code �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 10 Inland Maps ��������������������������������������������������� 12 Coastal Charts ������������������������������������������������� 16 United States� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 18 Canada ���������������������������������������������������� 24 Caribbean �������������������������������������������������� 26 South America� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 27 Europe����������������������������������������������������� 28 Africa ����������������������������������������������������� 39 Asia ������������������������������������������������������ 40 Australia/New Zealand �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 42 Pacific Islands �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
    [Show full text]
  • Interactive Statistical Graphics/ When Charts Come to Life
    Titel Event, Date Author Affiliation Interactive Statistical Graphics When Charts come to Life [email protected] www.theusRus.de Telefónica Germany Interactive Statistical Graphics – When Charts come to Life PSI Graphics One Day Meeting Martin Theus 2 www.theusRus.de What I do not talk about … Interactive Statistical Graphics – When Charts come to Life PSI Graphics One Day Meeting Martin Theus 3 www.theusRus.de … still not what I mean. Interactive Statistical Graphics – When Charts come to Life PSI Graphics One Day Meeting Martin Theus 4 www.theusRus.de Interactive Graphics ≠ Dynamic Graphics • Interactive Graphics … uses various interactions with the plots to change selections and parameters quickly. Interactive Statistical Graphics – When Charts come to Life PSI Graphics One Day Meeting Martin Theus 4 www.theusRus.de Interactive Graphics ≠ Dynamic Graphics • Interactive Graphics … uses various interactions with the plots to change selections and parameters quickly. • Dynamic Graphics … uses animated / rotating plots to visualize high dimensional (continuous) data. Interactive Statistical Graphics – When Charts come to Life PSI Graphics One Day Meeting Martin Theus 4 www.theusRus.de Interactive Graphics ≠ Dynamic Graphics • Interactive Graphics … uses various interactions with the plots to change selections and parameters quickly. • Dynamic Graphics … uses animated / rotating plots to visualize high dimensional (continuous) data. 1973 PRIM-9 Tukey et al. Interactive Statistical Graphics – When Charts come to Life PSI Graphics One Day Meeting Martin Theus 4 www.theusRus.de Interactive Graphics ≠ Dynamic Graphics • Interactive Graphics … uses various interactions with the plots to change selections and parameters quickly. • Dynamic Graphics … uses animated / rotating plots to visualize high dimensional (continuous) data.
    [Show full text]
  • Cluster Analysis Or Clustering Is a Common Technique for Statistical
    IOSR Journal of Engineering Apr. 2012, Vol. 2(4) pp: 719-725 AN OVERVIEW ON CLUSTERING METHODS T. Soni Madhulatha Associate Professor, Alluri Institute of Management Sciences, Warangal. ABSTRACT Clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. Clustering is the process of grouping similar objects into different groups, or more precisely, the partitioning of a data set into subsets, so that the data in each subset according to some defined distance measure. This paper covers about clustering algorithms, benefits and its applications. Paper concludes by discussing some limitations. Keywords: Clustering, hierarchical algorithm, partitional algorithm, distance measure, I. INTRODUCTION finding the length of the hypotenuse in a triangle; that is, it Clustering can be considered the most important is the distance "as the crow flies." A review of cluster unsupervised learning problem; so, as every other problem analysis in health psychology research found that the most of this kind, it deals with finding a structure in a collection common distance measure in published studies in that of unlabeled data. A cluster is therefore a collection of research area is the Euclidean distance or the squared objects which are “similar” between them and are Euclidean distance. “dissimilar” to the objects belonging to other clusters. Besides the term data clustering as synonyms like cluster The Manhattan distance function computes the analysis, automatic classification, numerical taxonomy, distance that would be traveled to get from one data point to botrology and typological analysis. the other if a grid-like path is followed.
    [Show full text]