Statistics I Chapter 3: Bivariate Data Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Statistics I Chapter 3: Bivariate Data Analysis Statistics I Chapter 3: Bivariate data analysis Chapter 3: Bivariate data analysis Contents 3.1 Two-way tables I Bivariate data I Definition of a two-way table I Joint absolute/relative frequency distribution I Marginal frequency distribution I Conditional frequency distribution 3.2 Correlation I Graphical display: scatterplot I Types of relationships between numerical variables I Measures of linear association Chapter 2: Bivariate data analysis Recommended reading I Pe~na,D., Romo, J., 'Introducci´ona la Estad´ısticapara las Ciencias Sociales' I Chapters 7, 8, 9 I Newbold, P. 'Estad´ısticapara los Negocios y la Econom´ıa’(2009) I Chapter 12 Bivariate data I Bivariate data is obtained when we observe two characteristics (numerical or categorical) of one individual/object. I Notation: A sample of n bivariate observations will be written as n pairs (x1; y1); (x2; y2);:::; (xn; yn): Summarizing bivariate data: two-way table I In order to summarize the data, we create two-dimensional table ... I with classes/class intervals of one variable in the rows ... I and classes/class intervals of the other in the columns I The corresponding frequencies (absolute or relative) appear in the cells on the cross-section of the given row and the given column I When one variable is qualitative, the two-way table is called a contingency table Two-way table: joint absolute frequency distribution I Two-way table with k rows and m columns Y y1 ··· yj ··· ym Total x1 n11 ··· n1j ··· n1m n1• . X xi ni1 ··· nij ··· nim ni• . xk nk1 ··· nkj ··· nkm nk• Total n•1 ··· n•j ··· n•m n•• I Notation: nij absolute frequency of cell (i; j) Row i total: ni• = ni1 + ni2 + ··· + nim Column j total: n•j = n1j + n2j + ··· + nkj n•• sample size Joint absolute frequency distribution Example Sixty four men were selected and the following two variables were considered Y = Color of man's mother's eyes fBright; Darkg X = Color of man's eyes fBright; Darkg (x1; y1) = (B; B); (x2; y2) = (B; D);:::; (x64; y64) = (D; D) Y Bright Dark Total X Bright 23 12 35 Dark 17 12 29 Total 40 24 64 I (x2; y2) = (B; D) means that the 2nd man had Bright eyes and his mother had Dark eyes I 23 of men had B eyes and their mothers also had B eyes I 29 of men had Dark eyes Two-way table: relative joint frequency distribution I fij = nij =n•• relative frequency of cell (i; j) Y y1 ··· yj ··· ym Total x1 f11 ··· f1j ··· f1m f1• . Xx i fi1 ··· fij ··· fim fi• . xk fk1 ··· fkj ··· fkm fk• Total f•1 ··· f•j ··· f•m 1 I Notation: fij relative frequency of cell (i; j) Row i total: fi• = fi1 + fi2 + ··· + fim Column j total: f•j = f1j + f2j + ··· + fkj n•• sample size Joint frequency distribution: absolute and relative Example Y = Weekly number of visits to the cinema X = Weekly number of visits to the theater (absolute frequencies in black, relative in blue) Y 0 1 2 3 4 0 12 0.2795 0.1164 0.0932 0.0471 0.023 1 4 0.0933 0.0702 0.0471 0.0230 0.000 X 2 3 0.0703 0.0702 0.0470 0.0000 0.000 3 1 0.0230 0.0000 0.0000 0.0000 0.000 Marginal frequency distribution: absolute and relative I The marginal frequencies take us back to the univariate case (we ignore the other variable) I We obtain them by looking at the margins of the joint frequency table (absolute or relative) I Specifically, the relative marginal frequencies of X are defined as fi• = fi1 + ··· + fij + ··· + fim and those of Y are f•j = f1j + ··· + fij + ··· + fkj I Absolute marginal frequencies are obtained analogously Marginal frequency distribution Example Y = Number of workers X = Number of sales Y 1-24 25-49 50-74 75-99 Total 1-100 0.293 0.122 0.098 0.049 0.561 X 101-200 0.098 0.073 0.049 0.024 0.244 201-300 0.073 0.073 0.049 0.000 0.195 Total 0.463 0.268 0.195 0.073 1 Marginal (relative) frequencies of Y Workers 1-24 25-49 50-74 75-99 f•j 0.463 0.268 0.195 0.073 Marginal (relative) frequencies of X Sales 1-100 101-200 201-300 fi• 0.561 0.244 0.195 Conditional frequency distribution I The conditional distribution of Y , given that X = xi (notation f (Y jX = xi )) is obtained by calculating fij fi• for j-th class/class interval of Y (we narrow our data set to the observations satisfying the condition) I The conditional distribution of X , given that Y = yj is obtained by calculating for i-th class/class interval of X fij f•j I Exactly the same conditional distribution is obtained if we replace the relative frequencies by the absolute frequencies in the above definitions I Conditional frequencies always add up to 1! Conditional frequency distribution Example Y = Number of workers X = Number of sales Find the conditional frequency distribution for sales given that the number of workers is between 50 and 74. Y 1-24 25-49 50-74 75-99 Total 1-100 0.293 0.122 0.098 0.049 0.561 X 101-200 0.098 0.073 0.049 0.024 0.244 201-300 0.073 0.073 0.049 0.000 0.195 Total 0.463 0.268 0.195 0.073 1 The conditional frequency distribution f (X j50 ≤ Y ≤ 74) Sales 1-100 101-200 201-300 0:098 0:049 0:049 f (xj50 ≤ y ≤ 74) 0.500(= 0:195 ) 0.250(= 0:195 ) 0.250(= 0:195 ) Conditional frequency distribution Example Y = Number of workers X = Number of sales Find the conditional frequency distribution for workers when the number of sales is between 101 and 200. Y 1-24 25-49 50-74 75-99 Total 1-100 0.293 0.122 0.098 0.049 0.561 X 101-200 0.098 0.073 0.049 0.024 0.244 201-300 0.073 0.073 0.049 0.000 0.195 Total 0.463 0.268 0.195 0.073 1 The conditional frequency distribution f (Y j101 ≤ X ≤ 200) Workers 1-24 25-49 50-74 75-99 f (yj101 ≤ x ≤ 200) 0.402 0.299 0.201 0.098 Graphics for bivariate numerical data: scatterplot size price 107 162657 ● ● 114 165554 ● ● ● 91 154506 ● 165000 100 162103 ● ● ● ● 96 158271 ● 107 166925 104 161917 160000 100 161149 ● 80 152263 81 151878 Price of a house (euro) 105 165678 ● 155000 111 166696 ● 108 165387 ● 97 161806 80 85 90 95 100 105 110 115 106 163824 Size of a house (m^2) Types of relationships There are different ways in which two variables may be related: (i) absence of relationship (ii) negative linear relationship (iii) positive linear relationship and (iv) nonlinear relationship NO RELATIONSHIP NEGATIVE LINEAR 3 ● ● 3 ● ● ● ● ● ● 2 ●● ● 2 ● ● ● ● ● ● ●● 1 ● ● ● ●● 1 ● ● ● ●● ● ● ● ● ● ●● ● ●● 0 ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● 0 ● ● ●● ● ● ● ● ● ●●● ● ● ● −1 ● ● ● ● ● ● ● ● −1 ● ● ● ● ● ● ● ● ● ● ● −2 ● ● ● −2 ● ● ● ● ● −3 −2 −1 0 1 2 −2 −1 0 1 2 POSITIVE LINEAR NONLINEAR ● ● ●● ● ● ●●● 2 ● ● ● ● 0 ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● 1 ●● ● ● ● ● ●● ● ● ● ● ● ● ● −2 ● ●● 0 ●● ● ● ● ● ● ● ● ● ● ●● ● −4 ● −1 ● ● ● ●● ● ● ● ● ● −2 ● −6 −3 −8 ● ● −4 −3 −2 −1 0 1 2 −3 −2 −1 0 1 2 Measures of linear association: covariance I We look for a descriptive measure that would indicate whether there is a linear relationship between x and y I Sample covariance is defined as Pn i=1 xi yi − nx¯y¯ zn }| { ! 1 X s = (x − x¯)(y − y¯) xy n − 1 i i i=1 I Covariance and linear transformation: if we transform Y variable according to Z = a + bY , then the covariance between X and the new variable Z is sxz = bsxy Properties of the covariance I If the covariance is 'much larger than 0', it's because there exists a positive linear relationship between the variables I If the covariance is 'much smaller than 0', it's because there exists a negative linear relationship I If the covariance is 'small', it's because: (i) the linear relationship does not exists; or (ii) the relationship is nonlinear. I Drawbacks: I What does it mean large/small? I What are the units of covariance? Measures of linear association: correlation I Correlation overcomes the issues with covariance I Sample correlation is defined as sxy r(x;y) = sx sy I Properties: I The correlation is bounded: −1 ≤ r(x;y) ≤ 1 , so the terms large and small now make sense I It is unitless Measures of linear association: correlation Example Three variables are measured over 91 countries: female life expectancy, male life expectancy and gross national product. I The covariances between the three pairs of two variables are 105.15, 50066.04 and 57917.93, respectively. I On the other hand, the correlations between the three pairs of two variables are 0.98, 0.64 y 0.65, respectively. I Therefore, even if the covariances between male and female live expectancy and gross national product are larger than the covariance between male and female live expectancies, the correlation is larger for these last two variables..
Recommended publications
  • Chapter 9 Descriptive Statistics for Bivariate Data
    9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to explore the distribution of the values of a single variable) in Chapters 2 and 3. In this chapter we will consider bivariate data description. That is, we will discuss descriptive methods used to explore the joint distribution of the pairs of values of a pair of variables. The joint distribution of a pair of variables is the way in which the pairs of possible values of these variables are distributed among the units in the group of interest. When we measure or observe pairs of values for a pair of variables, we want to know how the two variables behave together (the joint distribution of the two variables), as well as how each variable behaves individually (the marginal distributions of each variable). In this chapter we will restrict our attention to bivariate data description for two quan- titative variables. We will make a distinction between two types of variables. A response variable is a variable that measures the response of a unit to natural or experimental stimuli. A response variable provides us with the measurement or observation that quan- ti¯es a relevant characteristic of a unit. An explanatory variable is a variable that can be used to explain, in whole or in part, how a unit responds to natural or experimental stimuli. This terminology is clearest in the context of an experimental study. Consider an experiment where a unit is subjected to a treatment (a speci¯c combination of conditions) and the response of the unit to the treatment is recorded.
    [Show full text]
  • Visualizing Data Descriptive Statistics I
    Introductory Statistics Lectures Visualizing Data Descriptive Statistics I Anthony Tanbakuchi Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the author © 2009 (Compile date: Tue May 19 14:48:11 2009) Contents 1 Visualizing Data1 Dot plots ........ 7 1.1 Frequency distribu- Stem and leaf plots... 7 tions: univariate data.2 1.3 Visualizing univariate Standard frequency dis- qualitative data..... 8 tributions.... 3 Bar plot......... 8 Relative frequency dis- Pareto chart....... 9 tributions.... 4 Cumulative frequency 1.4 Visualizing bivariate distributions.. 5 quantitative data . 11 1.2 Visualizing univariate Scatter plots . 11 quantitative data.... 5 Time series plots . 12 Histograms....... 5 1.5 Summary . 13 1 Visualizing Data The Six Steps in Statistics 1. Information is needed about a parameter, a relationship, or a hypothesis needs to be tested. 2. A study or experiment is designed. 3. Data is collected. 4. The data is described and explored. 5. The parameter is estimated, the relationship is modeled, the hypothesis is tested. 6. Conclusions are made. 1 2 of 14 1.1 Frequency distributions: univariate data Descriptive statistics Descriptive statistics. Definition 1.1 summarize and describe collected data (sample). Contrast with inferential statistics which makes inferences and generaliza- tions about populations. Useful characteristics for summarizing data Example 1 (Heights of students in class). If we are interested in heights of students and we measure the heights of students in this class, how can we summarize the data in useful ways? What to look for in data shape are most of the data all around the same value, or are they spread out evenly between a range of values? How is the data distributed? center what is the average value of the data? variation how much do the values vary from the center? outliers are any values significantly different from the rest? Class height data Load the class data, then look at what variables are defined.
    [Show full text]
  • Rainbow Plots, Bagplots and Boxplots for Functional Data
    Rainbow plots, bagplots and boxplots for functional data Rob J Hyndman and Han Lin Shang Department of Econometrics and Business Statistics, Monash University, Clayton, Australia June 5, 2009 Abstract We propose new tools for visualizing large amounts of functional data in the form of smooth curves. The proposed tools include functional versions of the bagplot and boxplot, and make use of the first two robust principal component scores, Tukey's data depth and highest density regions. By-products of our graphical displays are outlier detection methods for functional data. We compare these new outlier detection methods with existing methods for detecting outliers in functional data, and show that our methods are better able to identify the outliers. An R-package containing computer code and data sets is available in the online supple- ments. Keywords: Highest density regions, Robust principal component analysis, Kernel density estima- tion, Outlier detection, Tukey's halfspace depth. 1 Introduction Functional data are becoming increasingly common in a wide range of fields, and there is a need to develop new statistical tools for analyzing such data. In this paper, we are interested in visualizing data comprising smooth curves (e.g., Ramsay & Silverman, 2005; Locantore et al., 1999). Such functional data may be age-specific mortality or fertility rates (Hyndman & Ullah, 2007), term- structured yield curves (Kargin & Onatski, 2008), spectrometry data (Reiss & Ogden, 2007), or one of the many applications described by Ramsay & Silverman (2002). Ramsay & Silverman (2005) and Ferraty & Vieu (2006) provide detailed surveys of the many parametric and nonparametric techniques for analyzing functional data. 1 Visualization methods help in the discovery of characteristics that might not have been ap- parent using mathematical models and summary statistics; and yet this area of research has not received much attention in the functional data analysis literature to date.
    [Show full text]
  • A Universal Global Measure of Univariate and Bivariate Data Utility for Anonymised Microdata S Kocar
    A universal global measure of univariate and bivariate data utility for anonymised microdata S Kocar CSRM & SRC METHODS PAPER NO. 4/2018 Series note The ANU Centre for Social Research & Methods the Social Research Centre and the ANU. Its (CSRM) was established in 2015 to provide expertise includes quantitative, qualitative and national leadership in the study of Australian experimental research methodologies; public society. CSRM has a strategic focus on: opinion and behaviour measurement; survey • development of social research methods design; data collection and analysis; data archiving and management; and professional • analysis of social issues and policy education in social research methods. • training in social science methods • providing access to social scientific data. As with all CSRM publications, the views expressed in this Methods Paper are those of CSRM publications are produced to enable the authors and do not reflect any official CSRM widespread discussion and comment, and are position. available for free download from the CSRM website (http://csrm.cass.anu.edu.au/research/ Professor Matthew Gray publications). Director, ANU Centre for Social Research & Methods Research School of Social Sciences CSRM is located within the Research School of College of Arts & Social Sciences Social Sciences in the College of Arts & Social Sciences at the Australian National University The Australian National University (ANU). The centre is a joint initiative between August 2018 Methods Paper No. 4/2018 ISSN 2209-184X ISBN 978-1-925715-09-05 An electronic publication downloaded from http://csrm.cass.anu.edu.au/research/publications. For a complete list of CSRM working papers, see http://csrm.cass.anu.edu.au/research/publications/ working-papers.
    [Show full text]
  • The Aplpack Package
    The aplpack Package September 3, 2007 Type Package Title Another Plot PACKage: stem.leaf, bagplot, faces, spin3R, ... Version 1.1.1 Date 2007-09-03 Author Peter Wolf, Uni Bielefeld Maintainer Peter Wolf <[email protected]> Depends R (>= 2.4.0), tcltk Description set of functions for drawing some special plots: stem.leaf plots a stem and leaf plot bagplot plots a bagplot faces plots chernoff faces spin3R for an inspection of a 3-dim point cloud License This package was written by Hans Peter Wolf. This software is licensed under the GPL (version 2 or newer) terms. It is free software and comes with ABSOLUTELY NO WARRANTY. URL http://www.wiwi.uni-bielefeld.de/~wolf/software/aplpack R topics documented: aplpack-package . 1 bagplot . 2 boxplot2D . 5 faces............................................. 6 spin3R . 8 stem.leaf . 9 Index 11 1 2 bagplot aplpack-package Another PLotting Package Description this package contains functions for plotting: chernoff faces, stem and leaf plots, bagplot etc. Details Package: aplpack Type: Package Version: 1.0 Date: 2006-04-04 License: GPL bagplot plots a bagplot faces plots chernoff faces scboxplot plot a boxplot into a scatterplot spin3R enables the inspection of a 3-dim point cloud stem.leaf plots a stem and leaf display Author(s) Peter Wolf Maintainer: Peter Wolf <[email protected]> References Examples bagplot bagplot, a bivariate boxplot Description compute.bagplot() computes an object describing a bagplot of a bivariate data set. plot.bagplot() plots a bagplot object. bagplot() computes and plots a bagplot. Usage bagplot(x, y, factor = 3, na.rm = FALSE, approx.limit = 300, show.outlier = TRUE, show.whiskers = TRUE, show.looppoints = TRUE, show.bagpoints = TRUE, show.loophull = TRUE, show.baghull = TRUE, create.plot = TRUE, add = FALSE, pch = 16, cex = 0.4, bagplot 3 dkmethod = 2, precision = 1, verbose = FALSE, debug.plots = "no", col.loophull="#aaccff", col.looppoints="#3355ff", col.baghull="#7799ff", col.bagpoints="#000088", transparency=FALSE, ..
    [Show full text]
  • Bivariate Descriptive Statistics: Introduction*
    OpenStax-CNX module: m46177 1 Bivariate Descriptive Statistics: Introduction* Irene Mary Duranczyk Suzanne Loch Janet Stottlemyer This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract This module included categorical-categorical, cateogorical-measurement, and measurement-measurement descriptive statistics It also contains information from the descriptive statistics chapter part of Collabo- rative Statistics collection (col10522) by Barbara Illowsky and Susan Dean. 1 Descriptive Statistics for Bivariate Data: Introduction 1.1 Student Learning Outcomes By the end of this chapter, the student should be able to: • Identify dierent types of relationships between variables: categorical-categorical, categorical-numerical, and numerical-numerical Categorical - Categorical • Know how to make and summarize a contingency table between two categorical variables • Discuss and describe patterns in contingency tables and supporting graphic summaries Categorical - Numerical • Be able to build side-by-side histograms, back-to-back stemplots, and multiple box-plots between a categorical and a numeric variable. • Compare and contrast multiple groups based on their shape, center, and spread. Numerical - Numerical • Discuss basic ideas of the relationship between two numeric variables including scatter plots, linear regression and correlation • Create and interpret a line of best t • Calculate and interpret the correlation coecient *Version 1.2: May 2, 2013 3:47 pm -0500 http://creativecommons.org/licenses/by/3.0/ http://cnx.org/content/m46177/1.2/ OpenStax-CNX module: m46177 2 1.2 Introduction In the previous chapters you have explored how to organize, summarize, and discuss both univariate cat- egorical and numerical variables. Usually more interesting questions are to examine relationships between sets of variables.
    [Show full text]
  • Package 'Aplpack'
    Package ‘aplpack’ July 29, 2019 Title Another Plot Package: 'Bagplots', 'Iconplots', 'Summaryplots', Slider Functions and Others Version 1.3.3 Date 2019-07-26 Author Hans Peter Wolf [aut, cre] Maintainer Hans Peter Wolf <[email protected]> Depends R (>= 3.0.0) Suggests tkrplot, jpeg, png, splines, utils, tcltk Description Some functions for drawing some special plots: The function 'bagplot' plots a bagplot, 'faces' plots chernoff faces, 'iconplot' plots a representation of a frequency table or a data matrix, 'plothulls' plots hulls of a bivariate data set, 'plotsummary' plots a graphical summary of a data set, 'puticon' adds icons to a plot, 'skyline.hist' combines several histograms of a one dimensional data set in one plot, 'slider' functions supports some interactive graphics, 'spin3R' helps an inspection of a 3-dim point cloud, 'stem.leaf' plots a stem and leaf plot, 'stem.leaf.backback' plots back-to-back versions of stem and leaf plot. License GPL (>= 2) URL http://www.wiwi.uni-bielefeld.de/lehrbereiche/statoekoinf/comet/ wolf/wolf\_aplpack NeedsCompilation no Repository CRAN Date/Publication 2019-07-29 07:50:12 UTC R topics documented: bagplot . 2 bagplot.pairs . 5 1 2 bagplot boxplot2D . 6 faces............................................. 8 hdepth . 10 iconplot . 11 plothulls . 26 plotsummary . 28 puticon . 29 skyline.hist . 34 slider . 37 slider.bootstrap.lm.plot . 41 slider.brush . 42 slider.hist . 44 slider.lowess.plot . 45 slider.smooth.plot.ts . 46 slider.split.plot.ts . 47 slider.stem.leaf . 48 slider.zoom.plot.ts . 49 spin3R . 50 stem.leaf . 51 bagplot bagplot, a bivariate boxplot Description compute.bagplot() computes an object describing a bagplot of a bivariate data set.
    [Show full text]
  • Further Maths Bivariate Data Summary
    Further Maths Bivariate Data Summary Further Mathematics Bivariate Summary Representing Bivariate Data Back to Back Stem Plot A back to back stem plot is used to display bivariate data, involving a numerical variable and a categorical variable with two categories. Example The data can be displayed as a back to back stem plot. From the distribution we see that the Labor distribution is symmetric and therefore the mean and the median are very close, whereas the Liberal distribution is negatively skewed. Since the Liberal distribution is skewed the median is a better indicator of the centre of the distribution than the mean. It can be seen that the Liberal party volunteers handed out many more “how to vote” cards than the Labor party volunteers. Parallel Boxplots When we want to display a relationship between a numerical variable and a categorical variable with two or more categories, parallel boxplots can be used. Example The 5-number summary of each class is determined. Page 1 of 19 Further Maths Bivariate Data Summary Four boxplots are drawn. Notice that one scale is used. Based on the medians, 7B did best (median 77.5), followed by 7C (median 69.5), then 7D (median 65) and finally 7A (median 61.5). Two-way Frequency Tables and Segmented Bar Charts When we are examining the relationship between two categorical variables, the two-way frequency table can be used. Example 67 primary and 47 secondary school students were asked about their attitude to the number of school holidays which should be given. They were asked whether there should be fewer, the same number or more school holidays.
    [Show full text]
  • BIVARIATE DATA 7.1 Scatter Plots
    Page 1 of 7 BIVARIATE DATA 7.1 Scatter plots: A scatter plot, scatterplot, or scattergraph is a type of mathematical diagram using Cartesian coordinates to display values for two variables for a set of data. The data is displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis. This kind of plot is also called a scatter chart, scatter gram, scatter diagram, or scatter graph. Overview[ A scatter plot is used when a variable exists that is below the control of the experimenter. If a parameter exists that is systematically incremented and/or decremented by the other, it is called the control parameter or independent variable and is customarily plotted along the horizontal axis. The measured or dependent variable is customarily plotted along the vertical axis. If no dependent variable exists, either type of variable can be plotted on either axis and a scatter plot will illustrate only the degree of correlation (not causation) between two variables. A scatter plot can suggest various kinds of correlations between variables with a certain confidence interval. For example, weight and height, weight would be on x axis and height would be on the y axis. Correlations may be positive (rising), negative (falling), or null (uncorrelated). If the pattern of dots slopes from lower left to upper right, it suggests a positive correlation between the variables being studied. If the pattern of dots slopes from upper left to lower right, it suggests a negative correlation.
    [Show full text]
  • Package 'Aplpack'
    Package ‘aplpack’ February 15, 2013 Type Package Title Another Plot PACKage: stem.leaf, bagplot, faces, spin3R, and some slider functions Version 1.2.7 Date 2012-10-03 Author Peter Wolf, Uni Bielefeld Maintainer Peter Wolf <[email protected]> Depends R (>= 2.8.0), tcltk Suggests tkrplot Description set of functions for drawing some special plots: stem.leaf plots a stem and leaf plot stem.leaf.backback plots back-to-back versions of stem and leafs bagplot plots a bagplot faces plots chernoff faces spin3R for an inspection of a 3-dim point cloud slider functions for interactive graphics License GPL (>= 2) URL http://www.wiwi.uni-bielefeld.de/com/wolf/software/aplpack.html Repository CRAN Date/Publication 2012-12-11 07:55:36 NeedsCompilation no R topics documented: bagplot . .2 bagplot.pairs . .5 boxplot2D . .6 faces.............................................8 hdepth . 10 plotsummary . 11 1 2 bagplot skyline.hist . 13 slider . 15 slider.bootstrap.lm.plot . 20 slider.brush . 21 slider.hist . 22 slider.lowess.plot . 24 slider.smooth.plot.ts . 25 slider.split.plot.ts . 26 slider.stem.leaf . 27 slider.zoom.plot.ts . 28 spin3R . 29 stem.leaf . 30 Index 32 bagplot bagplot, a bivariate boxplot Description compute.bagplot() computes an object describing a bagplot of a bivariate data set. plot.bagplot() plots a bagplot object. bagplot() computes and plots a bagplot. Usage bagplot(x, y, factor = 3, na.rm = FALSE, approx.limit = 300, show.outlier = TRUE, show.whiskers = TRUE, show.looppoints = TRUE, show.bagpoints = TRUE, show.loophull = TRUE, show.baghull = TRUE, create.plot = TRUE, add = FALSE, pch = 16, cex = 0.4, dkmethod = 2, precision = 1, verbose = FALSE, debug.plots = "no", col.loophull="#aaccff", col.looppoints="#3355ff", col.baghull="#7799ff", col.bagpoints="#000088", transparency=FALSE, ..
    [Show full text]
  • Investigate Bivariate Measurement Data Written by Jake Wills – Mathsnz – [email protected]
    NCEA Level 3 - 3.9 Bivariate Data students.mathsnz.com Level 3 – AS91581 – 4 Credits – Internal Investigate Bivariate Measurement Data Written by Jake Wills – MathsNZ – [email protected] Achievement Achievement with Merit Achievement with Excellence Investigate bivariate Investigate bivariate Investigate bivariate measurement data. measurement data, with measurement data, with justification. statistical insight. Part 1: Problem ................................................................................................................................................. 2 Part 2: Plan ........................................................................................................................................................ 4 Part 2.1: Identifying the Variables .............................................................................................................. 4 Part 2.2: Naming the Source ...................................................................................................................... 6 Part 3: Data – Using NZGrapher ..................................................................................................................... 8 Part 4: Analysis .................................................................................................................................................. 9 Part 4.1: Trend ............................................................................................................................................... 9 Part 4.2: Association..................................................................................................................................
    [Show full text]
  • Statistics Statistics- the Branch of Mathematics That Deals with The
    Statistics Statistics- the branch of mathematics that deals with the collection, organization, analysis, and interpretation of numerical data. Dot Plot- a statistical chart consisting of data points plotted on a fairly simple scale. Histogram- a diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval. Box Plot- a graphic way to display the median, quartiles, maximum and minimum values of a data set on a number line. Interquartile Range- the range of values of a frequency distribution between the first and third quartiles. Standard Deviation- a measure of how spread out numbers are. Outlier- A value that "lies outside" (is much smaller or larger than) most of the other values in a set of data. y < Q1 − 1.5 × IQR or y > Q3 + 1.5 × IQR Frequency- the rate at which something occurs or is repeated over a particular period of time or in a given sample. Frequency Table- a table that shows the total for each category or group of data. Relative Frequency- how often something happens divided by all outcomes. Joint Frequency- It is called joint frequency because you are joining one variable from the row and one variable from the column. Joint Relative Frequency- the ratio of the number of observations of a joint frequency to the total number of observations in a frequency table. Marginal Frequency- the total in a row or column in a two way table. Marginal Relative Frequency- the ratio of the sum of the joint relative frequency in a row or column and the total number of data values.
    [Show full text]