Central Tendency, Variance, and Variability

Total Page:16

File Type:pdf, Size:1020Kb

Central Tendency, Variance, and Variability © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC © Mad Dog/Shutterstock CHAPTER 2 NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Central© Jones & Bartlett Learning, Tendency, LLC © Jones & Bartlett Learning, LLC Variance,NOT FOR SALE OR DISTRIBUTION and VariabilityNOT FOR SALE OR DISTRIBUTION Facts are stubborn, but statistics are more pliable. © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION—Mark Twain CHAPTER OUTLINE© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC Introduction NOT FOR SALE OR DISTRIBUTIONInterquartile Range NOT FOR SALE OR DISTRIBUTION Measures of Central Tendency Standard Deviation Mean Variance Median Data Harvesting Mode© Jones & Bartlett Learning, LLC Global Perspective© Jones & Bartlett Learning, LLC FrequencyNOT Distribution FOR SALE OR DISTRIBUTION Chapter SummaryNOT FOR SALE OR DISTRIBUTION Variance and Measures of Dispersion or Variability Apply Your Knowledge Min and Max References Range Web Links Outlier Data © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION LEARNING OUTCOMES After completing this chapter, you should be able to do the following: 1. Discuss the uses of mean, median, and mode in a hospital setting. © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC 2. Discuss the uses of measures of central tendency. 3. Demonstrate howNOT to find FOR mean, SALE median, OR DISTRIBUTIONmode, and standard deviation. NOT FOR SALE OR DISTRIBUTION 4. Describe the difference between continuous data and discrete data. 5. Describe the term skewed. KEY ©TERMS Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Continuous data Max Outlier data Descriptive statistics Mean Range Discrete data Median Skewed © JonesFrequency & Bartlett distribution Learning, LLC Min © Jones & BartlettStandard Learning, deviation LLC NOT FORInterquartile SALE OR range DISTRIBUTION Mode NOT FOR SALE ORVariance DISTRIBUTION 14 All Microsoft screenshots used with permission from Microsoft. © Jones & Bartlett Learning LLC, an Ascend Learning Company. NOT FOR SALE OR DISTRIBUTION. Measures of Central Tendency 15 © Jones & Bartlett Learning, LLC © Jones & BartlettHow Learning, Does Your LLC Hospital Rate? NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Dr. Barker, chief of the medical staff at his hospital, is reviewing cases of heart failure and the length of stay, readmission rates, and average age of patients with this condition. The purpose of this review is to improve patient care, reduce readmission rates, and potentially reduce length of stay in the hospital.© Jones Heart & Bartlett failure Learning,is the heart’s LLC inability to pump enough© Jonesblood and & Bartlettoxygen to Learning, LLC support the otherNOT organs FOR SALEin the body.OR DISTRIBUTION Heart failure expenditures averageNOT about FOR $32 billionSALE each OR DISTRIBUTION year in the United States. This cost includes healthcare services, medication, and missed days of work. Dr. Barker went to the website of the Centers for Disease Control and Prevention (CDC) to find data on his county (CDC Interactive Heart Disease site: https://nccd.cdc.gov/DHDSPAtlas/). Consider the following: © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC 1. If you were in Dr. Barker’s position, what specific goals might you have to improve patient care NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION in this situation? 2. What might a high readmission rate indicate, and what are measures that could be taken to reduce this number? 3. Why would Dr. Barker want to reduce length of stay among heart failure patients? How does © Jones & Bartlettlength Learning, of stay relate LLC to the quality of patient© Jones care? & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION or measures of central location and summary statistics. Introduction Each is a valid measure of central tendency but is used © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC The Mark Twain quote that introduces this chapter is under different conditions. You will recall from the first NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION quite telling in that although the data you have may chapter how mode was used to estimate the height of a be reliable and accurate, the interpretation of the data brick wall, which is only one example of this measure. can be questionable, as this and other chapters will If all the data are perfectly normal, the mean, reveal. In this chapter, you will learn more about key median, and mode will be identical in reliability and statistical© Jones methods & Bartlett and provide Learning, simple LLC and real-world will represent© Jones summary & Bartlett values Learning,accurately in LLC a given examplesNOT FOR of each.SALE These OR examples DISTRIBUTION will involve com- dataset.NOT Follow FOR along SALE and ORtry theDISTRIBUTION examples to see puting your results with a Microsoft Excel spreadsheet how these methods work. or other software tools. First you will learn about measures of central tendency, including mode, mean, median, variables, and frequency distribution. These Mean © Jones & Bartlettmeasures Learning, are part of descriptive LLC statistics and© are Jones Mean & Bartlett is a measure Learning, of central LLC tendency that can be NOT FOR SALEconcerned OR DISTRIBUTION with summarizing and interpreting someNOT FORdetermined SALE ORby mathematically DISTRIBUTION calculating the average of the properties of sets of data, but they do not sug- of observations (e.g., data elements) in a frequency gest necessarily the properties of the entire popula- distribution. Mean is the most common measure of tion from which the sample was taken (which would central tendency. The mean can be used with discrete require determining© Joneswhether & your Bartlett collected Learning, data were LLCdata (e.g., “choose 1, 2,© or Jones 3”) but is& mostBartlett commonly Learning, LLC in fact representativeNOT of FORthe entire SALE population). OR DISTRIBUTION used with continuous dataNOT (e.g., FOR weight SALE of a ORpatient). DISTRIBUTION Following the discussion of measures of central ten- In this case, the mean is just a model of the dataset. dency, you will learn about measures of difference among One of the most important factors relating to the the numbers in a dataset. Standard deviation, standard mean is that it minimizes errors in predicting any one error, range, and variance are covered in this section and © Jones & Bartlett Learning, LLC value in© theJones dataset. & Bartlett In fact, it Learning, produces the LLC fewest are known as measures of dispersion or variability. NOT FOR SALE OR DISTRIBUTION errors NOTcompared FOR with SALE median OR DISTRIBUTIONand mode. Another important characteristic of this measure is that it Measures of Central includes all values in a dataset. Remember that a pop- ulation is all of your data, whereas a sample is just a Tendency subset, preferably a random but representative one. © Jones & BartlettStatistical Learning, methods used LLC to determine the shape of a ©dis Jones- &However, Bartlett using Learning, the mean LLC has one disadvantage: it NOT FOR SALEtribution OR of DISTRIBUTION data are mean, median, and mode. TheseNOT FORis susceptible SALE OR to DISTRIBUTIONthe influence of outlier data, or data methods are known as measures of central tendency, elements that are very different or far away from most © Jones & Bartlett Learning LLC, an Ascend Learning Company. NOT FOR SALE OR DISTRIBUTION. 16 Chapter 2 Central Tendency, Variance, and Variability of the other data elements. Outlier data, discussed There are actually several types of calculation for © Jones & Bartlett Learning, LLC © Joneslater & Bartlettin the chapter, Learning, may also LLC be referred to as flier, mean. We have already covered the standard mean, NOT FORmaverick SALE, aberrant OR DISTRIBUTION, or straggler, although outlier is NOTalso FOR known SALE as the arithmeticOR DISTRIBUTION mean. Another is the sam- most commonly used. Outlier data may be detected ple mean, also known as the average. By using the sam- by statistical tests (such as a Dixon or Grubbs test) or ple mean, outlier data can be used. Skewed data will by a graphical display of the data. make no difference in the sample mean. Sample mean Moreover, the more© Jonesskewed & Bartlett(or varied) Learning, the is LLCan estimate of the population,© and Jones the dataset & Bartlett may be Learning, LLC data become, the less effectiveNOT FOR the meanSALE is ORin locat DISTRIBUTION- quite varied, such as 8, 34, 56,NOT 25, 41, FOR 2, 17, SALE 25. OR DISTRIBUTION ing a central tendency and typical value. Skewed Other means are the harmonic mean, for rate or simply means that the dataset contains both very speed measures, and the geometric mean, for when small and very large numerical values. Mean is best the ranges you are comparing are different and you used with© Jones data that & Bartlettare not skewed, Learning, such LLCas in the need to equalize© Jones them. Next & Bartlett we will computeLearning, a sam LLC- set 1, 3, 2, 1, 3. ple mean. NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Hands-on Statistics 2.1: Use Excel to Find Mean © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC To compute the mean, add all items together and divide by the number of elements in the dataset. NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION To perform this calculation in Excel, examine Figure 2.1 and follow these directions: 1. Open a new, blank spreadsheet in Excel. 2. Key in the following data about inpatient hospital days, entering one numeral per cell in consecutive cells of the same row, beginning in row 2, column C (C2), and ending in cell G2: © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION1, 3, 2, 1, 3 NOT FOR SALE OR DISTRIBUTION 3.
Recommended publications
  • Generalized Linear Models and Generalized Additive Models
    00:34 Friday 27th February, 2015 Copyright ©Cosma Rohilla Shalizi; do not distribute without permission updates at http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/ Chapter 13 Generalized Linear Models and Generalized Additive Models [[TODO: Merge GLM/GAM and Logistic Regression chap- 13.1 Generalized Linear Models and Iterative Least Squares ters]] [[ATTN: Keep as separate Logistic regression is a particular instance of a broader kind of model, called a gener- chapter, or merge with logis- alized linear model (GLM). You are familiar, of course, from your regression class tic?]] with the idea of transforming the response variable, what we’ve been calling Y , and then predicting the transformed variable from X . This was not what we did in logis- tic regression. Rather, we transformed the conditional expected value, and made that a linear function of X . This seems odd, because it is odd, but it turns out to be useful. Let’s be specific. Our usual focus in regression modeling has been the condi- tional expectation function, r (x)=E[Y X = x]. In plain linear regression, we try | to approximate r (x) by β0 + x β. In logistic regression, r (x)=E[Y X = x] = · | Pr(Y = 1 X = x), and it is a transformation of r (x) which is linear. The usual nota- tion says | ⌘(x)=β0 + x β (13.1) · r (x) ⌘(x)=log (13.2) 1 r (x) − = g(r (x)) (13.3) defining the logistic link function by g(m)=log m/(1 m). The function ⌘(x) is called the linear predictor. − Now, the first impulse for estimating this model would be to apply the transfor- mation g to the response.
    [Show full text]
  • Stochastic Process - Introduction
    Stochastic Process - Introduction • Stochastic processes are processes that proceed randomly in time. • Rather than consider fixed random variables X, Y, etc. or even sequences of i.i.d random variables, we consider sequences X0, X1, X2, …. Where Xt represent some random quantity at time t. • In general, the value Xt might depend on the quantity Xt-1 at time t-1, or even the value Xs for other times s < t. • Example: simple random walk . week 2 1 Stochastic Process - Definition • A stochastic process is a family of time indexed random variables Xt where t belongs to an index set. Formal notation, { t : ∈ ItX } where I is an index set that is a subset of R. • Examples of index sets: 1) I = (-∞, ∞) or I = [0, ∞]. In this case Xt is a continuous time stochastic process. 2) I = {0, ±1, ±2, ….} or I = {0, 1, 2, …}. In this case Xt is a discrete time stochastic process. • We use uppercase letter {Xt } to describe the process. A time series, {xt } is a realization or sample function from a certain process. • We use information from a time series to estimate parameters and properties of process {Xt }. week 2 2 Probability Distribution of a Process • For any stochastic process with index set I, its probability distribution function is uniquely determined by its finite dimensional distributions. •The k dimensional distribution function of a process is defined by FXX x,..., ( 1 x,..., k ) = P( Xt ≤ 1 ,..., xt≤ X k) x t1 tk 1 k for anyt 1 ,..., t k ∈ I and any real numbers x1, …, xk .
    [Show full text]
  • Variance Function Regressions for Studying Inequality
    Variance Function Regressions for Studying Inequality The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Western, Bruce and Deirdre Bloome. 2009. Variance function regressions for studying inequality. Working paper, Department of Sociology, Harvard University. Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:2645469 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#OAP Variance Function Regressions for Studying Inequality Bruce Western1 Deirdre Bloome Harvard University January 2009 1Department of Sociology, 33 Kirkland Street, Cambridge MA 02138. E-mail: [email protected]. This research was supported by a grant from the Russell Sage Foundation. Abstract Regression-based studies of inequality model only between-group differ- ences, yet often these differences are far exceeded by residual inequality. Residual inequality is usually attributed to measurement error or the in- fluence of unobserved characteristics. We present a regression that in- cludes covariates for both the mean and variance of a dependent variable. In this model, the residual variance is treated as a target for analysis. In analyses of inequality, the residual variance might be interpreted as mea- suring risk or insecurity. Variance function regressions are illustrated in an analysis of panel data on earnings among released prisoners in the Na- tional Longitudinal Survey of Youth. We extend the model to a decomposi- tion analysis, relating the change in inequality to compositional changes in the population and changes in coefficients for the mean and variance.
    [Show full text]
  • Flexible Signal Denoising Via Flexible Empirical Bayes Shrinkage
    Journal of Machine Learning Research 22 (2021) 1-28 Submitted 1/19; Revised 9/20; Published 5/21 Flexible Signal Denoising via Flexible Empirical Bayes Shrinkage Zhengrong Xing [email protected] Department of Statistics University of Chicago Chicago, IL 60637, USA Peter Carbonetto [email protected] Research Computing Center and Department of Human Genetics University of Chicago Chicago, IL 60637, USA Matthew Stephens [email protected] Department of Statistics and Department of Human Genetics University of Chicago Chicago, IL 60637, USA Editor: Edo Airoldi Abstract Signal denoising—also known as non-parametric regression—is often performed through shrinkage estima- tion in a transformed (e.g., wavelet) domain; shrinkage in the transformed domain corresponds to smoothing in the original domain. A key question in such applications is how much to shrink, or, equivalently, how much to smooth. Empirical Bayes shrinkage methods provide an attractive solution to this problem; they use the data to estimate a distribution of underlying “effects,” hence automatically select an appropriate amount of shrinkage. However, most existing implementations of empirical Bayes shrinkage are less flexible than they could be—both in their assumptions on the underlying distribution of effects, and in their ability to han- dle heteroskedasticity—which limits their signal denoising applications. Here we address this by adopting a particularly flexible, stable and computationally convenient empirical Bayes shrinkage method and apply- ing it to several signal denoising problems. These applications include smoothing of Poisson data and het- eroskedastic Gaussian data. We show through empirical comparisons that the results are competitive with other methods, including both simple thresholding rules and purpose-built empirical Bayes procedures.
    [Show full text]
  • EART6 Lab Standard Deviation in Probability and Statistics, The
    EART6 Lab Standard deviation In probability and statistics, the standard deviation is a measure of the dispersion of a collection of values. It can apply to a probability distribution, a random variable, a population or a data set. The standard deviation is usually denoted with the letter σ. It is defined as the root-mean-square (RMS) deviation of the values from their mean, or as the square root of the variance. Formulated by Galton in the late 1860s, the standard deviation remains the most common measure of statistical dispersion, measuring how widely spread the values in a data set are. If many data points are close to the mean, then the standard deviation is small; if many data points are far from the mean, then the standard deviation is large. If all data values are equal, then the standard deviation is zero. A useful property of standard deviation is that, unlike variance, it is expressed in the same units as the data. 2 (x " x ) ! = # N N Fig. 1 Dark blue is less than one standard deviation from the mean. For Fig. 2. A data set with a mean the normal distribution, this accounts for 68.27 % of the set; while two of 50 (shown in blue) and a standard deviations from the mean (medium and dark blue) account for standard deviation (σ) of 20. 95.45%; three standard deviations (light, medium, and dark blue) account for 99.73%; and four standard deviations account for 99.994%. The two points of the curve which are one standard deviation from the mean are also the inflection points.
    [Show full text]
  • Robust Logistic Regression to Static Geometric Representation of Ratios
    Journal of Mathematics and Statistics 5 (3):226-233, 2009 ISSN 1549-3644 © 2009 Science Publications Robust Logistic Regression to Static Geometric Representation of Ratios 1Alireza Bahiraie, 2Noor Akma Ibrahim, 3A.K.M. Azhar and 4Ismail Bin Mohd 1,2 Institute for Mathematical Research, University Putra Malaysia, 43400 Serdang, Selangor, Malaysia 3Graduate School of Management, University Putra Malaysia, 43400 Serdang, Selangor, Malaysia 4Department of Mathematics, Faculty of Science, University Malaysia Terengganu, 21030, Malaysia Abstract: Problem statement: Some methodological problems concerning financial ratios such as non- proportionality, non-asymetricity, non-salacity were solved in this study and we presented a complementary technique for empirical analysis of financial ratios and bankruptcy risk. This new method would be a general methodological guideline associated with financial data and bankruptcy risk. Approach: We proposed the use of a new measure of risk, the Share Risk (SR) measure. We provided evidence of the extent to which changes in values of this index are associated with changes in each axis values and how this may alter our economic interpretation of changes in the patterns and directions. Our simple methodology provided a geometric illustration of the new proposed risk measure and transformation behavior. This study also employed Robust logit method, which extends the logit model by considering outlier. Results: Results showed new SR method obtained better numerical results in compare to common ratios approach. With respect to accuracy results, Logistic and Robust Logistic Regression Analysis illustrated that this new transformation (SR) produced more accurate prediction statistically and can be used as an alternative for common ratios. Additionally, robust logit model outperforms logit model in both approaches and was substantially superior to the logit method in predictions to assess sample forecast performances and regressions.
    [Show full text]
  • Chapter 4: Central Tendency and Dispersion
    Central Tendency and Dispersion 4 In this chapter, you can learn • how the values of the cases on a single variable can be summarized using measures of central tendency and measures of dispersion; • how the central tendency can be described using statistics such as the mode, median, and mean; • how the dispersion of scores on a variable can be described using statistics such as a percent distribution, minimum, maximum, range, and standard deviation along with a few others; and • how a variable’s level of measurement determines what measures of central tendency and dispersion to use. Schooling, Politics, and Life After Death Once again, we will use some questions about 1980 GSS young adults as opportunities to explain and demonstrate the statistics introduced in the chapter: • Among the 1980 GSS young adults, are there both believers and nonbelievers in a life after death? Which is the more common view? • On a seven-attribute political party allegiance variable anchored at one end by “strong Democrat” and at the other by “strong Republican,” what was the most Democratic attribute used by any of the 1980 GSS young adults? The most Republican attribute? If we put all 1980 95 96 PART II :: DESCRIPTIVE STATISTICS GSS young adults in order from the strongest Democrat to the strongest Republican, what is the political party affiliation of the person in the middle? • What was the average number of years of schooling completed at the time of the survey by 1980 GSS young adults? Were most of these twentysomethings pretty close to the average on
    [Show full text]
  • Iam 530 Elements of Probability and Statistics
    IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES VARIABLE • Studying the behavior of random variables, and more importantly functions of random variables is essential for both the theory and practice of statistics. Variable: A characteristic of population or sample that is of interest for us. Random Variable: A function defined on the sample space S that associates a real number with each outcome in S. In other words, a numerical value to each outcome of a particular experiment. • For each element of an experiment’s sample space, the random variable can take on exactly one value TYPES OF RANDOM VARIABLES We will start with univariate random variables. • Discrete Random Variable: A random variable is called discrete if its range consists of a countable (possibly infinite) number of elements. • Continuous Random Variable: A random variable is called continuous if it can take on any value along a continuum (but may be reported “discretely”). In other words, its outcome can be any value in an interval of the real number line. Note: • Random Variables are denoted by upper case letters (X) • Individual outcomes for RV are denoted by lower case letters (x) DISCRETE RANDOM VARIABLES EXAMPLES • A random variable which takes on values in {0,1} is known as a Bernoulli random variable. • Discrete Uniform distribution: 1 P(X x) ; x 1,2,..., N; N 1,2,... N • Throw a fair die. P(X=1)=…=P(X=6)=1/6 DISCRETE RANDOM VARIABLES • Probability Distribution: Table, Graph, or Formula that describes values a random variable can take on, and its corresponding probability (discrete random variable) or density (continuous random variable).
    [Show full text]
  • Variance Function Estimation in Multivariate Nonparametric Regression by T
    Variance Function Estimation in Multivariate Nonparametric Regression by T. Tony Cai, Lie Wang University of Pennsylvania Michael Levine Purdue University Technical Report #06-09 Department of Statistics Purdue University West Lafayette, IN USA October 2006 Variance Function Estimation in Multivariate Nonparametric Regression T. Tony Cail, Michael Levine* Lie Wangl October 3, 2006 Abstract Variance function estimation in multivariate nonparametric regression is consid- ered and the minimax rate of convergence is established. Our work uses the approach that generalizes the one used in Munk et al (2005) for the constant variance case. As is the case when the number of dimensions d = 1, and very much contrary to the common practice, it is often not desirable to base the estimator of the variance func- tion on the residuals from an optimal estimator of the mean. Instead it is desirable to use estimators of the mean with minimal bias. Another important conclusion is that the first order difference-based estimator that achieves minimax rate of convergence in one-dimensional case does not do the same in the high dimensional case. Instead, the optimal order of differences depends on the number of dimensions. Keywords: Minimax estimation, nonparametric regression, variance estimation. AMS 2000 Subject Classification: Primary: 62G08, 62G20. Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104. The research of Tony Cai was supported in part by NSF Grant DMS-0306576. 'Corresponding author. Address: 250 N. University Street, Purdue University, West Lafayette, IN 47907. Email: [email protected]. Phone: 765-496-7571. Fax: 765-494-0558 1 Introduction We consider the multivariate nonparametric regression problem where yi E R, xi E S = [0, Ild C Rd while a, are iid random variables with zero mean and unit variance and have bounded absolute fourth moments: E lail 5 p4 < m.
    [Show full text]
  • Variance Function Program
    Variance Function Program Version 15.0 (for Windows XP and later) July 2018 W.A. Sadler 71B Middleton Road Christchurch 8041 New Zealand Ph: +64 3 343 3808 e-mail: [email protected] (formerly at Nuclear Medicine Department, Christchurch Hospital) Contents Page Variance Function Program 15.0 1 Windows Vista through Windows 10 Issues 1 Program Help 1 Gestures 1 Program Structure 2 Data Type 3 Automation 3 Program Data Area 3 User Preferences File 4 Auto-processing File 4 Graph Templates 4 Decimal Separator 4 Screen Settings 4 Scrolling through Summaries and Displays 4 The Variance Function 5 Variance Function Output Examples 8 Variance Stabilisation 11 Histogram, Error and Biological Variation Plots 12 Regression Analysis 13 Limit of Blank (LoB) and Limit of Detection (LoD) 14 Bland-Altman Analysis 14 Appendix A (Program Delivery) 16 Appendix B (Program Installation) 16 Appendix C (Program Removal) 16 Appendix D (Example Data, SD and Regression Files) 16 Appendix E (Auto-processing Example Files) 17 Appendix F (History) 17 Appendix G (Changes: Version 14.0 to Version 15.0) 18 Appendix H (Version 14.0 Bug Fixes) 19 References 20 Variance Function Program 15.0 1 Variance Function Program 15.0 The variance function (the relationship between variance and the mean) has several applications in statistical analysis of medical laboratory data (eg. Ref. 1), but the single most important use is probably the construction of (im)precision profile plots (2). This program (VFP.exe) was initially aimed at immunoassay data which exhibit relatively extreme variance relationships. It was based around the “standard” 3-parameter variance function, 2 J σ = (β1 + β2U) 2 where σ denotes variance, U denotes the mean, and β1, β2 and J are the parameters (3, 4).
    [Show full text]
  • Central Tendency, Dispersion, Correlation and Regression Analysis Case Study for MBA Program
    Business Statistic Central Tendency, Dispersion, Correlation and Regression Analysis Case study for MBA program CHUOP Theot Therith 12/29/2010 Business Statistic SOLUTION 1. CENTRAL TENDENCY - The different measures of Central Tendency are: (1). Arithmetic Mean (AM) (2). Median (3). Mode (4). Geometric Mean (GM) (5). Harmonic Mean (HM) - The uses of different measures of Central Tendency are as following: Depends upon three considerations: 1. The concept of a typical value required by the problem. 2. The type of data available. 3. The special characteristics of the averages under consideration. • If it is required to get an average based on all values, the arithmetic mean or geometric mean or harmonic mean should be preferable over median or mode. • In case middle value is wanted, the median is the only choice. • To determine the most common value, mode is the appropriate one. • If the data contain extreme values, the use of arithmetic mean should be avoided. • In case of averaging ratios and percentages, the geometric mean and in case of averaging the rates, the harmonic mean should be preferred. • The frequency distributions in open with open-end classes prohibit the use of arithmetic mean or geometric mean or harmonic mean. Prepared by: CHUOP Theot Therith 1 Business Statistic • If the distribution is bell-shaped and symmetrical or flat-topped one with few extreme values, the arithmetic mean is the best choice, because it is less affected by sampling fluctuations. • If the distribution is sharply peaked, i.e., the data cluster markedly at the middle or if there are abnormally small or large values, the median has smaller sampling fluctuations than the arithmetic mean.
    [Show full text]
  • Measures of Central Tendency for Censored Wage Data
    MEASURES OF CENTRAL TENDENCY FOR CENSORED WAGE DATA Sandra West, Diem-Tran Kratzke, and Shail Butani, Bureau of Labor Statistics Sandra West, 2 Massachusetts Ave. N.E. Washington, D.C. 20212 Key Words: Mean, Median, Pareto Distribution medians can lead to misleading results. This paper tested the linear interpolation method to estimate the median. I. INTRODUCTION First, the interval that contains the median was determined, The research for this paper began in connection with the and then linear interpolation is applied. The occupational need to measure the central tendency of hourly wage data minimum wage is used as the lower limit of the lower open from the Occupational Employment Statistics (OES) survey interval. If the median falls in the uppermost open interval, at the Bureau of Labor Statistics (BLS). The OES survey is all that can be said is that the median is equal to or above a Federal-State establishment survey of wage and salary the upper limit of hourly wage. workers designed to produce data on occupational B. MEANS employment for approximately 700 detailed occupations by A measure of central tendency with desirable properties is industry for the Nation, each State, and selected areas within the mean. In addition to the usual desirable properties of States. It provided employment data without any wage means, there is another nice feature that is proven by West information until recently. Wage data that are produced by (1985). It is that the percent difference between two means other Federal programs are limited in the level of is bounded relative to the percent difference between occupational, industrial, and geographic coverage.
    [Show full text]