Simple Mean Weighted Mean Or Harmonic Mean

Total Page:16

File Type:pdf, Size:1020Kb

Simple Mean Weighted Mean Or Harmonic Mean MultiplyMultiply oror Divide?Divide? AA BestBest PracticePractice forfor FactorFactor AnalysisAnalysis 77 ––10 10 JuneJune 20112011 Dr.Dr. ShuShu-Ping-Ping HuHu AlfredAlfred SmithSmith CCEACCEA Los Angeles Washington, D.C. Boston Chantilly Huntsville Dayton Santa Barbara Albuquerque Colorado Springs Ft. Meade Ft. Monmouth Goddard Space Flight Center Ogden Patuxent River Silver Spring Washington Navy Yard Cleveland Dahlgren Denver Johnson Space Center Montgomery New Orleans Oklahoma City Tampa Tacoma Vandenberg AFB Warner Robins ALC Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PRT-70, 01 Apr 2011 ObjectivesObjectives It is common to estimate hours as a simple factor of a technical parameter such as weight, aperture, power or source lines of code (SLOC), i.e., hours = a*TechParameter z “Software development hours = a * SLOC” is used as an example z Concept is applicable to any factor cost estimating relationship (CER) Our objective is to address how to best estimate “a” z Multiply SLOC by Hour/SLOC or Divide SLOC by SLOC/Hour? z Simple, weighted, or harmonic mean? z Role of regression analysis z Base uncertainty on the prediction interval rather than just the range Our goal is to provide analysts a better understanding of choices available and how to select the right approach Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 Approved for Public Release 2 of 25 OutlineOutline Definitions and Examples: z Means (Simple, Weighted, and Harmonic Means) z Regression Methods (MUPE, ZMPE, and Log-Error) z Harmonic Mean Examples Estimating SW development Hours by a Factor of z SLOC per Hour z Hour per SLOC Analyzing Factor CERs Using Different Methods Analyzing Uncertainties Realistic Example Conclusions Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 Approved for Public Release 3 of 25 DefinitionsDefinitions ofof MeansMeans Given a set of n observations (Z1, Z2, …, Zn), simple mean (SM), weighted mean (WM), geometric mean (GM), and harmonic mean (HM) are given by n GM =()Z /1 n n n ∏ i Z ∑i 1wi Z i i=1 ∑i=1 i WM = = SM = n w −1 n ∑i=1 i ⎛1n 1⎞ HM = ⎜ ⎟ ⎜ ∑ i=1 ⎟ ⎝n Zi ⎠ where wi is the weighting factor of the ith n observation (i = 1,…, n) = n 1 -1 ∑ i=1 Note: HM(Z) = (SM(1/Z)) Z i SM ≥ GM ≥ HM if all observations (Zi’s) are non-negative z These means are the same if all observations are the same Whether SM is greater or less than WM depends upon the relative magnitudes of the weighting factors Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 Approved for Public Release 4 of 25 WeightedWeighted HarmonicHarmonic MeanMean The weighted harmonic mean (WHM) is defined by n w −1 ∑i=1 i ⎛ ⎞ WHM = ⎜ 1 n wi ⎟ n w WHM = i ⎜ n ∑i=1 ⎟ i=1 w Zi ∑ ∑i=1 i Zi ⎝ ⎠ where wi is the weighting factor of the ith observation (i = 1,…, n) −1 ⎛1n 1⎞ HM = ⎜ ⎟ WMH Î ⎜ ∑i=1 ⎟ when all wi’s are the same ⎝n Zi ⎠ HM = GM2/SM for two observations z HM(a,b) = 2/(1/a+1/b) = 2ab/(a+b) = ab/[(a+b)/2] z If a = 1, b = 2, then SM = 1.5, GM = 1.414, HM = 1.333 Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 Approved for Public Release 5 of 25 HarmonicHarmonic MeanMean ExamplesExamples Productivity: z John can finish a data entry job in 60 hours z Mary can do it in 30 hours z The harmonic mean is 40 hours per person z It will take them 40/2 = 20 hours to finish the job if they work together Speed: z A car travels 120 miles at a speed of 60 miles/hour z It travels the next 120 miles at a speed of 30 miles/hour z Its average speed is the harmonic mean of 60 and 30, i.e., 40 miles/hour ¾ This means that if you would like to cover the 240 miles in the same time at a single speed, you would travel at 40 miles/hour SM is not applicable here and often mistakenly used in places where HM should be used Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 Approved for Public Release 6 of 25 ContrastingContrasting SimpleSimple andand HarmonicHarmonic MeansMeans Every statistic that measures the central tendencies has its own pros and cons Simple Mean Properties: z It can be highly influenced by extreme data points z It does not give greater weights to large data points z It treats all data points equally, as the weights are the same in the computation formula HM strongly favors the smaller numbers in the data set; it does not give equal weight to each data point z If x1 = 1, x2 = 2, Î SM=1.5, GM = 1.4, HM = 1.3 z If x1 = 0.1 & x2 =…= x10 = 10Î SM=9.0, GM = 6.3, HM = 0.9 Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 Approved for Public Release 7 of 25 DevelopingDeveloping thethe FactorsFactors byby RegressionRegression AnalysisAnalysis Simple and Weighted Means are also the solutions derived by weighted least squares (WLS) regression analysis for factor equations Multiplicative regression methods will be discussed next Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 8 of 25 MultiplicativeMultiplicative ErrorError ModelsModels –– LogLog-Error,Log--Error,Error, MUPE,MUPE, && ZMPEZMPE Cost variation is Definition of error term for Y = f(x)*ε proportional to the level of the function Log-Error: ε ~ LN(0, σ2) Ö Least squares in log space z Error = Log (Y) - Log f(X) z Minimize the sum of squared errors; process is done in log space MUPE: E(ε ) = 1, V(ε ) = σ2 Ö Least squares in weighted space z Error = (Y-f(X))/f(X) Note: E((Y-f(X))/f(X)) = 0 z Minimize the sum of squared % errors V((Y-f(X))/f(X)) = 2 σ2 iteratively (i.e., minimize Σi {(yi-f(xi))/fk-1(xi)} , k is the iteration number) ZMPE: E(ε ) = 1, V(ε ) = σ2 Ö Least squares in weighted space z Error = (Y-f(X))/f(X) z Minimize the sum of squared (percentage) errors with a constraint: Σi (yi-f(xi))/f(xi) = 0 We will focus on MUPE/ZMPE factor equations in this paper Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 Approved for Public Release 9of 25 SWSW DevDev HoursHours AsAs aa FactorFactor ofof SLOC/HourSLOC/Hour && Hour/SLOCHour/SLOC (1/4)(1/4) SW Dev hours for a program are commonly estimated using the source lines of code te ( ra(SLOC) and productivitySLOC per hour) Given n paired-data (SLOC1,hour1), (SLOC2,hour2), …, (SLOCn,hourn), given byre a ratetyiSM, WM, and HM of the productiv n (SLOCi / i hour⎛ 1 n ) 1 ⎞ SM(SLOC/hour) = SM_Rate = ∑i=1 = ⎜ ⎟ ⎜ ∑i=1 ⎟ n ⎝ n (hour/SLOCi i ⎠ ) n n n ( SLOCw /hour hour ( ) SLOC /( hour SLOC ) ) ∑i=1 i i i ∑i=1 i i i ∑i=1 i WM(SLOC/hour) = WM_Rate = n = n = n ()w ( hour ) ( hour ) ∑i=1 i ∑i=1 i ∑i=1 i −1 −1 ⎛ 1 n 1 ⎞ ⎛ 1 n ⎞ HM(SLOC/hour) = HM_Rate = ⎜ ⎟(= hour / SLOC ) ⎜ ∑i=1 ⎟ ⎜ ∑i=1 i i ⎟ ⎝ n SLOCi / i ⎠ hour⎝ n ⎠ (HM(SLOC/hour))-1 = SM(hour/SLOC); (SM_Rate)-1 HM(hour/SLOC)= -1 -1 “Flip” twice to relate HM HM(X) = (SM(1/X)) SM(X) = (HM(1/X)) to SM, and vice versa Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70 01, Apr 2011 PublApproved foric Release 10 of 25 SWSW DevDev HoursHours AsAs aa FactorFactor ofof SLOC/HourSLOC/Hour && Hour/SLOCHour/SLOC (2/4)(2/4) SW Dev hours for a program are commonly estimated based on an estimate of SLOC divided by an average factor of SLOC per Hour (an estimate of productivity rate) Average Productivity Rate z Divide: Dev-Hours = Est(SLOC) / SM(SLOC per hour) -1 = Est(SLOC) / {(Σi SLOCi/Houri)/n } = Est(SLOC) * {(Σi 1/(Houri/SLOCi))/n} = Est(SLOC) * HM(Hour/SLOC) SW Dev hours can also be predicted by an estimate of SLOC times an average factor of Hour per SLOC z Multiply: Dev-Hours = Est(SLOC) * SM(Hours per SLOC) = Est(SLOC) * SM(Hour/SLOC) The “Divide” method generates a point estimate (PE) smaller than the “Multiply” method unless all ratios are the same Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 Approved for Public Release 11 of 25 SWSW DevDev HoursHours AsAs aa FactorFactor ofof SLOC/HourSLOC/Hour && Hour/SLOCHour/SLOC (3/4)(3/4) Another method to estimate SW development hours is to divide an estimate of SLOC by a weighted productivity rate Weighted z Dev-Hours = Est(SLOC) / (Wtd SLOC per hour) Productivity Rate (weighted by hours) = Est(SLOC) / {(Σiwi *(SLOCi / Houri))/(Σiwi)} = Est(SLOC) / {(Σi Houri *(SLOCi / Houri))/(ΣiHouri)} = Est(SLOC) / {Σi SLOCi / Σi Houri } = Est(SLOC) * (Σi Houri)/(Σi SLOCi ) = Est(SLOC) / WM(SLOCi/Houri) = Est(SLOC) * WM(Houri/SLOCi) Using WM appears to be a better approach because it provides the same result regardless of the methods Be wary of the “common” uncertainty approach for WM z Three distributions are often applied to model the uncertainty for Dev-hours Presented at the 2011 ISPA/SCEA Joint Annual Conference and Training Workshop - www.iceaaonline.com PR-70, 01 Apr 2011 Approved for Public Release 12 of 25 SWSW DevDev HoursHours AsAs aa FactorFactor ofof SLOC/HourSLOC/Hour && Hour/SLOCHour/SLOC (4/4)(4/4) Question #1: Which method is more appropriate to estimate SW development hours, “Divide” or “Multiply”? Divide: DevHours = Est(SLOC)/SM(SLOC/Hour) = Est(SLOC)*HM (Hour/SLOC)(1) Multiply: DevHours = Est(SLOC)*SM(Hour/SLOC)
Recommended publications
  • Comparison of Harmonic, Geometric and Arithmetic Means for Change Detection in SAR Time Series Guillaume Quin, Béatrice Pinel-Puysségur, Jean-Marie Nicolas
    Comparison of Harmonic, Geometric and Arithmetic means for change detection in SAR time series Guillaume Quin, Béatrice Pinel-Puysségur, Jean-Marie Nicolas To cite this version: Guillaume Quin, Béatrice Pinel-Puysségur, Jean-Marie Nicolas. Comparison of Harmonic, Geometric and Arithmetic means for change detection in SAR time series. EUSAR. 9th European Conference on Synthetic Aperture Radar, 2012., Apr 2012, Germany. hal-00737524 HAL Id: hal-00737524 https://hal.archives-ouvertes.fr/hal-00737524 Submitted on 2 Oct 2012 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. EUSAR 2012 Comparison of Harmonic, Geometric and Arithmetic Means for Change Detection in SAR Time Series Guillaume Quin CEA, DAM, DIF, F-91297 Arpajon, France Béatrice Pinel-Puysségur CEA, DAM, DIF, F-91297 Arpajon, France Jean-Marie Nicolas Telecom ParisTech, CNRS LTCI, 75634 Paris Cedex 13, France Abstract The amplitude distribution in a SAR image can present a heavy tail. Indeed, very high–valued outliers can be observed. In this paper, we propose the usage of the Harmonic, Geometric and Arithmetic temporal means for amplitude statistical studies along time. In general, the arithmetic mean is used to compute the mean amplitude of time series.
    [Show full text]
  • Hydraulics Manual Glossary G - 3
    Glossary G - 1 GLOSSARY OF HIGHWAY-RELATED DRAINAGE TERMS (Reprinted from the 1999 edition of the American Association of State Highway and Transportation Officials Model Drainage Manual) G.1 Introduction This Glossary is divided into three parts: · Introduction, · Glossary, and · References. It is not intended that all the terms in this Glossary be rigorously accurate or complete. Realistically, this is impossible. Depending on the circumstance, a particular term may have several meanings; this can never change. The primary purpose of this Glossary is to define the terms found in the Highway Drainage Guidelines and Model Drainage Manual in a manner that makes them easier to interpret and understand. A lesser purpose is to provide a compendium of terms that will be useful for both the novice as well as the more experienced hydraulics engineer. This Glossary may also help those who are unfamiliar with highway drainage design to become more understanding and appreciative of this complex science as well as facilitate communication between the highway hydraulics engineer and others. Where readily available, the source of a definition has been referenced. For clarity or format purposes, cited definitions may have some additional verbiage contained in double brackets [ ]. Conversely, three “dots” (...) are used to indicate where some parts of a cited definition were eliminated. Also, as might be expected, different sources were found to use different hyphenation and terminology practices for the same words. Insignificant changes in this regard were made to some cited references and elsewhere to gain uniformity for the terms contained in this Glossary: as an example, “groundwater” vice “ground-water” or “ground water,” and “cross section area” vice “cross-sectional area.” Cited definitions were taken primarily from two sources: W.B.
    [Show full text]
  • “Mean”? a Review of Interpreting and Calculating Different Types of Means and Standard Deviations
    pharmaceutics Review What Does It “Mean”? A Review of Interpreting and Calculating Different Types of Means and Standard Deviations Marilyn N. Martinez 1,* and Mary J. Bartholomew 2 1 Office of New Animal Drug Evaluation, Center for Veterinary Medicine, US FDA, Rockville, MD 20855, USA 2 Office of Surveillance and Compliance, Center for Veterinary Medicine, US FDA, Rockville, MD 20855, USA; [email protected] * Correspondence: [email protected]; Tel.: +1-240-3-402-0635 Academic Editors: Arlene McDowell and Neal Davies Received: 17 January 2017; Accepted: 5 April 2017; Published: 13 April 2017 Abstract: Typically, investigations are conducted with the goal of generating inferences about a population (humans or animal). Since it is not feasible to evaluate the entire population, the study is conducted using a randomly selected subset of that population. With the goal of using the results generated from that sample to provide inferences about the true population, it is important to consider the properties of the population distribution and how well they are represented by the sample (the subset of values). Consistent with that study objective, it is necessary to identify and use the most appropriate set of summary statistics to describe the study results. Inherent in that choice is the need to identify the specific question being asked and the assumptions associated with the data analysis. The estimate of a “mean” value is an example of a summary statistic that is sometimes reported without adequate consideration as to its implications or the underlying assumptions associated with the data being evaluated. When ignoring these critical considerations, the method of calculating the variance may be inconsistent with the type of mean being reported.
    [Show full text]
  • Harmonic Means ‰ Mean of a Ratio ‰ Summarizing Variability: Range, Variance, Percentiles, Quartiles ‰ Determining Distribution of Data: Quantile-Quantile Plots
    SummarizingSummarizing MeasuredMeasured DataData ©2010 Raj Jain www.rajjain.com 12-1 OverviewOverview Basic Probability and Statistics Concepts: CDF, PDF, PMF, Mean, Variance, CoV, Normal Distribution Summarizing Data by a Single Number: Mean, Median, and Mode, Arithmetic, Geometric, Harmonic Means Mean of A Ratio Summarizing Variability: Range, Variance, percentiles, Quartiles Determining Distribution of Data: Quantile-Quantile plots ©2010 Raj Jain www.rajjain.com 12-2 PartPart III:III: ProbabilityProbability TheoryTheory andand StatisticsStatistics 1. How to report the performance as a single number? Is specifying the mean the correct way? 2. How to report the variability of measured quantities? What are the alternatives to variance and when are they appropriate? 3. How to interpret the variability? How much confidence can you put on data with a large variability? 4. How many measurements are required to get a desired level of statistical confidence? 5. How to summarize the results of several different workloads on a single computer system? 6. How to compare two or more computer systems using several different workloads? Is comparing the mean sufficient? 7. What model best describes the relationship between two variables? Also, how good is the model? ©2010 Raj Jain www.rajjain.com 12-3 BasicBasic ProbabilityProbability andand StatisticsStatistics ConceptsConcepts Independent Events: Two events are called independent if the occurrence of one event does not in any way affect the probability of the other event. Random Variable:
    [Show full text]
  • Notes on Calculating Computer Performance
    Notes on Calculating Computer Performance Bruce Jacob and Trevor Mudge Advanced Computer Architecture Lab EECS Department, University of Michigan {blj,tnm}@umich.edu Abstract This report explains what it means to characterize the performance of a computer, and which methods are appro- priate and inappropriate for the task. The most widely used metric is the performance on the SPEC benchmark suite of programs; currently, the results of running the SPEC benchmark suite are compiled into a single number using the geometric mean. The primary reason for using the geometric mean is that it preserves values across normalization, but unfortunately, it does not preserve total run time, which is probably the figure of greatest interest when performances are being compared. Cycles per Instruction (CPI) is another widely used metric, but this method is invalid, even if comparing machines with identical clock speeds. Comparing CPI values to judge performance falls prey to the same prob- lems as averaging normalized values. In general, normalized values must not be averaged and instead of the geometric mean, either the harmonic or the arithmetic mean is the appropriate method for averaging a set running times. The arithmetic mean should be used to average times, and the harmonic mean should be used to average rates (1/time). A number of published SPECmarks are recomputed using these means to demonstrate the effect of choosing a favorable algorithm. 1.0 Performance and the Use of Means We want to summarize the performance of a computer; the easiest way uses a single number that can be compared against the numbers of other machines.
    [Show full text]
  • Cross-Sectional Skewness
    Cross-sectional Skewness Sangmin Oh∗ Jessica A. Wachtery June 18, 2019 Abstract This paper evaluates skewness in the cross-section of stock returns in light of pre- dictions from a well-known class of models. Cross-sectional skewness in monthly returns far exceeds what the standard lognormal model of returns would predict. In spite of the fact that cross-sectional skewness is positive, aggregate market skewness is negative. We present a model that accounts for both of these facts. This model also exhibits long-horizon skewness through the mechanism of nonstationary firm shares. ∗Booth School of Business, The University of Chicago. Email: [email protected] yThe Wharton School, University of Pennsylvania. Email: [email protected]. We thank Hendrik Bessembinder, John Campbell, Marco Grotteria, Nishad Kapadia, Yapai Zhang, and seminar participants at the Wharton School for helpful comments. 1 Introduction Underlying the cross-section of stock returns is a universe of heterogeneous entities com- monly referred to as firms. What is the most useful approach to modeling these firms? For the aggregate market, there is a wide consensus concerning the form a model needs to take to be a plausible account of the data. While there are important differences, quantitatively successful models tend to feature a stochastic discount factor with station- ary growth rates and permanent shocks, combined with aggregate cash flows that, too, have stationary growth rates and permanent shocks.1 No such consensus exists for the cross-section. We start with a simple model for stock returns to illustrate the puzzle. The model is not meant to be the final word on the cross-section, but rather to show that the most straightforward way to extend the consensus for the aggregate to the cross-section runs quickly into difficulties both with regard to data and to theory.
    [Show full text]
  • Math 140 Introductory Statistics
    Notation Population Sample Sampling Math 140 Distribution Introductory Statistics µ µ Mean x x Standard Professor Bernardo Ábrego Deviation σ s σ x Lecture 16 Sections 7.1,7.2 Size N n Properties of The Sampling Example 1 Distribution of The Sample Mean The mean µ x of the sampling distribution of x equals the mean of the population µ: Problems usually involve a combination of the µx = µ three properties of the Sampling Distribution of the Sample Mean, together with what we The standard deviation σ x of the sampling distribution of x , also called the standard error of the mean, equals the standard learned about the normal distribution. deviation of the population σ divided by the square root of the sample size n: σ Example: Average Number of Children σ x = n What is the probability that a random sample The Shape of the sampling distribution will be approximately of 20 families in the United States will have normal if the population is approximately normal; for other populations, the sampling distribution becomes more normal as an average of 1.5 children or fewer? n increases. This property is called the Central Limit Theorem. 1 Example 1 Example 1 Example: Average Number Number of Children Proportion of families, P(x) of Children (per family), x µx = µ = 0.873 What is the probability that a 0 0.524 random sample of 20 1 0.201 σ 1.095 σx = = = 0.2448 families in the United States 2 0.179 n 20 will have an average of 1.5 3 0.070 children or fewer? 4 or more 0.026 Mean (of population) 0.6 µ = 0.873 0.5 0.4 Standard Deviation 0.3 σ =1.095 0.2 0.1 0.873 0 01234 Example 1 Example 2 µ = µ = 0.873 Find z-score of the value 1.5 Example: Reasonably Likely Averages x x − mean z = = What average numbers of children are σ 1.095 SD σ = = = 0.2448 reasonably likely in a random sample of 20 x x − µ 1.5 − 0.873 n 20 = x = families? σx 0.2448 ≈ 2.56 Recall that the values that are in the middle normalcdf(−99999,2.56) ≈ .9947 95% of a random distribution are called So in a random sample of 20 Reasonably Likely.
    [Show full text]
  • Package 'Cumstats'
    Package ‘cumstats’ January 16, 2017 Type Package Title Cumulative Descriptive Statistics Version 1.0 Date 2017-01-13 Author Arturo Erdely and Ian Castillo Maintainer Arturo Erdely <[email protected]> Description Cumulative descriptive statistics for (arithmetic, geometric, harmonic) mean, me- dian, mode, variance, skewness and kurtosis. License GPL-3 NeedsCompilation no Repository CRAN Date/Publication 2017-01-16 08:13:00 R topics documented: cumstats-package . .2 cumgmean . .3 cumhmean . .4 cumkurt . .5 cummean . .6 cummedian . .7 cummode . .8 cumquant . .9 cumskew . 10 cumvar . 11 kurtosis . 11 Mode ............................................ 12 skewness . 13 Index 15 1 2 cumstats-package cumstats-package Cumulative Descriptive Statistics Description Cumulative descriptive statistics for (arithmetic, geometric, harmonic) mean, median, mode, vari- ance, skewness and kurtosis. Details The DESCRIPTION file: Package: cumstats Type: Package Title: Cumulative Descriptive Statistics Version: 1.0 Date: 2017-01-13 Author: Arturo Erdely and Ian Castillo Maintainer: Arturo Erdely <[email protected]> Description: Cumulative descriptive statistics for (arithmetic, geometric, harmonic) mean, median, mode, variance, skewness and kurtosis. License: GPL-3 Index of help topics: Mode Statistical Mode cumgmean Cumulative Geometric Mean cumhmean Cumulative Harmonic Mean cumkurt Cumulative Kurtosis cummean Cumulative Arithmetic Mean cummedian Cumulative Median cummode Cumulative Mode cumquant Cumulative Quantile cumskew Cumulative Skewness cumstats-package Cumulative Descriptive Statistics cumvar Cumulative Variance kurtosis Pearson's Measure of Kurtosis skewness Skewness Cumulative descriptive statistics for (arithmetic, geometric, harmonic) mean, median, mode, vari- ance, skewness and kurtosis. Author(s) Arturo Erdely and Ian Castillo Maintainer: Arturo Erdely <[email protected]> cumgmean 3 cumgmean Cumulative Geometric Mean Description Returns a vector whose elements are the cumulative geometric mean of the elements of the argu- ment.
    [Show full text]
  • [Math.PR] 3 Dec 2001 Nsm Eali Eto .;Hr Ewl Utrcl H Essentials
    HARMONIC MEAN, RANDOM POLYNOMIALS AND STOCHASTIC MATRICES NATALIA L. KOMAROVA AND IGOR RIVIN Abstract. Motivated by a problem in learning theory, we are led to study the dominant eigenvalue of a class of random matrices. This turns out to be related to the roots of the derivative of random polynomials (generated by picking their roots uniformly at random in the interval [0, 1], although our results extend to other distribu- tions). This, in turn, requires the study of the statistical behavior of the harmonic mean of random variables as above, and that, in turn, leads us to delicate question of the rate of convergence to stable laws and tail estimates for stable laws. Introduction The original motivation for the work in this paper was provided by the first-named author’s research in learning theory, specifically in various models of language acquisition (see [KNN2001, NKN2001, KN2001]) and more specifically yet by the analysis of the speed of con- vergence of the memoryless learner algorithm. The setup is described in some detail in Section 4.1; here we will just recall the essentials. There is a collection of concepts R1,...,Rn and words which refer to these concepts, sometimes ambiguously. The teacher generates a stream of words, referring to the concept R1. This is not known to the student, but he must learn by, at each steps, guessing some concept Ri and checking for consistency with the teacher’s input. The mem- oryless learner algorithm consists of picking a concept Ri at random, arXiv:math/0105236v2 [math.PR] 3 Dec 2001 and sticking by this choice, until it is proven wrong.
    [Show full text]
  • 4.1 What Is an Average? Example
    STAT1010 – mean (or average) Chapter 4: Describing data ! 4.1 Averages and measures of center " Describing the center of a distribution ! 4.2 Shapes of distributions " Describing the shape ! 4.3 Quantifying variation " Describing the spread of a distribution 1 4.1 What is an average? ! In statistics, we generally use the term mean instead of average, and the mean has a specific formula… mean = sum of all values total number of values ! The term average could be interpreted in a variety of ways, thus, we’ll focus on the mean of a distribution or set of numbers. 2 Example: Eight grocery stores sell the PR energy bar for the following prices: $1.09 $1.29 $1.29 $1.35 $1.39 $1.49 $1.59 $1.79 Find the mean of these prices. Solution: The mean price is $1.41: $1.09 + $1.29 + $1.29 + $1.35 + $1.39 + $1.49 + $1.59 + $1.79 mean = 8 = $1.41 3 1 STAT1010 – mean (or average) Example: Octane Rating n = 40 87.4, 88.4, 88.7, 88.9, 89.3, 89.3, 89.6, 89.7 89.8, 89.8, 89.9, 90.0, 90.1, 90.3, 90.4, 90.4 90.4, 90.5, 90.6, 90.7, 91.0, 91.1, 91.1, 91.2 91.2, 91.6, 91.6, 91.8, 91.8, 92.2, 92.2, 92.2 92.3, 92.6, 92.7, 92.7, 93.0, 93.3, 93.7, 94.4 4 Example: Octane Rating Technical Note (short hand formula for the mean): Let x1, x2, …, xn represent n values.
    [Show full text]
  • STANDARDS and GUIDELINES for STATISTICAL SURVEYS September 2006
    OFFICE OF MANAGEMENT AND BUDGET STANDARDS AND GUIDELINES FOR STATISTICAL SURVEYS September 2006 Table of Contents LIST OF STANDARDS FOR STATISTICAL SURVEYS ....................................................... i INTRODUCTION......................................................................................................................... 1 SECTION 1 DEVELOPMENT OF CONCEPTS, METHODS, AND DESIGN .................. 5 Section 1.1 Survey Planning..................................................................................................... 5 Section 1.2 Survey Design........................................................................................................ 7 Section 1.3 Survey Response Rates.......................................................................................... 8 Section 1.4 Pretesting Survey Systems..................................................................................... 9 SECTION 2 COLLECTION OF DATA................................................................................... 9 Section 2.1 Developing Sampling Frames................................................................................ 9 Section 2.2 Required Notifications to Potential Survey Respondents.................................... 10 Section 2.3 Data Collection Methodology.............................................................................. 11 SECTION 3 PROCESSING AND EDITING OF DATA...................................................... 13 Section 3.1 Data Editing ........................................................................................................
    [Show full text]
  • Glossary of Transportation Construction Quality Assurance Terms
    TRANSPORTATION RESEARCH Number E-C235 August 2018 Glossary of Transportation Construction Quality Assurance Terms Seventh Edition TRANSPORTATION RESEARCH BOARD 2018 EXECUTIVE COMMITTEE OFFICERS Chair: Katherine F. Turnbull, Executive Associate Director and Research Scientist, Texas A&M Transportation Institute, College Station Vice Chair: Victoria A. Arroyo, Executive Director, Georgetown Climate Center; Assistant Dean, Centers and Institutes; and Professor and Director, Environmental Law Program, Georgetown University Law Center, Washington, D.C. Division Chair for NRC Oversight: Susan Hanson, Distinguished University Professor Emerita, School of Geography, Clark University, Worcester, Massachusetts Executive Director: Neil J. Pedersen, Transportation Research Board TRANSPORTATION RESEARCH BOARD 2017–2018 TECHNICAL ACTIVITIES COUNCIL Chair: Hyun-A C. Park, President, Spy Pond Partners, LLC, Arlington, Massachusetts Technical Activities Director: Ann M. Brach, Transportation Research Board David Ballard, Senior Economist, Gellman Research Associates, Inc., Jenkintown, Pennsylvania, Aviation Group Chair Coco Briseno, Deputy Director, Planning and Modal Programs, California Department of Transportation, Sacramento, State DOT Representative Anne Goodchild, Associate Professor, University of Washington, Seattle, Freight Systems Group Chair George Grimes, CEO Advisor, Patriot Rail Company, Denver, Colorado, Rail Group Chair David Harkey, Director, Highway Safety Research Center, University of North Carolina, Chapel Hill, Safety and Systems
    [Show full text]