Weighted Mean Mathematics

Total Page:16

File Type:pdf, Size:1020Kb

Weighted Mean Mathematics underground Weighted mean mathematics A weighted mean or weighted average of a set of data (or numbers) is a mean in which the different pieces of data are given different weights. “Weight” here has the sense of “importance”, but can also be thought of as literal “weight”. The simplest case is where each piece of data 푥푖 occurs with a corresponding frequency 푓푖, then we can think of the data as 푥1, … , 푥1, 푥2, … , 푥2, … , 푥푛, … , 푥푛 with 푓1 copies of 푥1, and so on. Thus the arithmetic mean of these would be 푥 + ⋯ + 푥 + 푥 + ⋯ + 푥 + ⋯ + 푥 + ⋯ + 푥 ∑ 푓 푥 ̄푥= 1 1 2 2 푛 푛 = 푖 푖 . 푓1 + 푓2 + ⋯ + 푓푛 ∑ 푓푖 More generally, if we weight the data 푥푖 with weight 푤푖, then the weighted arithmetic mean ∑ 푤푖푥푖 is . ∑ 푤푖 This is the same formula as that for the centre of mass of a set of objects (where weight 푤푖 ∑ 푚푖푥푖 is replaced by mass 푚푖): ̄푥= , that is, the 푥-coordinate of the centre of mass is the ∑ 푚푖 mean of the 푥-coordinates of the objects, where the objects are weighted by their mass. It is also the same formula as that for the expectation of a discrete random variable 푋. If 푋 can take the values 푥1, …, 푥푛 with probabilities 푝1, …, 푝푛 respectively, where 푝1 + 푝2 + ⋯ + 푝푛 = 1, then the expectation (mean) of 푋 is given by 퐸(푋) = 푝1푥1 + 푝2푥2 + ⋯ + 푝푛푥푛 = ∑ 푝푖푥푖. (We don’t need to explicitly divide by 푝1 + 푝2 + ⋯ + 푝푛 as this equals 1.) Similarly, the weighted geometric mean of a set of data would be 푤1 푤2 푤푛 1/(푤1+푤2+⋯+푤푛) (푥1 푥2 … 푥푛 ) . Weighted means are used when calculating inflation: in the UK, the Retail Price Index (RPI) and Consumer Price Index (CPI) are calculated by first finding a weighted mean cost ofaset of items in a typical shopping basket, weighted by the quantity in which they might typically be purchased. More can be found on these at the Office for National Statistics. Weighted means can also be used for continuous mass distributions. For example, if a uniform lamina has the shape shown in this diagram: Page 1 of 2 Copyright © University of Cambridge. All rights reserved. Last updated 31-Oct-16 https://undergroundmathematics.org/glossary/weighted-mean then we can estimate the 푥-coordinate of its centre of mass by summing over the strips, one of which is highlighted in red: ∑(푓(푥)훿푥)푥 ̄푥≈ ∑(푓(푥)훿푥) since each strip has mass proportional to its area 푓(푥)훿푥. In the limit as the strips become thinner, the sums become integrals, giving ∫푏 푥푓(푥) 푑푥 ̄푥= 푎 . 푏 ∫푎 푓(푥) 푑푥 The same formula applies if we are finding the expectation (mean) of a continuous random variable, where this time 푓(푥) is the probability density function, so 푏 퐸(푋) = 푥푓(푥) 푑푥. ∫푎 푏 (Again, we do not need to explicitly divide by the integral , as this equals .) ∫푎 푓(푥) 푑푥 1 Page 2 of 2 Copyright © University of Cambridge. All rights reserved. Last updated 31-Oct-16 https://undergroundmathematics.org/glossary/weighted-mean.
Recommended publications
  • Data Analysis in Astronomy and Physics
    Data Analysis in Astronomy and Physics Lecture 3: Statistics 2 Lecture_Note_3_Statistics_new.nb M. Röllig - SS19 Lecture_Note_3_Statistics_new.nb 3 Statistics Statistics are designed to summarize, reduce or describe data. A statistic is a function of the data alone! Example statistics for a set of data X1,X 2, … are: average, the maximum value, average of the squares,... Statistics are combinations of finite amounts of data. Example summarizing examples of statistics: location and scatter. 4 Lecture_Note_3_Statistics_new.nb Location Average (Arithmetic Mean) Out[ ]//TraditionalForm= N ∑Xi i=1 X N Example: Mean[{1, 2, 3, 4, 5, 6, 7}] Mean[{1, 2, 3, 4, 5, 6, 7, 8}] Mean[{1, 2, 3, 4, 5, 6, 7, 100}] 4 9 2 16 Lecture_Note_3_Statistics_new.nb 5 Location Weighted Average (Weighted Arithmetic Mean) 1 "N" In[36]:= equationXw ⩵ wi Xi "N" = ∑ wi i 1 i=1 Out[36]//TraditionalForm= N ∑wi Xi i=1 Xw N ∑wi i=1 1 1 1 1 In case of equal weights wi = w : Xw = ∑wi Xi = ∑wXi = w ∑Xi = ∑Xi = X ∑wi ∑w w N 1 N Example: data: xi, weights :w i = xi In[46]:= x1={1., 2, 3, 4, 5, 6, 7}; w1=1/ x1; x2={1., 2, 3, 4, 5, 6, 7, 8}; w2=1 x2; x3={1., 2, 3, 4, 5, 6, 7, 100}; w3=1 x3; In[49]:= x1.w1 Total[w1] x2.w2 Total[w2] x3.w3 Total[w3] Out[49]= 2.69972 Out[50]= 2.9435 Out[51]= 3.07355 6 Lecture_Note_3_Statistics_new.nb Location Median Arrange Xi according to size and renumber.
    [Show full text]
  • The Generalized Likelihood Uncertainty Estimation Methodology
    CHAPTER 4 The Generalized Likelihood Uncertainty Estimation methodology Calibration and uncertainty estimation based upon a statistical framework is aimed at finding an optimal set of models, parameters and variables capable of simulating a given system. There are many possible sources of mismatch between observed and simulated state variables (see section 3.2). Some of the sources of uncertainty originate from physical randomness, and others from uncertain knowledge put into the system. The uncertainties originating from physical randomness may be treated within a statistical framework, whereas alternative methods may be needed to account for uncertainties originating from the interpretation of incomplete and perhaps ambiguous data sets. The GLUE methodology (Beven and Binley 1992) rejects the idea of one single optimal solution and adopts the concept of equifinality of models, parameters and variables (Beven and Binley 1992; Beven 1993). Equifinality originates from the imperfect knowledge of the system under consideration, and many sets of models, parameters and variables may therefore be considered equal or almost equal simulators of the system. Using the GLUE analysis, the prior set of mod- els, parameters and variables is divided into a set of non-acceptable solutions and a set of acceptable solutions. The GLUE methodology deals with the variable degree of membership of the sets. The degree of membership is determined by assessing the extent to which solutions fit the model, which in turn is determined 53 CHAPTER 4. THE GENERALIZED LIKELIHOOD UNCERTAINTY ESTIMATION METHODOLOGY by subjective likelihood functions. By abandoning the statistical framework we also abandon the traditional definition of uncertainty and in general will have to accept that to some extent uncertainty is a matter of subjective and individual interpretation by the hydrologist.
    [Show full text]
  • Lecture 3: Measure of Central Tendency
    Lecture 3: Measure of Central Tendency Donglei Du ([email protected]) Faculty of Business Administration, University of New Brunswick, NB Canada Fredericton E3B 9Y2 Donglei Du (UNB) ADM 2623: Business Statistics 1 / 53 Table of contents 1 Measure of central tendency: location parameter Introduction Arithmetic Mean Weighted Mean (WM) Median Mode Geometric Mean Mean for grouped data The Median for Grouped Data The Mode for Grouped Data 2 Dicussion: How to lie with averges? Or how to defend yourselves from those lying with averages? Donglei Du (UNB) ADM 2623: Business Statistics 2 / 53 Section 1 Measure of central tendency: location parameter Donglei Du (UNB) ADM 2623: Business Statistics 3 / 53 Subsection 1 Introduction Donglei Du (UNB) ADM 2623: Business Statistics 4 / 53 Introduction Characterize the average or typical behavior of the data. There are many types of central tendency measures: Arithmetic mean Weighted arithmetic mean Geometric mean Median Mode Donglei Du (UNB) ADM 2623: Business Statistics 5 / 53 Subsection 2 Arithmetic Mean Donglei Du (UNB) ADM 2623: Business Statistics 6 / 53 Arithmetic Mean The Arithmetic Mean of a set of n numbers x + ::: + x AM = 1 n n Arithmetic Mean for population and sample N P xi µ = i=1 N n P xi x¯ = i=1 n Donglei Du (UNB) ADM 2623: Business Statistics 7 / 53 Example Example: A sample of five executives received the following bonuses last year ($000): 14.0 15.0 17.0 16.0 15.0 Problem: Determine the average bonus given last year. Solution: 14 + 15 + 17 + 16 + 15 77 x¯ = = = 15:4: 5 5 Donglei Du (UNB) ADM 2623: Business Statistics 8 / 53 Example Example: the weight example (weight.csv) The R code: weight <- read.csv("weight.csv") sec_01A<-weight$Weight.01A.2013Fall # Mean mean(sec_01A) ## [1] 155.8548 Donglei Du (UNB) ADM 2623: Business Statistics 9 / 53 Will Rogers phenomenon Consider two sets of IQ scores of famous people.
    [Show full text]
  • Experimental and Simulation Studies of the Shape and Motion of an Air Bubble Contained in a Highly Viscous Liquid Flowing Through an Orifice Constriction
    Accepted Manuscript Experimental and simulation studies of the shape and motion of an air bubble contained in a highly viscous liquid flowing through an orifice constriction B. Hallmark, C.-H. Chen, J.F. Davidson PII: S0009-2509(19)30414-2 DOI: https://doi.org/10.1016/j.ces.2019.04.043 Reference: CES 14954 To appear in: Chemical Engineering Science Received Date: 13 February 2019 Revised Date: 25 April 2019 Accepted Date: 27 April 2019 Please cite this article as: B. Hallmark, C.-H. Chen, J.F. Davidson, Experimental and simulation studies of the shape and motion of an air bubble contained in a highly viscous liquid flowing through an orifice constriction, Chemical Engineering Science (2019), doi: https://doi.org/10.1016/j.ces.2019.04.043 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Experimental and simulation studies of the shape and motion of an air bubble contained in a highly viscous liquid flowing through an orifice constriction. B. Hallmark, C.-H. Chen, J.F. Davidson Department of Chemical Engineering and Biotechnology, Philippa Fawcett Drive, Cambridge. CB3 0AS. UK Abstract This paper reports an experimental and computational study on the shape and motion of an air bubble, contained in a highly viscous Newtonian liquid, as it passes through a rectangular channel having a constriction orifice.
    [Show full text]
  • Prevalence of Elevated Blood Pressure Levels In
    PREVALENCE OF ELEVATED BLOOD PRESSURE LEVELS IN OVERWEIGHT AND OBESE YOUNG POPULATION (2 – 19 YEARS) IN THE UNITED STATES BETWEEN 2011 AND 2012 By: Lawrence O. Agyekum A Dissertation Submitted to Faculty of the School of Health Related Professions, Rutgers, The State University of New Jersey in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Biomedical Informatics Department of Health Informatics Spring 2015 © 2015 Lawrence Agyekum All Rights Reserved Final Dissertation Approval Form PREVALENCE OF ELEVATED BLOOD PRESSURE LEVELS IN OVERWEIGHT AND OBESE YOUNG POPULATION (2 – 19 YEARS) IN THE UNITED STATES BETWEEN 2011 AND 2012 BY Lawrence O. Agyekum Dissertation Committee: Syed Haque, Ph.D., Committee Chair Frederick Coffman Ph.D., Committee Member Shankar Srinivasan, Ph.D., Member Approved by the Dissertation Committee: _____________________________________ Date: _______________ _____________________________________ Date: _______________ ____________________________________ Date: _______________ ii ABSTRACT Prevalence of Elevated Blood Pressure Levels in Overweight and Obese Young Population (2 -19 Years) in the United States between 2011-2012 By Lawrence Ofori Agyekum Several studies have reported hypertension prevalence in children and adolescents in the United States (US) using regional or local population-based samples but few have reported national prevalence. The present study estimates national hypertension prevalence in US children and adolescents for 2011-2012. A convenient sample size of 4,196 (population aged ≤ 19) representing 43% of 9,756 (total survey respondents) was selected and stratified by age groups; “Children” and “Adolescents” using the 2007 Joint National Committee recommended definitions. Next, hypertension distribution was explained by gender, race, age, body weight, standing height and blood serum total cholesterol.
    [Show full text]
  • Arxiv:1111.2491V1 [Physics.Data-An] 10 Nov 2011 .Simulation 2
    Optimized differential energy loss estimation for tracker detectors Ferenc Sikl´er KFKI Research Institute for Particle and Nuclear Physics, Budapest, Hungary CERN, Geneva, Switzerland S´andor Szeles E¨otv¨os University, Budapest, Hungary Abstract The estimation of differential energy loss for charged particles in tracker detectors is studied. The robust truncated mean method can be generalized to the linear combination of the energy deposit measurements. The optimized weights in case of arithmetic and geometric means are obtained using a detailed simulation. The results show better particle separation power for both semiconductor and gaseous detectors. Key words: Energy loss, Silicon, TPC PACS: 29.40.Gx, 29.85.-c, 34.50.Bw 1. Introduction The identification of charged particles is crucial in several fields of particle and nuclear physics: particle spectra, correlations, selection of daughters of resonance decays and for reducing the background of rare physics processes [1, 2]. Tracker detectors, both semiconductor and gaseous, can be employed for particle identification, or yield extraction in the statistical sense, by proper use of energy deposit measurements along the trajectory of the particle. While for gaseous detectors a wide momentum range is available, in semiconductors there is practically no logarithmic rise of differential energy loss (dE/dx) at high momentum, thus only momenta below the the minimum ionization region are accessible. In this work two representative materials, silicon and neon are studied. Energy loss of charged particles inside matter is a complicated process. For detailed theoretical model and several comparisons to measured data see Refs. [3, 4]. While the energy lost and deposited differ, they will be used interchangeably in the discussion.
    [Show full text]
  • Estimation of a Proportion in Survey Sampling Using the Logratio Approach
    Noname manuscript No. (will be inserted by the editor) Estimation of a proportion in survey sampling using the logratio approach Karel Hron · Matthias Templ · Peter Filzmoser Received: date / Accepted: date Abstract The estimation of a mean of a proportion is a frequent task in statistical survey analysis, and often such ratios are estimated from compositions such as income components, wage components, tax components, etc. In practice, the weighted arithmetic mean is reg- ularly used to estimate the center of the data. However, this estimator is not appropriate if the ratios are estimated from compositions, because the sample space of compositional data is the simplex and not the usual Euclidean space. We demonstrate that the weighted geometric mean is useful for this purpose. Even for different sampling designs, the weighted geometric mean shows excellent behavior. Keywords Survey sampling, Proportions, Compositional data, Logratio analysis K. Hron Department of Mathematical Analysis and Applications of Mathematics, Faculty of Science, Palack´yUni- versity, 17. listopadu 12, 771 46 Olomouc, Czech Republic Department of Geoinformatics, Faculty of Science, Palack´yUniversity, tˇr.Svobody 26, 771 46 Olomouc, Czech Republic Tel.: +420585634605 Fax: +420585634002 E-mail: [email protected] M. Templ Department of Statistics and Probability Theory, Vienna University of Technology, Wiedner Hauptstr. 7, A-1040 Vienna, Austria Tel.: +4315880110715 E-mail: [email protected] P. Filzmoser Department of Statistics and Probability Theory, Vienna University of Technology, Wiedner Hauptstr. 8-10, A-1040 Vienna, Austria Tel.: +4315880110733 E-mail: p.fi[email protected] 2 Karel Hron et al. 1 Introduction Many surveys are concerned with the problem of estimating a mean of proportions.
    [Show full text]
  • Simultaneous Penalized M-Estimation of Covariance Matrices Using
    Simultaneous penalized M-estimation of covariance matrices using geodesically convex optimization Esa Ollilaa,∗, Ilya Soloveychikb, David E. Tylerc,1, Ami Wieselb,2 aAalto University, Finland bThe Hebrew University of Jerusalem, Israel cRutgers – The State University of New Jersey, USA Abstract A common assumption when sampling p-dimensional observations from K dis- tinct group is the equality of the covariance matrices. In this paper, we propose two penalized M-estimation approaches for the estimation of the covariance or scatter matrices under the broader assumption that they may simply be close to each other, and hence roughly deviate from some positive definite “center”. The first approach begins by generating a pooled M-estimator of scatter based on all the data, followed by a penalised M-estimator of scatter for each group, with the penalty term chosen so that the individual scatter matrices are shrunk towards the pooled scatter matrix. In the second approach, we minimize the sum of the individual group M-estimation cost functions together with an additive joint penalty term which enforces some similarity between the individual scatter es- timators, i.e. shrinkage towards a mutual center. In both approaches, we utilize the concept of geodesic convexity to prove the existence and uniqueness of the penalized solution under general conditions. We consider three specific penalty functions based on the Euclidean, the Riemannian, and the Kullback-Leibler distances. In the second approach, the distance based penalties are shown to lead to estimators of the mutual center that are related to the arithmetic, the Riemannian and the harmonic means of positive definite matrices, respectively.
    [Show full text]
  • Weighted Power Mean Copulas: Theory and Application
    A Service of Leibniz-Informationszentrum econstor Wirtschaft Leibniz Information Centre Make Your Publications Visible. zbw for Economics Klein, Ingo; Fischer, Matthias J.; Pleier, Thomas Working Paper Weighted power mean copulas: Theory and application IWQW Discussion Papers, No. 01/2011 Provided in Cooperation with: Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics Suggested Citation: Klein, Ingo; Fischer, Matthias J.; Pleier, Thomas (2011) : Weighted power mean copulas: Theory and application, IWQW Discussion Papers, No. 01/2011, Friedrich- Alexander-Universität Erlangen-Nürnberg, Institut für Wirtschaftspolitik und Quantitative Wirtschaftsforschung (IWQW), Nürnberg This Version is available at: http://hdl.handle.net/10419/50916 Standard-Nutzungsbedingungen: Terms of use: Die Dokumente auf EconStor dürfen zu eigenen wissenschaftlichen Documents in EconStor may be saved and copied for your Zwecken und zum Privatgebrauch gespeichert und kopiert werden. personal and scholarly purposes. Sie dürfen die Dokumente nicht für öffentliche oder kommerzielle You are not to copy documents for public or commercial Zwecke vervielfältigen, öffentlich ausstellen, öffentlich zugänglich purposes, to exhibit the documents publicly, to make them machen, vertreiben oder anderweitig nutzen. publicly available on the internet, or to distribute or otherwise use the documents in public. Sofern die Verfasser die Dokumente unter Open-Content-Lizenzen (insbesondere CC-Lizenzen) zur Verfügung gestellt haben sollten, If the documents have
    [Show full text]
  • 1 Social Studies 201 September 27, 2004 Mean for Grouped Data
    1 Social Studies 201 September 27, 2004 Mean for grouped data { text, section 5.4.2, pp. 188-200. For grouped data, where data are grouped into categories or intervals and presented as diagrams or tables, the de¯nition of the mean is unchanged, but the method of obtaining it di®ers from that used for ungrouped data. The mean is the sum of values of the variable divided by the number of cases, but it is necessary to make sure that the sum is properly obtained. In order to obtain the sum or total value, each individual value must be multiplied by the number, or percentage, of cases that take on that value, and these products are added together. The mean is then obtained by dividing this sum by the number, or percentage, of cases. A formal de¯nition and examples follow { again, if you are unfamiliar with the notation below, please read the text, pp. 97-100 and pp. 190-92 on notation. De¯nition of the mean for grouped data. For a variable X, taking on k di®erent values X1;X2;X3; :::Xk, with respective frequencies f1; f2; f3; :::fk, the mean of X is f X + f X + f X + ::: + f X X¹ = 1 1 2 2 3 3 k k n where n = f1 + f2 + f3 + ::: + fk: Using summation notation, §(fX) X¹ = n where n = §f. Notes on calculating the mean with grouped data 1. Products. The ¯rst step in calculating the mean is to multiply each value of the variable X by the frequency of occurrence of that value.
    [Show full text]
  • Means and Covariance Functions for Geostatistical Compositional Data: an Axiomatic Approach
    Means and covariance functions for geostatistical compositional data: an axiomatic approach Denis Allarda, Thierry Marchantb aBiostatistics and Spatial Processes, BioSP, INRA, 84914, Avignon, France. bGhent University, Ghent, Belgium. Corresponding author: D. Allard, BioSP, 84914 Avignon, FRANCE Tel: +33 432722171; Fax: +33 432722182 [email protected] Abstract This work focuses on the characterization of the central tendency of a sample of compositional data. It provides new results about theoretical properties of means and co- variance functions for compositional data, with an axiomatic perspective. Original results that shed new light on the geostatistical modeling of compositional data are presented. As a first result, it is shown that the weighted arithmetic mean is the only central ten- dency characteristic satisfying a small set of axioms, namely continuity, reflexivity and marginal stability. Moreover, this set of axioms also implies that the weights must be identical for all parts of the composition. This result has deep consequences on the spa- tial multivariate covariance modeling of compositional data. In a geostatistical setting, it is shown as a second result that the proportional model of covariance functions (i.e., the product of a covariance matrix and a single correlation function) is the only model that provides identical kriging weights for all components of the compositional data. As a consequence of these two results, the proportional model of covariance function is the only covariance model compatible with reflexivity and marginal stability. Keywords Aitchison geometry; central tendency; functional equation; geostatistics; multi- variate covariance function model Authors are listed alphabetically. They contributed equally. 1 1 Introduction This work focuses on the characterization of the central tendency of a sample of composi- tional data and on consequences regarding its geostatistical modeling.
    [Show full text]
  • Exact Logit-Based Product Design
    Exact Logit-Based Product Design Irem_ Ak¸caku¸s UCLA Anderson School of Management, University of California, Los Angeles, California 90095, United States, [email protected] Velibor V. Miˇsi´c UCLA Anderson School of Management, University of California, Los Angeles, California 90095, United States, [email protected] The share-of-choice product design (SOCPD) problem is to find the product, as defined by its attributes, that maximizes market share arising from a collection of customer types or segments. When customers follow a logit model of choice, the market share is given by a weighted sum of logistic probabilities, leading to the logit-based share-of-choice product design problem. At first glance, this problem appears hopeless: one must optimize an objective function that is neither convex nor concave, over an exponentially-sized discrete set of attribute combinations. In this paper, we develop an exact methodology for solving this problem based on modern integer, convex and conic optimization. At the heart of our methodology is the surprising result that the logit-based SOCPD problem can be exactly reformulated as a mixed-integer convex program. Using both synthetic problem instances and instances derived from real conjoint data sets, we show that our methodology can solve large instances to provable optimality or near optimality in operationally feasible time frames. Key words : new product development; choice modeling; conjoint analysis; integer programming; convex optimization. 1. Introduction Consider the following canonical marketing problem. A firm has to design a product, which has a collection of attributes, and each attribute can be set to one of a finite set of levels.
    [Show full text]