Lecture 12 Robust Estimation

Total Page:16

File Type:pdf, Size:1020Kb

Lecture 12 Robust Estimation Lecture 12 Robust Estimation Prof. Dr. Svetlozar Rachev Institute for Statistics and Mathematical Economics University of Karlsruhe Financial Econometrics, Summer Semester 2007 Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Copyright These lecture-notes cannot be copied and/or distributed without permission. The material is based on the text-book: Financial Econometrics: From Basics to Advanced Modeling Techniques (Wiley-Finance, Frank J. Fabozzi Series) by Svetlozar T. Rachev, Stefan Mittnik, Frank Fabozzi, Sergio M. Focardi,Teo Jaˇsic`. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Outline I Robust statistics. I Robust estimators of regressions. I Illustration: robustness of the corporate bond yield spread model. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics I Robust statistics addresses the problem of making estimates that are insensitive to small changes in the basic assumptions of the statistical models employed. I The concepts and methods of robust statistics originated in the 1950s. However, the concepts of robust statistics had been used much earlier. I Robust statistics: 1. assesses the changes in estimates due to small changes in the basic assumptions; 2. creates new estimates that are insensitive to small changes in some of the assumptions. I Robust statistics is also useful to separate the contribution of the tails from the contribution of the body of the data. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics I Peter Huber observed, that robust, distribution-free, and nonparametrical actually are not closely related properties. I Example: The sample mean and the sample median are nonparametric estimates of the mean and the median but the mean is not robust to outliers. In fact, changes of one single observation might have unbounded effects on the mean while the median is insensitive to changes of up to half the sample. I Robust methods assume that there are indeed parameters in the distributions under study and attempt to minimize the effects of outliers as well as erroneous assumptions on the shape of the distribution. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: Qualitative and Quantitative Robustness I Estimators are functions of the sample data. 0 I Given an N-sample of data X = (x1,..., xN ) from a population with a cdf F (x), depending on parameter Θ∞, an estimator for Θ∞ is a function ϑˆ = ϑN (x1,..., xN ).. I Consider those estimators that can be written as functions of the cumulative empirical distribution function: N −1 X FN (x) = N I (xi ≤ x) i=1 where I is the indicator function. For these estimators we can write ϑˆ = ϑN (FN ) Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: Qualitative and Quantitative Robustness I Most estimators, in particular the ML estimators, can be written in this way with probability 1. I In general, when N → ∞ then FN (x) → F (x) and ϑˆN → ϑ∞ in probability. The estimator ϑˆN is a random variable that depends on the sample. I Under the distribution F , it will have a probability distribution LF (ϑN ). I Statistics defined as functionals of a distribution are robust if they are continuous with respect to the distribution. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: Qualitative and Quantitative Robustness I In 1968, Hampel introduced a technical definition of qualitative robustness based on metrics of the functional space of distributions. I It states that an estimator is robust for a given distribution F if small deviations from F in the given metric result in small deviations from LF (ϑN ) in the same metric or eventually in some other metric for any sequence of samples of increasing size. I The definition of robustness can be made quantitative by assessing quantitatively how changes in the distribution F affect the distribution LF (ϑN ). Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: Resistant Estimators I An estimator is called resistant if it is insensitive to changes in one single observation. I Given an estimator ϑˆ = ϑN (FN ),we want to understand what happens if we add a new observation of value x to a large sample. To this end we define the influence curve (IC), also called influence function. I The IC is a function of x given ϑ, and F is defined as follows: ϑ((1 − s)F + sδx ) − ϑ(F ) ICϑ,F (x) = lim s→0 s where δx denotes a point mass 1 at x. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: Resistant Estimators I As we can see from its previous definition, the IC is a function of the size of the single observation that is added. In other words, the IC measures the influence of a single observation x on a statistics ϑ for a given distribution F . I In practice, the influence curve is generated by plotting the value of the computed statistic with a single point of X added to Y against that X value. Example: The IC of the mean is a straight line. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: Resistant Estimators Several aspects of the influence curve are of particular interest: I Is the curve ”bounded” as the X values become extreme? Robust statistics should be bounded. That is, a robust statistic should not be unduly influenced by a single extreme point. I What is the general behavior as the X observation becomes extreme? For example, does it becomes smoothly down-weighted as the values become extreme? I What is the influence if the X point is in the ”center” of the Y points?. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: Breakdown Bound The breakdown (BD) bound or point is the largest possible fraction of observations for which there is a bound on the change of the estimate when that fraction of the sample is altered without restrictions. Example: We can change up to 50% of the sample points without provoking unbounded changes of the median. On the contrary, changes of one single observation might have unbounded effects on the mean. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: Rejection Point I The rejection point is defined as the point beyond which the IC becomes zero. Note: The observations beyond the rejection point make no contribution to the final estimate except, possibly, through the auxiliary scale estimate. I Estimators that have a finite rejection point are said to be redescending and are well protected against very large outliers. However, a finite rejection point usually results in the underestimation of scale. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: Main concepts I The gross error sensitivity expresses asymptotically the maximum effect that a contaminated observation can have on the estimator. It is the maximum absolute value of the IC. I The local shift sensitivity measures the effect of the removal of a mass at y and its reintroduction at x. For continuous and differentiable IC, the local shift sensitivity is given by the maximum absolute value of the slope of IC at any point. I Winsor’s principle states that all distributions are normal in the middle. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: M-Estimators I M-estimators are those estimators that are obtained by minimizing a function of the sample data. I Suppose that we are given an N-sample of data 0 X= (x1,..., xN ) . The estimator T (x1,..., xN ) is called an M-estimator if it is obtained by solving the following minimum problem: ( N ) X T = arg mint J = ρ(xi , t) i=1 where ρ(xi , t) is an arbitrary function. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: M-Estimators Alternatively, if ρ(xi , t) is a smooth function, we can say that T is an M-estimator if it is determined by solving the equations: N X ψ(xi , t) = 0 i=1 where ∂ρ(x , t) ψ(x , t) = i i ∂t Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: M-Estimators I When the M-estimator is equivariant, that is T (x1 + a,..., xN + a) = T (x1,..., xN ) + a, ∀a ∈ R, we can write ψ and ρ in terms of the residuals x − t. I Also, in general, an auxiliary scale estimate, S, is used to obtain the scaled residuals r = (x − t)/S. If the estimator is also equivariant to changes of scale, we can write x − t ψ(x, t) = ψ = ψ(r) S x − t ρ(x, t) = ρ = ρ(r) S Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics: M-Estimators I ML estimators are M-estimators with ρ = − log f , where f is the probability density. I The name M-estimators means maximum likelihood-type estimators. LS estimators are also M-estimators. I The IC of M-estimators has a particularly simple form.
Recommended publications
  • Outlier Identification.Pdf
    STATGRAPHICS – Rev. 7/6/2009 Outlier Identification Summary The Outlier Identification procedure is designed to help determine whether or not a sample of n numeric observations contains outliers. By “outlier”, we mean an observation that does not come from the same distribution as the rest of the sample. Both graphical methods and formal statistical tests are included. The procedure will also save a column back to the datasheet identifying the outliers in a form that can be used in the Select field on other data input dialog boxes. Sample StatFolio: outlier.sgp Sample Data: The file bodytemp.sgd file contains data describing the body temperature of a sample of n = 130 people. It was obtained from the Journal of Statistical Education Data Archive (www.amstat.org/publications/jse/jse_data_archive.html) and originally appeared in the Journal of the American Medical Association. The first 20 rows of the file are shown below. Temperature Gender Heart Rate 98.4 Male 84 98.4 Male 82 98.2 Female 65 97.8 Female 71 98 Male 78 97.9 Male 72 99 Female 79 98.5 Male 68 98.8 Female 64 98 Male 67 97.4 Male 78 98.8 Male 78 99.5 Male 75 98 Female 73 100.8 Female 77 97.1 Male 75 98 Male 71 98.7 Female 72 98.9 Male 80 99 Male 75 2009 by StatPoint Technologies, Inc. Outlier Identification - 1 STATGRAPHICS – Rev. 7/6/2009 Data Input The data to be analyzed consist of a single numeric column containing n = 2 or more observations.
    [Show full text]
  • Should We Think of a Different Median Estimator?
    Comunicaciones en Estad´ıstica Junio 2014, Vol. 7, No. 1, pp. 11–17 Should we think of a different median estimator? ¿Debemos pensar en un estimator diferente para la mediana? Jorge Iv´an V´eleza Juan Carlos Correab [email protected] [email protected] Resumen La mediana, una de las medidas de tendencia central m´as populares y utilizadas en la pr´actica, es el valor num´erico que separa los datos en dos partes iguales. A pesar de su popularidad y aplicaciones, muchos desconocen la existencia de dife- rentes expresiones para calcular este par´ametro. A continuaci´on se presentan los resultados de un estudio de simulaci´on en el que se comparan el estimador cl´asi- co y el propuesto por Harrell & Davis (1982). Mostramos que, comparado con el estimador de Harrell–Davis, el estimador cl´asico no tiene un buen desempe˜no pa- ra tama˜nos de muestra peque˜nos. Basados en los resultados obtenidos, se sugiere promover la utilizaci´on de un mejor estimador para la mediana. Palabras clave: mediana, cuantiles, estimador Harrell-Davis, simulaci´on estad´ısti- ca. Abstract The median, one of the most popular measures of central tendency widely-used in the statistical practice, is often described as the numerical value separating the higher half of the sample from the lower half. Despite its popularity and applica- tions, many people are not aware of the existence of several formulas to estimate this parameter. We present the results of a simulation study comparing the classic and the Harrell-Davis (Harrell & Davis 1982) estimators of the median for eight continuous statistical distributions.
    [Show full text]
  • Bias, Mean-Square Error, Relative Efficiency
    3 Evaluating the Goodness of an Estimator: Bias, Mean-Square Error, Relative Efficiency Consider a population parameter ✓ for which estimation is desired. For ex- ample, ✓ could be the population mean (traditionally called µ) or the popu- lation variance (traditionally called σ2). Or it might be some other parame- ter of interest such as the population median, population mode, population standard deviation, population minimum, population maximum, population range, population kurtosis, or population skewness. As previously mentioned, we will regard parameters as numerical charac- teristics of the population of interest; as such, a parameter will be a fixed number, albeit unknown. In Stat 252, we will assume that our population has a distribution whose density function depends on the parameter of interest. Most of the examples that we will consider in Stat 252 will involve continuous distributions. Definition 3.1. An estimator ✓ˆ is a statistic (that is, it is a random variable) which after the experiment has been conducted and the data collected will be used to estimate ✓. Since it is true that any statistic can be an estimator, you might ask why we introduce yet another word into our statistical vocabulary. Well, the answer is quite simple, really. When we use the word estimator to describe a particular statistic, we already have a statistical estimation problem in mind. For example, if ✓ is the population mean, then a natural estimator of ✓ is the sample mean. If ✓ is the population variance, then a natural estimator of ✓ is the sample variance. More specifically, suppose that Y1,...,Yn are a random sample from a population whose distribution depends on the parameter ✓.The following estimators occur frequently enough in practice that they have special notations.
    [Show full text]
  • A Joint Central Limit Theorem for the Sample Mean and Regenerative Variance Estimator*
    Annals of Operations Research 8(1987)41-55 41 A JOINT CENTRAL LIMIT THEOREM FOR THE SAMPLE MEAN AND REGENERATIVE VARIANCE ESTIMATOR* P.W. GLYNN Department of Industrial Engineering, University of Wisconsin, Madison, W1 53706, USA and D.L. IGLEHART Department of Operations Research, Stanford University, Stanford, CA 94305, USA Abstract Let { V(k) : k t> 1 } be a sequence of independent, identically distributed random vectors in R d with mean vector ~. The mapping g is a twice differentiable mapping from R d to R 1. Set r = g(~). A bivariate central limit theorem is proved involving a point estimator for r and the asymptotic variance of this point estimate. This result can be applied immediately to the ratio estimation problem that arises in regenerative simulation. Numerical examples show that the variance of the regenerative variance estimator is not necessarily minimized by using the "return state" with the smallest expected cycle length. Keywords and phrases Bivariate central limit theorem,j oint limit distribution, ratio estimation, regenerative simulation, simulation output analysis. 1. Introduction Let X = {X(t) : t I> 0 } be a (possibly) delayed regenerative process with regeneration times 0 = T(- 1) ~< T(0) < T(1) < T(2) < .... To incorporate regenerative sequences {Xn: n I> 0 }, we pass to the continuous time process X = {X(t) : t/> 0}, where X(0 = X[t ] and [t] is the greatest integer less than or equal to t. Under quite general conditions (see Smith [7] ), *This research was supported by Army Research Office Contract DAAG29-84-K-0030. The first author was also supported by National Science Foundation Grant ECS-8404809 and the second author by National Science Foundation Grant MCS-8203483.
    [Show full text]
  • Robustness of Parametric and Nonparametric Tests Under Non-Normality for Two Independent Sample
    International Journal of Management and Applied Science, ISSN: 2394-7926 Volume-4, Issue-4, Apr.-2018 http://iraj.in ROBUSTNESS OF PARAMETRIC AND NONPARAMETRIC TESTS UNDER NON-NORMALITY FOR TWO INDEPENDENT SAMPLE 1USMAN, M., 2IBRAHIM, N 1Dept of Statistics, Fed. Polytechnic Bali, Taraba State, Dept of Maths and Statistcs, Fed. Polytechnic MubiNigeria E-mail: [email protected] Abstract - Robust statistical methods have been developed for many common problems, such as estimating location, scale and regression parameters. One motivation is to provide methods with good performance when there are small departures from parametric distributions. This study was aimed to investigates the performance oft-test, Mann Whitney U test and Kolmogorov Smirnov test procedures on independent samples from unrelated population, under situations where the basic assumptions of parametric are not met for different sample size. Testing hypothesis on equality of means require assumptions to be made about the format of the data to be employed. Sometimes the test may depend on the assumption that a sample comes from a distribution in a particular family; if there is a doubt, then a non-parametric tests like Mann Whitney U test orKolmogorov Smirnov test is employed. Random samples were simulated from Normal, Uniform, Exponential, Beta and Gamma distributions. The three tests procedures were applied on the simulated data sets at various sample sizes (small and moderate) and their Type I error and power of the test were studied in both situations under study. Keywords - Non-normal,Independent Sample, T-test,Mann Whitney U test and Kolmogorov Smirnov test. I. INTRODUCTION assumptions about your data, but it may also require the data to be an independent random sample4.
    [Show full text]
  • Robust Methods in Biostatistics
    Robust Methods in Biostatistics Stephane Heritier The George Institute for International Health, University of Sydney, Australia Eva Cantoni Department of Econometrics, University of Geneva, Switzerland Samuel Copt Merck Serono International, Geneva, Switzerland Maria-Pia Victoria-Feser HEC Section, University of Geneva, Switzerland A John Wiley and Sons, Ltd, Publication Robust Methods in Biostatistics WILEY SERIES IN PROBABILITY AND STATISTICS Established by WALTER A. SHEWHART and SAMUEL S. WILKS Editors David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Iain M. Johnstone, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg, Harvey Goldstein. Editors Emeriti Vic Barnett, J. Stuart Hunter, Jozef L. Teugels A complete list of the titles in this series appears at the end of this volume. Robust Methods in Biostatistics Stephane Heritier The George Institute for International Health, University of Sydney, Australia Eva Cantoni Department of Econometrics, University of Geneva, Switzerland Samuel Copt Merck Serono International, Geneva, Switzerland Maria-Pia Victoria-Feser HEC Section, University of Geneva, Switzerland A John Wiley and Sons, Ltd, Publication This edition first published 2009 c 2009 John Wiley & Sons Ltd Registered office John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com. The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
    [Show full text]
  • A Guide to Robust Statistical Methods in Neuroscience
    A GUIDE TO ROBUST STATISTICAL METHODS IN NEUROSCIENCE Authors: Rand R. Wilcox1∗, Guillaume A. Rousselet2 1. Dept. of Psychology, University of Southern California, Los Angeles, CA 90089-1061, USA 2. Institute of Neuroscience and Psychology, College of Medical, Veterinary and Life Sciences, University of Glasgow, 58 Hillhead Street, G12 8QB, Glasgow, UK ∗ Corresponding author: [email protected] ABSTRACT There is a vast array of new and improved methods for comparing groups and studying associations that offer the potential for substantially increasing power, providing improved control over the probability of a Type I error, and yielding a deeper and more nuanced understanding of data. These new techniques effectively deal with four insights into when and why conventional methods can be unsatisfactory. But for the non-statistician, the vast array of new and improved techniques for comparing groups and studying associations can seem daunting, simply because there are so many new methods that are now available. The paper briefly reviews when and why conventional methods can have relatively low power and yield misleading results. The main goal is to suggest some general guidelines regarding when, how and why certain modern techniques might be used. Keywords: Non-normality, heteroscedasticity, skewed distributions, outliers, curvature. 1 1 Introduction The typical introductory statistics course covers classic methods for comparing groups (e.g., Student's t-test, the ANOVA F test and the Wilcoxon{Mann{Whitney test) and studying associations (e.g., Pearson's correlation and least squares regression). The two-sample Stu- dent's t-test and the ANOVA F test assume that sampling is from normal distributions and that the population variances are identical, which is generally known as the homoscedastic- ity assumption.
    [Show full text]
  • 1 Estimation and Beyond in the Bayes Universe
    ISyE8843A, Brani Vidakovic Handout 7 1 Estimation and Beyond in the Bayes Universe. 1.1 Estimation No Bayes estimate can be unbiased but Bayesians are not upset! No Bayes estimate with respect to the squared error loss can be unbiased, except in a trivial case when its Bayes’ risk is 0. Suppose that for a proper prior ¼ the Bayes estimator ±¼(X) is unbiased, Xjθ (8θ)E ±¼(X) = θ: This implies that the Bayes risk is 0. The Bayes risk of ±¼(X) can be calculated as repeated expectation in two ways, θ Xjθ 2 X θjX 2 r(¼; ±¼) = E E (θ ¡ ±¼(X)) = E E (θ ¡ ±¼(X)) : Thus, conveniently choosing either EθEXjθ or EX EθjX and using the properties of conditional expectation we have, θ Xjθ 2 θ Xjθ X θjX X θjX 2 r(¼; ±¼) = E E θ ¡ E E θ±¼(X) ¡ E E θ±¼(X) + E E ±¼(X) θ Xjθ 2 θ Xjθ X θjX X θjX 2 = E E θ ¡ E θ[E ±¼(X)] ¡ E ±¼(X)E θ + E E ±¼(X) θ Xjθ 2 θ X X θjX 2 = E E θ ¡ E θ ¢ θ ¡ E ±¼(X)±¼(X) + E E ±¼(X) = 0: Bayesians are not upset. To check for its unbiasedness, the Bayes estimator is averaged with respect to the model measure (Xjθ), and one of the Bayesian commandments is: Thou shall not average with respect to sample space, unless you have Bayesian design in mind. Even frequentist agree that insisting on unbiasedness can lead to bad estimators, and that in their quest to minimize the risk by trading off between variance and bias-squared a small dosage of bias can help.
    [Show full text]
  • 11. Parameter Estimation
    11. Parameter Estimation Chris Piech and Mehran Sahami May 2017 We have learned many different distributions for random variables and all of those distributions had parame- ters: the numbers that you provide as input when you define a random variable. So far when we were working with random variables, we either were explicitly told the values of the parameters, or, we could divine the values by understanding the process that was generating the random variables. What if we don’t know the values of the parameters and we can’t estimate them from our own expert knowl- edge? What if instead of knowing the random variables, we have a lot of examples of data generated with the same underlying distribution? In this chapter we are going to learn formal ways of estimating parameters from data. These ideas are critical for artificial intelligence. Almost all modern machine learning algorithms work like this: (1) specify a probabilistic model that has parameters. (2) Learn the value of those parameters from data. Parameters Before we dive into parameter estimation, first let’s revisit the concept of parameters. Given a model, the parameters are the numbers that yield the actual distribution. In the case of a Bernoulli random variable, the single parameter was the value p. In the case of a Uniform random variable, the parameters are the a and b values that define the min and max value. Here is a list of random variables and the corresponding parameters. From now on, we are going to use the notation q to be a vector of all the parameters: Distribution Parameters Bernoulli(p) q = p Poisson(l) q = l Uniform(a,b) q = (a;b) Normal(m;s 2) q = (m;s 2) Y = mX + b q = (m;b) In the real world often you don’t know the “true” parameters, but you get to observe data.
    [Show full text]
  • Robust Statistical Methods in R Using the WRS2 Package
    JSS Journal of Statistical Software MMMMMM YYYY, Volume VV, Issue II. doi: 10.18637/jss.v000.i00 Robust Statistical Methods in R Using the WRS2 Package Patrick Mair Rand Wilcox Harvard University University of Southern California Abstract In this manuscript we present various robust statistical methods popular in the social sciences, and show how to apply them in R using the WRS2 package available on CRAN. We elaborate on robust location measures, and present robust t-test and ANOVA ver- sions for independent and dependent samples, including quantile ANOVA. Furthermore, we present on running interval smoothers as used in robust ANCOVA, strategies for com- paring discrete distributions, robust correlation measures and tests, and robust mediator models. Keywords: robust statistics, robust location measures, robust ANOVA, robust ANCOVA, robust mediation, robust correlation. 1. Introduction Data are rarely normal. Yet many classical approaches in inferential statistics assume nor- mally distributed data, especially when it comes to small samples. For large samples the central limit theorem basically tells us that we do not have to worry too much. Unfortu- nately, things are much more complex than that, especially in the case of prominent, \dan- gerous" normality deviations such as skewed distributions, data with outliers, or heavy-tailed distributions. Before elaborating on consequences of these violations within the context of statistical testing and estimation, let us look at the impact of normality deviations from a purely descriptive angle. It is trivial that the mean can be heavily affected by outliers or highly skewed distribu- tional shapes. Computing the mean on such data would not give us the \typical" participant; it is just not a good location measure to characterize the sample.
    [Show full text]
  • Bayes Estimator Recap - Example
    Recap Bayes Risk Consistency Summary Recap Bayes Risk Consistency Summary . Last Lecture . Biostatistics 602 - Statistical Inference Lecture 16 • What is a Bayes Estimator? Evaluation of Bayes Estimator • Is a Bayes Estimator the best unbiased estimator? . • Compared to other estimators, what are advantages of Bayes Estimator? Hyun Min Kang • What is conjugate family? • What are the conjugate families of Binomial, Poisson, and Normal distribution? March 14th, 2013 Hyun Min Kang Biostatistics 602 - Lecture 16 March 14th, 2013 1 / 28 Hyun Min Kang Biostatistics 602 - Lecture 16 March 14th, 2013 2 / 28 Recap Bayes Risk Consistency Summary Recap Bayes Risk Consistency Summary . Recap - Bayes Estimator Recap - Example • θ : parameter • π(θ) : prior distribution i.i.d. • X1, , Xn Bernoulli(p) • X θ fX(x θ) : sampling distribution ··· ∼ | ∼ | • π(p) Beta(α, β) • Posterior distribution of θ x ∼ | • α Prior guess : pˆ = α+β . Joint fX(x θ)π(θ) π(θ x) = = | • Posterior distribution : π(p x) Beta( xi + α, n xi + β) | Marginal m(x) | ∼ − • Bayes estimator ∑ ∑ m(x) = f(x θ)π(θ)dθ (Bayes’ rule) | α + x x n α α + β ∫ pˆ = i = i + α + β + n n α + β + n α + β α + β + n • Bayes Estimator of θ is ∑ ∑ E(θ x) = θπ(θ x)dθ | θ Ω | ∫ ∈ Hyun Min Kang Biostatistics 602 - Lecture 16 March 14th, 2013 3 / 28 Hyun Min Kang Biostatistics 602 - Lecture 16 March 14th, 2013 4 / 28 Recap Bayes Risk Consistency Summary Recap Bayes Risk Consistency Summary . Loss Function Optimality Loss Function Let L(θ, θˆ) be a function of θ and θˆ.
    [Show full text]
  • Robustbase: Basic Robust Statistics
    Package ‘robustbase’ June 2, 2021 Version 0.93-8 VersionNote Released 0.93-7 on 2021-01-04 to CRAN Date 2021-06-01 Title Basic Robust Statistics URL http://robustbase.r-forge.r-project.org/ Description ``Essential'' Robust Statistics. Tools allowing to analyze data with robust methods. This includes regression methodology including model selections and multivariate statistics where we strive to cover the book ``Robust Statistics, Theory and Methods'' by 'Maronna, Martin and Yohai'; Wiley 2006. Depends R (>= 3.5.0) Imports stats, graphics, utils, methods, DEoptimR Suggests grid, MASS, lattice, boot, cluster, Matrix, robust, fit.models, MPV, xtable, ggplot2, GGally, RColorBrewer, reshape2, sfsmisc, catdata, doParallel, foreach, skewt SuggestsNote mostly only because of vignette graphics and simulation Enhances robustX, rrcov, matrixStats, quantreg, Hmisc EnhancesNote linked to in man/*.Rd LazyData yes NeedsCompilation yes License GPL (>= 2) Author Martin Maechler [aut, cre] (<https://orcid.org/0000-0002-8685-9910>), Peter Rousseeuw [ctb] (Qn and Sn), Christophe Croux [ctb] (Qn and Sn), Valentin Todorov [aut] (most robust Cov), Andreas Ruckstuhl [aut] (nlrob, anova, glmrob), Matias Salibian-Barrera [aut] (lmrob orig.), Tobias Verbeke [ctb, fnd] (mc, adjbox), Manuel Koller [aut] (mc, lmrob, psi-func.), Eduardo L. T. Conceicao [aut] (MM-, tau-, CM-, and MTL- nlrob), Maria Anna di Palma [ctb] (initial version of Comedian) 1 2 R topics documented: Maintainer Martin Maechler <[email protected]> Repository CRAN Date/Publication 2021-06-02 10:20:02 UTC R topics documented: adjbox . .4 adjboxStats . .7 adjOutlyingness . .9 aircraft . 12 airmay . 13 alcohol . 14 ambientNOxCH . 15 Animals2 . 18 anova.glmrob . 19 anova.lmrob .
    [Show full text]