Probability Distributionsdistributions

Total Page:16

File Type:pdf, Size:1020Kb

Probability Distributionsdistributions PartPart 1:1: ProbabilityProbability DistributionsDistributions Probability Distribution Normal Distribution Student-t Distribution Chi Square Distribution F Distribution Significance Tests ESS210B Prof. Jin-Yi Yu PurposesPurposes ofof DataData AnalysisAnalysis True Distributions or Relationships in the Earths System Organizing Data Sampling Find Relationships among Data Weather Forecasts, Test Significance of the Results Physical Understanding, …. ESS210B Prof. Jin-Yi Yu ParametersParameters andand StatisticsStatistics Parameters: Numbers that describe a population. For example, the population mean (µ)and standard deviation (σ). Statistics: Numbers that are calculated from a sample. A given population has only one value of a particular parameter, but a particular statistic calculated from different samples of the population has values that are generally different, both from each other, and from the parameter that the statistics is designed to estimate. The science of statistics is concerned with how to draw reliable, quantifiable inferences from statistics about parameters. (from Middleton 2000) ESS210B Prof. Jin-Yi Yu VariablesVariables andand SamplesSamples Random Variable: A variable whose values occur at random, following a probability distribution. Observation: When the random variable actually attains a value, that value is called an observation (of the variable). Sample: A collection of several observations is called sample. If the observations are generated in a random fashion with no bias, that sample is known as a random sample. By observing the distribution of values in a random sample, we can draw conclusions about the underlying probability distribution. ESS210B Prof. Jin-Yi Yu ProbabilityProbability DistributionDistribution continuous probability distribution discrete probability distribution PDFPDF The pattern of probabilities for a set of events is called a probability distribution. (1) The probability of each event or combinations of events must range from 0 to 1. (2) The sum of the probability of all possible events must be equal too 1. ESS210B Prof. Jin-Yi Yu ExampleExample If you throw a die, there are six possible outcomes: the numbers 1, 2, 3, 4, 5 or 6. This is an example of a random variable (the dice value), a variable whose possible values occur at random. When the random variable actually attains a value (such as when the dice is actually thrown) that value is called an observation. If you throw the die 10 times, then you have a random sample which consists of 10 observations. ESS210B Prof. Jin-Yi Yu ProbabilityProbability DensityDensity FunctionFunction P = the probability that a randomly selected value of a variable X falls between a and b. f(x) = the probability density function. The probability function has to be integrated over distinct limits to obtain a probability. The probability for X to have a particular value is ZERO. Two important properties of the probability density function: (1) f(x) ≥ 0 for all x within the domain of f. (2) ESS210B Prof. Jin-Yi Yu CumulativeCumulative DistributionDistribution FunctionFunction The cumulative distribution function F(x) is defined as the probability that a variable assumes a value less than x. The cumulative distribution function is often used to assist in calculating probability (will show later). The following relation between F and P is essential for probability calculation: ESS210B Prof. Jin-Yi Yu NormalNormal DistributionDistribution f: probability density function mean µ: mean of the population σ: standard deviation of the population The normal distribution is one of the most important distribution in geophysics. Most geophysical variables (such as wind, temperature, pressure, etc.) are distributed normally about their means. ESS210B Prof. Jin-Yi Yu StandardStandard NormalNormal DistributionDistribution The standard normal distribution has a mean of 0 and a standard deviation of 1. This probability distribution is particularly useful as it can represent any normal distribution, whatever its mean and standard deviation. Using the following transformation, a normal distribution of variable X can be converted to the standard normal distribution of variable Z: Z = ( X - µ ) / σ ESS210B Prof. Jin-Yi Yu TransformationsTransformations It can be shown that any frequency function can be transformed in to a frequency function of given form by a suitable transformation or functional relationship. For example, the original data follows some complicated skewed distribution, we may want to transform this distribution into a known distribution (such as the normal distribution) whose theory and property are well known. Most geoscience variables are distributed normally about their mean or can be transformed in such a way that they become normally distributed. The normal distribution is, therefore, one of the most important distribution in geoscience data analysis. ESS210B Prof. Jin-Yi Yu HowHow toto UseUse StandardStandard NormalNormal DistributionDistribution Example 1: What is the probability that a value of Z is greater than 0.75? Answer: P(Z≥0.75) = 1-P(Z≤0.75)=1-0.7734=0.2266 ESS210B Prof. Jin-Yi Yu AnotherAnother ExampleExample negative value Example 2: What is the probability that Z lies between the limits Z1=-0.60 and Z2=0.75? Answer: P(Z ≤-0.60) = 1 − P(Z < 0.60) Î P(Z >-0.60)=1−0.7257=0.2743 P(-0.60 ≤Z ≤0.75) = P(Z ≤0.75) − P(Z ≤-0.60) = 0.7734 − 0.2743 ESS210B = 0.4991 Prof. Jin-Yi Yu ExampleExample 11 Given a normal distribution with µ = 50 and σ = 10, find the probability that X assumes a value between 45 and 62. The Z values corresponding to x1 = 45 and x2 = 62 are Therefore, P(45 < X < 62) = P(-0.5 < Z < 1.2) = P(Z <1.2) – P(Z<-0.5) = 0.8849 – 0.3085 = 0.5764 ESS210B Prof. Jin-Yi Yu ExampleExample 22 An electrical firm manufactures light bulbs that have a length of life that is normally distributed with mean equal to 800 hours and a standard deviation of 40 hours. Find the possibility that a bulb burns between 778 and 834 hours. The Z values corresponding to x1 = 778 and x2 = 834 are z1 = (778 – 800) / 40 = -0.55 z2 = (834 – 800) / 40 =0.85 Therefore, P(778 < X < 834) = P(-0.55 < Z < 0.85) = P(Z <0.85) – P(Z<-0.55) = 0.8849 – 0.2912 = 0.5111 ESS210B Prof. Jin-Yi Yu ProbabilityProbability ofof NormalNormal DistributionDistribution ESS210B Prof. Jin-Yi Yu ProbabilityProbability ofof NormalNormal DistributionDistribution There is only a 4.55% probability that a normally distributed variable will fall more than 2 standard deviations away from its mean. This is the two-tailed probability. The probability that a normal variable will exceed its mean by more then 2 σ is only half of that, 2.275%. ESS210B Prof. Jin-Yi Yu ApplicationApplication inin OperationalOperational ClimatologyClimatology 0.53 -1.28 -0.53 Warmer than 9 out of 10 years 1.28 This figure shows average temperatures for January 1989 over the US, expressed as quantiles of the local Normal distributions. Different values of µ and σ have been estimated for each location. ESS210B Prof. Jin-Yi Yu HowHow toto EstimateEstimate µµ andand σσ In order to use the normal distribution, we need to know the mean and standard deviation of the population ÎBut they are impossible to know in most geoscience applications, because most geoscience populations are infinite. ÎWe have to estimate µ and σ from samples. and ESS210B Prof. Jin-Yi Yu SamplingSampling DistributionDistribution Is the sample mean close to the population mean? To answer this question, we need to know the probability distribution of the sample mean. We can obtain the distribution by repeatedly drawing samples from a population and find out the frequency distribution. ESS210B Prof. Jin-Yi Yu ExampleExample If we repeatedly draw N observations from a standard normal distribution (µ=0 and σ=1) and calculate the sample mean, what is the distribution of the sample mean? ESS210B (from Berk & Carey’s book) Prof. Jin-Yi Yu DistributionDistribution ofof SampleSample AverageAverage What did we learn from this example? ÎIf a sample is composed of N random observations from a normal distribution with mean µ and standard deviation σ, then the distribution of the sample average will be a normal distribution with mean µ but standard deviation σ/sqrt(N). ESS210B Prof. Jin-Yi Yu StandardStandard ErrorError The standard deviation of the probability distribution of the sample mean (X) is also referred as the “standard error” of X. ESS210B Prof. Jin-Yi Yu DistributionDistribution ofof SampleSample MeansMeans One example of how the normal distribution can be used for data that are “non-normally” distributed is to determine the distribution of sample means. Is the sample mean close to the population mean? To answer this question, we need to know the probability distribution of the sample mean. We can obtain the distribution by repeatedly drawing samples from a population and find out the frequency distribution. ESS210B Prof. Jin-Yi Yu AnAn ExampleExample ofof NormalNormal DistributionDistribution 1000 random numbers =2 N N = 10 non-normal distribution If we repeatedly draw a sample of N values from a population and calculate the mean of that sample, the distribution of sample means tends to be in normal distribution. ESS210B (Figure from Panofsky and Brier 1968) Prof. Jin-Yi Yu CentralCentral LimitLimit TheoryTheory In the limit, as the sample size becomes large, the sum (or the mean) of a set of independent measurements will have a normal distribution, irrespective of the distribution of the raw data. If there is a population which has a mean of µand a standard deviation of σ, then the distribution of its sampling means will have: µ µ (1) a mean x = , and σ σ 1/2 (2) a standard deviation x = /(N) = standard error of the mean ESS210B Prof. Jin-Yi Yu SampleSample SizeSize (S-σ)/σ * 100 N=30 The bigger the sample size N, the more accurate the estimated mean and standard deviation are to the true values.
Recommended publications
  • Lecture 22: Bivariate Normal Distribution Distribution
    6.5 Conditional Distributions General Bivariate Normal Let Z1; Z2 ∼ N (0; 1), which we will use to build a general bivariate normal Lecture 22: Bivariate Normal Distribution distribution. 1 1 2 2 f (z1; z2) = exp − (z1 + z2 ) Statistics 104 2π 2 We want to transform these unit normal distributions to have the follow Colin Rundel arbitrary parameters: µX ; µY ; σX ; σY ; ρ April 11, 2012 X = σX Z1 + µX p 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 1 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Marginals General Bivariate Normal - Cov/Corr First, lets examine the marginal distributions of X and Y , Second, we can find Cov(X ; Y ) and ρ(X ; Y ) Cov(X ; Y ) = E [(X − E(X ))(Y − E(Y ))] X = σX Z1 + µX h p i = E (σ Z + µ − µ )(σ [ρZ + 1 − ρ2Z ] + µ − µ ) = σX N (0; 1) + µX X 1 X X Y 1 2 Y Y 2 h p 2 i = N (µX ; σX ) = E (σX Z1)(σY [ρZ1 + 1 − ρ Z2]) h 2 p 2 i = σX σY E ρZ1 + 1 − ρ Z1Z2 p 2 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY = σX σY ρE[Z1 ] p 2 = σX σY ρ = σY [ρN (0; 1) + 1 − ρ N (0; 1)] + µY = σ [N (0; ρ2) + N (0; 1 − ρ2)] + µ Y Y Cov(X ; Y ) ρ(X ; Y ) = = ρ = σY N (0; 1) + µY σX σY 2 = N (µY ; σY ) Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 2 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 3 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - RNG Multivariate Change of Variables Consequently, if we want to generate a Bivariate Normal random variable Let X1;:::; Xn have a continuous joint distribution with pdf f defined of S.
    [Show full text]
  • Applied Biostatistics Mean and Standard Deviation the Mean the Median Is Not the Only Measure of Central Value for a Distribution
    Health Sciences M.Sc. Programme Applied Biostatistics Mean and Standard Deviation The mean The median is not the only measure of central value for a distribution. Another is the arithmetic mean or average, usually referred to simply as the mean. This is found by taking the sum of the observations and dividing by their number. The mean is often denoted by a little bar over the symbol for the variable, e.g. x . The sample mean has much nicer mathematical properties than the median and is thus more useful for the comparison methods described later. The median is a very useful descriptive statistic, but not much used for other purposes. Median, mean and skewness The sum of the 57 FEV1s is 231.51 and hence the mean is 231.51/57 = 4.06. This is very close to the median, 4.1, so the median is within 1% of the mean. This is not so for the triglyceride data. The median triglyceride is 0.46 but the mean is 0.51, which is higher. The median is 10% away from the mean. If the distribution is symmetrical the sample mean and median will be about the same, but in a skew distribution they will not. If the distribution is skew to the right, as for serum triglyceride, the mean will be greater, if it is skew to the left the median will be greater. This is because the values in the tails affect the mean but not the median. Figure 1 shows the positions of the mean and median on the histogram of triglyceride.
    [Show full text]
  • Startips …A Resource for Survey Researchers
    Subscribe Archives StarTips …a resource for survey researchers Share this article: How to Interpret Standard Deviation and Standard Error in Survey Research Standard Deviation and Standard Error are perhaps the two least understood statistics commonly shown in data tables. The following article is intended to explain their meaning and provide additional insight on how they are used in data analysis. Both statistics are typically shown with the mean of a variable, and in a sense, they both speak about the mean. They are often referred to as the "standard deviation of the mean" and the "standard error of the mean." However, they are not interchangeable and represent very different concepts. Standard Deviation Standard Deviation (often abbreviated as "Std Dev" or "SD") provides an indication of how far the individual responses to a question vary or "deviate" from the mean. SD tells the researcher how spread out the responses are -- are they concentrated around the mean, or scattered far & wide? Did all of your respondents rate your product in the middle of your scale, or did some love it and some hate it? Let's say you've asked respondents to rate your product on a series of attributes on a 5-point scale. The mean for a group of ten respondents (labeled 'A' through 'J' below) for "good value for the money" was 3.2 with a SD of 0.4 and the mean for "product reliability" was 3.4 with a SD of 2.1. At first glance (looking at the means only) it would seem that reliability was rated higher than value.
    [Show full text]
  • 1. How Different Is the T Distribution from the Normal?
    Statistics 101–106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M §7.1 and §7.2, ignoring starred parts. Reread M&M §3.2. The eects of estimated variances on normal approximations. t-distributions. Comparison of two means: pooling of estimates of variances, or paired observations. In Lecture 6, when discussing comparison of two Binomial proportions, I was content to estimate unknown variances when calculating statistics that were to be treated as approximately normally distributed. You might have worried about the effect of variability of the estimate. W. S. Gosset (“Student”) considered a similar problem in a very famous 1908 paper, where the role of Student’s t-distribution was first recognized. Gosset discovered that the effect of estimated variances could be described exactly in a simplified problem where n independent observations X1,...,Xn are taken from (, ) = ( + ...+ )/ a normal√ distribution, N . The sample mean, X X1 Xn n has a N(, / n) distribution. The random variable X Z = √ / n 2 2 Phas a standard normal distribution. If we estimate by the sample variance, s = ( )2/( ) i Xi X n 1 , then the resulting statistic, X T = √ s/ n no longer has a normal distribution. It has a t-distribution on n 1 degrees of freedom. Remark. I have written T , instead of the t used by M&M page 505. I find it causes confusion that t refers to both the name of the statistic and the name of its distribution. As you will soon see, the estimation of the variance has the effect of spreading out the distribution a little beyond what it would be if were used.
    [Show full text]
  • Appendix F.1 SAWG SPC Appendices
    Appendix F.1 - SAWG SPC Appendices 8-8-06 Page 1 of 36 Appendix 1: Control Charts for Variables Data – classical Shewhart control chart: When plate counts provide estimates of large levels of organisms, the estimated levels (cfu/ml or cfu/g) can be considered as variables data and the classical control chart procedures can be used. Here it is assumed that the probability of a non-detect is virtually zero. For these types of microbiological data, a log base 10 transformation is used to remove the correlation between means and variances that have been observed often for these types of data and to make the distribution of the output variable used for tracking the process more symmetric than the measured count data1. There are several control charts that may be used to control variables type data. Some of these charts are: the Xi and MR, (Individual and moving range) X and R, (Average and Range), CUSUM, (Cumulative Sum) and X and s, (Average and Standard Deviation). This example includes the Xi and MR charts. The Xi chart just involves plotting the individual results over time. The MR chart involves a slightly more complicated calculation involving taking the difference between the present sample result, Xi and the previous sample result. Xi-1. Thus, the points that are plotted are: MRi = Xi – Xi-1, for values of i = 2, …, n. These charts were chosen to be shown here because they are easy to construct and are common charts used to monitor processes for which control with respect to levels of microbiological organisms is desired.
    [Show full text]
  • 1 One Parameter Exponential Families
    1 One parameter exponential families The world of exponential families bridges the gap between the Gaussian family and general dis- tributions. Many properties of Gaussians carry through to exponential families in a fairly precise sense. • In the Gaussian world, there exact small sample distributional results (i.e. t, F , χ2). • In the exponential family world, there are approximate distributional results (i.e. deviance tests). • In the general setting, we can only appeal to asymptotics. A one-parameter exponential family, F is a one-parameter family of distributions of the form Pη(dx) = exp (η · t(x) − Λ(η)) P0(dx) for some probability measure P0. The parameter η is called the natural or canonical parameter and the function Λ is called the cumulant generating function, and is simply the normalization needed to make dPη fη(x) = (x) = exp (η · t(x) − Λ(η)) dP0 a proper probability density. The random variable t(X) is the sufficient statistic of the exponential family. Note that P0 does not have to be a distribution on R, but these are of course the simplest examples. 1.0.1 A first example: Gaussian with linear sufficient statistic Consider the standard normal distribution Z e−z2=2 P0(A) = p dz A 2π and let t(x) = x. Then, the exponential family is eη·x−x2=2 Pη(dx) / p 2π and we see that Λ(η) = η2=2: eta= np.linspace(-2,2,101) CGF= eta**2/2. plt.plot(eta, CGF) A= plt.gca() A.set_xlabel(r'$\eta$', size=20) A.set_ylabel(r'$\Lambda(\eta)$', size=20) f= plt.gcf() 1 Thus, the exponential family in this setting is the collection F = fN(η; 1) : η 2 Rg : d 1.0.2 Normal with quadratic sufficient statistic on R d As a second example, take P0 = N(0;Id×d), i.e.
    [Show full text]
  • Problems with OLS Autocorrelation
    Problems with OLS Considering : Yi = α + βXi + ui we assume Eui = 0 2 = σ2 = σ2 E ui or var ui Euiuj = 0orcovui,uj = 0 We have seen that we have to make very specific assumptions about ui in order to get OLS estimates with the desirable properties. If these assumptions don’t hold than the OLS estimators are not necessarily BLU. We can respond to such problems by changing specification and/or changing the method of estimation. First we consider the problems that might occur and what they imply. In all of these we are basically looking at the residuals to see if they are random. ● 1. The errors are serially dependent ⇒ autocorrelation/serial correlation. 2. The error variances are not constant ⇒ heteroscedasticity 3. In multivariate analysis two or more of the independent variables are closely correlated ⇒ multicollinearity 4. The function is non-linear 5. There are problems of outliers or extreme values -but what are outliers? 6. There are problems of missing variables ⇒can lead to missing variable bias Of course these problems do not have to come separately, nor are they likely to ● Note that in terms of significance things may look OK and even the R2the regression may not look that bad. ● Really want to be able to identify a misleading regression that you may take seriously when you should not. ● The tests in Microfit cover many of the above concerns, but you should always plot the residuals and look at them. Autocorrelation This implies that taking the time series regression Yt = α + βXt + ut but in this case there is some relation between the error terms across observations.
    [Show full text]
  • Approaching Mean-Variance Efficiency for Large Portfolios
    Approaching Mean-Variance Efficiency for Large Portfolios ∗ Mengmeng Ao† Yingying Li‡ Xinghua Zheng§ First Draft: October 6, 2014 This Draft: November 11, 2017 Abstract This paper studies the large dimensional Markowitz optimization problem. Given any risk constraint level, we introduce a new approach for estimating the optimal portfolio, which is developed through a novel unconstrained regression representation of the mean-variance optimization problem, combined with high-dimensional sparse regression methods. Our estimated portfolio, under a mild sparsity assumption, asymptotically achieves mean-variance efficiency and meanwhile effectively controls the risk. To the best of our knowledge, this is the first time that these two goals can be simultaneously achieved for large portfolios. The superior properties of our approach are demonstrated via comprehensive simulation and empirical studies. Keywords: Markowitz optimization; Large portfolio selection; Unconstrained regression, LASSO; Sharpe ratio ∗Research partially supported by the RGC grants GRF16305315, GRF 16502014 and GRF 16518716 of the HKSAR, and The Fundamental Research Funds for the Central Universities 20720171073. †Wang Yanan Institute for Studies in Economics & Department of Finance, School of Economics, Xiamen University, China. [email protected] ‡Hong Kong University of Science and Technology, HKSAR. [email protected] §Hong Kong University of Science and Technology, HKSAR. [email protected] 1 1 INTRODUCTION 1.1 Markowitz Optimization Enigma The groundbreaking mean-variance portfolio theory proposed by Markowitz (1952) contin- ues to play significant roles in research and practice. The optimal mean-variance portfolio has a simple explicit expression1 that only depends on two population characteristics, the mean and the covariance matrix of asset returns. Under the ideal situation when the underlying mean and covariance matrix are known, mean-variance investors can easily compute the optimal portfolio weights based on their preferred level of risk.
    [Show full text]
  • Characteristics and Statistics of Digital Remote Sensing Imagery (1)
    Characteristics and statistics of digital remote sensing imagery (1) Digital Images: 1 Digital Image • With raster data structure, each image is treated as an array of values of the pixels. • Image data is organized as rows and columns (or lines and pixels) start from the upper left corner of the image. • Each pixel (picture element) is treated as a separate unite. Statistics of Digital Images Help: • Look at the frequency of occurrence of individual brightness values in the image displayed • View individual pixel brightness values at specific locations or within a geographic area; • Compute univariate descriptive statistics to determine if there are unusual anomalies in the image data; and • Compute multivariate statistics to determine the amount of between-band correlation (e.g., to identify redundancy). 2 Statistics of Digital Images It is necessary to calculate fundamental univariate and multivariate statistics of the multispectral remote sensor data. This involves identification and calculation of – maximum and minimum value –the range, mean, standard deviation – between-band variance-covariance matrix – correlation matrix, and – frequencies of brightness values The results of the above can be used to produce histograms. Such statistics provide information necessary for processing and analyzing remote sensing data. A “population” is an infinite or finite set of elements. A “sample” is a subset of the elements taken from a population used to make inferences about certain characteristics of the population. (e.g., training signatures) 3 Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around the central value, and the frequency of occurrence declines away from this central point.
    [Show full text]
  • Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F
    Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F Course : Basic Econometrics : HC43 / Statistics B.A. Hons Economics, Semester IV/ Semester III Delhi University Course Instructor: Siddharth Rathore Assistant Professor Economics Department, Gargi College Siddharth Rathore guj75845_appC.qxd 4/16/09 12:41 PM Page 461 APPENDIX C SOME IMPORTANT PROBABILITY DISTRIBUTIONS In Appendix B we noted that a random variable (r.v.) can be described by a few characteristics, or moments, of its probability function (PDF or PMF), such as the expected value and variance. This, however, presumes that we know the PDF of that r.v., which is a tall order since there are all kinds of random variables. In practice, however, some random variables occur so frequently that statisticians have determined their PDFs and documented their properties. For our purpose, we will consider only those PDFs that are of direct interest to us. But keep in mind that there are several other PDFs that statisticians have studied which can be found in any standard statistics textbook. In this appendix we will discuss the following four probability distributions: 1. The normal distribution 2. The t distribution 3. The chi-square (␹2 ) distribution 4. The F distribution These probability distributions are important in their own right, but for our purposes they are especially important because they help us to find out the probability distributions of estimators (or statistics), such as the sample mean and sample variance. Recall that estimators are random variables. Equipped with that knowledge, we will be able to draw inferences about their true population values.
    [Show full text]
  • The Bootstrap and Jackknife
    The Bootstrap and Jackknife Summer 2017 Summer Institutes 249 Bootstrap & Jackknife Motivation In scientific research • Interest often focuses upon the estimation of some unknown parameter, θ. The parameter θ can represent for example, mean weight of a certain strain of mice, heritability index, a genetic component of variation, a mutation rate, etc. • Two key questions need to be addressed: 1. How do we estimate θ ? 2. Given an estimator for θ , how do we estimate its precision/accuracy? • We assume Question 1 can be reasonably well specified by the researcher • Question 2, for our purposes, will be addressed via the estimation of the estimator’s standard error Summer 2017 Summer Institutes 250 What is a standard error? Suppose we want to estimate a parameter theta (eg. the mean/median/squared-log-mode) of a distribution • Our sample is random, so… • Any function of our sample is random, so... • Our estimate, theta-hat, is random! So... • If we collected a new sample, we’d get a new estimate. Same for another sample, and another... So • Our estimate has a distribution! It’s called a sampling distribution! The standard deviation of that distribution is the standard error Summer 2017 Summer Institutes 251 Bootstrap Motivation Challenges • Answering Question 2, even for relatively simple estimators (e.g., ratios and other non-linear functions of estimators) can be quite challenging • Solutions to most estimators are mathematically intractable or too complicated to develop (with or without advanced training in statistical inference) • However • Great strides in computing, particularly in the last 25 years, have made computational intensive calculations feasible.
    [Show full text]
  • Linear Regression
    eesc BC 3017 statistics notes 1 LINEAR REGRESSION Systematic var iation in the true value Up to now, wehav e been thinking about measurement as sampling of values from an ensemble of all possible outcomes in order to estimate the true value (which would, according to our previous discussion, be well approximated by the mean of a very large sample). Givenasample of outcomes, we have sometimes checked the hypothesis that it is a random sample from some ensemble of outcomes, by plotting the data points against some other variable, such as ordinal position. Under the hypothesis of random selection, no clear trend should appear.Howev er, the contrary case, where one finds a clear trend, is very important. Aclear trend can be a discovery,rather than a nuisance! Whether it is adiscovery or a nuisance (or both) depends on what one finds out about the reasons underlying the trend. In either case one must be prepared to deal with trends in analyzing data. Figure 2.1 (a) shows a plot of (hypothetical) data in which there is a very clear trend. The yaxis scales concentration of coliform bacteria sampled from rivers in various regions (units are colonies per liter). The x axis is a hypothetical indexofregional urbanization, ranging from 1 to 10. The hypothetical data consist of 6 different measurements at each levelofurbanization. The mean of each set of 6 measurements givesarough estimate of the true value for coliform bacteria concentration for rivers in a region with that urbanization level. The jagged dark line drawn on the graph connects these estimates of true value and makes the trend quite clear: more extensive urbanization is associated with higher true values of bacteria concentration.
    [Show full text]