ST 380 Probability and for the Physical Sciences Parameter Estimation

Probability theory tells us what to expect when we carry out some experiment with random outcomes, in terms of the parameters of the problem.

Statistical theory tells us what we can learn about those parameters when we have seen the outcome of the experiment.

We speak of making statistical inferences about the parameters.

1 / 16 Point Estimation Introduction ST 380 Probability and Statistics for the Physical Sciences

Point Estimation A point estimate of a parameter is a single value that represents a best guess as to the value of the parameter.

For example, Rasmussen Reports surveyed 1,500 likely voters over a 3-day period, and 690 agreed that they approve the President’s performance in office.

We assume that each voter was randomly selected from a population in which a fraction p of voters would agree.

Here p is the parameter of interest, and the natural point estimate of it isp ˆ = 690/1500 = .46, or 46%.

2 / 16 Point Estimation General Concepts ST 380 Probability and Statistics for the Physical Sciences

Sample In any situation where we observe a simple random sample X1, X2,..., Xn from some population with mean µ, we know that the ¯ sample mean X = (X1 + X2 + ··· + Xn)/n satisfies

E(X¯ ) = µ, so it is natural to estimate µ by X¯ .

We treat the Rasmussen survey as a binomial experiment with ¯ E(Xi ) = p, so usingp ˆ (= X ) to estimate p is a special case of using X¯ to estimate µ.

3 / 16 Point Estimation General Concepts ST 380 Probability and Statistics for the Physical Sciences

Estimator and Estimate It is important to distinguish between the rule that we follow to estimate a parameter and the value that we find for a particular sample.

We call the rule an and the value an estimate.

For example, in the survey data, the rule is “estimate p by the sample fractionp ˆ”, and the value is .46.

So the estimator isp ˆ, and the estimate is .46.

One week ago, the same estimator pˆ with a different sample gave a different estimate, .49.

4 / 16 Point Estimation General Concepts ST 380 Probability and Statistics for the Physical Sciences

Sampling Distribution Clearly a point estimator is a , and therefore has a sampling distribution.

Suppose that X1, X2,..., Xn is a random sample from some population with a parameter θ, and that ˆ ˆ θ = θ(X1, X2,..., Xn) is a statistic that we want to use as an estimator of θ.

5 / 16 Point Estimation General Concepts ST 380 Probability and Statistics for the Physical Sciences

Bias If E(θˆ) = θ for all possible values of θ, θˆ is an unbiased estimator of θ.

In general, the of θˆ as an estimator of θ is

E(θˆ − θ) = E(θˆ) − θ.

A biased estimator in a sense systematically over-estimates or under-estimates θ, so we try to avoid with large .

An unbiased estimator is desirable, but not always available, and not always sensible.

6 / 16 Point Estimation General Concepts ST 380 Probability and Statistics for the Physical Sciences

For example, suppose that n = 1, and X = X1 has the with parameter µ: µx P(X = x) = p(x; µ) = e−µ , x = 0, 1,... x!

E(X ) = µ, so X is an unbiased estimator of µ, but suppose that the parameter of interest is θ = e−µ.

The only unbiased estimator of θ is ( 1 if X = 0, θˆ = 0 if X > 0.

7 / 16 Point Estimation General Concepts ST 380 Probability and Statistics for the Physical Sciences

Mean Squared Error We measure how far an estimator θˆ is from the true value θ using the mean squared error:

MSE(θˆ; θ) = E[(θˆ − θ)2].

We can show that

MSE(θˆ; θ) = (bias)2 + V (θˆ).

For an unbiased estimator, bias = 0, so

MSE(θˆ; θ) = V (θˆ).

8 / 16 Point Estimation General Concepts ST 380 Probability and Statistics for the Physical Sciences

Many biased estimators are approximately unbiased, in the sense that

(bias)2  V (θˆ), so MSE(θˆ; θ) ≈ V (θˆ).

Standard error So if an estimator is unbiased, either exactly or approximately, its performance is measured by V (θˆ), or by its standard deviation q ˆ σθˆ = V (θ), also known as its standard error.

9 / 16 Point Estimation General Concepts ST 380 Probability and Statistics for the Physical Sciences

Often an estimator’s standard error is a of θ or other parameters; these must be replaced by estimates before we can actually calculate a value.

Estimated standard error The resulting statistic is called the estimated standard error, and is denotedσ ˆθˆ.

Example: binomial distribution; V (ˆp) = p(1 − p)/n, so r r p(1 − p) pˆ(1 − pˆ) σ = , andσ ˆ = . pˆ n pˆ n

10 / 16 Point Estimation General Concepts ST 380 Probability and Statistics for the Physical Sciences Methods of Point Estimation

In some situations we have an obvious estimator θˆ, such as the binomialp ˆ = X /n.

In other cases we may not.

Example: Ozone pollution

Suppose that X1, X2,..., X28 are daily maximum ozone levels on 28 consecutive days. Suppose further that we want to model these as independent variables with the Weibull distribution

 α−1 α x α f (x; α, β) = e−(x/β) , 0 < x < ∞. β β

11 / 16 Point Estimation Methods of Point Estimation ST 380 Probability and Statistics for the Physical Sciences

It is not obvious how to estimate either α or β.

Suppose that we know from other data that α is well approximated by the value 2. It is still not obvious how to estimate β.

Before we observed the data, the joint pdf

n Y f (xi ; α, β) i=1 measures the relative probability of observing specific values x1, x2,..., xn.

12 / 16 Point Estimation Methods of Point Estimation ST 380 Probability and Statistics for the Physical Sciences

Likelihood function

After observing x1, x2,..., xn, we can use the same function to measure the relative likelihood of different values of α and β (or just β if we believe we know the value of α = α0).

When used this way, we call it the likelihood function,

n Y L(β) = f (xi ; α0, β). i=1

13 / 16 Point Estimation Methods of Point Estimation ST 380 Probability and Statistics for the Physical Sciences

Example, with simulated ozone levels: n <- 28 alpha0 <- 2 beta <- 70 x <- rweibull(n, alpha0, beta) L <- function(beta) { lik <- rep(NA, length(beta)) for (i in 1:length(beta)) lik[i] <- prod(dweibull(x, alpha0, beta[i])) lik } plot(L, from = 50, to = 100)

14 / 16 Point Estimation Methods of Point Estimation ST 380 Probability and Statistics for the Physical Sciences

Maximum Likelihood The most likely value of β, the value that maximizes the likelihood, is the maximum likelihood estimate.

Maximum likelihood estimators are generally approximately unbiased, and have close to the smallest possible mean squared error.

Most of the estimators that we cover later will be maximum likelihood estimators, or sometimes unbiased modifications of them.

15 / 16 Point Estimation Methods of Point Estimation ST 380 Probability and Statistics for the Physical Sciences

In the example, we can use the optimize() function to find the maximum likelihood estimate of β: o <- optimize(L, c(50, 100), maximum = TRUE) abline(v = o$maximum, col = "blue") title(paste("MLE of beta:", round(o$maximum, 1)))

Alternatively, we can show analytically that

1 α0 P x  α0 βˆ = i ML n mean(x^alpha0)^(1/alpha0)

16 / 16 Point Estimation Methods of Point Estimation