Math 541 - Numerical Analysis Approximation Theory Discrete Least Squares Approximation
Total Page:16
File Type:pdf, Size:1020Kb
Math 541 - Numerical Analysis Approximation Theory Discrete Least Squares Approximation Joseph M. Mahaffy, [email protected] Department of Mathematics and Statistics Dynamical Systems Group Computational Sciences Research Center San Diego State University San Diego, CA 92182-7720 http://jmahaffy.sdsu.edu Spring 2018 Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (1/97) Outline 1 Approximation Theory: Discrete Least Squares Introduction Discrete Least Squares A Simple, Powerful Approach 2 Application: Cricket Thermometer Polynomial fits Model Selection - BIC and AIC Different Norms Weighted Least Squares 3 Application: U. S. Population Polynomial fits Exponential Models 4 Application: Pharmokinetics Compartment Models Dog Study with Opioid Exponential Peeling Nonlinear least squares fits Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (2/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach Introduction: Matching a Few Parameters to a Lot of Data. Sometimes we get a lot of data, many observations, and want to fit it to a simple model. 8 6 4 2 0 0 1 2 3 4 5 Underlying function f(x) = 1 + x + x^2/25 Measured Data Average Linear Best Fit Quadratic Best Fit PDF-link: code. Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (3/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach Why a Low Dimensional Model? Low dimensional models (e.g. low degree polynomials) are easy to work with, and are quite well behaved (high degree polynomials can be quite oscillatory.) All measurements are noisy, to some degree. Often, we want to use a large number of measurements in order to “average out” random noise. Approximation Theory looks at two problems: [1] Given a data set, find the best fit for a model (i.e. in a class of functions, find the one that best represents the data.) [2] Find a simpler model approximating a given function. Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (4/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach Interpolation: A Bad Idea? We can probably agree that trying to interpolate this data set: 8 6 4 2 0 0 1 2 3 4 5 Measured Data with a 50th degree polynomial is not the best idea in the world... Even fitting a cubic spline to this data gives wild oscillations! [I tried, and it was not pretty!] Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (5/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach Defining “Best Fit” — the Residual. We are going to relax the requirement that the approximating function must pass through all the data points. Now we need a measurement of how well our approximation fits the data. — A definition of “best fit.” If f(xi) are the measured function values, and a(xi) are the values of our approximating functions, we can define a function, r(xi)= f(xi) − a(xi) which measures the deviation (residual) at xi. T Notice that ˜r = {r(x0), r(x1),...,r(xn)} is a vector. Notation: From now on, fi = f(xi), ai = a(xi), and ri = r(xi). T T Further, ˜f = {f0,f1,...,fn} , ˜a = {a0,a1,...,an} , and T ˜r = {r0, r1,...,rn} . Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (6/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach What is the Size of the Residual? There are many possible choices, e.g. • The abs-sum of the deviations: n E1 = |ri| ⇔ E1 = k˜rk1 i=0 X • The sum-of-the-squares of the deviations: n 2 E2 = |r | ⇔ E2 = k˜rk2 v i ui=0 uX t • The largest of the deviations: E∞ = max |ri| ⇔ E∞ = k˜rk∞ 0≤i≤n In most cases, the sum-of-the-squares version is the easiest to work Discrete Least Squares Approximation — Josephwith. M. (From Mahaffy, [email protected] on we will focusi on(7/97) this choice...) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach Discrete Least Squares Approximation We have chosen the sum-of-squares measurement for errors. Lets find the constant that best fits the data, minimize n 2 E(C)= (fi − C) . i=0 X If C∗ is a minimizer, then E′(C∗) = 0 [derivative at a max/min is zero] n n ′ ′′ E (C)= − 2(fi − C)= −2 fi + 2(n + 1)C, E (C)=2(n + 1) i=0 i=0 X X Positive Set =0, and solve for C | {z } hence | {z } n ∗ 1 C = fi, it is a min since E′′(C∗)=2(n + 1) > 0. n + 1 i=0 X is the constant that best the fits the data. (Note: C∗ is the average.) Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (8/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach Discrete Least Squares: Linear Approximation. The form of Least Squares you are most likely to see: Find the Linear Function, p1(x)= a0 + a1x, that best fits the data. The error E(a0,a1) we need to minimize is: n 2 E(a0,a1)= [(a0 + a1xi) − fi] . i=0 X The first partial derivatives with respect to a0 and a1 better be zero at the minimum: ∂ n E(a0,a1) = 2 [(a0 + a1xi) − fi] =0 ∂a0 i=0 Xn ∂ E(a0,a1) = 2 xi [(a0 + a1xi) − fi] = 0. ∂a1 i=0 X We “massage” these expressions to get the Normal Equations... Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (9/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach Linear Approximation: The Normal Equations p1(x) n n a0(n +1) + a1 xi = fi i=0 i=0 n Xn Xn 2 a0 xi + a1 xi = xifi. i=0 i=0 i=0 X X X Since everything except a0 and a1 is known, this is a 2-by-2 system of equations. n n (n + 1) xi fi i=0 a0 i=0 n Xn = Xn . 2 a1 xi x xifi i i=0 i=0 i=0 X X X Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (10/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach Alternate Linear Least Squares For the Linear Least Squares model, p1(x)= a0 + a1x, where the error n 2 E(a0,a1)= [(a0 + a1xi) − fi] i=0 X is minimized for data set (xi,fi),i =0, ..., n, there is an easy formulation without matrices often stated in Statistics texts Define the averages 1 n 1 n x¯ = x and f¯ = f . n +1 i n +1 i i=0 i=0 X X The best fitting slope and intercept are n i=0(xi − x¯)fi a1 = and a0 = f¯− a1x.¯ n (x − x¯)2 P i=0 i Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] i (11/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach Quadratic Model, p2(x) 2 For the quadratic polynomial p2(x)= a0 + a1x + a2x , the error is given by n 2 2 E(a0,a1,a2)= a0 + a1xi + a2xi − fi i=0 X At the minimum (best model) we must have n ∂ 2 E(a0,a1,a2) = 2 (a0 + a1xi + a2xi ) − fi = 0 ∂a0 i=0 Xn ∂ 2 E(a0,a1,a2) = 2 xi (a0 + a1xi + a2xi ) − fi = 0 ∂a1 i=0 Xn ∂ 2 2 E(a0,a1,a2) = 2 xi (a0 + a1xi + a2xi ) − fi = 0. ∂a2 i=0 X Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (12/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach Quadratic Model: The Normal Equations p2(x) 2 Similarly for the quadratic polynomial p2(x)= a0 + a1x + a2x , the normal equations are: n n n 2 a0(n +1) + a1 xi + a2 xi = fi i=0 i=0 i=0 n Xn Xn Xn 2 3 a0 xi + a1 x + a2 x = xifi. i i i=0 i=0 i=0 i=0 Xn Xn Xn Xn 2 3 4 2 a0 xi + a1 xi + a2 xi = xi fi. i=0 i=0 i=0 i=0 X X X X Note: Even though the model is quadratic, the resulting (normal) equations are linear. — The model is linear in its parameters, a0, a1, and a2. Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (13/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach The Normal Equations — As Matrix Equations. We rewrite the Normal Equations as: n n n 2 (n + 1) xi xi fi i=0 i=0 i=0 n Xn Xn a0 nX 2 3 xi x x a1 = xifi. . i i i=0 i=0 i=0 a2 i=0 Xn Xn Xn Xn 2 3 4 2 xi xi xi xi fi. i=0 i=0 i=0 i=0 X X X X It is not immediately obvious, but this expression can be written in the form ATA˜a = AT˜f. Where the matrix A is very easy to write in terms of xi. [Jump Forward]. Discrete Least Squares Approximation — Joseph M. Mahaffy, [email protected] (14/97) Introduction Approximation Theory: Discrete Least Squares Discrete Least Squares A Simple, Powerful Approach The Polynomial Equations in Matrix Form pm(x) We can express the mth degree polynomial, pm(x), evaluated at the points xi: 2 m a0 + a1xi + a2xi + ··· + amxi = fi, i =0,...,n as a product of an (n + 1)-by-(m + 1) matrix, A and the (m + 1)-by-1 vector ˜a and the result is the (n + 1)-by-1 vector ˜f, usually n ≫ m: 2 m 1 x0 x0 ··· x0 f0 2 m 1 x1 x1 ··· x1 a0 f1 2 m 1 x2 x2 ··· x2 a1 f2 2 m 1 x3 x3 ··· x3 .