Curve fitting – Least squares
1
Curve fitting
Km = 100 µM -1 vmax = 1 ATP s
2 Excurse error of multiple measurements
Starting point: Measure N times the same parameter Obtained values are Gaussian distributed around mean with standard deviation σ
What is the error of the mean of all measurements?
Sum = x1 +x2 + ... + xN Variance of sum = N * σ2 (Central limit theorem) Standard deviation of sum N
Mean = (x1 +x2 + ... + xN)/N N 1 Standard deviation of mean N (called standard error of mean) N N N
3
Excurse error propagation
What is error for f(x,y,z,…) if we know errors of x,y,z,… (σx, σy, σz, …) for purely statistical errors?
-> Individual variances add scaled by squared partial derivatives (if parameters are uncorrelated)
Examples :
Addition/substraction: squared errors add up
squared relative Product: errors add up
4 Excurse error propagation
squared relative Ratios: errors add up
Powers: relative error times power
Logarithms: error is relative error
5
Curve fitting – Least squares
Starting point: - data set with N pairs of (xi,yi) - xi known exactly, -yi Gaussian distributed around true value with error σi - errors uncorrelated - function f(x) which shall describe the values y (y = f(x)) - f(x) depends on one or more parameters a
[S] v vmax [S] KM
6 Curve fitting – Least squares
1 2 2 Probability to get y for given x [y i f (xi ;a)] / 2 i ] i i P(y i ;a) e 2 i
[S] v vmax [S] KM
7
Curve fitting – Least squares
N 1 2 2 Prob. to get whole set y for set of x [yi f (xi ;a)] / 2 i ] i i P(y1,..,yN ;a) e i 1 2 i
For best fitting theory curve (red curve) P(y1,..yN;a) becomes maximum!
[S] v vmax [S] KM
8 Curve fitting – Least squares
N 1 2 2 Prob. to get whole set y for set of x [yi f (xi ;a)] / 2 i ] i i P(y1,..,yN ;a) e i 1 2 i
For best fitting theory curve (red curve) P(y1,..yN;a) becomes maximum!
Use logarithm of product, get a sum and maximize sum: 2 N N 1 y i f (xi ;a) lnP(y1,..,yN ;a) ln i 2 2 1 i 1
OR minimize χ2 with:
Principle of least squares!!!
9
Curve fitting – Least squares
Principle of least squares!!! (Χ2 minimization)
Solve:
Solve equation(s) either analytically (only simple functions) or numerically (specialized software, different algorithms) χ2 value indicates goodness of fit
Errors available: USE THEM! → so called weighted fit
Errors not available: σi’s are set as constant → conventional fit
10 Reduced χ2
Expectation value of χ2 for weighted fit:
2 N N 2 y i f (xi ;a) 2 i N 2 2 i 1 i i 1 i
Define reduced χ2:
2 2 2 red with 1 red N M red
number of fit paramters
For weighted fit the reduced χ2 should become 1, if errors are properly chosen
Expectation value of χ2 for unweighted fit: 1 N 1 N 2 y f (x ;a) 2 2 2 i i N M i 1 N M i 1 Should approach the variance of a single data point
11
Simple proportion
Differentiation:
For σi = const = σ
Solve:
Get:
12 Simple proportion - Errors
Rewrite:
-> Error of m given by errors of yi -> Use rules for error propagation
-> Standard error of determination of m (single confidence interval) is then square-root of V(m)
‐> σ2 is estimated as mean square deviation of fitted function from the y- values if now errors are available N 2 1 2 y i f (xi ;a) N 1 13
Straight line fit (2 parameters)
(or σi = const = σ)
Differentiation w.r.t c: or
Differentiation w.r.t m: or
Get: (solve eqn. array)
Errors:
For all relations which are linear with respect to the fit parameters, analytical solutions possible! 14 Straight line fit (2 parameters)
Additional quantities for multiple parameters: Covariances -> describes interdependency/correlation between the obtained parameters
Covariance matrix Vii = Var(x(i))
V V V Var(p ) V V p1p1 p1p2 p1p3 1 p1p2 p1p3 covpi ,p j Vp1p2 Vp2p2 Vp2p3 Vp1p2 Var(p2 ) Vp2p3 Vp1p3 Vp2p3 Vp3p3 Vp1p3 Vp2p3 Var(p3 )
Squared errors of fit parameters!
Straight line fit:
15
More in depth
Online lecture: Statistical Methods of Data Analysis by Ian C. Brock http://www-zeus.physik.uni-bonn.de/~brock/teaching/stat_ws0001/
Numerical recipes in C++, Cambridge university press
16