A Some Mathematical Background Topics

A Some Mathematical Background Topics A.1 Series Identities A.1.1 Geometric Series Direct multiplication confirms the algebraic identity (x − y)(xn + xn−1y + ...+ xyn−1 + yn)=xn+1 − yn+1 for any x and y and n a positive integer. If x = y, then xn+1 − yn+1 xn + xn−1y + ...+ xyn−1 + yn = . x − y In particular 1 − rn 1+r + r2 + ...+ rn−1 = . (A.1) 1 − r If |r| < 1 the right hand side converges as n →∞and this becomes 1 1+r + r2 + ...= . (A.2) 1 − r A.1.2 Arithmetic Series Byadding1+2+3+...+ n to itself backwards one gets n terms each equal to n + 1. Hence n(n +1) 1+2+3+...+ n = . 2 A.1.3 Taylor’s Series An infinitely differentiable function y = f(x) can be expanded in a power series about a given point x = a according to f (a) f (a) f (3)(a) f(x)=f(a)+ (x − a)+ (x − a)2 + (x − a)3 + .... 1! 2! 3! In particular, the exponential function can be expanded about 0 to give R.W. Shonkwiler, Finance with Monte Carlo, Springer Undergraduate 213 Texts in Mathematics and Technology, DOI 10.1007/978-1-4614-8511-7, © Springer Science+Business Media New York 2013 214 A Some Mathematical Background Topics x2 x3 ex =1+x + + + .... (A.3) 2! 3! Likewise the log function can be expanded about 1. Using the change of variable u = t − 1, from the definition of the log function 1+x dt x du log(1 + x)= = 1 t 0 1+u x x2 x3 = (1 − u + u2 − u3 + ...)du = x − + − ... (A.4) 0 2 3 valid for |x| < 1. A.2 Histograms A histogram is a special kind of bar chart for depicting a set of values, v1, v2, ..., vN , numbering, say, N in total. A convenient subdivision of the x-axis is created containing the values, for example by means of the points x0, x1, x2, ..., xK; x0 ≤ vi ≤ xK, i =1, 2,...,N. They establish intervals, or bins,[x0,x1), [x1,x2), ...,[xK−1,xK ). A count is made of how many of the values lie in each bin, for example n1 in [x0,x1), n2 in [x1,x2) and so on. Finally a rectangle of that height nk is drawn standing on bin [xk−1,xk), k =1,...,K. This is called a frequency histogram. K − Altogether the area under to bars is A = k=1 nk(xk xk−1). Redrawing the figure and making the height on the kth subinterval equal to nk/A, produces a density histogram or just histogram for short. A density histogram is an approximation of the probability density of the process generating the original values. A.3 Probability Distributions and Densities A random variable X is the specific (real-valued) outcome of a trial of a process whose outcomes are unpredictable (in exact value). By means of a histogram it is possible to see with what frequency the various outcomes occur. For a discrete random variable, one with only finitely many outcomes, the frequency pi of each outcome, xi,isitsprobability,Pr(X = xi)=pi,andthe function of these probabilities, f(xi)=pi,isitsprobability density function or pdf. For every real number x, the sum of the probabilities less than or equal to x is called the cumulative distribution function or cdf, F (x)= {f(xi):xi ≤ x}. The cumulative distribution function is 0 for x less than the smallest outcome of the process, is 1 for x larger than the largest outcome, and is otherwise constant A.4 Expectations 215 except for jumps at each outcome xi by the amount pi. It follows that for each x, F (x) is the probability that a trial of the process will be less than or equal to x. Likewise for a continuous random variable, its cumulative distribution function F (x) is the probability that a trial of the process will be less than or equal to x. However for a continuous random variable, the cdf is continuous, that is, has no jumps. But nevertheless it is monotone increasing (if y>x,then F (y) ≥ F (x)) and tends to 0 as x →−∞and1asx →∞. The probability density function f(·) for a continuous random variable is the derivative of its cdf. Therefore the probability a trial of the process will lie between two real values is given by the integral of its pdf, b Pr(a<X<b)= f(x) dx. a A.4 Expectations The expectation of a function g(X) of a random variable is, in the discrete case, E(g(X)) = g(Xi)f(xi) i and in the continuous case ∞ E(g(X)) = g(x)f(x) dx. −∞ The mean is the expectation of X itself μ = E(X) and the variance is the expectation of the squared differences from the mean var(X)=E (X − μ)2 = E(X2) − μ2. The third member is an equivalent expression for the second. If a distribution is tightly clustered about its mean, its variance is small. By the Law of Large Numbers, expectations can be approximated empirically. Let X1, X2, ..., Xn be the outcomes of n trials of the process. The estimate of the expectation E(g(X)) is 1 n E(g(X)) ≈ g(X ). n i i=1 This tends to the exact value as n →∞. 216 A Some Mathematical Background Topics Let X and Y be two random variables defined over the same probability space, their covariance is defined as covar(X, Y )=E (X − μX )(Y − μY ) . (A.5) The correlation between X and Y is defined as covar(X, Y ) ρXY = . (A.6) σX σY If X and Y are independent then E(f(X)g(Y )) = E(f(X))E(g(Y )) for functions of X and Y respectively and so covar(X, Y )=ρXY =0.Further,ifX and Y are independent, then var(f(X)+g(Y )) = var(f(X)) + var(g(Y )). A.5 The Normal Distribution Among probabilistic processes, one of the most important is the normal distribution. It is a continuous process with density given by 1 − 1 ( x−μ )2 φ(x)= √ e 2 σ , −∞ <x<∞. σ 2π The density has two parameters, μ and σ2, and these are its mean and variance respectively. A notation for this distribution is N(μ, σ2). There is no closed form expression for the cdf of the normal distribution in terms of familiar functions, but there are several accurate rational approxima- tions. That due to Abramowitz and Stegun is as follows. Let Φ(·) denote the cumulative distribution function and let t =1/(1 + a|x|), |x| indicates the absolute value of x.Then 2 3 4 5 Φ(x) ≈ 1 − φ(x)(b1t + b2t + b3t + b4t + b5t ). (A.7) The constants are: a =0.2316419 b1 =0.319381530,b2 =0.356563782,b3 =1.781477937 b4 =1.821255978,b5 =1.330274429 (A.8) There is also an rational approximation for the inverse of the cumulative distribution function. The following is from [BFS83] Let x = Φ−1(u)for0.5 ≤ u<1 and put y = − log((1 − u)2). A.7 Least Squares 217 Then 2 3 4 p0 + p1y + p2y + p3y + p4y x = y + 2 3 4 . (A.9) q0 + q1y + q2y + q3y + q4y If 0 <u<0.5, by symmetry, Φ−1(u)=−Φ−1(1 − u). The constants are: p0=−0.322232431088 q0=0.099348462606 p1=−1 q1=0.588581570495 p2=−0.342242088547 q2=0.531103462366 p3=−0.0204231210245 q3=0.10353775285 p4=−0.0000453642210148 q4=0.0038560700634 A.6 The Central Limit Theorem Theorem A.1. (Central limit theorem) Let X1, X2, ..., Xn be independent random samples from a distribution with mean μ and finite variance σ2.Then n X − nμ Y = i=1√ i nσ2 has a limiting distribution as n →∞and it is N(0, 1), normal with mean 0 and variance 1. A.7 Least Squares Assume variables y and x are linearly related with slope m and intercept b. And assume we have n empirical data points testing that relationship, (x1,y1), (x2,y2), ...,(xn,yn). Let ei be the difference between the empirical value yi and the predicted value, mxi + b. The sum of the squares of these differences is n n 2 − 2 ei = (yi (mxi + b)) . (A.10) i=1 i=1 We seek to find the values of m and b which minimize this sum. Start by differentiating (A.10) with respect to m and b and set the derivatives to zero. First with respect to m n 0=2 (yi − (mxi + b)) (−xi) i=1 n n n − 2 − 0= xiyi m xi b xi, (A.11) i=1 i=1 i=1 a −2 has been divided out since what’s left will still be zero. 218 A Some Mathematical Background Topics Then with respect to b n 0=2 (yi − (mxi + b)) (1) i=1 n n 0= yi − m xi − nb. (A.12) i=1 i=1 The resulting system of two linear equations in two unknowns is, from (A.11) and (A.12), m n x2 + b n x = n x y i=1 i i=1 i i=1 i i n n m i=1 xi + nb = i=1 yi. The solution by Cramer’s Rule, see below, is n n x y − n x n y m = i=1 i i i=1 i i=1 i n 2 − n 2 n i=1 xi ( i=1 xi) n x2 n y − n x n x y b = i=1 i i=1 i i=1 i i=1 i i (A.13) n 2 − n 2 n i=1 xi ( i=1 xi) A.8 Error Estimates for Monte Carlo Simulations Suppose we are trying to use Monte Carlo to estimate some value, call it θ.

Load more