A Math Tool Kit

Total Page:16

File Type:pdf, Size:1020Kb

A Math Tool Kit Introduction to Quantitative Finance A Math Tool Kit Robert R. Reitano The MIT Press Cambridge, Massachusetts London, England 6 2010 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. MIT Press books may be purchased at special quantity discounts for business or sales promotional use. For information, please email [email protected] or write to Special Sales Department, The MIT Press, 55 Hayward Street, Cambridge, MA 02142. This book was set in Times New Roman on 3B2 by Asco Typesetters, Hong Kong and was printed and bound in the United States of America. Library of Congress Cataloging-in-Publication Data Reitano, Robert R., 1950– Introduction to quantitative finance : a math tool kit / Robert R. Reitano. p. cm. Includes index. ISBN 978-0-262-01369-7 (hardcover : alk. paper) 1. Finance—Mathematical models. I. Title. HG106.R45 2010 332.0105195—dc22 2009022214 10987654321 Index Abelian group, 38 Arrow, Kenneth J., 530 Absolute convergence. See under Convergence Arrow–Pratt measure of absolute risk aversion, 530 Absolute moments of discrete distribution, 274– Ask price, 137 75 Asset(s) Absolute value, 45 consumption, 330 Accounting regimes, 515 investment, 330 Accumulated value functions, 55–56 risky, 391–92 Accumulation point, 130 Asset allocation, 320 and convergence, 148, 151, 164 and Euclidean space, 93–95 of bounded numerical sequence, 152, 154–55 framework, 319–24 and ratio test, 195 minimal risk, 508–509 Advanced Quantitative Finance: A Math Tool Kit, optimal risk, 528–31 xxiii, 405–406 price function approximation in, 222–23 A‰ne, 524 Asset hedging, 516 Aggregate loss model, 310–13 Asset–liability management (asset–liability risk Aleph, @, 143 management, ALM), 101, 514–21 Algebra Asset portfolio, risk-free, 320, 387–88 associative normed division, 73 Associative normed division algebra, 73 fundamental theorem of, 49 Associativity, of point addition in vector spaces, linear, 323 72 sigma, 235, 238, 614–15 Assumptions Borel, 618 need to examine, 26, 27 ALM (asset–liability management, asset–liability Autocorrelation, 325 risk management), 101, 514–21 Axiomatic set theory, 4–6, 117–21. See also Set Almost everywhere (a.e.), 574 theory Alternating geometric series, 185 basic operations of, 121–22 Alternating harmonic series, 183, 184–85, 478 Axiom of choice, 118, 120 Alternating series convergence test, 193–94 Axioms, 4, 24, 31 American option, 329 for natural numbers, 32 Analytic function, 470–73, 477, 482, 486 for set theory, 118–20 Annualized basis, of interest-rate calculation, 56 Banach, Stefan, 202 Annual rate basis, 53 Banach space, 199–202 Annuity, 55, 221, 645 financial application of, 223–24 Antiderivative, 583, 592, 595, 596 Barber’s paradox, 10, 139–40 Approximation Base-b expansion, 43–44 with binomial lattice, of equity prices, 326, 406 Basis point, 100 of derivatives, 504–505 Bell-shaped curve, 378–79, 634 of duration and convexity, 509–14, 516–17, 651– Benchmark bonds, 99 54 Bernoulli, Jakob, 290 of functions, 417, 440 Bernoulli distribution, 290 error in, 468 Bernoulli trial, 290 improvement of, 450–52, 465–66 Beta distribution, 628–30 and Taylor polynomials, 470 Beta function, 628 of integral of normal density, 654–60 Beta of a stock portfolio, 104 and numerical integration, 609 Be´zout, E´ tienne, 35 and Simpson’s rule, 612–13 Be´zout’s identity, 35–36 and trapezoidal rule, 609–12 Biconditional, 12 and Stirling’s formula for n!, 371–74 Bid–ask spread (bid–o¤er spread), 103, 137 Arbitrage, risk-free, 26, 320, 331, 517–18 Bid price, 137 Aristotle of Stagira, and wheel of Aristotle, 9–10 Big O convergence, 440–41, 442 Arithmetic, fundamental theorem of, 35, 38 Binary call option, European, 662–63 Arithmetic mean–geometric mean inequality, Binary connectives, 11 AGM, 79, 502 Binomial coe‰cients, 249, 467 690 Index Binomial distribution, 250, 290–92 Boundedness and De Moivre–Laplace theorem, 368 and continuous functions, 434–35, 438 and European call option, 402–403 and convergence, 150, 151 and geometric distribution, 292 and integrability, 567–68 negative, 296–99, 314, 350 of sequence, 158 and Poisson distribution, 299 (see also Poisson Bounded numerical sequence, 145 distribution) accumulation point, 152, 154–55 Binomial lattice equity price models as Dt ! 0 Bounded subset, 131 real world, 392–400 Bound variable, 11 risk-neutral, 532–43 Burden of proof, 1, 2 special risk-averter, 543 Business school finance students, xxvii Binomial lattice model, 326–28, 399, 400, 404, 407 C (field of complex numbers), 45, 48 Cox–Ross–Rubinstein, 406 as metric space, 162, 165 Binomial probabilities, approximation of, 376–77, numerical series defined on, 177 379–80 Cn (n-dimensional complex space), 72 Binomial probability density function as metric space, 160, 162 continuitization of, 666–68 standard norm and inner product for, 74–75 piecewise, 664–66 Calculus, xxxiv, 417, 559–60 and De Moivre–Laplace theorem, 381 financial applications of (di¤erentiation) limiting density of, 370 asset–liability management, 514–21 Binomial random variables, 290, 291, 377, 403 Black–Scholes–Merton option-pricing formulas, Binomial scenario model, 328–29 547–49 Binomial stock price model, 395 constrained optimization, 507 Binomial theorem, 220, 250 continuity of price functions, 505–506 Black, Fischer, 405 duration and convexity approximation, 509–14, Black–Scholes–Merton option-pricing formulas, 516–17 404–406, 521, 538, 547–49 the ‘‘Greeks,’’ 521–22 generalized, 660–75 interval bisection, 507–508 and continuitization of binomial distribution, minimal risk asset allocation, 508–509 666–68 optimal risky asset allocation, 528–31 and limiting distribution of continuitization, risk-neutral binomial distribution, 532–43 668–71 special risk-averter binomial distribution, 543– and piecewise continuitization of binomial 47 distribution, 664–66 utility theory, 522–28 limiting distributions for, 532, 543 financial applications of (integration) Bond-pricing functions, 57–59 approximating integral of normal density, 654– Bond reconstitution, 98 60 Bonds continuous discounting, 641–44 classification of, 95 continuous stock dividends and reinvestment, present value of, 56–57 649–51 price versus par, 58 continuous term structure, 644–49 Bond yield to maturity, and interval bisection, duration and convexity approximation, 651– 167–70 54 Bond yields, 95, 96, 644–45 generalized Black–Scholes–Merton formula, conversion to spot rates (bootstrapping), 99 660–75 parameters for, 96 and functions, 417–20 Bond yield vector risk analysis, 99–100 continuity of, 420–33 (see also Continuous Bootstrapping, 99 functions) Borel, E´ mile, 134, 618 fundamental theorem of Borel sets, 618 version I, 581–84, 586, 592, 609, 643 Borel sigma algebra, 618 version II, 585–87, 598, 647, 648, 652 Bounded derivative, 465 and integration, 559 (see also Integration) Bounded interval, 122 multivariate, 91, 323, 515, 522, 625, 635 Index 691 Caldero´n, Alberto P., xxxvi, 25 in R, 122–27 Call option, price of, 521. See also European call in Rn, 127–28 option Cluster point, 130 Cantor, Georg, 38, 42, 125 Collateralized mortgage obligation (CMO), 513 Cantor’s diagonalization argument, 42, 361, 370 Collective loss model, 310–13 Cantor’s theory of infinite cardinal numbers, 143 Combinatorics (combinatorial analysis), 247 Cantor ternary set, 125–27 general orderings, 248–52 generalized, 143 simple ordered samples, 247–48 Capital letters, in probability theory notation, 280 and variance of sample variance, 285 Capital risk management, 514 Common expectation formulas, 624–26 Cardinal numbers, Cantor’s theory on, 143 Common stock-pricing functions, 60–61, 217–18, Cartesian plane, 45 506. See also at Stock Cartesian product, 71 Commutative group, 37, 38 Cash flow vectors, 100–101 Commutativity, of point addition in vector spaces, Cauchy, Augustin Louis, 42, 75, 162, 475, 599, 632 72 Cauchy criterion, 162, 163, 445 Compact metric space, 160–61 Cauchy distribution, 632–34 Compactness, 131, 136 Cauchy form of Taylor series remainder, 476, 599, and continuous functions, 436, 438 600–601 and Heine–Borel theorem, 131, 132–33, 134 Cauchy–Schwarz inequality, 75, 76, 78, 80 and general metric space, 158 and correlation, 272–74 Comparative ratio test, for series convergence, 194 Cauchy sequences, 42, 162–65, 201–202, 445 Comparison test, for series convergence, 183, 191– and complete metric space, 165–67 92, 208 and convergence, 178–79 Complement, set as, 121 and lp-norms, 197 Complete collection of events, 234–35, 239, 614. and lp-space, 201 See also Sigma algebra Cayley, Arthur, 73 Complete metric space, 164 Cayley–Dickson construction, 73 and Cauchy sequences, 165–67 c.d.f. See Cumulative distribution function under metric d, 165–66 Central di¤erence approximation, 504, 505 Completeness, of a mathematical theory, 4, 24 Central limit theorem, xxxiii, 381–86 Complete normed linear space, 201, 202. See also De Moivre–Laplace theorem as, 368, 381 Banach space Central moments of discrete distribution, 274 with compatible inner product, 206 (see also of continuous distribution, 624 Hilbert space) Certain life annuity, m-year, 318 Complete probability space, 237, 615 Characteristic function (c.f.), 277–78, 625 Complex analysis, 278, 347, 417, 559 and complex-valued functions, 559 Complex conjugate, 45 of discrete random variable, 484–85 Complex linear metric space, 160 uniqueness of, 347–48
Recommended publications
  • Laws of Total Expectation and Total Variance
    Laws of Total Expectation and Total Variance Definition of conditional density. Assume and arbitrary random variable X with density fX . Take an event A with P (A) > 0. Then the conditional density fXjA is defined as follows: 8 < f(x) 2 P (A) x A fXjA(x) = : 0 x2 = A Note that the support of fXjA is supported only in A. Definition of conditional expectation conditioned on an event. Z Z 1 E(h(X)jA) = h(x) fXjA(x) dx = h(x) fX (x) dx A P (A) A Example. For the random variable X with density function 8 < 1 3 64 x 0 < x < 4 f(x) = : 0 otherwise calculate E(X2 j X ≥ 1). Solution. Step 1. Z h i 4 1 1 1 x=4 255 P (X ≥ 1) = x3 dx = x4 = 1 64 256 4 x=1 256 Step 2. Z Z ( ) 1 256 4 1 E(X2 j X ≥ 1) = x2 f(x) dx = x2 x3 dx P (X ≥ 1) fx≥1g 255 1 64 Z ( ) ( )( ) h i 256 4 1 256 1 1 x=4 8192 = x5 dx = x6 = 255 1 64 255 64 6 x=1 765 Definition of conditional variance conditioned on an event. Var(XjA) = E(X2jA) − E(XjA)2 1 Example. For the previous example , calculate the conditional variance Var(XjX ≥ 1) Solution. We already calculated E(X2 j X ≥ 1). We only need to calculate E(X j X ≥ 1). Z Z Z ( ) 1 256 4 1 E(X j X ≥ 1) = x f(x) dx = x x3 dx P (X ≥ 1) fx≥1g 255 1 64 Z ( ) ( )( ) h i 256 4 1 256 1 1 x=4 4096 = x4 dx = x5 = 255 1 64 255 64 5 x=1 1275 Finally: ( ) 8192 4096 2 630784 Var(XjX ≥ 1) = E(X2jX ≥ 1) − E(XjX ≥ 1)2 = − = 765 1275 1625625 Definition of conditional expectation conditioned on a random variable.
    [Show full text]
  • Probability Cheatsheet V2.0 Thinking Conditionally Law of Total Probability (LOTP)
    Probability Cheatsheet v2.0 Thinking Conditionally Law of Total Probability (LOTP) Let B1;B2;B3; :::Bn be a partition of the sample space (i.e., they are Compiled by William Chen (http://wzchen.com) and Joe Blitzstein, Independence disjoint and their union is the entire sample space). with contributions from Sebastian Chiu, Yuan Jiang, Yuqi Hou, and Independent Events A and B are independent if knowing whether P (A) = P (AjB )P (B ) + P (AjB )P (B ) + ··· + P (AjB )P (B ) Jessy Hwang. Material based on Joe Blitzstein's (@stat110) lectures 1 1 2 2 n n A occurred gives no information about whether B occurred. More (http://stat110.net) and Blitzstein/Hwang's Introduction to P (A) = P (A \ B1) + P (A \ B2) + ··· + P (A \ Bn) formally, A and B (which have nonzero probability) are independent if Probability textbook (http://bit.ly/introprobability). Licensed and only if one of the following equivalent statements holds: For LOTP with extra conditioning, just add in another event C! under CC BY-NC-SA 4.0. Please share comments, suggestions, and errors at http://github.com/wzchen/probability_cheatsheet. P (A \ B) = P (A)P (B) P (AjC) = P (AjB1;C)P (B1jC) + ··· + P (AjBn;C)P (BnjC) P (AjB) = P (A) P (AjC) = P (A \ B1jC) + P (A \ B2jC) + ··· + P (A \ BnjC) P (BjA) = P (B) Last Updated September 4, 2015 Special case of LOTP with B and Bc as partition: Conditional Independence A and B are conditionally independent P (A) = P (AjB)P (B) + P (AjBc)P (Bc) given C if P (A \ BjC) = P (AjC)P (BjC).
    [Show full text]
  • Probability Cheatsheet
    Probability Cheatsheet v1.1.1 Simpson's Paradox Expected Value, Linearity, and Symmetry P (A j B; C) < P (A j Bc;C) and P (A j B; Cc) < P (A j Bc;Cc) Expected Value (aka mean, expectation, or average) can be thought Compiled by William Chen (http://wzchen.com) with contributions yet still, P (A j B) > P (A j Bc) of as the \weighted average" of the possible outcomes of our random from Sebastian Chiu, Yuan Jiang, Yuqi Hou, and Jessy Hwang. variable. Mathematically, if x1; x2; x3;::: are all of the possible values Material based off of Joe Blitzstein's (@stat110) lectures Bayes' Rule and Law of Total Probability that X can take, the expected value of X can be calculated as follows: P (http://stat110.net) and Blitzstein/Hwang's Intro to Probability E(X) = xiP (X = xi) textbook (http://bit.ly/introprobability). Licensed under CC i Law of Total Probability with partitioning set B1; B2; B3; :::Bn and BY-NC-SA 4.0. Please share comments, suggestions, and errors at with extra conditioning (just add C!) Note that for any X and Y , a and b scaling coefficients and c is our http://github.com/wzchen/probability_cheatsheet. constant, the following property of Linearity of Expectation holds: P (A) = P (AjB1)P (B1) + P (AjB2)P (B2) + :::P (AjBn)P (Bn) Last Updated March 20, 2015 P (A) = P (A \ B1) + P (A \ B2) + :::P (A \ Bn) E(aX + bY + c) = aE(X) + bE(Y ) + c P (AjC) = P (AjB1; C)P (B1jC) + :::P (AjBn; C)P (BnjC) If two Random Variables have the same distribution, even when they are dependent by the property of Symmetry their expected values P (AjC) = P (A \ B jC) + P (A \ B jC) + :::P (A \ B jC) Counting 1 2 n are equal.
    [Show full text]
  • Conditional Expectation and Prediction Conditional Frequency Functions and Pdfs Have Properties of Ordinary Frequency and Density Functions
    Conditional expectation and prediction Conditional frequency functions and pdfs have properties of ordinary frequency and density functions. Hence, associated with a conditional distribution is a conditional mean. Y and X are discrete random variables, the conditional frequency function of Y given x is pY|X(y|x). Conditional expectation of Y given X=x is Continuous case: Conditional expectation of a function: Consider a Poisson process on [0, 1] with mean λ, and let N be the # of points in [0, 1]. For p < 1, let X be the number of points in [0, p]. Find the conditional distribution and conditional mean of X given N = n. Make a guess! Consider a Poisson process on [0, 1] with mean λ, and let N be the # of points in [0, 1]. For p < 1, let X be the number of points in [0, p]. Find the conditional distribution and conditional mean of X given N = n. We first find the joint distribution: P(X = x, N = n), which is the probability of x events in [0, p] and n−x events in [p, 1]. From the assumption of a Poisson process, the counts in the two intervals are independent Poisson random variables with parameters pλ and (1−p)λ (why?), so N has Poisson marginal distribution, so the conditional frequency function of X is Binomial distribution, Conditional expectation is np. Conditional expectation of Y given X=x is a function of X, and hence also a random variable, E(Y|X). In the last example, E(X|N=n)=np, and E(X|N)=Np is a function of N, a random variable that generally has an expectation Taken w.r.t.
    [Show full text]
  • Chapter 3: Expectation and Variance
    44 Chapter 3: Expectation and Variance In the previous chapter we looked at probability, with three major themes: 1. Conditional probability: P(A | B). 2. First-step analysis for calculating eventual probabilities in a stochastic process. 3. Calculating probabilities for continuous and discrete random variables. In this chapter, we look at the same themes for expectation and variance. The expectation of a random variable is the long-term average of the ran- dom variable. Imagine observing many thousands of independent random values from the random variable of interest. Take the average of these random values. The expectation is the value of this average as the sample size tends to infinity. We will repeat the three themes of the previous chapter, but in a different order. 1. Calculating expectations for continuous and discrete random variables. 2. Conditional expectation: the expectation of a random variable X, condi- tional on the value taken by another random variable Y . If the value of Y affects the value of X (i.e. X and Y are dependent), the conditional expectation of X given the value of Y will be different from the overall expectation of X. 3. First-step analysis for calculating the expected amount of time needed to reach a particular state in a process (e.g. the expected number of shots before we win a game of tennis). We will also study similar themes for variance. 45 3.1 Expectation The mean, expected value, or expectation of a random variable X is writ- ten as E(X) or µX . If we observe N random values of X, then the mean of the N values will be approximately equal to E(X) for large N.
    [Show full text]
  • Lecture Notes 4 Expectation • Definition and Properties
    Lecture Notes 4 Expectation • Definition and Properties • Covariance and Correlation • Linear MSE Estimation • Sum of RVs • Conditional Expectation • Iterated Expectation • Nonlinear MSE Estimation • Sum of Random Number of RVs Corresponding pages from B&T: 81-92, 94-98, 104-115, 160-163, 171-174, 179, 225-233, 236-247. EE 178/278A: Expectation Page 4–1 Definition • We already introduced the notion of expectation (mean) of a r.v. • We generalize this definition and discuss it in more depth • Let X ∈X be a discrete r.v. with pmf pX(x) and g(x) be a function of x. The expectation or expected value of g(X) is defined as E(g(X)) = g(x)pX(x) x∈X • For a continuous r.v. X ∼ fX(x), the expected value of g(X) is defined as ∞ E(g(X)) = g(x)fX(x) dx −∞ • Examples: ◦ g(X)= c, a constant, then E(g(X)) = c ◦ g(X)= X, E(X)= x xpX(x) is the mean of X k k ◦ g(X)= X , E(X ) is the kth moment of X ◦ g(X)=(X − E(X))2, E (X − E(X))2 is the variance of X EE 178/278A: Expectation Page 4–2 • Expectation is linear, i.e., for any constants a and b E[ag1(X)+ bg2(X)] = a E(g1(X)) + b E(g2(X)) Examples: ◦ E(aX + b)= a E(X)+ b ◦ Var(aX + b)= a2Var(X) Proof: From the definition Var(aX + b) = E ((aX + b) − E(aX + b))2 = E (aX + b − a E(X) − b)2 = E a2(X − E(X))2 = a2E (X − E(X))2 = a2Var( X) EE 178/278A: Expectation Page 4–3 Fundamental Theorem of Expectation • Theorem: Let X ∼ pX(x) and Y = g(X) ∼ pY (y), then E(Y )= ypY (y)= g(x)pX(x)=E(g(X)) y∈Y x∈X • The same formula holds for fY (y) using integrals instead of sums • Conclusion: E(Y ) can be found using either fX(x) or fY (y).
    [Show full text]
  • Probabilities, Random Variables and Distributions A
    Probabilities, Random Variables and Distributions A Contents A.1 EventsandProbabilities................................ 318 A.1.1 Conditional Probabilities and Independence . ............. 318 A.1.2 Bayes’Theorem............................... 319 A.2 Random Variables . ................................. 319 A.2.1 Discrete Random Variables ......................... 319 A.2.2 Continuous Random Variables ....................... 320 A.2.3 TheChangeofVariablesFormula...................... 321 A.2.4 MultivariateNormalDistributions..................... 323 A.3 Expectation,VarianceandCovariance........................ 324 A.3.1 Expectation................................. 324 A.3.2 Variance................................... 325 A.3.3 Moments................................... 325 A.3.4 Conditional Expectation and Variance ................... 325 A.3.5 Covariance.................................. 326 A.3.6 Correlation.................................. 327 A.3.7 Jensen’sInequality............................. 328 A.3.8 Kullback–LeiblerDiscrepancyandInformationInequality......... 329 A.4 Convergence of Random Variables . 329 A.4.1 Modes of Convergence . 329 A.4.2 Continuous Mapping and Slutsky’s Theorem . 330 A.4.3 LawofLargeNumbers........................... 330 A.4.4 CentralLimitTheorem........................... 331 A.4.5 DeltaMethod................................ 331 A.5 ProbabilityDistributions............................... 332 A.5.1 UnivariateDiscreteDistributions...................... 333 A.5.2 Univariate Continuous Distributions . 335
    [Show full text]
  • (Introduction to Probability at an Advanced Level) - All Lecture Notes
    Fall 2018 Statistics 201A (Introduction to Probability at an advanced level) - All Lecture Notes Aditya Guntuboyina August 15, 2020 Contents 0.1 Sample spaces, Events, Probability.................................5 0.2 Conditional Probability and Independence.............................6 0.3 Random Variables..........................................7 1 Random Variables, Expectation and Variance8 1.1 Expectations of Random Variables.................................9 1.2 Variance................................................ 10 2 Independence of Random Variables 11 3 Common Distributions 11 3.1 Ber(p) Distribution......................................... 11 3.2 Bin(n; p) Distribution........................................ 11 3.3 Poisson Distribution......................................... 12 4 Covariance, Correlation and Regression 14 5 Correlation and Regression 16 6 Back to Common Distributions 16 6.1 Geometric Distribution........................................ 16 6.2 Negative Binomial Distribution................................... 17 7 Continuous Distributions 17 7.1 Normal or Gaussian Distribution.................................. 17 1 7.2 Uniform Distribution......................................... 18 7.3 The Exponential Density...................................... 18 7.4 The Gamma Density......................................... 18 8 Variable Transformations 19 9 Distribution Functions and the Quantile Transform 20 10 Joint Densities 22 11 Joint Densities under Transformations 23 11.1 Detour to Convolutions......................................
    [Show full text]
  • Lecture 15: Expected Running Time
    Lecture 15: Expected running time Plan/outline Last class, we saw algorithms that used randomization in order to achieve interesting speed-ups. Today, we consider algorithms in which the running time is a random variable, and study the notion of the "expected" running time. Las Vegas algorithms Most of the algorithms we saw in the last class had the following behavior: they run in time , where is the input size and is some parameter, and they have a failure probability (i.e., probability of returning an incorrect answer ) of . These algorithms are often called Monte Carlo algorithms (although we refrain from doing this, as it invokes di!erent meanings in applied domains). Another kind of algorithms, known as Las Vegas algorithms, have the following behavior: their output is always correct, but the running time is a random variable (and in principle, the run time can be unbounded). Sometimes, we can convert a Monte Carlo algorithm into a Las Vegas one. For instance, consider the toy problem we considered last time (where an array is promised to have indices for which and the goal was to "nd one such index). Now consider the following modi"cation of the algorithm: procedure find_Vegas(A): pick random index i in [0, ..., N-1] while (A[i] != 0): pick another random index i in [0, ..., N-1] end while return i Whenever the procedure terminates, we get an that satis"es . But in principle, the algorithm can go on for ever. Another example of a Las Vegas algorithm is quick sort, which is a simple procedure for sorting an array.
    [Show full text]
  • Missed Expectations? 1 Linearity of Expectation
    Massachusetts Institute of Technology Course Notes, Week 14 6.042J/18.062J, Fall ’05: Mathematics for Computer Science December 2 Prof. Albert R. Meyer and Prof. Ronitt Rubinfeld revised December 8, 2005, 61 minutes Missed Expectations? In the previous notes, we saw that the average value of a random quantity is captured by the mathematical concept of the expectation of a random variable, and we calculated expectations for several kinds of random variables. Now we will see two things that make expectations so useful. First, they are often very easy to calculate due to the fact that they obey linearity. Second, once you know what the expectation is, you can also get some type of bound on the probability that you are far from the expectation —that is, you can show that really weird things are not that likely to happen. How good a bound you can get depends on what you know about your distribution, but don’t worry, even if you know next to nothing, you can still say something relatively interesting. 1 Linearity of Expectation 1.1 Expectation of a Sum Expected values obey a simple, very helpful rule called Linearity of Expectation. Its sim­ plest form says that the expected value of a sum of random variables is the sum of the expected values of the variables. Theorem 1.1. For any random variables R1and R2, E [R1+R2] =E [R1] +E [R2] . Proof. Let T::=R1+R2. The proof follows straightforwardly by rearranging terms from the definition of E [T ]. E [R1+R2] ::=E [T ] � ::= T (s) · Pr {s} s∈S � = (R1(s) +R2(s))· Pr {s} (Def.
    [Show full text]
  • On the Method of Moments for Estimation in Latent Variable Models Anastasia Podosinnikova
    On the Method of Moments for Estimation in Latent Variable Models Anastasia Podosinnikova To cite this version: Anastasia Podosinnikova. On the Method of Moments for Estimation in Latent Variable Models. Machine Learning [cs.LG]. Ecole Normale Superieure de Paris - ENS Paris, 2016. English. tel- 01489260v1 HAL Id: tel-01489260 https://tel.archives-ouvertes.fr/tel-01489260v1 Submitted on 14 Mar 2017 (v1), last revised 5 Apr 2018 (v2) HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE DE DOCTORAT de l’Université de recherche Paris Sciences Lettres – PSL Research University préparée à l’École normale supérieure Sur la méthode des mo- École doctorale n°386 ments pour d’estimation Spécialité: Informatique des modèles à variable latentes Soutenue le 01.12.2016 On the method of moments for es- timation in latent variable models par Anastasia Podosinnikova Composition du Jury : Mme Animashree ANANDKUMAR University of California Irvine Rapporteur M Samuel KASKI Aalto University Rapporteur M Francis BACH École normale supérieure Directeur de thèse M Simon LACOSTE-JULIEN École normale supérieure Directeur de thèse M Alexandre D’ASPREMONT École normale supérieure Membre du Jury M Pierre COMON CNRS Membre du Jury M Rémi GRIBONVAL INRIA Membre du Jury Abstract Latent variable models are powerful probabilistic tools for extracting useful latent structure from otherwise unstructured data and have proved useful in numerous ap- plications such as natural language processing and computer vision.
    [Show full text]
  • Advanced Probability
    Advanced Probability Alan Sola Department of Pure Mathematics and Mathematical Statistics University of Cambridge [email protected] Michaelmas 2014 Contents 1 Conditional expectation 4 1.1 Basic objects: probability measures, σ-algebras, and random variables . .4 1.2 Conditional expectation . .8 1.3 Integration with respect to product measures and Fubini's theorem . 12 2 Martingales in discrete time 15 2.1 Basic definitions . 15 2.2 Stopping times and the optional stopping theorem . 16 2.3 The martingale convergence theorem . 20 2.4 Doob's inequalities . 22 2.5 Lp-convergence for martingales . 24 2.6 Some applications of martingale techniques . 26 3 Stochastic processes in continuous time 31 3.1 Basic definitions . 31 3.2 Canonical processes and sample paths . 36 3.3 Martingale regularization theorem . 38 3.4 Convergence and Doob's inequalities in continuous time . 40 3.5 Kolmogorov's continuity criterion . 42 3.6 The Poisson process . 44 1 4 Weak convergence 46 4.1 Basic definitions . 46 4.2 Characterizations of weak convergence . 49 4.3 Tightness . 51 4.4 Characteristic functions and L´evy'stheorem . 53 5 Brownian motion 56 5.1 Basic definitions . 56 5.2 Construction of Brownian motion . 59 5.3 Regularity and roughness of Brownian paths . 61 5.4 Blumenthal's 01-law and martingales for Brownian motion . 62 5.5 Strong Markov property and reflection principle . 67 5.6 Recurrence, transience, and connections with partial differential equations . 70 5.7 Donsker's invariance principle . 76 6 Poisson random measures and L´evyprocesses 81 6.1 Basic definitions .
    [Show full text]