Transformations and Expectations of Random Variables

Total Page:16

File Type:pdf, Size:1020Kb

Transformations and Expectations of Random Variables Transformations and Expectations of random variables X ∼ FX (x): a random variable X distributed with CDF FX . Any function Y = g(X) is also a random variable. If both X, and Y are continuous random variables, can we find a simple way to characterize FY and fY (the CDF and PDF of Y ), based on the CDF and PDF of X? For the CDF: FY (y) = PY (Y ≤ y) = PY (g(X) ≤ y) = PX (x 2 X : g(X) ≤ y)(X is sample space for X) Z = fX (s)ds: fx2X :g(X)≤yg 0 PDF: fY (y) = Fy(y) Caution: need to consider support of y. Consider several examples: 1. X ∼ U[−1; 1] and y = exp(x) That is: 1 if x 2 [−1; 1] f (x) = 2 X 0 otherwise 1 1 F (x) = + x; for x 2 [−1; 1]. X 2 2 FY (y) = P rob(exp(X) ≤ y) = P rob(X ≤ log y) 1 1 1 = F (log y) = + log y; for y 2 [ ; e]: X 2 2 e Be careful about the bounds of the support! @ f (y) = F (y) Y @y Y 1 1 1 = f (log y) = ; for y 2 [ ; e]. X y 2y e 1 2. X ∼ U[−1; 1] and Y = X2 2 FY (y) = P rob(X ≤ y) p p = P rob(− y ≤ X ≤ y) p p = FX ( y) − FX (− y) p p p = 2FX ( y) − 1; by symmetry: FX (− y) = 1 − FX ( y). @ f (y) = F (y) Y @y Y p 1 1 = 2f ( y) p = p ; for y 2 [0; 1]. X 2 y 2 y As the first example above showed, it's easy to derive the CDF and PDF of Y when g(·) is a strictly monotonic function: Theorems 2.1.3, 2.1.5: When g(·) is a strictly increasing function, then Z g−1(y) −1 FY (y) = fX (x)dx = FX (g (y)) −∞ @ f (y) = f (g−1(y)) g−1(y) using chain rule: Y X @y Note: by the inverse function theorem, @ −1 0 g (y) = 1= [g (x)] j −1 : @y x=g (y) When g(·) is a strictly decreasing function, then Z 1 −1 FY (y) = fX (x)dx = 1 − FX (g (y)) g−1(y) @ f (y) = −f (g−1(y)) g−1(y) using chain rule: Y X @y These are the change of variables formulas for transformations of univariate random variables. transformations. 2 Here is a special case of a transformation: Thm 2.1.10: Let X have a continuous CDF FX (·) and define the random variable Y = FX (X). Then Y ∼ U[0; 1], i.e., FY (y) = y, for y 2 [0; 1]. Expected value (Definition 2.2.1): The expected value, or mean, of a random variable g(X) is R 1 g(x)fX (x)dx if X continuous Eg(X) = P−∞ x2X g(x)P (X = x) if X discrete provided that the integral or the sum exists The expectation is a linear operator (just like integration): so that " n # n X X E α ∗ gi(X) + b = α ∗ Egi(X) + b: i=1 i=1 Note: Expectation is a population average, i.e., you average values of the random variable g(X) weighting by the population density fX (x). A statistical experiment yields sample observations X1;X2;:::;Xn ∼ FX . From these sample ¯ 1 P ¯ observations, we can calculate sample avg. Xn ≡ n i Xi. In general: Xn 6= EX. But under ¯ some conditions, as n ! 1, then Xn ! EX in some sense (which we discuss later). Expected value is commonly used measure of \central tendency" of a random variable X. 1 Example: But mean may not exist: Cauchy random variable with density f(x) = π(1+x2) for x 2 (−∞; 1). Note that Z 1 x Z 0 x Z 1 x 2 dx = 2 dx + 2 dx −∞ π(1 + x ) −∞ π(1 + x ) 0 π(1 + x ) 0 Z 0 x Z b x = lim 2 dx + lim 2 dx a→−∞ a π(1 + x ) b!1 0 π(1 + x ) 0 1 2 0 1 2 b = lim [log(1 + x )]a + lim [log(1 + x )]0 a→−∞ 2π b!1 2π = −∞ + 1 undefined Other measures: 3 1. Median: med(X) = m such that FX (x) = 0:5. Robust to outliers, and has nice invariance property: for Y = g(X) and g(·) monotonic increasing, then med(Y ) = g(med(X)). 2. Mode: Mode(X) = maxx fX (x). Moments: important class of expectations 0 n For each integer n, the n-th (uncentred) moment of X ∼ FX (·) is µn ≡ EX . n n The n-th centred moment is µn ≡ E(X − µ) = E(X − EX) . (It is centred around the mean EX.) 2 p For n = 2: µ2 = E(X − EX) is the Variance of X. µ2 is the standard deviation. Important formulas: • V ar(aX + b) = a2V arX (variance is not a linear operation) • V arX = E(X2) − (EX)2: alternative formula for the variance The moments of a random variable are summarized in the moment generating function. Definition: the moment-generating function of X is MX (t) ≡ E exp(tX), provided that the expectation exists in some neighborhood t 2 [−h; h] of zero. Specifically: R 1 tx −∞ e fX (x)dx for X continuous Mx(t) = P tx x2X e P (X = x) for X discrete: The uncentred moments of X are generated from this function by: n n (n) d EX = MX (0) ≡ n MX (t) ; dt t=0 which is the n-th derivative of the MGF, evaluated at t = 0. When it exists (see below), then MGF provides alternative description of a probability distribution. Mathematically, it is a Laplace transform, which can be convenient for certain mathematical calculations. 4 Example: standard normal distribution: Z 1 1 x2 MX (t) = p exp tx − dx −∞ 2π 2 Z 1 1 1 = p exp − ((x − t)2 − t2) dx −∞ 2π 2 1 Z 1 1 1 = exp( t2) · p exp − (x − t)2 dx 2 −∞ 2π 2 1 = exp( t2) · 1 2 where last term on RHS is integral over density function of N(t; 1), which integrates to one. 1 1 2 First moment: EX = MX (0) = t · exp( 2 t ) t=0 = 0. 2 2 1 2 2 1 2 Second moment: EX = MX (0) = exp( 2 t ) + t exp( 2 t ) = 1. In many cases, the MGF can characterize a distribution. But problem is that it may not exist (eg. Cauchy distribution) For a RV X, is its distribution uniquely determined by its moment generating function? Thm 2.3.11: For X ∼ FX and Y ∼ FY , if MX and MY exist, and MX (t) = MY (t) for all t in some neighborhood of zero, then FX (u) = FY (u) for all u. Note that if the MGF exists, then it characterizes a random variable with an infinite number of moments (because the MGF is infinitely differentiable). Converse not necessarily true. (ex. log-normal random variable: X ∼ N(0; 1), Y = exp(X)) Characteristic function: The characteristic function of a random variable g(x), defined as Z +1 φg(x)(t) = Ex exp(itg(x)) = exp(itg(x))f(x)dx −∞ where f(x) is the density for x. This is also called the \Fourier transform". Features of characteristic function: • The CF always exists. This follows from the equality eitx = cos(tx) + i · sin(tx), and both the real and complex parts of the integrand are bounded functions. 5 • Consider a symmetric density function, with f(−x) = f(x) (symmetric around zero). Then resulting φ(t) is real-valued, and symmetric around zero. • The CF completely determines the distribution of X (every cdf has a unique charac- teristic function). • Let X have characteristic function φX (t). Then Y = aX +b has characteristic function ibt φY (t) = e φX (at). • X and Y , independent, with characteristic functions φX (t) and φY (t). Then φX+Y (t) = φX (t)φY (t) • φ(0) = 1. R +1 1 • For a given characteristic function φX (t) such that −∞ jφX (t)jdt < 1, the corre- sponding density fX (x) is given by the inverse Fourier transform, which is 1 Z +1 fX (x) = φX (t) exp(−itx)dt: 2π −∞ Example: N(0; 1) distribution, with density f(x) = p1 exp(−x2=2). 2π Take as given that the characteristic function of N(0; 1) is Z 1 2 2 φN(0;1)(t) = p exp itx − x =2) dx = exp(−t =2): (1) 2π Hence the inversion formula yields 1 Z +1 f(x) = exp(−t2=2) exp(−itx)dt: 2π −∞ Now making substitution z = −t, we get 1 Z +1 exp izx − z2=2dz 2π −∞ 1 1 2 =p φN(0;1)(x) = p exp(x =2) = fN(0;1)(x): (Use Eq. (1)) 2π 2π • Characteristic function also summarizes the moments of a random variable. Specifi- cally, note that the h-th derivative of φ(t) is Z +1 φh(t) = ihg(x)h exp(itg(x))f(x)dx: (2) −∞ 1Here j · j denotes the modulus of a complex number. For x + iy, we have jx + iyj = px2 + y2. 6 h h Hence, assuming the h-th moment, denoted µg(x) ≡ E[g(x)] exists, it is equal to h h h µg(x) = φ (0)=i : Hence, assuming that the required moments exist, we can use Taylor's theorem to expand the characteristic function around t = 0 to get: it (it)2 (it)k φ(t) = 1 + µ1 + µ2 + ::: + µk + o(tk): 1 g(x) 2 g(x) k! g(x) • Cauchy distribution, cont'd: The characteristic function for the Cauchy distribu- tion is φ(t) = exp(−|tj): This is not differentiable at t = 0, which by Eq.
Recommended publications
  • Cloaking Via Change of Variables for Second Order Quasi-Linear Elliptic Differential Equations Maher Belgacem, Abderrahman Boukricha
    Cloaking via change of variables for second order quasi-linear elliptic differential equations Maher Belgacem, Abderrahman Boukricha To cite this version: Maher Belgacem, Abderrahman Boukricha. Cloaking via change of variables for second order quasi- linear elliptic differential equations. 2017. hal-01444772 HAL Id: hal-01444772 https://hal.archives-ouvertes.fr/hal-01444772 Preprint submitted on 24 Jan 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Cloaking via change of variables for second order quasi-linear elliptic differential equations Maher Belgacem, Abderrahman Boukricha University of Tunis El Manar, Faculty of Sciences of Tunis 2092 Tunis, Tunisia. Abstract The present paper introduces and treats the cloaking via change of variables in the framework of quasi-linear elliptic partial differential operators belonging to the class of Leray-Lions (cf. [14]). We show how a regular near-cloak can be obtained using an admissible (nonsingular) change of variables and we prove that the singular change- of variable-based scheme achieve perfect cloaking in any dimension d ≥ 2. We thus generalize previous results (cf. [7], [11]) obtained in the context of electric impedance tomography formulated by a linear differential operator in divergence form.
    [Show full text]
  • 18.440: Lecture 9 Expectations of Discrete Random Variables
    18.440: Lecture 9 Expectations of discrete random variables Scott Sheffield MIT 18.440 Lecture 9 1 Outline Defining expectation Functions of random variables Motivation 18.440 Lecture 9 2 Outline Defining expectation Functions of random variables Motivation 18.440 Lecture 9 3 Expectation of a discrete random variable I Recall: a random variable X is a function from the state space to the real numbers. I Can interpret X as a quantity whose value depends on the outcome of an experiment. I Say X is a discrete random variable if (with probability one) it takes one of a countable set of values. I For each a in this countable set, write p(a) := PfX = ag. Call p the probability mass function. I The expectation of X , written E [X ], is defined by X E [X ] = xp(x): x:p(x)>0 I Represents weighted average of possible values X can take, each value being weighted by its probability. 18.440 Lecture 9 4 Simple examples I Suppose that a random variable X satisfies PfX = 1g = :5, PfX = 2g = :25 and PfX = 3g = :25. I What is E [X ]? I Answer: :5 × 1 + :25 × 2 + :25 × 3 = 1:75. I Suppose PfX = 1g = p and PfX = 0g = 1 − p. Then what is E [X ]? I Answer: p. I Roll a standard six-sided die. What is the expectation of number that comes up? 1 1 1 1 1 1 21 I Answer: 6 1 + 6 2 + 6 3 + 6 4 + 6 5 + 6 6 = 6 = 3:5. 18.440 Lecture 9 5 Expectation when state space is countable II If the state space S is countable, we can give SUM OVER STATE SPACE definition of expectation: X E [X ] = PfsgX (s): s2S II Compare this to the SUM OVER POSSIBLE X VALUES definition we gave earlier: X E [X ] = xp(x): x:p(x)>0 II Example: toss two coins.
    [Show full text]
  • Some Integral Inequalities for Operator Monotonic Functions on Hilbert Spaces Received May 6, 2020; Accepted June 24, 2020
    Spec. Matrices 2020; 8:172–180 Research Article Open Access Silvestru Sever Dragomir* Some integral inequalities for operator monotonic functions on Hilbert spaces https://doi.org/10.1515/spma-2020-0108 Received May 6, 2020; accepted June 24, 2020 Abstract: Let f be an operator monotonic function on I and A, B 2 SAI (H) , the class of all selfadjoint op- erators with spectra in I. Assume that p : [0, 1] ! R is non-decreasing on [0, 1]. In this paper we obtained, among others, that for A ≤ B and f an operator monotonic function on I, Z1 Z1 Z1 0 ≤ p (t) f ((1 − t) A + tB) dt − p (t) dt f ((1 − t) A + tB) dt 0 0 0 1 ≤ [p (1) − p (0)] [f (B) − f (A)] 4 in the operator order. Several other similar inequalities for either p or f is dierentiable, are also provided. Applications for power function and logarithm are given as well. Keywords: Operator monotonic functions, Integral inequalities, Čebyšev inequality, Grüss inequality, Os- trowski inequality MSC: 47A63, 26D15, 26D10. 1 Introduction Consider a complex Hilbert space (H, h·, ·i). An operator T is said to be positive (denoted by T ≥ 0) if hTx, xi ≥ 0 for all x 2 H and also an operator T is said to be strictly positive (denoted by T > 0) if T is positive and invertible. A real valued continuous function f (t) on (0, ∞) is said to be operator monotone if f (A) ≥ f (B) holds for any A ≥ B > 0. In 1934, K. Löwner [7] had given a denitive characterization of operator monotone functions as follows: Theorem 1.
    [Show full text]
  • Chapter 8 Change of Variables, Parametrizations, Surface Integrals
    Chapter 8 Change of Variables, Parametrizations, Surface Integrals x0. The transformation formula In evaluating any integral, if the integral depends on an auxiliary function of the variables involved, it is often a good idea to change variables and try to simplify the integral. The formula which allows one to pass from the original integral to the new one is called the transformation formula (or change of variables formula). It should be noted that certain conditions need to be met before one can achieve this, and we begin by reviewing the one variable situation. Let D be an open interval, say (a; b); in R , and let ' : D! R be a 1-1 , C1 mapping (function) such that '0 6= 0 on D: Put D¤ = '(D): By the hypothesis on '; it's either increasing or decreasing everywhere on D: In the former case D¤ = ('(a);'(b)); and in the latter case, D¤ = ('(b);'(a)): Now suppose we have to evaluate the integral Zb I = f('(u))'0(u) du; a for a nice function f: Now put x = '(u); so that dx = '0(u) du: This change of variable allows us to express the integral as Z'(b) Z I = f(x) dx = sgn('0) f(x) dx; '(a) D¤ where sgn('0) denotes the sign of '0 on D: We then get the transformation formula Z Z f('(u))j'0(u)j du = f(x) dx D D¤ This generalizes to higher dimensions as follows: Theorem Let D be a bounded open set in Rn;' : D! Rn a C1, 1-1 mapping whose Jacobian determinant det(D') is everywhere non-vanishing on D; D¤ = '(D); and f an integrable function on D¤: Then we have the transformation formula Z Z Z Z ¢ ¢ ¢ f('(u))j det D'(u)j du1::: dun = ¢ ¢ ¢ f(x) dx1::: dxn: D D¤ 1 Of course, when n = 1; det D'(u) is simply '0(u); and we recover the old formula.
    [Show full text]
  • Randomized Algorithms
    Chapter 9 Randomized Algorithms randomized The theme of this chapter is madendrizo algorithms. These are algorithms that make use of randomness in their computation. You might know of quicksort, which is efficient on average when it uses a random pivot, but can be bad for any pivot that is selected without randomness. Analyzing randomized algorithms can be difficult, so you might wonder why randomiza- tion in algorithms is so important and worth the extra effort. Well it turns out that for certain problems randomized algorithms are simpler or faster than algorithms that do not use random- ness. The problem of primality testing (PT), which is to determine if an integer is prime, is a good example. In the late 70s Miller and Rabin developed a famous and simple random- ized algorithm for the problem that only requires polynomial work. For over 20 years it was not known whether the problem could be solved in polynomial work without randomization. Eventually a polynomial time algorithm was developed, but it is much more complicated and computationally more costly than the randomized version. Hence in practice everyone still uses the randomized version. There are many other problems in which a randomized solution is simpler or cheaper than the best non-randomized solution. In this chapter, after covering the prerequisite background, we will consider some such problems. The first we will consider is the following simple prob- lem: Question: How many comparisons do we need to find the top two largest numbers in a sequence of n distinct numbers? Without the help of randomization, there is a trivial algorithm for finding the top two largest numbers in a sequence that requires about 2n − 3 comparisons.
    [Show full text]
  • Probability Cheatsheet V2.0 Thinking Conditionally Law of Total Probability (LOTP)
    Probability Cheatsheet v2.0 Thinking Conditionally Law of Total Probability (LOTP) Let B1;B2;B3; :::Bn be a partition of the sample space (i.e., they are Compiled by William Chen (http://wzchen.com) and Joe Blitzstein, Independence disjoint and their union is the entire sample space). with contributions from Sebastian Chiu, Yuan Jiang, Yuqi Hou, and Independent Events A and B are independent if knowing whether P (A) = P (AjB )P (B ) + P (AjB )P (B ) + ··· + P (AjB )P (B ) Jessy Hwang. Material based on Joe Blitzstein's (@stat110) lectures 1 1 2 2 n n A occurred gives no information about whether B occurred. More (http://stat110.net) and Blitzstein/Hwang's Introduction to P (A) = P (A \ B1) + P (A \ B2) + ··· + P (A \ Bn) formally, A and B (which have nonzero probability) are independent if Probability textbook (http://bit.ly/introprobability). Licensed and only if one of the following equivalent statements holds: For LOTP with extra conditioning, just add in another event C! under CC BY-NC-SA 4.0. Please share comments, suggestions, and errors at http://github.com/wzchen/probability_cheatsheet. P (A \ B) = P (A)P (B) P (AjC) = P (AjB1;C)P (B1jC) + ··· + P (AjBn;C)P (BnjC) P (AjB) = P (A) P (AjC) = P (A \ B1jC) + P (A \ B2jC) + ··· + P (A \ BnjC) P (BjA) = P (B) Last Updated September 4, 2015 Special case of LOTP with B and Bc as partition: Conditional Independence A and B are conditionally independent P (A) = P (AjB)P (B) + P (AjBc)P (Bc) given C if P (A \ BjC) = P (AjC)P (BjC).
    [Show full text]
  • Introduction to the Modern Calculus of Variations
    MA4G6 Lecture Notes Introduction to the Modern Calculus of Variations Filip Rindler Spring Term 2015 Filip Rindler Mathematics Institute University of Warwick Coventry CV4 7AL United Kingdom [email protected] http://www.warwick.ac.uk/filiprindler Copyright ©2015 Filip Rindler. Version 1.1. Preface These lecture notes, written for the MA4G6 Calculus of Variations course at the University of Warwick, intend to give a modern introduction to the Calculus of Variations. I have tried to cover different aspects of the field and to explain how they fit into the “big picture”. This is not an encyclopedic work; many important results are omitted and sometimes I only present a special case of a more general theorem. I have, however, tried to strike a balance between a pure introduction and a text that can be used for later revision of forgotten material. The presentation is based around a few principles: • The presentation is quite “modern” in that I use several techniques which are perhaps not usually found in an introductory text or that have only recently been developed. • For most results, I try to use “reasonable” assumptions, not necessarily minimal ones. • When presented with a choice of how to prove a result, I have usually preferred the (in my opinion) most conceptually clear approach over more “elementary” ones. For example, I use Young measures in many instances, even though this comes at the expense of a higher initial burden of abstract theory. • Wherever possible, I first present an abstract result for general functionals defined on Banach spaces to illustrate the general structure of a certain result.
    [Show full text]
  • Joint Probability Distributions
    ST 380 Probability and Statistics for the Physical Sciences Joint Probability Distributions In many experiments, two or more random variables have values that are determined by the outcome of the experiment. For example, the binomial experiment is a sequence of trials, each of which results in success or failure. If ( 1 if the i th trial is a success Xi = 0 otherwise; then X1; X2;:::; Xn are all random variables defined on the whole experiment. 1 / 15 Joint Probability Distributions Introduction ST 380 Probability and Statistics for the Physical Sciences To calculate probabilities involving two random variables X and Y such as P(X > 0 and Y ≤ 0); we need the joint distribution of X and Y . The way we represent the joint distribution depends on whether the random variables are discrete or continuous. 2 / 15 Joint Probability Distributions Introduction ST 380 Probability and Statistics for the Physical Sciences Two Discrete Random Variables If X and Y are discrete, with ranges RX and RY , respectively, the joint probability mass function is p(x; y) = P(X = x and Y = y); x 2 RX ; y 2 RY : Then a probability like P(X > 0 and Y ≤ 0) is just X X p(x; y): x2RX :x>0 y2RY :y≤0 3 / 15 Joint Probability Distributions Two Discrete Random Variables ST 380 Probability and Statistics for the Physical Sciences Marginal Distribution To find the probability of an event defined only by X , we need the marginal pmf of X : X pX (x) = P(X = x) = p(x; y); x 2 RX : y2RY Similarly the marginal pmf of Y is X pY (y) = P(Y = y) = p(x; y); y 2 RY : x2RX 4 / 15 Joint
    [Show full text]
  • 7. Transformations of Variables
    Virtual Laboratories > 2. Distributions > 1 2 3 4 5 6 7 8 7. Transformations of Variables Basic Theory The Problem As usual, we start with a random experiment with probability measure ℙ on an underlying sample space. Suppose that we have a random variable X for the experiment, taking values in S, and a function r:S→T . Then Y=r(X) is a new random variabl e taking values in T . If the distribution of X is known, how do we fin d the distribution of Y ? This is a very basic and important question, and in a superficial sense, the solution is easy. 1. Show that ℙ(Y∈B) = ℙ ( X∈r −1( B)) for B⊆T. However, frequently the distribution of X is known either through its distribution function F or its density function f , and we would similarly like to find the distribution function or density function of Y . This is a difficult problem in general, because as we will see, even simple transformations of variables with simple distributions can lead to variables with complex distributions. We will solve the problem in various special cases. Transformed Variables with Discrete Distributions 2. Suppose that X has a discrete distribution with probability density function f (and hence S is countable). Show that Y has a discrete distribution with probability density function g given by g(y)=∑ f(x), y∈T x∈r −1( y) 3. Suppose that X has a continuous distribution on a subset S⊆ℝ n with probability density function f , and that T is countable.
    [Show full text]
  • Pathological Real-Valued Continuous Functions
    Pathological Real-Valued Continuous Functions by Timothy Miao A project submitted to the Department of Mathematical Sciences in conformity with the requirements for Math 4301 (Honour's Seminar) Lakehead University Thunder Bay, Ontario, Canada copyright c (2013) Timothy Miao Abstract We look at continuous functions with pathological properties, in particular, two ex- amples of continuous functions that are nowhere differentiable. The first example was discovered by K. W. T. Weierstrass in 1872 and the second by B. L. Van der Waerden in 1930. We also present an example of a continuous strictly monotonic function with a vanishing derivative almost everywhere, discovered by Zaanen and Luxemburg in 1963. i Acknowledgements I would like to thank Razvan Anisca, my supervisor for this project, and Adam Van Tuyl, the course coordinator for Math 4301. Without their guidance, this project would not have been possible. As well, all the people I have learned from and worked with from the Department of Mathematical Sciences at Lakehead University have contributed to getting me to where I am today. Finally, I am extremely grateful for the support that my family has given during all of my education. ii Contents Abstract i Acknowledgements ii Chapter 1. Introduction 1 Chapter 2. Continuity and Differentiability 2 1. Preliminaries 2 2. Sequences of functions 4 Chapter 3. Two Continuous Functions that are Nowhere Differentiable 6 1. Example given by Weierstrass (1872) 6 2. Example given by Van der Waerden (1930) 10 Chapter 4. Monotonic Functions and their Derivatives 13 1. Preliminaries 13 2. Some notable theorems 14 Chapter 5. A Strictly Monotone Function with a Vanishing Derivative Almost Everywhere 18 1.
    [Show full text]
  • The Fixed-Point Theorem 8 March 2013 Lecturer: Andrew Myers
    CS 6110 Lecture 21 The Fixed-Point Theorem 8 March 2013 Lecturer: Andrew Myers We saw that the semantics of the while command are a fixed point. We also saw that intuitively, the semantics are the limit of a series of approximations capturing a finite number of iterations of the loop, and giving a result of ? for greater numbers of iterations. In order to take a limit, we need greater structure, which led us to define partial orders. But ordering is not enough. 1 Complete partial orders (CPOs) Least upper bounds Given a partial order (S; v), and a subset B ⊆ S, y is an upper bound of B iff 8x 2 B:x v y. In addition, y is a least upper bound iff y is an upper bound and y v z for all upper bounds z of B. We may abbreviate “least upper bound” as LUB or lub. We notate the LUB of a subset B as F B. We may also make this an infix operator, F F writing i21::m xi = x1 t ::: t xm = fxigi21::m. This is also known as the join of elements x1; : : : ; xm. Chains A chain is a pairwise comparable sequence of elements from a partial order (i.e., elements x0; x1; x2 ::: F such that x0 v x1 v x2 v :::). For any finite chain, its LUB is its last element (e.g., xi = xn). Infinite chains (!-chains, i.e. indexed by the natural numbers) may also have LUBs. Complete partial orders A complete partial order (CPO)1 is a partial order in which every chain has a least upper bound.
    [Show full text]
  • Functions of Random Variables
    Names for Eg(X ) Generating Functions Topic 8 The Expected Value Functions of Random Variables 1 / 19 Names for Eg(X ) Generating Functions Outline Names for Eg(X ) Means Moments Factorial Moments Variance and Standard Deviation Generating Functions 2 / 19 Names for Eg(X ) Generating Functions Means If g(x) = x, then µX = EX is called variously the distributional mean, and the first moment. • Sn, the number of successes in n Bernoulli trials with success parameter p, has mean np. • The mean of a geometric random variable with parameter p is (1 − p)=p . • The mean of an exponential random variable with parameter β is1 /β. • A standard normal random variable has mean0. Exercise. Find the mean of a Pareto random variable. Z 1 Z 1 βαβ Z 1 αββ 1 αβ xf (x) dx = x dx = βαβ x−β dx = x1−β = ; β > 1 x β+1 −∞ α x α 1 − β α β − 1 3 / 19 Names for Eg(X ) Generating Functions Moments In analogy to a similar concept in physics, EX m is called the m-th moment. The second moment in physics is associated to the moment of inertia. • If X is a Bernoulli random variable, then X = X m. Thus, EX m = EX = p. • For a uniform random variable on [0; 1], the m-th moment is R 1 m 0 x dx = 1=(m + 1). • The third moment for Z, a standard normal random, is0. The fourth moment, 1 Z 1 z2 1 z2 1 4 4 3 EZ = p z exp − dz = −p z exp − 2π −∞ 2 2π 2 −∞ 1 Z 1 z2 +p 3z2 exp − dz = 3EZ 2 = 3 2π −∞ 2 3 z2 u(z) = z v(t) = − exp − 2 0 2 0 z2 u (z) = 3z v (t) = z exp − 2 4 / 19 Names for Eg(X ) Generating Functions Factorial Moments If g(x) = (x)k , where( x)k = x(x − 1) ··· (x − k + 1), then E(X )k is called the k-th factorial moment.
    [Show full text]