1 Random Variables
Total Page:16
File Type:pdf, Size:1020Kb
Political Science 100a/200a Fall 2001 Probability, part II1 1 Random variables Recall idea of a variable and types of variables: ... • Will now redescribe in probability theory terms as a random variable. Here is a techni- • cal/mathematical definition: Defn: A random variable is a function that assigns a number to each point in a sample space S. For social science purposes, a more intuitive definition is this: A random variable is a process or mechanism that assigns values of a variable to different cases. e.g.: Let S = a list of countries , with an arbitrary country in the sample space de- • noted s S. f g 2 1. Then X(s) = 1990 per capita GDP of country s is a random variable. For instance, X(Ghana) = $902 X(France) = $13; 904 etc... 2. Likewise, Y (s) = 1 is country s had a civil war in the 1990s, and Y (s) = 0 if not, is a random variable. 3. S = list of survey respondents , X(s) = Party ID of respondent s. f g Typically, we will drop the notation X(s) and just write X and Y , leaving it implicit • that this a random variable that associates a number to each point in a sample space. Why \random"? • { Because we will be thinking about \experiments" where we draw a point or collec- tion of points from the sample space at random or according to some probability distribution. 1Notes by James D. Fearon, Dept. of Political Science, Stanford University, October 2001. 1 { Because we may believe that the process that assigns a number like GDP (the value of the variable) to a case has a random or stochastic element. { Some statisticians, like Friedman, actually define a random variable as \a chance procedure for generating a number," by which he means some repeatable process that has a well-understood source of randomness and for which \objective" proba- bilities can in principle be computed. { For others (and most social scientists, I think) there is no necessary implication that the value of the variable (e.g., a country's per capita GDP) is actually produced by a stochastic process, although in fact this is often how we will think about it (treat as \as if" the result of a repeatable random process). e.g.: an individual's propensity for suicide is assumed to be based on character- ∗ istics of the person (such as degree of social integration) plus \random noise" e.g.: a particular measurement of the weight of an object is understood to be ∗ produced as equal to true weight + bias + noise. Observation: Things that are true of functions are also true, generally, of random • variables. { Thus, if if X(s) and Y (s) are random variables (defined on the same sample space), then so are X(s)+Y (s), X(s) Y (s), X(s)Y (s), and X(s)=Y (s) (provided Y (s) = 0 for all S). − 6 { In general: Thm : If φ(z) is a function and X(s) is a random variable, then φ(X(s)) is a random variable as well. { How are such combinations and compositions of two random variables formed? Case by case. For example, we form Z(s) = X(s) + Y (s) by adding the the values of X(s) + Y (s) for each point s in the sample space. Important: Instead of working with a probability distribution or measure defined on • the sample space S, it is often more convenient to work with the implied probability distribution on the different possible values of the random variable X(s). Let • f(x) = P ( s X(s) = x ): f j g Thus, f(x) is the probability that the random variable will take the value x. The set s X(s) = x is the set of all points in the sample space (possible outcomes of the expf j eriment) sucg h that X(s) = x. (Note: We need to do it like this because there might be many such s S.) 2 So f(x) is a probability distribution on possible outcomes of X(s) which need not be • the same as the probability distribution on the state space S. 2 { e.g.: Let X(s) be the number of heads that appear in two tosses of a fair coin, with S taken as hh; ht; th; tt . Then f g f(0) = P ( tt ) = 1=4 f(1) = P (fht;g th ) = 1=2 f(2) = P (fhh ) g= 1=4 f(x) = 0 forf allg other x = 0; 1; 2 6 { e.g.: Let X(s) be 1 for dyad years with a war and 0 for dyad years with no war, with S taken as a sample of dyad years. Then f(0) = P ( s : s is a peaceful dyadyear ) = :999 f(1) = P (fs : s is dyad year at war ) =g :001 f g From now on, we will often work directly with random variables and the associated • probability distributions on them as defined here, bypassing the underlying sample space. 2 Discrete random variables Defn: A random variable X has a discrete probability distribution f(x) if X can take only a finite number of values (or a countably infinite number of values). e.g.: Let X be the number of heads in n tosses of a fair coin (thus X is a random • variable). Then for x 0; 1; 2; : : : ; 10 , 2 f g 10 1 f(x) = ; x 210 and for all other x, f(x) = 0. Plot with Stata ... Note that this is the same idea as for a histogram, which may take a more continuous • variable, group it into categories, and display the probability associated with each of these categories. e.g.: Let X be the number of bills sent by Congress to the president in a given year. • f(x) would then refer to the probability that x bills are sent. Another way to describe the probability distribution of a random variable ... 3 Defn: If a random variable X has a discrete probability distribution given by f(x), then the cumulative distribution function (c.d.f) of X is F (x) = P (X x): ≤ In words, F (x) is the probability that the random variable produces a value of less than • or equal to x. e.g.: Graph the c.d.f. of a random variable distributed by a binomial distribution for • n = 2 and p = 1=2. ... e.g.: for n = 3 ... • Defn: If an experiment or process produces two random variables X and Y , then the joint probability distribution of X and Y may be written as f(x; y) = P (X = x & Y = y): e.g.: You roll two dice, letting X be the random variable referring to the result on the • first die, and Y the value that shows on the second. What is f(x; y)? recall defn of independence, example where it does not hold (e.g., income and life • expectancy) .. Draw 3D picture of a discrete joint density ... • 3 Continuous random variables Whenever you work with a sample space S that is finite, then you get a discrete distri- • bution f(x). The samples we observe in practice are always finite and thus have discrete distributions. For theoretical reasons, though, it is extremely useful to be able to represent the idea • of a sample space that is (uncountably) infinite. e.g.: Take as the sample space all the points in the unit interval, so that S = [0; 1], with • typical element x [0; 1]. 2 Suppose further that we consider a probability measure such that every point in this • interval is equally likely. This is called a uniform distribution on [0; 1], and is often denoted U[0; 1]. 4 How characterize in terms of a probability distribution function like f(x) for a discrete • distribution? A paradox: If there are an infinite number of possible outcomes in the sample space, and each is equally likely, then what is the probability of any particular point, e.g., s = :378? ... if 0, won't work, if positive, won't work ... Instead, we define a probability measure that assigns positive probability numbers only • to sets of points in [0, 1], and in particular, only to what are called measurable sets of points in the sample space. e.g.: Intuitively, what would be the probability of drawing a point from the subinterval • [0; 1=2] if all points are equally likely? What about from [1=4; 1=2]? In general, we will characterize the uniform probability distribution on [0, 1] in terms of • a density function f(x). This is related to but NOT the same thing as f(x) = P (X(s) = x) in the discrete case. Defn: A density function f(x) for a continuous random variable X has the property that the area under f(x) for any interval [a; b] is equal to the probability of drawing an x [a; b]. 2 In the case of U[0; 1], f(x) = 1 for all x [0; 1]. • 2 Show graphically ... • Demonstrate that P ([a; b]) equals area under f(x) for [a; b]. • What about the probability of drawing x = :358? Or what about the probability of • drawing x :358; :989 ? Defined as having \measure zero" (i.e., zero probability). True for an2y collectionf ofg particular points, even if infinite. In calculus terms, a random variable X has a continuous probability distribution or • continuous density if there exists a nonnegative function f(x) defined on the real line such that for any interval A R, ⊂ P (X A) = f(x)dx: 2 ZA (For those who haven't had calculus, think of the S-like thing as another form of sum- mation operator.