Part V - Chance Variability

Part V - Chance Variability Dr. Joseph Brennan Math 148, BU Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 1 / 78 Law of Averages In Chapter 13 we discussed the Kerrich coin-tossing experiment. Kerrich was a South African who spent World War II as a Nazi prisoner. He spent his time flipping a coin 10; 000 times, faithfully recording the results. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 2 / 78 Law of Averages Law of Averages: If an experiment is independently repeated a large number of times, the percentage of occurrences of a specific event E will be the theoretical probability of the event occurring, but of by some amount - the chance error. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 3 / 78 Law of Averages As the coin toss was repeated, the percentage of heads approaches its theoretical expectation: 50%. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 4 / 78 Law of Averages Caution The Law of Averages is commonly misunderstood as the Gamblers Fallacy: "By some magic everything will balance out. With a run of 10 heads a tail is becoming more likely." This is very false. After a run of 10 heads the probability of tossing a tail is still 50%! Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 5 / 78 Law of Averages In fact, the number of heads above half is quickly increasing as the experiment proceeds. A gambler betting on tails and hoping for balance would be devastated as the tails appear about 134 times less than heads after 10; 000 tosses. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 6 / 78 Law of Averages In our coin-flipping experiment; the number of heads will be around half the number of tosses plus or minus the chance error. As the number of tosses goes up, the chance error gets larger in absolute terms. However, when viewed relatively, the chance error as a percentage decreases. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 7 / 78 Sample Spaces Recall that a sample spaces S lists all the possible outcomes of a study. Example (3 coins): We can record an outcome as a string of heads and tails, such as HHT. The corresponding sample space is S = fHHH; HHT; HTH; THH; TTH; THT; HTT; TTTg: It is often more convenient to deal with outcomes as numbers, rather than as verbal statements. Suppose we are interested in the number of heads. Let X denote the number of heads in3 tosses. For instance, if the outcome is HHT, then X = 2. The possible values of X are0,1,2, and3. For every outcome from S, X will take a particular value: Outcome HHH HHT HTH THH TTH THT HTT TTT X 3 2 2 2 1 1 1 0 Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 8 / 78 Random Variable Random Variable: An unknown subject to random change. Often a random variable will be an unknown numerical result of study. A random variable has a numerical sample space where each outcome has an assigned probability. There is not necessarily equal assigned probabilities: The quantity X in the previous Example is a random variable because its value is unknown unless the tossing experiment is performed. Definition: A random variable is an unknown numerical result of a study. Mathematically, a random variable is a function which assigns a numerical value to each outcome in a sample space S. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 9 / 78 Example (3 coins) We have two different sample spaces for our 3 coin experiment: S = fHHH; HHT; HTH; THH; TTH; THT; HTT; TTTg: S∗ = f0; 1; 2; 3g The sample spaceS describes8 equally likely outcomes for our coin flips while the sample space S∗ describes4 not equally likely outcomes. Recall that S∗ represents the values of the random variableX, the number of heads resulting from three coin flips. 1 1 1 1 P(X = 0)= P(TTT ) = · · = 2 2 2 8 3 P(X = 1)= P(HTT or TTH or THT ) = 8 3 1 P(X = 2)= P(X = 3)= 8 8 S∗ does not contain information about the order of heads and tails. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 10 / 78 Discrete and Continuous Random Variables Discrete Random Variables: A discrete random variable has a number of possible values which can be listed. Mathematically we say the number of possible values are countable. Variable X in Example (3 coins) is discrete. Simple actions are discrete: rolling dice, flipping coins, dealing cards, drawing names from a hat, spinning a wheel, . Continuous Random Variables: A continuous random variable takes values in an interval of numbers. It is impossible to list or count all the possible values of a continuous random variable. Mathematically we say the number of possible values are uncountable. For the data on heights of people, the average height¯x is a continuous random variable which takes on values from some interval, say, [0; 200] (in inches). Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 11 / 78 Probability Distributions Any random variable X , discrete or continuous, can be described with A probability distribution. A mean and standard deviation. The probability distribution of a random variable X is defined by specifying the possible values of X and their probabilities. For discrete random variables the probability distribution is given by the probability table and is represented graphically as the probability histogram. For continuous random variables the probability distribution is given by the probability density function and is represented graphically by the density curve. Recall that we discussed density curves in Part II. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 12 / 78 The Mean of a Random Variable X In Part II (Descriptive Statistics) we discussed the mean and standard deviation,¯x and s, of data sets to measure the center and spread of the observations. Similar definitions exist for random variables: The mean of the random variable X , denoted µ, measures the centrality of the probability distribution. The mean µ is computed from the probability distribution of X as a weighted average of the possible values of X with weights being the probabilities of these values. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 13 / 78 The Expected Value The mean µ of a random variable X is often called the expected value of X . It means that the observed value of a random variable is expected to be around its expected value; the difference is the chance error. In other words, observed value of X = µ + chance error We never expect a random variable X to be exactly equal to its expected value µ. The likely size of the chance error can be determined by the standard deviation, denoted σ. The standard deviation σ measures the distribution's spread and is a quantity which is computed from the probability distribution of X . Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 14 / 78 Random Variable X and Population A population of interest is often characterized by the random variable X . Example: Suppose we are interested in the distribution of American heights. The random variable X (height) describes the population (US people). The distribution of X is called the population distribution, and the distribution parameters, µ and σ, are the population parameters. Population parameters are fixed constants which are usually unknown and need to be estimated. A sample (data set) should be viewed as values (realizations) of the random variable X drawn from the probability distribution. The sample mean¯x and standard deviation s estimates the unknown population mean µ standard deviation σ. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 15 / 78 Discrete Random Variables The distribution of a discrete random variable X is summarized in the distribution table: Value of X x1 x2 x3 ... xk Probability p1 p2 p3 ... pk The symbols xi represent the distinct possible values of X and pi is the probability associated to xi . p1 + p2 + ::: + pk = 1 (or 100%) This is due to all possible values of X being listed in the sample space S = fx1; x2;:::; xk g. The events X = xi and X = xj , i 6= j, are disjoint since the random variable X cannot take two distinct values at the same time. Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 16 / 78 Example (Fish) A resort on a lake claims that the distribution of the number of fish X in the daily catch of experienced fisherman is given below. x 0 1 2 3 4 5 6 7 P(X = x) 0.02 0.08 0.10 0.18 0.25 0.20 0.15 0.02 Find the following : (a) P(X ≥ 5) 0:37 (b) P(2 < X < 5) 0:43 (c) y if P(X ≤ y) = 0:2 y = 2 (d) y if P(X > y) = 0:37 y = 4 (e) P(X 6= 5) 1 − 0:20 = 0:80 (f) P(X < 2 or X = 6) 0:25 (g) P(X < 2 and X > 4) 0 (h) P(X = 9) 0 Dr. Joseph Brennan (Math 148, BU) Part V - Chance Variability 17 / 78 Probability Histograms The probability distribution of a random variable X is called the probability histogram. There are k bars, where k is the number of possible values of X . The i-th bar is centered at the xi , has a unit width and height pi . The areas of the probability histograms display the assignment of probabilities to possible values of X .

Load more