Categorical and Quantitative Variables

Categorical and Quantitative Variables

<p> Math 120 Section 1.1 Categorical and Quantitative Variables Individuals are the objects that are described by data. Individuals can be people, animals, plants, cities, units of time, cars, etc. A variable is any characteristic of an individual. Typically variables take on different values for different individuals in the population. Example 1 In this example the individuals under consideration are the students in our class today. The variable under consideration is the height of the students. On the piece of paper provided, write your height in cm. (or feet and inches, to be converted to cm.) and pass the paper in. Record the heights of all the class members below in the first table. In the second table list the heights again, sorted in ascending order.</p><p>Unsorted Heights Sorted Heights</p><p>We can create a graph of the data by dividing the range of the data into intervals of equal width and counting the number of observations that fall into each category. Interval (cm) Count Up to 150 151 – 155 156 – 160 161 – 165 166 – 170 171 – 175 176 – 180 181 – 185 186 and above Draw a histogram to display the data and comment on the distribution noting:</p><p>Math 120 Lecture Notes 1  Shape (symmetric or skewed)  Center  Spread</p><p>Quantitative vs Categorical Variables</p><p>Variables can be either quantitative (such as height, weight, etc) or categorical (such as gender, eye colour, political affiliation, etc). Quantitative variables can always be described with a number.</p><p>Example 2 For each of the following data sets, identify the individuals, the variable, and whether the variable is quantitative or categorical</p><p> a) The speed of 10 cars were measured by a radar device on a city street: 55 48 65 35.5 56 52 59 59.5 48 70</p><p>The individuals described by this data are ______</p><p>The characteristic measured by this variable is ______</p><p>The variable is a) quantitative b) categorical</p><p>Math 120 Lecture Notes 2 b) The number of students registered in a selection of UCC math classes: </p><p>Math 120-03: 13 Math 110-01: 33 Math 316-01 : 12 Math 110-02: 36 Math 322-01: 26 Math 120-01: 38 Math 211-01: 27 Math 120-02: 25</p><p>The individuals described by this data are ______</p><p>The characteristic measured by this variable is ______</p><p>The variable is a) quantitative b) categorical</p><p> c) The total annual rainfall (in inches) that fell in Gnomesville in given years: 1990 - 33 inches 1991 - 45 inches 1992 - 36 inches 1993 - 50 inches 1994 - 50 inches 1995 - 42 inches</p><p>The individuals described by this data are ______</p><p>The characteristic measured by this variable is ______</p><p>The variable is a) quantitative b) categorical</p><p> d) The program that students a Math 120 class are enrolled in: John J. : CSOM Jane D. : Tourism . . . Wendy P. : Accounting Surinder P. : Arts . . .</p><p>The individuals described by this data are ______</p><p>The characteristic measured by this variable is ______</p><p>The variable is a) quantitative b) categorical</p><p> e) The colour of all the cats in the SPCA was recorded.</p><p>The individuals described by this data are ______</p><p>The characteristic measured by this variable is ______</p><p>The variable is a) quantitative b) categorical</p><p>Displaying Data with graphs</p><p>Math 120 Lecture Notes 3 In Example 1 we saw that a histogram can be used to display quantitative data. (A bar graph of pie chart is more useful for displaying categorical data.) Another way to display quantitative data is with a stem and leaf plot. This is done in the following exercise:</p><p>Example 3: Complete stem and leaf plot for the heights of students collected in the first lecture. (a split stem is used) The data is given below.</p><p>152 155 157 160 160 163 163 163 164 164 165 165 168 168 169 169 153 170 170 170 170 172 173 173 175 175 178 180 182 183 193 193</p><p>15 15 16 16 17 17 18 18 19 19</p><p>Exercise 4: Find out what portion of the students in class today are male and what portion are female. </p><p> a) Make a bar graph to display the data.</p><p> b) Make a pie chart to display the data</p><p>Math 120</p><p>Math 120 Lecture Notes 4 Section 1.2 Displaying Distributions with Numbers</p><p>Measuring the Centre</p><p>We can measure the center of quantitative data in two ways: with the mean and with the median. </p><p>The Mean, x</p><p>To find the mean of a set of observations, add their values and divide by the number of </p><p> observation. If the n observations are x1, x2, . . . xn, their mean is: x  x  ⋯  x x  1 2 n n</p><p>1 In more compact notation this may be written: x  x n  i</p><p>Example 1 Calculate the mean of the numbers given. Plot the numbers and their mean on the number line, and note that the mean is the “balance point”.</p><p> a) 1 4 5 6 13</p><p>0</p><p> b) 1 4 5 5 9 9 10 13</p><p>0</p><p> c) 1 1 2 2 2 3 3 10</p><p>0</p><p>Example 1 c) shows that the mean is sensitive to outliers, that is, observations that fall far outside of the usual pattern. </p><p>Math 120 Lecture Notes 5 The median M is another way of measuring the center of quantitative data. The median is the midpoint of a distribution, the number such that half the observations are smaller and the other half are larger. To find the median:</p><p>1. Arrange all the numbers in order of size, from smallest to largest. 2. If the number of observations is odd, the median is the center observation. 3. If the number of observations is even, then the median is the mean of the two center observations.</p><p>Example 2 Determine the median of the numbers given. Plot the numbers and their median on the number line, and note that the median, unlike the mean, is not aways the “balance point”.</p><p> d) 1 4 5 6 13</p><p>0</p><p> e) 1 4 5 5 9 9 10 13</p><p>0</p><p> f) 1 1 2 2 2 3 3 10</p><p>0</p><p> g) 1 1 2 2 2 3 3 4</p><p>0</p><p>Math 120 Lecture Notes 6 Measuring the Spread</p><p>How spread out a data set is can be measured in several different ways:  Range - this is the simplest measure of variability. The range is just the difference between the highest and lowest value.  Quartiles  the 1st quartile, Q1, lies 1/4 way up the sorted list of data. It is the median of the observations that fall below the median  The 3rd quartile, Q3, lies 3/4 way up the sorted list of data. It is the median of the observations that fall above the median.  The Five-number summary consists of the maximum, minimum, median, Q1, and Q3. These can be shown graphically in a box-plot.  The standard deviation - this is used to measure the spread of the data when the mean is used to measure the centre. It is a measure of how far the observations deviate from their mean.</p><p>Example 3</p><p>The following data prepresent the length of long-distance phone calls (in minutes) made by a business office in 1 day:</p><p>3 7 2 14 4 29 3 9 1 20 10 7 2 42 3 5 a) Find the five-number summary (first sort the data)</p><p>Min______Q1______Median______Q3______Max______</p><p> b) draw a box plot to represent the data.</p><p>Math 120 Lecture Notes 7 Standard deviations: The standard deviation of a set of quantitative data is a measure of how spread out the data is around the mean. </p><p>The formula for the standard deviation of n observation, x1, x2,. . . xn, is:</p><p>(x  x)2  (x  x)2  ⋯  (x  x)2 s  1 2 n n  1</p><p>This formula is quite complicated and difficult to calculate. Fortunately your calculator can calculate the standard deviation automatically. </p><p>Example 4: Complete the table and calculate the standard deviation of the following sets of data. Compare the results with that obtained directly from a calculator a) 0 5 7 Obersvation Devation Squared Deviation X (x - x ) (x - x )2</p><p>Mean = Sum (alway) = Sum = </p><p> b) 2 3 0 2 1 0 3 0 1 4 Obersvation Devation Squared Deviation X (x - x ) (x - x )2</p><p>Mean = Sum (alway) = Sum = </p><p>Math 120 Lecture Notes 8 Example 5</p><p>The following data represents the minimum daily temperature ( C) in two different cities on 4 consecutive days. For each city:  Calculate the mean and standard deviation (use your calculator)  Plot the data as dots (city A) and x's (city B) on the number line  Explain why the standard deviation is larger for city B than for City A. </p><p>City A : 4 6 8 10 City B: 0 6 8 14 </p><p>City A: Mean______Standard Deviation______</p><p>0</p><p>City B: Mean______Standard Deviation______</p><p>0</p><p>Math 120 Lecture Notes 9 Math 120 Section 1.3 The Normal Distribution</p><p>Suppose 32 specimens of a species of insect are captured and measured. Their length in mm is given in the table below. 0.5 Make a histogram to show the distribution of the data, using 1.20 bins of width 1 mm. ( 0 - 1 mm, 1 - 2 mm, etc) 2.21 2.75 3.19 3.45 4.12 4.39 4.78 5.01 5.35 5.98 6.21 6.35 1. Shade the portion of the histogram that represents the insects 6.50 that are between 8mm and 11mm in length. 6.78 7.13 7.45 2. What portion of the specimens are between 8mm and 11 mm in 7.59 length? 7.63 7.86 8.03 3. What is the total area of the histogram? 8.50 8.83 9.50 4. What is the area of the histogram that lies between x = 8 and x = 9.95 11? 10.39 10.76 11.59 5. What portion of the area of the histogram lies between x = 8 and x 11.76 = 11? 12.59 13.25 6. If one of the 32 specimens is chosen at random, what is the probability that it is between 8mm and 11mm in length?</p><p>Math 120 Lecture Notes 10 Suppose we adjusted the vertical scale on the histogram in the previous example so that the total area of the histogram is 1. Then the area of the histogram between 8 and 11 would be equal to the portion of observations that fall between 8 mm and 11mm. When a histogram has a total area of 1, and a smooth curve is drawn through the histogram, a density curve results. </p><p>A density curve is a curve that:  Is always on or above the horizontal axis  Has a total area of exactly 1 underneath it</p><p>One very important class of curves is the normal curves. Much real-world data are distributed along a normal curve. </p><p>A standard normal distribution is a distribution with mean = 0 and standard deviation = 1. Tables exist that tell us the area under a standard normal curve and to the left of particular z values. Any normal distribution can be standardized to convert it to a standard normal distribution</p><p>Exercise: In each case, use the table to find the area under a standard normal distribution for the given z-values. Shade in the appropriate portion of the graph a) z < -1.22</p><p> b) z > 1.22</p><p> c) 0 < z < 1.2</p><p>Math 120 Lecture Notes 11 Recall that the standard normal distribution has a mean () of 0 and a standard deviation () of 1. We abbreviate any normal distribution with mean  and standard deviation  as N(,). So the standard normal distribution is abbreviated as N(0,1).</p><p>Exercise 1: Suppose z is a variable that follows the standard normal distribution. In each case, use tables to find the portion of the z's that fall within the indicated interval. Shade in the appropriate portion of the graph: a) within one standard deviation of the mean.</p><p> b) within two standard deviations of the mean.</p><p> c) within three standard deviations of the mean.</p><p>The above results are often refered to as the 68-95-99.7 rule: _____% of observations fall within one standard deviation of the mean _____% of observations fall within two standard deviations of the mean _____% of observations fall within three standard deviations of the mean</p><p>Math 120 Lecture Notes 12 Any normal distribution can be standardized, that is, changed into a standard normal distribution by the following transformation:</p><p>If x is an observation from a normal distribution that has mean  standard deviation , the standardized value of x is:</p><p> x   z  </p><p>Exercise 2: Let the variable x represent the actual amount of coffee in a serving dispensed by a machine that is supposed to dispense 16 ounces per serving. Suppose that the distribution of x is normal with mean of 16 and a standard deviation of 3; that is, x is N(16,3). In each case, state what portion of the servings dispensed by the machine have an x value in the indicated interval. Shade the appropriate area under the N(16,3) normal curve and the corresponding area under the N(0,1) normal curve. a) x < 12 ounces</p><p> b) x < 18 ounces </p><p> c) What portion of the servings are between 14 ounces and 20 ounces?</p><p>Math 120 Lecture Notes 13 d) What is the probability that the machine will give you a serving of more than 20 ounces?</p><p> e) What portion of the servings fall within 1 standard deviation of the mean?</p><p>Math 120 Lecture Notes 14 Exercise 3: Of the servings dispensed by the coffee machine described in Exercise 2, how small are the smallest 15% of servings? Shade the appropriate areas on the density curves.</p><p>Exercise 4: How large must a serving be to be amongst the largest 25% of servings? Shade the appropriate areas on the density curves.</p><p>Math 120 Lecture Notes 15 Exercise 5: Assume that legal Canadian quarters (25 cent coins) have weights (x values) that are normally distributed with a mean of x = 5.67 grams and a standard deviation of x = .07 grams. a) What portion of Canadian quarters weigh less than 5.6 grams? Shade in the appropriate areas on the graphs.</p><p> b) In hopes of detecting counterfeit quarters, a vending machine has been adjusted so that it will only accept quarters that weigh between 5.50 and 5.80 grams. What percentage of legal quarters are accepted by the machine? Shade in the appropriate areas on the graphs.</p><p>Math 120 Lecture Notes 16 b) How much must a quarter weigh to be amongst the lightest 8% of quarters? Shade in the appropriate areas.</p><p>Math 120 Lecture Notes 17</p>

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    17 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us