Central Tendency, Dispersion, Correlation and Regression Analysis Case Study for MBA Program
Total Page:16
File Type:pdf, Size:1020Kb
Business Statistic Central Tendency, Dispersion, Correlation and Regression Analysis Case study for MBA program CHUOP Theot Therith 12/29/2010 Business Statistic SOLUTION 1. CENTRAL TENDENCY - The different measures of Central Tendency are: (1). Arithmetic Mean (AM) (2). Median (3). Mode (4). Geometric Mean (GM) (5). Harmonic Mean (HM) - The uses of different measures of Central Tendency are as following: Depends upon three considerations: 1. The concept of a typical value required by the problem. 2. The type of data available. 3. The special characteristics of the averages under consideration. • If it is required to get an average based on all values, the arithmetic mean or geometric mean or harmonic mean should be preferable over median or mode. • In case middle value is wanted, the median is the only choice. • To determine the most common value, mode is the appropriate one. • If the data contain extreme values, the use of arithmetic mean should be avoided. • In case of averaging ratios and percentages, the geometric mean and in case of averaging the rates, the harmonic mean should be preferred. • The frequency distributions in open with open-end classes prohibit the use of arithmetic mean or geometric mean or harmonic mean. Prepared by: CHUOP Theot Therith 1 Business Statistic • If the distribution is bell-shaped and symmetrical or flat-topped one with few extreme values, the arithmetic mean is the best choice, because it is less affected by sampling fluctuations. • If the distribution is sharply peaked, i.e., the data cluster markedly at the middle or if there are abnormally small or large values, the median has smaller sampling fluctuations than the arithmetic mean. • The arithmetic mean should ordinarily be used, because, it is simple, rigidly defined, based on all observations and amenable to further statistical treatment, unless nature of data strongly prohibits its use. Conclusion to choose of an Average: - Arithmetic Mean (AM): is used in generally - Median: is used when data are extremely value - Mode: is used to find the most common use/need/demand… - Geometric Mean (GM): is used to find the average of ratio/percentage - Harmonic Mean (HM): is used to find the average speed. 2. EXPLAIN THE DIFFERENCE BETWEEN ABSOLUTE AND RELATIVE MEASURES OF DISPERSION Absolute and Relative Measures of Variation: • Absolute measures of dispersion are expressed in the same statistical unit in which the original data are given such as riels, kilograms, tones, etc. These values may be used to compare the variation in two distributions provided the variables are expressed in the same units and of the same average size. Prepared by: CHUOP Theot Therith 2 Business Statistic • In case the two sets of data are expressed in different units, such as quintals of sugar versus tones of sugarcane or if the average size is very different such as managers’ salary versus workers’ salary the relative measures of dispersion should be used. 3. COMPUTE THE SAMPLE ARITHMETIC MEAN: Time Lower Upper Number of Mid-point in second limit limit Customers (f) (m) fm 20 -29 20 29 60 24.5 1470 30 -39 30 39 160 34.5 5520 40 -49 40 49 210 44.5 9345 50 -59 50 59 290 54.5 15805 60 -69 60 69 250 64.5 16125 70 -79 70 79 220 74.5 16390 80 -89 80 89 110 84.5 9295 90 -99 90 99 70 94.5 6615 100 -109 100 109 40 104.5 4180 110 -119 110 119 10 114.5 1145 120 -129 120 129 20 124.5 2490 N=∑f=1440 ∑f m= 88380 fm 88380 By formula: X 61.375 N 1440 Interpret the result: in generally, cashiers need 61.375 seconds (around 62 seconds) in average to serve each customer. 4. THE MEDIAN AND MODAL INCOMES Income (in $) Number of Households c.f. Less than 2000 151 151 2000 up to 3000 183 334 3000 up to 4000 212 546 4000 up to 5000 184 730 5000 up to 6000 157 887 6000 and greater 113 1000 N= 1000 a. Find the median incomes - Median class = size of N/2 th item = 1000/2=500 - Median lies in the class of 3000 up to 4000 Prepared by: CHUOP Theot Therith 3 Business Statistic N c. f . MedianL 2 i f Where L = 3000, the lower limit of the median class N = 1000, total number of households (total frequency) f = 212, households’ number (frequency) of median class c.f. = 334, cumulative frequency of the class preceding the median class i = 1000, the class interval of the median class (4000-3000) Hence, 500 334 Median3000 2 1000 3783.0189 $3783 212 Therefore, the median of households’ incomes is 3783 dollars b. Find the modal incomes The highest frequency (number of households) is 212, so the modal class is 3000-4000. By formula: 1 Mo L i 1 2 Where L = 3000, the lower limit of the modal class 1 = 212 – 183 = 29 2 = 212 – 184 = 28 i = 1000 29 Hence, Mo 3000 1000 $3508.77 29 28 Therefore, the modal of households’ incomes is 3508.77 dollars Prepared by: CHUOP Theot Therith 4 Business Statistic 5. THE FOLLOWING DATA ARE THE ESTIMATED MARKET VALUES (IN $ MILLIONS) OF 50 COMPANIES IN THE AUTO PARTS BUSINESS. 2 Nº x xi x xi x 1 26.8 9.642 92.968164 2 28.3 11.142 124.144164 3 11.7 -5.458 29.789764 4 6.7 -10.458 109.369764 5 6.1 -11.058 122.279364 6 8.6 -8.558 73.239364 7 15.5 -1.658 2.748964 8 18.5 1.342 1.800964 9 31.4 14.242 202.834564 10 0.9 -16.258 264.322564 11 6.5 -10.658 113.592964 12 31.4 14.242 202.834564 13 6.8 -10.358 107.288164 14 30.4 13.242 175.350564 15 9.6 -7.558 57.123364 16 30.6 13.442 180.687364 17 23.4 6.242 38.962564 18 22.3 5.142 26.440164 19 20.6 3.442 11.847364 20 35 17.842 318.336964 21 15.4 -1.758 3.090564 22 4.3 -12.858 165.328164 23 12.9 -4.258 18.130564 24 5.2 -11.958 142.993764 25 17.1 -0.058 0.003364 26 18 0.842 0.708964 27 20.2 3.042 9.253764 28 29.8 12.642 159.820164 29 37.8 20.642 426.092164 30 1.9 -15.258 232.806564 31 7.6 -9.558 91.355364 32 33.5 16.342 267.060964 33 1.3 -15.858 251.476164 34 13.4 -3.758 14.122564 35 1.2 -15.958 254.657764 36 21.5 4.342 18.852964 37 7.9 -9.258 85.710564 Prepared by: CHUOP Theot Therith 5 Business Statistic 38 14.1 -3.058 9.351364 39 18.3 1.142 1.304164 40 16.6 -0.558 0.311364 41 11 -6.158 37.920964 42 11.2 -5.958 35.497764 43 29.7 12.542 157.301764 44 27.1 9.942 98.843364 45 31.1 13.942 194.379364 46 10.2 -6.958 48.413764 47 1 -16.158 261.080964 48 18.7 1.542 2.377764 49 32.7 15.542 241.553764 50 16.1 -1.058 1.119364 2 x x 5486.8818 a. Determine the standard deviation of the market values. By formula: x 2 N x x x ... x 857.9 Where 1 2 50 17.158 N 50 50 And according to the table above, the standard deviation 2 x 5486.8818 10.475573(in million dollar) N 50 Therefore the standard deviation of the market values is 10.47 (million dollars) b. Determine the coefficient of variation. 10.475573 C.V. 100 100 61.05536% x 17.158 Therefore the coefficient of variation is C.V. = 61.05536% Prepared by: CHUOP Theot Therith 6 Business Statistic 6. DETERMINE KARL PEARSON’S COEFFICIENT OF CORRELATION Annual R&D spent 2 2 Year Profit x x y y x x x x (x x)(y y) ( x ) ( y ) 2000 2 20 -4.1 -12.8 16.81 163.84 52.48 2001 3 25 -3.1 -7.8 9.61 60.84 24.18 2002 5 34 -1.1 1.2 1.21 1.44 -1.32 2003 4 30 -2.1 -2.8 4.41 7.84 5.88 2004 11 40 4.9 7.2 24.01 51.84 35.28 2005 5 31 -1.1 -1.8 1.21 3.24 1.98 2006 6 35 -0.1 2.2 0.01 4.84 -0.22 2007 8 36 1.9 3.2 3.61 10.24 6.08 2008 7 38 0.9 5.2 0.81 27.04 4.68 2009 10 39 3.9 6.2 15.21 38.44 24.18 2 2 x y x x y y x xy y =61 =328 =76.9 =369.6 =153.2 By formula (x x)(y y) r (x x)2 y y2 Where x 61 x 6.1 N 10 y 328 y 32.8 N 10 x x2 76.90 y y2 369.60 x xy y153.20 Hence, 153.20 r 0.9087 76.90369.60 Therefore coefficient of correlation is r = 0.9087 Prepared by: CHUOP Theot Therith 7 Business Statistic - Explain the relationship between the amount spent on R&D and profit of the company. The value of correlation coefficient r = 0.9087, it indicates that the relationship between the amount spent on R&D and profit of the company is high degree of positive correlation.