<<

Business Central Tendency, Dispersion, Correlation and Case study for MBA program

CHUOP Theot Therith 12/29/2010

Business Statistic

SOLUTION

1. CENTRAL TENDENCY

- The different measures of Central Tendency are:

(1). Arithmetic (AM)

(2).

(3).

(4). (GM)

(5). (HM)

- The uses of different measures of Central Tendency are as following:

Depends upon three considerations:

1. The concept of a typical value required by the problem.

2. The type of available.

3. The special characteristics of the under consideration.

• If it is required to get an based on all values, the or geometric mean

or harmonic mean should be preferable over median or mode.

• In case middle value is wanted, the median is the only choice.

• To determine the most common value, mode is the appropriate one.

• If the data contain extreme values, the use of arithmetic mean should be avoided.

• In case of averaging ratios and percentages, the geometric mean and in case of averaging the

rates, the harmonic mean should be preferred.

• The distributions in open with open-end classes prohibit the use of arithmetic

mean or geometric mean or harmonic mean.

Prepared by: CHUOP Theot Therith 1 Business Statistic

• If the distribution is bell-shaped and symmetrical or flat-topped one with few extreme

values, the arithmetic mean is the best choice, because it is less affected by

fluctuations.

• If the distribution is sharply peaked, i.e., the data cluster markedly at the middle or if there

are abnormally small or large values, the median has smaller sampling fluctuations than the

arithmetic mean.

• The arithmetic mean should ordinarily be used, because, it is simple, rigidly defined, based

on all observations and amenable to further statistical treatment, unless nature of data

strongly prohibits its use.

Conclusion to choose of an Average:

- Arithmetic Mean (AM): is used in generally

- Median: is used when data are extremely value

- Mode: is used to find the most common use/need/demand…

- Geometric Mean (GM): is used to find the average of ratio/percentage

- Harmonic Mean (HM): is used to find the average speed.

2. EXPLAIN THE DIFFERENCE BETWEEN ABSOLUTE AND RELATIVE MEASURES OF

DISPERSION

Absolute and Relative Measures of Variation:

• Absolute measures of dispersion are expressed in the same in which the

original data are given such as riels, kilograms, tones, etc. These values may be used to

compare the variation in two distributions provided the variables are expressed in the same

units and of the same average size.

Prepared by: CHUOP Theot Therith 2 Business Statistic

• In case the two sets of data are expressed in different units, such as quintals of sugar versus

tones of sugarcane or if the average size is very different such as managers’ salary versus

workers’ salary the relative measures of dispersion should be used.

3. COMPUTE THE SAMPLE ARITHMETIC MEAN:

Time Lower Upper Number of Mid-point in second limit limit Customers (f) (m) fm 20 -29 20 29 60 24.5 1470 30 -39 30 39 160 34.5 5520 40 -49 40 49 210 44.5 9345 50 -59 50 59 290 54.5 15805 60 -69 60 69 250 64.5 16125 70 -79 70 79 220 74.5 16390 80 -89 80 89 110 84.5 9295 90 -99 90 99 70 94.5 6615 100 -109 100 109 40 104.5 4180 110 -119 110 119 10 114.5 1145 120 -129 120 129 20 124.5 2490 N=∑f=1440 ∑f m= 88380

fm 88380 By formula: X     61.375 N 1440

Interpret the result: in generally, cashiers need 61.375 seconds (around 62 seconds) in average to

serve each customer.

4. THE MEDIAN AND MODAL INCOMES

Income (in $) Number of Households c.f. Less than 2000 151 151 2000 up to 3000 183 334 3000 up to 4000 212 546 4000 up to 5000 184 730 5000 up to 6000 157 887 6000 and greater 113 1000 N= 1000 a. Find the median incomes

- Median class = size of N/2 th item = 1000/2=500

- Median lies in the class of 3000 up to 4000 Prepared by: CHUOP Theot Therith 3 Business Statistic

 N    c. f . MedianL   2 i  f     

Where L = 3000, the lower limit of the median class

N = 1000, total number of households (total frequency)

f = 212, households’ number (frequency) of median class

c.f. = 334, cumulative frequency of the class preceding the median class

i = 1000, the class interval of the median class (4000-3000)

Hence,

 500    334 Median3000  2 1000 3783.0189 $3783  212     

Therefore, the median of households’ incomes is 3783 dollars b. Find the modal incomes

The highest frequency (number of households) is 212, so the modal class is 3000-4000.

By formula:

 1  Mo L   i  1  2 

Where L = 3000, the lower limit of the modal class

1 = 212 – 183 = 29

2 = 212 – 184 = 28

i = 1000

 29  Hence, Mo 3000  1000 $3508.77  29  28

Therefore, the modal of households’ incomes is 3508.77 dollars

Prepared by: CHUOP Theot Therith 4 Business Statistic

5. THE FOLLOWING DATA ARE THE ESTIMATED MARKET VALUES (IN $ MILLIONS)

OF 50 COMPANIES IN THE AUTO PARTS BUSINESS.

2 Nº x xi  x xi  x 1 26.8 9.642 92.968164 2 28.3 11.142 124.144164 3 11.7 -5.458 29.789764 4 6.7 -10.458 109.369764 5 6.1 -11.058 122.279364 6 8.6 -8.558 73.239364 7 15.5 -1.658 2.748964 8 18.5 1.342 1.800964 9 31.4 14.242 202.834564 10 0.9 -16.258 264.322564 11 6.5 -10.658 113.592964 12 31.4 14.242 202.834564 13 6.8 -10.358 107.288164 14 30.4 13.242 175.350564 15 9.6 -7.558 57.123364 16 30.6 13.442 180.687364 17 23.4 6.242 38.962564 18 22.3 5.142 26.440164 19 20.6 3.442 11.847364 20 35 17.842 318.336964 21 15.4 -1.758 3.090564 22 4.3 -12.858 165.328164 23 12.9 -4.258 18.130564 24 5.2 -11.958 142.993764 25 17.1 -0.058 0.003364 26 18 0.842 0.708964 27 20.2 3.042 9.253764 28 29.8 12.642 159.820164 29 37.8 20.642 426.092164 30 1.9 -15.258 232.806564 31 7.6 -9.558 91.355364 32 33.5 16.342 267.060964 33 1.3 -15.858 251.476164 34 13.4 -3.758 14.122564 35 1.2 -15.958 254.657764 36 21.5 4.342 18.852964 37 7.9 -9.258 85.710564 Prepared by: CHUOP Theot Therith 5 Business Statistic

38 14.1 -3.058 9.351364 39 18.3 1.142 1.304164 40 16.6 -0.558 0.311364 41 11 -6.158 37.920964 42 11.2 -5.958 35.497764 43 29.7 12.542 157.301764 44 27.1 9.942 98.843364 45 31.1 13.942 194.379364 46 10.2 -6.958 48.413764 47 1 -16.158 261.080964 48 18.7 1.542 2.377764 49 32.7 15.542 241.553764 50 16.1 -1.058 1.119364 2 x  x 5486.8818

a. Determine the standard of the market values.

By formula:

x  2    N

x x  x ... x 857.9 Where     1 2 50  17.158 N 50 50

And according to the table above, the

2 x   5486.8818     10.475573(in million dollar) N 50

Therefore the standard deviation of the market values is 10.47 (million dollars) b. Determine the .

 10.475573 C.V.  100  100  61.05536% x 17.158

Therefore the coefficient of variation is C.V. = 61.05536%

Prepared by: CHUOP Theot Therith 6 Business Statistic

6. DETERMINE KARL PEARSON’S COEFFICIENT OF CORRELATION

Annual R&D spent 2 2 Year Profit x  x y  y x  x x  x (x  x)(y  y) ( x ) ( y ) 2000 2 20 -4.1 -12.8 16.81 163.84 52.48 2001 3 25 -3.1 -7.8 9.61 60.84 24.18 2002 5 34 -1.1 1.2 1.21 1.44 -1.32 2003 4 30 -2.1 -2.8 4.41 7.84 5.88 2004 11 40 4.9 7.2 24.01 51.84 35.28 2005 5 31 -1.1 -1.8 1.21 3.24 1.98 2006 6 35 -0.1 2.2 0.01 4.84 -0.22 2007 8 36 1.9 3.2 3.61 10.24 6.08 2008 7 38 0.9 5.2 0.81 27.04 4.68 2009 10 39 3.9 6.2 15.21 38.44 24.18 2 2 x  y x  x y  y x  xy  y =61 =328 =76.9 =369.6 =153.2

By formula

(x  x)(y  y) r   (x  x)2 y  y2

Where

x 61 x     6.1 N 10 y 328 y     32.8 N 10 x  x2  76.90 y  y2  369.60 x  xy  y153.20

Hence,

153.20 r   0.9087 76.90369.60

Therefore coefficient of correlation is r = 0.9087

Prepared by: CHUOP Theot Therith 7 Business Statistic

- Explain the relationship between the amount spent on R&D and profit of the company.

The value of r = 0.9087, it indicates that the relationship between the amount spent on R&D and profit of the company is high degree of positive correlation. , the company should spend more on R&D to get more its annual profit.

7. CORRELATION AND REGRESSION, SCATTER DIAGRAM.

- The difference between correlation and regression Correlation: is a statistical tool, which studies or measures the relationship between two variables. It enables us to have an idea about the degree and direction of the relationship between the two variables under study. Examples, the relationship between advertisement expense and sales, amount spend on R&D and annual profit.

Regression: is another one important statistical tools, which studies or measures the impact of one variable to other. It means the estimation or the prediction of the unknown value of one variable from the known value of the other variable. Examples, the impact of the advertisement expense to sales, the estimation of earnings from sales.

- The scatter diagram: the scatter diagram is the diagrammatic of bivariate data. It only tells us about the nature of the relationship whether it is positive or negative and whether it is high or low. It does not provide us an exact measure of the extent of the relationship between the two variables.

Below are the explanations through the scatter diagrams “Graphic”:

y y

x x (a) x (b)

Prepared by: CHUOP Theot Therith 8 Business Statistic

1. Picture (a) indicates the correlation is perfect and positive because all the points lie on a

straight line starting from the left bottom and going up towards the right top. It is perfect

positive correlation, means, 100% increase/decrease of x ==> 100% increase/decrease of

y (this case coefficient of correlation is r = +1). Picture (b) indicates the correlation is

perfect and negative because all the points lie on a straight line starting from the left top

and coming down to the right bottom. It is perfect negative correlation, means, 100%

increase/decrease of x ==> 100% decrease/increase of y (this case coefficient of

correlation is r = -1)

y y

x x (c) (d)

2. Picture (c) shows the correlation is positive since this reveals that the values of the two

variables move in the same direction because the plotted points reveal an upward trend

rising from lower left hand corner and going upward to the upper right hand corner. If x

increases/decreases ==> y increases/decreases. Picture (d) shows the correlation is

negative since in this case the values of the two variables move in the opposite direction

because the points depict a downward trend from the upper left hand corner to the lower

right hand corner. If x increases/decreases ==> y decreases/increases.

3. If the points are very dense, i.e., very close to each other, a fairly good amount of

correlation may be expected between the two variables. If the points are widely scattered,

a poor correlation may be expected between them.

Prepared by: CHUOP Theot Therith 9 Business Statistic

8. REGRESSION ANALYSIS

a. Determine the regression equation

Sales X Earnings Y Company y  y 2 2 (x  x)(y  y) ($ millions) ($ millions) x  x x  x x  x Lucky 89.200 4.900 47.442 -0.442 2250.712 0.195 -20.953 KFC 28.600 6.000 -13.158 0.658 173.142 0.433 -8.663 Mekong Bus 18.200 1.300 -23.558 -4.042 554.995 16.335 95.215 Sorya Bus 69.200 12.800 27.442 7.458 753.045 55.627 204.669 Bayon Bakery 17.500 2.600 -24.258 -2.742 588.467 7.517 66.508 Apsara Bakery 11.900 1.700 -29.858 -3.642 891.520 13.262 108.734 Tiger Beer 71.700 8.000 29.942 2.658 896.503 7.067 79.595 Angkor Beer 58.600 6.600 16.842 1.258 283.642 1.583 21.192 Pizza World 19.600 3.500 -22.158 -1.842 490.992 3.392 40.808 Master Roll 18.600 4.400 -23.158 -0.942 536.308 0.887 21.807 Akira 51.200 8.200 9.442 2.858 89.145 8.170 26.987 Nokia 46.800 4.100 5.042 -1.242 25.418 1.542 -6.260 2 2 x  y x  x y  y x  xy  y =501.100 =64.100 =7533.889 =116.009 =629.641

According to the table above,

x 501.1 x     41.758 N 12 y 64.1 y     5.342 N 12 x  x2  7533.889 y  y2 116.009 x  xy  y  629.641

Find the coefficients:

dxdy x  xy  y 629.641 b       0.08357 yx dx2 x  x2 7533.889

dxdy x  xy  y 629.641 b       5.42751 xy dy2 y  y2 116.009

Prepared by: CHUOP Theot Therith 10

Business Statistic

Hence,

Equation of line of regression of earnings on sales (y on x):

y  y  byx (x  x) y  5.341 0.08357(x  41.758)

y  0.08357x  5.341 3.4897 y  0.08357x 1.8513

Therefore Equation of line of regression of earnings on sales (y on x): y = 0.08357x + 1.8513

Equation of line of regression of sales on earnings (x on y):

x  x  bxy (y  y) x  41.758 5.42751(y  5.342)

x  5.42751y  41.758 28.99195 x  5.42751y 12.766

Therefore Equation of line of regression of sales on earnings (x on y): x  5.42751y 12.766 b. Estimate the earnings for a small company with $50.0 million in sales

As equation of line of regression of earnings on sales (y on x): y  0.08357x1.8513 so the earnings is y  0.0835750 1.8513 6.0297(in million dollars)

Therefore the earning of that company with $50.0 million in sales is $6.0297 million.

Prepared by: CHUOP Theot Therith 11