Log Normal Distribution: How to Find Shares of a Group

Total Page:16

File Type:pdf, Size:1020Kb

Log Normal Distribution: How to Find Shares of a Group

Log Normal Distribution: How to find shares of a group ?

Suppose our variable of interest is X (annual sales of a firm, for example).

For the time being, let us suppose that we have reasons to believe that X has a Log Normal distribution with m and s2 as the mean and the variance.

This means, ln(X) has a Normal Distribution with m and s2 as the mean and the variance.

In our example: m = - 0.2037 and s = 1.3321

[Note that the m and s above are measured in units of 1,00,000s i.e Lakhs in the Indian system]

There can be two types of queries that you may be interested in:

(a) What is the share of firms in total annual sales where each one of these firms has annual sales above Rs. 5,00,000 ? (b) What is the share of, say, top 3% of the firms?

Here is the recipe to find the answers:

Query (a) ln(5)  m ln(5)  0.2037 Step 1: Compute - z   s   1.3321  0.029012 s 1.3321

Step 2: Find the area above z (i.e. 0.029012 in this case) in standard normal distribution i.e. N(0,1) [go to StatCalc or PHStat2 for this].

The result is 0.4884. Thus, the share of the firms with annual sales above Rs. 5 Lakhs each is 48.84% !

Query (b)

Step 1: Find z such that P(X ≤ z) = 1 – 0.03 = 0.97 in the standard normal distribution i.e. N(0,1) [go to StatCalc or PHStat2]

The result is z = 1.88079

Step 2: Compute z1 = z – s = 1.88079 - 1.3321 = 0.54869

1 Step 3: Find from standard normal distribution i.e. N(0,1) the area above z1 i.e. 0.54869. The result is 0.2916.

Thus, the share of top 3% firms in total annual sales is 29.16%

Log Normal Distribution: Lorenz Ratio

While discussing Log Normal distribution, let me also show you another useful result.

The Lorenz Ratio of a Log Normal distribution with variance s2 is given by –

2 [area below s/√2 in N(0,1)] - 1

Thus, in the present data on annual sales of firms, the Lorenz Ratio is 0.65.

Please check if you can get it. I have used rounding off to four places of decimals in the intermediate calculation (so your results and mine may vary a little depending on the extent of rounding off you do).

I must be begging the following Questions / Doubts

How do we know that we can reasonably assume a Log Normal Distribution?

I have to answer this by assuming two different situations:

In the first situation – you may have data as I have used here on annual sales (a sample of 50 firms, refer to the Excel Sheet ‘lognormal-distn.xls’ from my website). In this case, just fit the log normal distribution following the procedure I have already demonstrated in the classroom, and check if it would be a reasonable assumption.

In the second situation where you do not have any such data available at hand (I think in your case, most often you can get them from one database or other, such as the CMIE database in India but … suppose you don’t have).

Well, this is a bit of a blind situation! We need to fall back upon what we know from experience. Almost all distributions of economic variables are skewed, but not all are well described by the Log Normal distribution model. This model works fairly well in the case of size distributions of firms (not conglomeration of units) and incidence of agglomeration or mergers is rare. I am sticking my neck out while I say thism, because there can be situations when even the absence of agglomeration may not be enough.

2 There are other economic variables such as the consumption expenditure where this distribution is often an adequate description.

Log Normal distribution does not work well in the case of income or asset distributions.

Above are only some clues. You have to take your chances – after all no action because of lack of information, is worse than some approximate action!

How do we know m and s 2? If you have the data such as I have in this example, then just take natural log of all the figures and compute the mean and the variance to get m and s2. Incidentally, use the following formula for computing s2 , if you have a sample data in hand:

n n 2 2 1 2 1  2  s  (xi  x)   xi  n.x  n 1 1 n 1  1 

Most softwares, like Excel (stdev and stddevp), give you the option to compute the variance or standard deviation for the population or the sample. You should use the latter, if you are dealing with a sample.

Suppose you don’t have the raw data but you have (from past data, may be) the mean and variance of the original variable (not mean and variance of the log data), then get m and s2 using the following formula: a = Mean of X b2 = Variance of X

Then,

 a 2  m  ln   2 2   a  b 

 a 2  s 2  ln 1  2   b 

HAPPY HUNTING “ABNORMAL DISTRIBUTIONS”!!!!

3

Recommended publications