<<

Mashadi, Syamsudhuha, MDH Gamal dan M. Imran, (Eds) Proceedings of the International Seminar on Mathematics and Its Usage in Other Areas November II-12.2010 . ISBN.978-979-1222-9g-2

RARE EVENT MODEL SIMULATION FOR HEAVY TAILED DISTRIBUTION

Dodi Devianto

Department of Mathematics, Andalas University Limau Manis Campus, Padang-city 25163, West Sumatera, INDONESIA

Email: ddevianto(S).fmipa.unand.ac.id

Abstract.

It is shown that the sum of rare event X„j generated from generalization of negative in the scheme of infinitesimal system of ^ = {{^„y}y=i2 n) converges to a kind of stable distribution with skewed property and heavy tailed. The sum of random variables X„j show a good sh^ to figure out a rare event phenomenon where most of fail events have high probability concentrated around random variables zero, and the remainders (success events) have veiy low probability to appear.

Keywords: rare event, skewed distribution, heavy tailed distribution, stable distribution, generalization of negative binomial distribution, infinitesimal system schemes.

1. Introduction

It is knovm that strongest statistical argument based on the , which states that the sum of a large number of independent identically distributed random variables from a finite distribution will tend to be normally distributed. However, from empirical reasearch, infinitesimal system of random variables or triangular array problem has usually heavier tails and for special cases, the distribution of the row sums from this system does not fit a with well because of its heavy tailed and . This problem only can be solved by generalization of central limit theorem on the sense of stable distribution. Recently, the study of subclasses from skewed distribution especially for heavy tailed distribution, are booming because its ability to cover empirical data which heavy tailed distribution and it becomes the most popular alternative to Guassian distribution which has been rejected by numerous emperical studies. The strong empirical evidence for these features combined with the generalization of central limit theorem is used by many papers to justify the use of stable distribution models, in economics and finance are given by Mandelbrot (1963), Fama (1965), Fama and Roll (1970), Embrechts et al. (1997), Rachev and Mittnik (2000), McCulloch (1996). The facts above give strong evidence about the importance of stable distribution to face heavy tailed and skewed distribution performed from empirical data especially for rare event phenomenon that is poorly described by Gaussian distribution, and it is worth to explain in mathematical theorem and also simulation to confirm the results. Therefore, this paper is devoted to simulate a rare event model for heavy tailed distribution in the scheme of infinitesimal system.

2. Low Probability and Rare Event Model

The phenomenon of low probability (rare) events rely on description of possible outcomes from something very rare, but they usually give huge impact and nearly impossible to predict from past history of data set. The rare events sometimes attributed as outlier as it lies outide the realm of regular expctatiton because nothing in the past can convincingly point to its possibility. For instance, the distribution of sale of a product break to market with outstanding high popularity can be recognized as rare event, since this phenomenon occurs with very low probability among many products. The formal mathematical set up for this phenomenon is by setting X(jt) as a random

14 Mashadi, Syamsudhuha, MDH Gamal dan M. Imran, (Eds) Proceedings of the International Seminar on Mathematics and Its Usage in Other Areas November 11-12,2010 ISRN 07«.07Q.n??.9S.7 variable for rare event phenomenon such as outstanding high popularity a product above and let event {Rit)} defined on a probability space (il,A,P) and rare in the sense that x(0 = Pr(i?(r))->0 as /-^oo . An estimator for x(t) is a random variable X(t) such that xit) = EXit). The difficulty in rare event simulation is to produce estimators which not only small variance in term Var{X(t)) but also a small relative error ^Var(X(t))/xit). Assymptotically, the best performance which has been observed in realistic situation is a bounded relative error in the limit / -> 00. The famous simulation to generate rare event simulation performed by the Monte Carlo method, that is to produce « independent and identically distributed replications X,,X^ X. of X(t), estimate x-x(t) by its empirical average and form a confidence interval based upon the emeprical variance of X^ for J = I, 2,n. The this kind of event, as an example, can be approuched by taking probability p of "failure" close to one from generalization of negative binomial distribution having distribution as follows

Pr(X = r) = v(v + l) ...(v + r-l)^^" r! where v = \/n for large enough n, r takes integer values starting from zero and p is probability of failure with 0 < p <\ , q = \- p . This probability distribution has fi = E{X) = yp/(l - p) and variance n -p +p - + p V (7^=E(iX-tif) = -

relatively small when v is getting small for large enough ti.

Probability Distribution 1.0

0.8

0.6

0.4

0.2

Random Variable X 10 20 30 40 50 Figure 1. Generalization negative binomial distribution (skewed graph, red plot) and classical negative binomial distribution (waved graph, blue plot) with probability of fail p = 0.75 and n = 5. Generalization negative binomial distribution has low probabilities except at x = o , that is to figure out the rare event phenomenon with very heavy tailed.

Beside the characterization of that kind of distribution with low probabilities, it is very fascinating to see the properties of their limit distribution of sums of independent random variables. There are many establish theorem, one of them is central limit theorem that it is necessary and sufficient conditions for sums of independent random variables with fixed mean and finite variance converges to normal distribution. In the very special case, e.g. random variables from rare event phenomenon with skewed distribution, limit distribution the sums of this independent random variables converges to some special distribution. In many cases with large number of random variables are systemized going to be small or most of them close to zero, then it is

15 Mashadi, Syamsudhuha, MDH Gamal dan M. Imran, (Eds) Proceedings of the hitemational Seminar on Mathematics and Its Usage in Other Areas November 11-12,2010 . ISBN.978.979.1777.0S.9 interesting to draw a distribution tendency from this sums of independent random variables, that we call the system as infinitesimal system of random variables. To cover problem above for a special case, Devianto and Takano (2007) have derived necessary and sufficient conditions for convergence of row sums of an infinitesimal triangular array of random variables to the where the proof of its theorem based on the Levy representation of infinitely divisible characteristic function of geometric distribution. Next, let us treat random variables fi-om the generalization of negative binomial distribution in infinitesimal system scheme, which is by setting {{X„j };=i.2 n) ^ a sequence of row wise independent identically distributed random variables with X„j has generalization of negative binomial distribution as the following probabilities,

Pr iX„j = r) = V (V +1) ... (V + r - 1) ^ (1 - /')"

where 00 max Pr{|^„,|>7;} = l-(1-/')''->0 y=l, 2,n ' as « -> 00. The parameter v in this term often to be called as over (under) dispersion parameter, that occurred when observed variance is higher (lower) than variance in theoretical model. In this scheme of infinitesimal system random variables we fixed over (under) dispersion parameter v = l/«. Base on the infinitesimal system {{X„^.}y=,_2 „}„=i, 2.... where random variable X„j has generalization of negative binomial distribution, then we have important example on convergence to the geometric distribution in the following theorem. Theorem (Devianto and Takano, 2007). The system of independent identically distributed random variables {{A'„y}y=i.2 „}„=i,2,... is the infinitesimal system of random variables and the sequence of distribution functions of sums of independent random variables Z„ = A'„, + Z„2 +... + X„„ converges completely to the geometric distribution.

We have known that random sample from generalization of negative binomial distribution can be recognized as a rare event case. If we set an infinitesimal system of random variables {{^n;}>i.2 «}n=i, 2... ^hcre X„j has generalization of negative binomial distribution, then by Theorem above random variable Z„ =X„,+X„2+...converges completely to the geometric distribution. This fact gives us new evidence that sums of independent identicuil> distributed random variables from rare event phenomenon has tendency not only its heavy tailed distribution but also on convergence to kind of memory less distribution, that is convergence to the geometric distribution.

3. The Model and Simulation Results

We have explained that generalization of negative binomial distribution has a good shape to figt .c out a rare event phenomenon where most of "fail" events have high probability concentrated on random variables with value zero, and the remainders (success events) have very low probability to appear. Now, by using this fact we generate random samples X„^ from this rare event phenomenon by using acceptance-rejection algorithm then setting them into new random variables Y„ defined at time t as follows

16 Mashadi, Syamsudhuha, MDH Gamal dan M. Imran, (Eds) Proceedings of the International Seminar on Mathematics and Its Usage in Other Areas November II-12,2010 TSRN

\+X„j with probability 1/2 with probability 1/2. We are interested in probability distribution of annual growth of the sums of independent random variables Y„j ,that is Z„ = 7„, + Y„2 + ... +Y„„. We expect that the statistical properties of the growth of distribution Z„ still depend on properties of random samples X„j, since it is natural that magnitude of fluctuations Z„ will increase while the sharpness of probability distribution is going more skewed than probability half-geometric distribution, where half geometric distribution is defined as follows

(l/2)(l-/7)p" for « = 0,1, 2,.. PriX = n) = il/2)(l-p)p-" for « = -l,-2, where p is a fixed probability of "failure" on any single attempt.

Figure 2. Plot PDF of z, (red plot) and half-geometric distribution (blue plot) with iV^ = 10'' random samples and p = 0.9. The difference shape between distribution of Z„ and half-geometric distribution occurred because of effect summation random samples by adding plus or minus while generating random variable Y„j.

J20 Figure 3. Plot CDF of Z„ (red plot) and half-geometric distribution (blue plot) with A'^ = 10'' random samples and p = 0.9. The difference shape between distribution of Z„ and half-geometric distribution occurred because of effect summation random samples by adding plus or minus while generating random variable Y„j.

17 Mashadi, Syamsudhuha, MDH Gamal dan M. Imran, (Eds) Proceedings of the International Seminar on Mathematics and Its Usage in Other Areas November II-12,2010 ISBN. 978-979-1222-95-2

Figure 4. The Log-plot PDF of Z„ (red plot) performs very close to straight line, while half-geometric distribution (blue plot) collapse onto two single straight line.

Figure 5. The Log-plot PDF of 1-CDF Z„ (blue plot), the tail of Log-log-plot is expected to be a straight line. The model of rare event above is to show growth dynamics by random samples Z„, that strongly performs a kind of stable distribution, this satble distribution has implication of heavy tailed distribution, where the tail show a kind of power lawa behavior. The log-log plot of the distributions (Figure 4) collapses onto straight line. This result suggest the universality, the equivalence of power laws with a particular scaling exponent can have a deeper origin in the dynamical processes that generate the relation in the tail behavior. This is a new alternative point of view in the study of skewed distribution related to firm size distribution for rare event phenomenon, and it is the worth result to study the size of business firms distribution for growth dynamics.

4. Conclusion

The row sum of independent random variables X„j is generated from generalization of negative binomial distribution in the scheme of infinitesimal system of X = {{^„y}y=i 2 «} converges a kind of stable distribution with skewed property and heavy tailed, and the tails show a powe. behavior as implication of stable distribution properties. The behavior of sum of random variables X„j show a good shape to figure out a rare event phenomenon where most of feil events have high jMobability concentrated around random variables zero, and the remainders (success events) have veiy low probability to appear. This is a new alternative point of view in the study of skewed distribution related to firm size distribution for growth dynamics.

REFERENCES

[1] Amaral, L. A. N., Buldyrev, S. V., Havlin, S., Leschhorn, P., Salinger, M. A., Stanley, H. E. and Stanley, M. H. R. (1997). Scaling Behavior in Economics: II. Modeling of Company Growth. Journal de Physique I France 7, pp 635-650. [2] Amaral, L. A. N., Gopikrishnan, P., Plerou, P. and Stanley, H. E. (2001). A Model for the Growth Dynamics of Economic Organizations. Physica A 299, pp 127-136.

18 Mashadi, Syamsudhuha, MDH Gamal dan M. Imran, (Eds) Proceedings of the International Seminar on Mathematics and Its Usage in Other Areas November II-12,2010 TSRN.97S.97q.1777.0S-?

[3] Devianto, D and Takano, K. (2007). On Necessary and Sufficient Conditions for Convergence to the Geometric Distribution. Int. J. Pure and Appl. Math., No. 39 Vol. 2, pp 249-264. [4] Embrechts, P., Kluppelberg, C, and Mikosch, T. (1997). Modelling Extremal Events for Insurance and Finance. Springer, Berlin. [5] Fama, E. F. (1965). The Behavior of Stock Market Prices. Journal of Business, 38, pp 34-105. [6] Fama, E. and Roll, R (1971) Parameter Estimates for Symmetric Stable Distributions, Journal of the American Statistical Association, 66, pp 331-338. [7] Fujiwara, Y., Souma, W., Aoyama, H., Kaizoji, T., and Aoki, M. (2003). Growth and Fluctuations of Personal Income. Physica A321, pp 598-604. [8] Fujiwara, Y., Aoyama, H., Souma, W. (2006). Growth and Fluctuations for Small-Business Firms. Springer, London, pp 295-291. [9] Hall, B. H. (1987). The Relationship Between Firm Size and Firm Growth in the U.S. Manufacturing Sector. The Journal of Industrial Economics 35, pp 583-606. [10] Ijiri, Y. and Simon, H. A. (1977). Skew Distributions and the Size of Business Firms. Nort- Holland Publishing, Amsterdam. [11] Ishikawa, A. (2006). Annual Change of Pareto Index Dynamically Deduced from the Law of Detailed Quasi-balance. Physica A371, pp 525-535. [12] Ishikawa, A. (2007). The Uniqueness of Firm Size Distribution Function from Tent-shaped Growth Rate Distribution. Physica A383/1, pp 79-84. [13] Mandelbrot, B. B. (1963). The Variation of Certain Speculative Prices. Journal of Business 36, pp 394-419. [14] Matia, K., Fu, D., Buldyrev, S.V., Pammolli, F., Riccaboni, M., and Stanley, H.E. (2004). Statistical Properties of Business Firms Structure and Growth. Europhys. Lett., 67, pp 498- 503. [15] McCulloch, J. H. (1996). Financial Applications of Stable Distributions, in G. S. Maddala, C. R. Rao. Handbook of Statistics, Vol. 14, Elsevier, pp 393-425. [16] Sutton, J. (1997). Gibrat's legacy. Journal of Economic Literature 35 (1), pp 40-59. [17] Thorin, O. (1977). On the of the . Scand. J., pp 31-40.

19