<<

3rd International e-Conference on Optimization, Education and Mining in Science, Engineering and Risk Management 2013/2014 (OEDM SERM 2013/2014)

Testing Gaussian and Non-Gaussian Break Point Models: V4 Stock Markets Michael Princ Charles University [email protected]

Abstract During the study we analyzed Gaussian and Non-Gaussian break point models in a case of stock markets in V4 countries. We can recommend Non-Gaussian models as more suitable for a detection of structural breaks. The best performing methods were Mann-Whitney, Kolmogorov-Smirnov and Cramer-von-Mises models. In terms of stability we identified stock market in Poland as the most stable, while the Slovak market can be perceived as the least stable. This leads to interesting result, which indicates that higher volume of trading is connected with higher stability of the market and thus markets with lower trading volumes are more prone to occurrence of structural changes. While higher trading activity cannot fully defend against structural changes we confirm that they can help against general instability of markets even in case of Central European countries. Keywords: Gaussian, non-Gaussian, V4 Stock Markets, Break point models.

I. INTRODUCTION

When we talk about structural breaks, we can distinguish economic point of view described by shifts in economy, which are usually connected with economic crisis or economic integration. On the other hand there is also an econometric point of view described by shifts in , when we identify statistically significant different coefficients or changing volatility levels. We can understand it as a discussion between qualitative or quantitative changes. In following study we would like to focus mostly on the econometric framework applied on data sets of V4 countries, which can be used in further analyses in specific economic applications. In our study we would like to focus on two major classes of structural break point models Gaussian and Non-Gaussian. It that we will study Gaussian models, which are based on an assumption of existence of normality, and Non-Gaussian models, which are more complex and does not need this assumption. More general framework testing Non-Gaussian data sequences, can assume that distribution of data series is unknown. In our study we will test whether and how can this condition influence performance of proposed test . We would like to make an overview of test statistics and models, which can determine breakpoints in data samples. Our goal is to determine also features of analyzed V4 stock markets.

II. LITERATURE REVIEW

During analysis of time series detecting of structural changes regression relationships has been a focus in various econometric studies. We can distinguish many types of tests on structural changes e.g. Student test (e.g. in Hawkins et al. (2003)), Mood test (e.g. in Mood (1954)), Mann- Whitney test (e.g. in Mann-Whitney (1947)), Lepage test (e.g. in Lepage (1971)) and many other tests devoted to testing of fluctuations. There were numerous studies, which confirmed that break point can be determined by “endogenous” data analysis - see Zivot-Andrews (1992), Perron (1997) or Harvey et al. (2001). This supports our idea of identification of structural break points in real data sets. Very good studies incorporating more structural tests can be analyzed e.g. in

© Publishing House Curriculum. ISBN 978-80-87894-01-9 379

3rd International e-Conference on Optimization, Education and Data Mining in Science, Engineering and Risk Management 2013/2014 (OEDM SERM 2013/2014)

following studies Ross et al. (2011) or Ross-Adam(2012).

III. STRUCTURAL CHANGE MODELS

The models can be very helpful in time of sudden structural changes; the best example is possibility of starting crisis, when a switch from one regime changes to another. Other example can be perceived in opposite event, when crisis phase out and a situation is being stabilized. More importantly we would like to identify the most suitable econometric tests, which can reliably identify structural changes. When we talk structural breaks the basic alternative can be defined as an existence of different  (1 i i )  { A 0 i  (i i n) parameters in different parts of the data series as in a following way: A 0

We can distinguish break-point models in a sense, whether they identify single or multiple change-points in the process. Simple tests identify only single structural point, when a data sample is divided only to two specific subsamples, while more sophisticated test statistics can identify multiple structural break points and divide starting data sample to various number of subsamples. We will devote our analysis solely to more complex models detecting multiple points, when a number of subsamples is unknown prior computation.

3.1 Gaussian Sequence Models In this part we will introduce all used change-point models based on assumption that the researched data series can be marked as a Gaussian sequence. In following chapters, we will test how affects power of tests, when assumption is not fulfilled.

3.1.1 Student t-test In our study we will use a more complex variant of well-known Student t-test, which was improved by Hawkins et al. (2003) in order to detect multiple change-points. The test is constructed in a way to detect changes in a .

3.1.2 Bartlett test We use Bartlett test defined as in Snedecor-Cochran (1989) and Hawkins-Zamba (2005), which is mostly used for a detection of changes in a Gaussian sequence. The specific test statistic can be seen below: k (N k)ln(S22 )  (n  1)ln(S ) 2 pi1 i i X  k 11 ( ( 1 ) 1 ) 3(k 1)i1 ni  1 Nk  3.1.3 GLR test Generalized Likelihood Ratio test statistic can be defined as below: SS GLR k log0,nn  (n  k)log 0, SS 0,k k,n ,

SV/ (j i) i,j i,j is defined as a maximum likelihood estimator of the variance of the sequence of

© Publishing House Curriculum. ISBN 978-80-87894-01-9 380

3rd International e-Conference on Optimization, Education and Data Mining in Science, Engineering and Risk Management 2013/2014 (OEDM SERM 2013/2014)

data. For further details we refer to Hawkins-Zamba(2005b). The test is used in order to detect possible both mean and variance changes in a Gaussian sequence of data.

3.2Non-Gaussian Sequence Models All following tests does not require an assumption of normality of time series, which are being analyzed. We will test in following chapters whether this more general framework is having some positive effects on a reliability of proposed tests.

3.2.1 Mood test The Mood test statistics measures to which extent the rank at each point deviates from its expected value, for further details we refer to Mood (1954).We can define test static in a following way:

2 M ( ( xi )  ( n  1) / 2) xSi 3.2.2 Mann-Whitney test Mann-Whitney U test proposed in Mann-Whitney (1947), where we compute U test as a sum of test statistics for two individual samples can be defined as below:

nn1(n 1 1) 2 (n 2 1) UUURR1  2  1   2  22 3.2.3 Lepage test Lepage test proposed inLepage (1971), can be defined as a sum of squared Mann-Whitney and Mood statistics and is proposed as a synergic approach, which should use benefits of both receding test statistics. The general form of a test can be perceived below: 22 LUM 3.2.4Cramer-von-Mises Cramer-von-Mises test statistic is inspired by Anderson(1962) and its form is as follows:

Wwk,, t  k t Wt max ,1  k  t Used form of the test is same as in and Ross-Adam(2012). k  wkt,

3.2.5Kolmogorov-Smirnov nn The test is in a form p Q D 12 as in Feller(1948) and Ross-Adam(2012) k,, t k t nn12 For further information about used test statistics above and their exact application in parametric and non-parametric forms we refer to Ross et al. (2011) and Ross-Adam(2012).

IV. DATA SAMPLE

© Publishing House Curriculum. ISBN 978-80-87894-01-9 381

3rd International e-Conference on Optimization, Education and Data Mining in Science, Engineering and Risk Management 2013/2014 (OEDM SERM 2013/2014)

We analyze four different stock indexes, it namely means BUX index traded in Hungary, PX index traded in the Czech Republic, SAX traded in Slovakia and finally WIG20 index traded in Poland. Our data sample starts on 7th September 19931 and ends on 28th February 2014. We would like to analyze properties of subsamples defined on a basis of identified structural breaks.

V. PERFORMANCE ANALYSIS

In this part we would like to determine how we can compare performance of structural break modelling. We would like to answer following hypotheses: Can we confirm that more general non-Gaussian tests will lead to improved results in identification of structural changes in data samples? Can we perceive relationship between a higher number of identified structural break points and improved consistency of identified subsamples? Can we identify strengths or weaknesses of proposed test statistics in real data samples?

5.1 Break-points In this part we would like to describe some interesting facts about features of identified break points and defined subsamples. In a following table we can perceive how many breakpoints were identified by each test in specific market. We can see trends of less or more stable markets connected, with less or more structural breaks, and also structural break models, which tends to identify more or less structural break points in general.

Table 1 - Number of identified break-points

Cramer- Mann- Kolmogorov- von- Student Bartlett GLR Whitney Mood Lepage Smirnov Mises Total BUX 60 95 72 33 79 72 40 45 496 PX 73 123 103 48 77 77 59 44 604 SAX 58 193 163 20 67 66 46 35 648 WIG20 47 71 55 20 54 48 32 21 348 Average 59.5 120.5 98.25 30.25 69.25 65.75 44.25 36.25 524 Average of the 92.75 49.15 group

Table 1 offers some interesting findings, the least number of break-points were identified in a case of Polish market, which indicates that the market can be identified as the most stable. On the other hand the most change-points were identified Slovak market, with a fact that the Slovak data sample was the shortest we can conclude that the Slovak market can be regarded as the most instable. Based on results in Table 1 we can also conclude that Gaussian sequence models usually identify more structural break points (on average almost 2 times more). Whether this is a good

1with an exception of SAX data sample, which starts on 3rd July 1995. © Publishing House Curriculum. ISBN 978-80-87894-01-9 382

3rd International e-Conference on Optimization, Education and Data Mining in Science, Engineering and Risk Management 2013/2014 (OEDM SERM 2013/2014)

sign or an adverse effect we will find in further analysis. Ex-ante it is not possible to judge whether data samples include more or less break points. While the trade volume is much higher in a case of Polish market than in Slovakia, we can conclude that higher trading activity improve general stability of stock markets.

5.2 Consistency of subsamples We use three basic properties for a determination whether subsamples can be identified as consistent or rather precisely divided by important structural breaks. The first property is normality, so we identify whether identified subsamples are normally distributed. Then we determine whether neighboring subsamples do have statistically significant difference in a level of mean or variance. For a determination of normality we use Shapiro-Wilk test statistic defined in Shapiro-Wilk (1965). Hypothesis that data sample is normally distributed is being tested. Different levels in mean are determined by a usage of Welch t-statistic as in Welch (1947), which tests hypothesis that data samples have a same mean. Finally F-Test proposed in Fisher (1936) tests hypothesis that ratio of of examined data samples is equal to 1. We show how successful are proposed tests in a determination of break points, when the new subsample has some desirable properties, it means that it is normally distributed2and determine a structural change. In a case that neighboring subsamples had different levels of mean or variance, it would have meant that the break point was identified successfully, because every breakpoint should a statistically significant change in a structure. Thus it is desirable that Shapiro-Wilks’ hypothesis is not rejected and subsample is identified as normally distributed, as well as it is desirable that Welch’s and Fisher’s hypotheses are rejected in a comparison of neighboring subsamples and thus the breakpoint identifies a statistically significant breakpoint.

Table 2 - Rate of success – BUX index

Cramer- Mann- Kolmog.- von- Student Bartlett GLR Whitney Mood Lepage Smirnov Mises S-W 10% 40.00% 76.84% 66.67% 45.45% 68.35% 51.39% 42.50% 40.00% S-W 5% 40.00% 82.11% 76.39% 48.48% 72.15% 56.94% 47.50% 44.44% S-W 1% 46.67% 92.63% 83.33% 57.58% 84.81% 73.61% 57.50% 57.78% T-test 10% 35.00% 11.58% 31.94% 84.85% 6.33% 37.50% 82.50% 86.67% T-test 5% 33.33% 10.53% 25.00% 84.85% 2.53% 36.11% 75.00% 80.00% T-test 1% 23.33% 2.11% 15.28% 51.52% 1.27% 25.00% 55.00% 44.44% F-test 10% 21.67% 89.47% 72.22% 57.58% 92.41% 59.72% 55.00% 60.00% F-test 5% 20.00% 83.16% 72.22% 54.55% 89.87% 56.94% 52.50% 51.11% F-test 1% 15.00% 81.05% 70.83% 39.39% 84.81% 51.39% 42.50% 44.44%

2We have tested all data series and came to conclusion that all market indexes are not normally distributed, Shapiro- Wilk test hypothesis was rejected in all cases. © Publishing House Curriculum. ISBN 978-80-87894-01-9 383

3rd International e-Conference on Optimization, Education and Data Mining in Science, Engineering and Risk Management 2013/2014 (OEDM SERM 2013/2014)

Table 3 - Rate of success – PX index

Cramer- Mann- Kolmog.- von- Student Bartlett GLR Whitney Mood Lepage Smirnov Mises S-W 10% 31.51% 73.17% 62.14% 54.17% 67.53% 59.74% 59.32% 50.00% S-W 5% 38.36% 78.05% 70.87% 60.42% 71.43% 64.94% 66.10% 56.82% S-W 1% 52.05% 83.74% 81.55% 64.58% 81.82% 77.92% 74.58% 63.64% T-test 10% 34.25% 15.45% 27.18% 89.58% 7.79% 41.56% 74.58% 84.09% T-test 5% 31.51% 7.32% 19.42% 89.58% 5.19% 36.36% 72.88% 84.09% T-test 1% 20.55% 3.25% 10.68% 62.50% 2.60% 22.08% 52.54% 52.27% F-test 10% 26.03% 75.61% 68.93% 56.25% 85.71% 68.83% 57.63% 54.55% F-test 5% 23.29% 73.17% 67.96% 45.83% 85.71% 67.53% 50.85% 45.45% F-test 1% 16.44% 65.04% 58.25% 37.50% 80.52% 53.25% 44.07% 40.91%

Table 4 - Rate of success – SAX index

Cramer- Mann- Kolmog.- von- Student Bartlett GLR Whitney Mood Lepage Smirnov Mises S-W 10% 12.07% 29.02% 23.31% 30.00% 20.90% 24.24% 19.57% 20.00% S-W 5% 15.52% 34.20% 29.45% 35.00% 26.87% 24.24% 21.74% 25.71% S-W 1% 15.52% 47.67% 37.42% 45.00% 35.82% 34.85% 30.43% 34.29% T-test 10% 24.14% 3.63% 6.13% 90.00% 14.93% 31.82% 43.48% 51.43% T-test 5% 22.41% 3.11% 3.68% 90.00% 7.46% 25.76% 30.43% 51.43% T-test 1% 13.79% 1.55% 1.84% 60.00% 1.49% 15.15% 17.39% 31.43% F-test 10% 15.52% 50.78% 44.17% 40.00% 68.66% 46.97% 56.52% 45.71% F-test 5% 15.52% 49.22% 42.94% 30.00% 59.70% 45.45% 52.17% 42.86% F-test 1% 12.07% 44.56% 39.88% 30.00% 55.22% 36.36% 45.65% 34.29%

© Publishing House Curriculum. ISBN 978-80-87894-01-9 384

3rd International e-Conference on Optimization, Education and Data Mining in Science, Engineering and Risk Management 2013/2014 (OEDM SERM 2013/2014)

Table 5- Rate of success – WIG20 index

Cramer- Mann- Kolmog.- von- Student Bartlett GLR Whitney Mood Lepage Smirnov Mises S-W 10% 40.43% 69.01% 60.00% 60.00% 62.96% 58.33% 43.75% 42.86% S-W 5% 46.81% 83.10% 61.82% 60.00% 72.22% 62.50% 43.75% 47.62% S-W 1% 55.32% 88.73% 80.00% 75.00% 85.19% 77.08% 62.50% 61.90% T-test 10% 40.43% 11.27% 14.55% 95.00% 9.26% 29.17% 84.38% 76.19% T-test 5% 38.30% 5.63% 9.09% 95.00% 5.56% 20.83% 68.75% 76.19% T-test 1% 25.53% 2.82% 3.64% 65.00% 0.00% 16.67% 40.63% 71.43% F-test 10% 19.15% 83.10% 74.55% 30.00% 92.59% 70.83% 59.38% 61.90% F-test 5% 17.02% 81.69% 72.73% 25.00% 85.19% 64.58% 50.00% 42.86% F-test 1% 17.02% 77.46% 70.91% 20.00% 81.48% 58.33% 34.38% 33.33%

We determine a degree of success based on rejecting or not rejecting proposed hypotheses on determined levels of significance (1%, 5%, 10%). Following tables 2 to 5 show rate of success in specific data samples.In the last table 6 we present an overview of output based on all analyzed data samples.

Table 6 –Average rate of success – V4 indexes

Cramer- Mann- Kolmog.- von- Student Bartlett GLR Whitney Mood Lepage Smirnov Mises S-W 36.19% 69.86% 61.08% 52.97% 62.50% 55.48% 47.44% 45.42% T-test 28.55% 6.52% 14.04% 79.82% 5.37% 28.17% 58.13% 65.81% F-test 12.07% 44.56% 39.88% 30.00% 55.22% 36.36% 45.65% 34.29% Average 25.60% 40.31% 38.33% 54.27% 41.03% 40.00% 50.41% 48.50%

We perceive that there is no model, which would outperform all other models. Some precise result is always connected with some drawbacks as a trade-off. We can see that on average the best performing models are Mann-Whitney, Kolmogorov-Smirnov and Cramer-von-Mises. On the other hand, we can also conclude that structural test model based on Student-t statistics is clearly outdated and achieved at best only mediocre results.

VI. CONCLUSION

We came to conclusion that any of proposed test statistics was not able to outperform all others, mostly the models were able to excel in one field, but they performed worse in remaining features. While Gaussian sequence models did not performed precisely during a detection of structural breakpoints describing different levels in mean, they performed better in a detection of normally

© Publishing House Curriculum. ISBN 978-80-87894-01-9 385

3rd International e-Conference on Optimization, Education and Data Mining in Science, Engineering and Risk Management 2013/2014 (OEDM SERM 2013/2014)

distributed subsamples, which is a quite surprising result. Overall we can recommend Non- Gaussian test statistics as the most versatile tools for structural break modelling. They identified lower number of structural break points with a sufficient precision. We came to conclusion that non-Gaussian sequence models outperformed simpler Gaussian models. In some aspects Gaussian models performed on par with more complex models, but with a side product of substantially higher number of break points. We can recommend Mann- Whitney, Kolmogorov-Smirnov and Cramer-von-Mises models as the most suitable structural test for analyzed data series.

REFERENCES

[1] ANDERSON, T. W., (1962), On the distribution of the two sample Cramer-von Mises Criterion, Annals of , vol. 33, pp. 1148-1159 [2] FELLER, W. E., (1948), On the Kolmogorov-Smirnov Limit Theorems for Empirical Distributions, The Annals of Mathematical Statistics, vol. 21 (2), pp. 301-302 [3] FISHER, R.A., (1936), The use of multiple measurements in taxonomic problems, Annals of Eugenics vol. 7, pp. 179–188 [4] HARVEY, D.I., LEYBOURNE, S.J., NEWBOLD, P., (2001), Innovational outlier tests with an endogeneously determined break in level, Oxford Bulletin of Economics andStatistics,vol. 63, pp. 559-575 [5] HAWKINS, D., QIU, P., KANG, C., (2003),TheChangepoint Model for Statistical Process Control, Journal of Quality Technology, vol. 35, pp. 355-366. [6] HAWKINS, D. , ZAMBA, K., (2005), A Change-Point Model for a Shift in Variance,Journal of QualityTechnology, vol. 37, pp. 21-31 [7] HAWKINS, D. , ZAMBA, K., (2005b), Statistical Process Control for Shifts in Mean or Variance Usinga Changepoint Formulation,Technometrics, vol. 47(2), pp. 164-173 [8] LEPAGE, Y. (1971), Combination of Wilcoxiansand Ansari–Bradley Statistics, Biometrika, vol. 58, pp. 213–217 [9] MANN, H. B., WHITNEY, D. R. (1947),On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other, Annals of Mathematical Statistics, vol. 18 (1), pp. 50–60 [10] PERRON, P. (1997), Further Evidence on Breaking Trend Functions in Macroeconomic Variables,Journal of , vol. 80 (2), pp.355-385. [11] ROSS, G. J., TASOULIS, D. K., ADAMS, N. M. (2011), A Nonparametric Change-Point Model for Streaming Data, Technometrics, vol. 53(4) [12] ROSS, G. J., ADAMS, N. M. (2012), Two Nonparametric Control Charts for Detecting Arbitary Distribution Changes, Journal of Quality Technology [13] SHAPIRO, S. S. - WILK, M. B. (1965), An Test for Normality (Complete Samples), Biometrika, vol. 52 (3/4), pp. 591-611. [14] SNEDECOR, G. W. and COCHRAN, W. G. (1989), Statistical Methods, Eighth Edition, Iowa State University Press [15] WELCH, B. L., (1947), The generalization of "Student's" problem when several different population variances are involved, Biometrika vol. 34 (1–2), pp. 28–35 [16] ZIVOT E., ANDREWS D. W. K. (1992), Further Evidence on the Great Crash, the Oil-Price Shock, and the Unit-Root Hypothesis, Journal of Business & Economic Statistics, vol. 10, No. 3. (Jul., 1992), pp. 251-270.

© Publishing House Curriculum. ISBN 978-80-87894-01-9 386