
Al- Azhar University - Gaza Deanship of Postgraduate Studies Faculty of Economics and Administrative Sciences Department of Applied Statistics On Discordance Tests for the Wrapped Cauchy Distribution حول اختبارات التنافر لتوزيع كوشي الملفوف By Moneb Mostafa Kulab Supervised by Dr. Ali H. Abu Zaid Dr. Mo'omen M. R. El-hanjouri Assistant Professor of Statistics Assistant Professor of Statistics Al Azhar University –Gaza Al Azhar University –Gaza A Thesis Submitted in Partial Fulfillment of Requirements for the Degree of M.Sc. of Applied Statistics December 2014 On Discordance Tests for the Wrapped Cauchy Distribution حول اختبارات التنافر لتوزيع كوشي الملفوف مَاْ الْفَضْلُ إِﻻ ﻷهْلِ الْعِلْمِ إِنَّهُمُ ... عَلَى اهلُدَى ملن اسْتَهْدَىْ أَدِﻻءُ وَقِيْمَةُ املَرْءِ مَاْ قَدْ كَاْنَ حيسِنُهُ ... وَاجلَاهِلُون َﻷهْلِ العِلْمِ أَعْدَاْءُ فَقُمْ بِعِلْمٍ وَﻻ تَطْلُبْ بِهِ بَدَﻻً ... فَالنَاسُ مَوْتَى وَأَهْلُ العِلْمِ أَحْيَاْءُ علي بن أبي طالب رضي اهلل عنه DECLARATION I certify that this thesis is submitted for the Master degree as the result of my own research, except where otherwise acknowledged, and that this thesis (or any part of the same) has not been submitted for a higher degree to any university or institution. Signed ............ Moneb Mostafa Kulab Date: ----------------- ABSTRACT Circular data arise quite frequently in many natural and physical sciences. Standard statistical techniques can't be used to analyze circular data due to the circular geometry of the sample space. Circular data as any other types of data are subjected to contamination with some unexpected observations which are known outliers. This study focuses on detecting outliers in the circular data which follow the wrapped Cauchy distribution, which results from wrapping the Cauchy distribution. Four tests of discordancy for circular data are reviewed and extended to the wrapped Cauchy distribution. The cut-off points of the four tests are obtained via simulation study at three levels of percentiles. The power of performance of discordancy tests in wrapped Cauchy distribution is examined based on three performances measures. The results show that the power of performance is an increasing function of the level of contamination and concentration parameter. An inverse relationship is noticed between the power of performance and the sample size for the four considered statistics except C statistic. In General also we discerned that both C and A statistics perform compatibility better than other two statistics. For illustration purposes, we consider two real circular data sets, namely, the ants’ direction data set and the wind direction data set. i الملخص تظهر البيانات الدائرية بشكل متكرر في العديد من العلوم الطبيعية والفيزيائية. ﻻ يمكن استخدام الطرق اﻹحصائية التقليدية لتحليل البيانات الدائرية نظراً لفضاء معاينتها الدائري. و كأي نوع أخر من أنواع البيانات فإن البيانات الدائرية عرضة للتلويث ببعض القراءات غير المتوقعة والتي تعرف بالقراءات الشاذة. تركز هذه الدراسة على اكتشاف القراءات الشاذة في البيانات الدائرية التي تتبع توزيع كوشي الملفوف والذي ينتج عن طي توزيع كوشي اﻻحتمالي، حيث تم استعراض أربعة من اختبارات التنافر في البيانات الدائرية ومن ثم تعميمها على توزيع كوشي الملفوف. باستخدام دراسة المحاكاة أوجدت القيم الجدولية لﻻختبارات اﻻربعة عند ثﻻث مستويات مئوية، كما تم فحص قوة أداءها باستخدام ثﻻثة مقاييس لقوة اختبارات التنافر. أظهرت النتائج أن قوة اﻻداء ارتبطت بدالة تزايديه مع مستوى تلويث البيانات وكذلك مستوي تركيزها. كما تم مﻻحظة وجود عﻻقة عكسية بين قوة اﻷداء وحجم العينة في احصائيات التنافر اﻷربعة ما عدا إحصائية C. بشكل عام فقد تبين أيضا أن اﻻختبارين C و A قدموا أداءا ً أفضل من اﻻختبارين اﻵخرين . تم تطبيق اﻻختبارات على مجموعتين من البيانات الدائرية الحقيقية وهي مجموعة بيانات اتجاه النمل ومجموعة بيانات اتجاه الرياح بغرض التوضيح. ii DEDICATION To my parents To my brothers and sisters To my friends I dedicate this work iii ACKNOWLEDGMENTS Thanks to Allah the compassionate the merciful for giving me patience and strength to accomplish this research. I would like to express my sincere gratitude to my advisory committee: Assistant Professor Dr. Ali Abu Zaid and Assistant Professor Dr. Mo'omen El-hanjouri for their guidance, constructive advice, spent many hours discussing and giving advice during the process and for their support not only as mentor but also as good friends. My gratitude also goes to my graduate committee members: Associate Professor Dr. Mahmoud Okasha, Associate Professor of Statistics Dr. Abdalla. El-Habeel, Assistant Professor Dr. Mo'omen El-hanjouri and Assistant Professor Dr. Shadi Al-Tilbany, I would not have been able to achieve my learning in the same manner without their immense knowledge. My deeply felt thanks go to my parents, brothers and sisters for their encouragement during the study, my full respect, love and appreciation for all of you. iv TABLE OF CONTENTS Subject Page No LIST OF TABLES …………………………………………………………………. viii LIST OF FIGURES ………………………………………………………………... Ix ABBREVIATIONS AND SYMPOLS ……………………………………………… X CHAPTER ONE: INTRODUCTION 1.1 Background and study……….…………………………………………………... 1 1.2 Summary Statistics of Circular Data …………………...……………………….. 4 1.2.1 Measures of location ……………………………………………………… 4 1.2.2 Measures of concentration and dispersion ………………………………... 6 1.3 Circular Probability Distribution ……………………………………………….. 7 1.4 The Problem of Outliers ………………………………………………………… 9 1.5 Problem statement ………………………………………………………………. 9 1.6 Objectives ………………………………………………………………………. 10 1.7 Methodology ……………………………………………………………………. 10 1.8 Thesis Outline …………………………………………………………………... 11 CHAPTER TWO: LITRATURE REVIEW 2.1 Introduction……………………………………………………………………. 12 2.2 Outliers in Linear Data ……………………………………………………….. 13 2.2.1 Outliers in Univariate Linear Data …………………………………...... 14 2.2.2 Outliers in Multivariate Linear Data …………………………………... 16 2.3 Outliers in Circular Data ……………………………………………………... 17 2.3.1 Tests of Discordancy in Univariate Circular Data …………………….. 18 2.3.2 Outliers in multivariate circular data ………………………………….. 24 2.4 Wrapped Cauchy Distribution ……………………………………………….. 25 2.4.1 Parameters Estimates …………………………………………………… 26 v 2.4.2 Characteristics of the Wrapped Cauchy Distribution …………………….. 27 2.4.3 Applications of the Wrapped Cauchy Distribution ……………………….. 29 2.5 Summary……………………………………………………………………….... 30 CHAPTER THREE: NUMERICAL STUDY 3.1 Introduction……………………………………………………………………… 31 3.2 The cut-off points for discordancy tests ………………………………………… 31 3.3 Power of performance ……………………………………………………...…… 35 3.4 Summary ………………………………………………………………………... 40 CHAPTER FOUR: APPLICATIONS TO REAL DATA SETS 4.1 Introduction……………………………………………………………………… 42 4.2 Ants' Direction Data …………………………………………………………...... 42 4.2.1 Data description …………………………………………………………... 43 4.2.2 Identification of outliers …………………………………………………... 44 4.3 Wind Data ………………………………………………………………………. 45 4.3.1 Data description …………………………………………………………... 47 4.3.2 Detection of outliers ……………………………………………………..... 49 4.4 Discussion ………………………………………………………………………. 50 CHAPTER FIVE: CONCLUSIONS 5.1 Summary …………………..……………………………………………...…….. 51 5.2 Main conclusions ……………………………………………………………….. 51 5.3 Further researches.……………...……………………………………….………. 52 REFFERENCES ….…………………………………………………………….. 53 APPENDICES APPENDIX A.1: The Cut-off points for the tests of discordancy …………………… 58 vi APPENDIX A.2: R subroutine tests of discordancy in circular data ………………… 62 APPENDIX A.3: R subroutine for obtaining the Cut-off points for the tests of 64 discordancy ……………………………………………………………………………. APPENDIX A.4: R subroutine for power of performance in circular data …………... 65 APPENDIX A.5: Power of Performance of Discordancy Tests ……………………… 67 vii LIST OF TABLES Table Page No Table 4.1: Descriptive statistics of the ants' direction data ……………………………. 44 Table 4.2: Results of discordancy tests on ants' direction data ……………………….. 45 Table 4.3: Descriptive statistics of circular error of the wind data ………………………. 48 Table 4.4: Results of discordancy tests on wind data ……………………………………. 50 viii LIST OF FIGURES Figure Page No Figure 1.1: Arithmetic and circular mean ...……………………………………………... 1 Figure 2.1: A replot for the star data …………………………………………………….. 16 Figure 2.2: Graphical presentation of data in (2.1) ……………………………………… 18 Figure 2.3: Circular boxplot of the frogs directions ……………………………………... 24 Figure 2.4: Circular plot for WC distribution with different concentration parameter ... 28 Figure 3.1: The cut-off 150 points for C statistic for different value of concentration 32 parameter …………………………………………………………………………………. Figure 3.2: The cut-off points for D statistic for different values of sample size ….…..... 33 Figure 3.3: The cut-off points for M statistic for different values of sample size ……….. 34 Figure 3.4: The cut-off points for A statistic for different cases ……………………….. 35 Figure 3.5: Power of performance for all statistics when n 50 ……………………...… 37 Figure 3.6: Power of performance for all statistics when 0.9……………………….. 38 Figure 3.7: Relative performance of discordancy tests ………………………………….. 39 Figure 3.8: The difference between P1 and P3 in some cases …………………………... 40 Figure 4.1: Circular plot of the ants' direction data ……………………………………… 43 Figure 4.2: Circular distance between the observed and predicted values ………………. 47 Figure 4.3: Circular plot of circular error of the wind data ……………………………... 48 ix ABBREVIATIONS AND SYMBOLS Abbreviation Full Word R Mean resultant length. Sample mean direction. Sample median direction. V Sample circular variance. v Sample circular standard deviation. Concentration parameter. WC(,) The wrapped Cauchy
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages99 Page
-
File Size-