Thesis Reference
Total Page:16
File Type:pdf, Size:1020Kb
Thesis Robust methods for personal income distribution models VICTORIA-FESER, Maria-Pia Abstract In the present thesis, robust statistical techniques are applied and developed for the economic problem of the analysis of personal income distributions and inequality measures. We follow the approach based on influence functions in order to develop robust estimators for the parametric models describing personal income distributions when the data are censored and when they are grouped. We also build a robust procedure for a test of choice between two models and analyse the robustness properties of goodness-of-fit tests. The link between economic and robustness properties is studied through the analysis of inequality measures. We begin our discussion by presenting the economic framework from which the statistical developments are made, namely the study of the personal income distribution and inequality measures. We then discuss the robust concepts that serve as basis for the following steps and compute optimal bounded-influence estimators for different personal income distribution models when the data are continuous and complete. In a third step, we study the case of censored data and propose a generalization of the EM [...] Reference VICTORIA-FESER, Maria-Pia. Robust methods for personal income distribution models. Thèse de doctorat : Univ. Genève, 1993, no. SES 384 URN : urn:nbn:ch:unige-64509 DOI : 10.13097/archive-ouverte/unige:6450 Available at: http://archive-ouverte.unige.ch/unige:6450 Disclaimer: layout of this document may differ from the published version. 1 / 1 2 Robust Methods for Personal Income Distribution Models Maria-Pia Victoria Feser Submitted for the degree of Ph.D in Econometrics and Statistics Faculty of Economic and Social Sciences University of Geneva, Switzerland Accepted on the recommendation of Dr. A.C. Atkinson, professor, London, Dr. P. Balestra, professor, Geneva, Dr. U. Kohli, professor, Geneva, Dr. E. Ronchetti, professor, Geneva, supervisor, Dr. P. Rousseeuw, professor, Brussels. Thesis No. 384 May 1993 i To Johannes, with love. ii Abstract In the present thesis, robust statistical techniques are applied and developed for the economic problem of the analysis of personal income distributions and inequality measures. We follow the approach based on influence functions in order to develop robust estimators for the parametric models describing personal income distributions when the data are censored and when they are grouped. We also build a robust procedure for a test of choice between two models and analyse the robustness properties of goodness-of-fit tests. The link between economic and robustness properties is studied through the analysis of inequality measures. We begin our discussion by presenting the economic framework from which the statistical developments are made, namely the study of the per- sonal income distribution and inequality measures. We then discuss the robust concepts that serve as basis for the following steps and compute opti- mal bounded-influence estimators for different personal income distribution models when the data are continuous and complete. In a third step, we study the case of censored data and propose a generalization of the EM algorithm with robust estimators. For grouped data, Hampel’s theorem is extended in order to build optimally bounded-influence estimators for grouped data. We then focus on tests for model choice and develop a robust generalized Cox-type statistic. We also analyse the robustness properties of a wide class of goodness-of-fit statistics by computing their level influence functions. Fi- nally, we study the robustness properties of inequality measures and relate our findings with some economic properties these measures should fulfil. Our motivation for the development of these new robust procedures comes from our interest in the field of income distribution and inequality measurement. However, it should be stressed that the new estimators and tests procedures we propose do not only apply in this particular field, but they can be used in or extended to any parametric problem in which density estimation, incomplete information, grouped or discrete data, model choice, goodness-of-fit, concentration index, is one of the key words. iii R´esum´e Dans cette th`ese, nous developpons et appliquons certaines techniques de la statistique robuste au probl`eme ´economique de l’analyse de la distribution du revenu personnel et des mesures d’in´egalit´e. Nous utilisons l’approche bas´ee sur les fonctions d’influence afin de developper des estimateurs ro- bustes pour les mod`eles param´etriques d´ecrivant la distribution du revenu personnel lorsque les donn´ees sont censur´ees et lorsqu’elles sont group´ees. Nous construisons aussi des proc´edures robustes pour tester le choix entre deux mod`eles et analysons les propri´etes de robustesse de tests d’ad´equation. Le lien entre certaines propri´etes ´economiques et de robustesse est ´etudi´eau moyen des mesures d’in´egalit´e. Nous commen¸cons notre discussion par une pr´esentation du cadre ´eco- nomique dans lequel nous nous situons,asavoirl’´ ` etude de la distribution du revenu personnel et des mesures d’in´egalit´e associ´ees. Nous exposons ensuite les concepts de la statistique robuste qui nous sont utiles par la suite et cal- culons des estimateurs optimaux `a influence born´ee pour diff´erents mod`eles de distribution de revenu personnel lorsque les donn´ees sont continues et compl`etes, simul´ees ou r´eelles. Dans un troisi`eme temps, nous ´etudions le cas des donn´ees censur´ees et proposons une g´en´eralisation de l’algorithme EM avec des estimateurs robustes. Le th´eor`eme de Hampel est ensuite ´etendu au cas des donn´ees group´ees et des estimateurs robustes `a influence born´ee de fa¸con optimale sont propos´es. Plus tard, nous nous concentrons sur les proc´edures de choix de mod`ele et d´eveloppons une statistique de test robuste de type Cox. Nous analysons aussi les propri´et´es de robustesse d’une large classe de statistiques de test d’ad´equation en calculant les cor- respondantes fonctions d’influence sur le niveau. Finalement, nous ´etudions les propri´etes de robustesse de mesures d’in´egalite en fonction des propri´et´es ´economiques que ces derni`eres doivent satisfaire. Le d´eveloppement de nouveaux estimateurs et de nouvelles proc´edures de test a ´et´emotiv´eparnotreint´erˆet au probl`eme de l’´etude des distributions de revenu personnel et des mesures d’in´egalit´e. Cependant, il est utile de mettre en ´evidence le fait que les nouveaux estimateurs et les nouvelles proc´edures de test que nous proposons ne sont pas seulement applicables dans ce domaine particulier. En effet, ils peuvent ˆetre appliqu´es ou ´etendus `adesprobl`emes param´etriques dans lesquels des termes comme estimation de densit´e, information incompl`ete, donn´ees group´ees ou discr`etes, choix de mod`ele, tests d’ad´equation, indices de concentration sont des mot-cl´es. iv Acknowledgement I would like to express my gratitude to Prof. E. Ronchetti for his valuable suggestions and his generous guidance throughout the course of this research. His encouragement as expert and as friend have made this work possible. I am also grateful to Prof. A. C. Atkinson and Dr. F. Cowell for their support during my research at the London School of Economics and to Prof. P. Balestra, Prof. U. Kohli and Prof. P. Rousseeuw for their comments during the defense. My thanks also go to my friends and colleagues of the faculty of eco- nomic and social sciences of the University of Geneva for their stimulating discussions and their moral support, especially to S. H´eritier for his helpful comments during the preparation of the defense. Finally, I would like to express my grateful thanks to my parents, for their love, encouragement and support during most of my student life. Contents 1 Introduction 1 2 Income Distribution and Inequality 7 2.1Introduction............................ 7 2.2Thegenerationanddistributionofincome........... 8 2.3TheLorenzcurveandanalysisofinequality.......... 10 2.3.1 Definitionandconstruction............... 10 2.3.2 Anorderingtool..................... 13 2.4Incomeinequalitymeasures................... 14 2.5Parametricmodelsforincomedistributions.......... 18 2.5.1 Generatingsystems.................... 19 2.5.2 Propertiesofincomedistributionmodels........ 20 2.5.3 Mostcommonlyusedmodels.............. 22 2.6Statisticalaspectsoftheanalysisofincome.......... 24 3 OBRE with Complete Information 27 3.1Introduction............................ 27 3.2Robustnessconcepts....................... 30 3.2.1 Definitions........................ 30 3.2.2 Continuityandqualitativerobustness......... 31 3.2.3 The influence function . ................ 31 3.2.4 The influence function and robustness measures . 32 3.2.5 Thebreakdownpoint.................. 33 3.3Optimalrobustestimators.................... 34 3.3.1 Optimalityresults.................... 34 3.3.2 Computationalaspects.................. 39 3.3.3 Howtochoosetheboundc ............... 42 3.4Applicationtotwoincomedistributionmodels........ 44 3.4.1 Simulationresults.................... 44 v vi CONTENTS 3.4.2 Applicationtorealdata................. 50 3.5Propertiesofrobustestimators................. 57 3.5.1 Efficiency......................... 57 3.5.2 Sensitivity......................... 57 3.5.3 Breakdownpoint..................... 61 4 OBRE with Incomplete Information 65 4.1TheEMalgorithm........................ 65 4.1.1 Introduction....................... 65 4.1.2 DefinitionoftheEMalgorithm............. 66 4.1.3