Equation (2.3) Suggests the Following Model for the Probability of Having a Change-Point Among the First N - 1 Observations
Total Page:16
File Type:pdf, Size:1020Kb
INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, same thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy subrnitted. Broken or indistinct print, coîored or poor quality illustrations and photographs, pnnt bleedthrough, substandard margins. and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthanzed copyright material had to be removed, a note will indicate the deletion. Oversize materials (e-g., maps, drawings, &arts) are reproduced by sectioning the original, beginning at the upper left-hand merand continuing from left to right in equal sections with small overlaps. Each original is aldo photographed in one exposure and is induded in reduced forrn at the back of the book. Photographs induded in the original manuscript have been repraduced xerographically in this copy. Higher quality 6' x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional diarge. Contact UMI diredly to order. Bell 8 Howell Information and Leaming 300 North Zeeb Road, Ann Arbor, MI 48106-1346 USA 800-521-0600 NOTE TO USERS Page(s) not included in the original manuscript are unavailable from the author or university. The manuscript was microfilmed as received. This reproduction is the best copy available. UMI Modeling Covariance in Multi-Path Changepoint Problems Masoud Asgharian Dastenaei Depart ment of Mat hematics and Statistics McGill University, Montreal A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements of the degree of Doctor of Philosophy OMasoud Asgharian Dastenaei 1998 i National library Bibliothèque nationale 1+1 of Canada du Canada Acquisitions and Acquisitions et Bibliographie Services services bibliographiques 395 Wellington Street 395, nre Wellington Ottawa ON KIA ON4 OrtawaON KlAûN4 Canada Canada The author has granted a non- L'auteur a accordé une licence non exclusive licence allowing the exclusive permettant à la National Library of Canada to Bibliothèque nationale du Canada de reproduce, loan, distribute or sel1 reproduire, prêter, dismbuer ou copies of this thesis in microform, vendre des copies de cette thèse sous paper or electronic formats. la forme de microfiche/nlm, de reproduction sur papier ou sur format électronique. The author retains ownership of the L'auteur conserve la propriété du copyright in this thesis. Neither the droit d'auteur qui protège cette thèse. thesis nor substantial extracts fiom it Ni la thèse ni des extraits substantiels may be printed or othewise de celle-ci ne doivent être imprimés reproduced without the author's ou autrement reproduits sans son permission. autorisation. To the rnemory of my mother, Hagar, who constantly supported me up to the last day of her life and loved to see this moment, but it didn't corne true. And to my wife and dear friend Mojgan Bien qu'il ait été intensivement étudié dans le cas de la trajectoire unique, le problème du point de changement a été largement ignoré dans le cas de trajectoires multiples. Dans la situation "multi-trajectoires" , il est souvent utile de déterminer l'impact des covariables sur le point de changement lui-même aussi bien que les paramètres avant et après celui-ci. Cette thèse aborde l'inclusion des covariables dans la distribution du point de changement, cet aspect n'ayant jamais été étudié aupara- vant. Le modèle que nous introduisons est basé sur la fonction hasard du change- ment. Il a des caractéristiques qui permettent d'établir les résultats asymptotiques nécessaires à l'estimation et aux tests. En effet, nous établissons la consistance des estimateurs du maximum de vraissemblance des paramètres de notre modèle. Le modèle proposé étant un mélange, deux difficultés reliées à de tels modèles sont à surmonter, à savoir l'identifiabilité et la définie positivité de la matrice d'information. 11 est établi, sous des conditions appropriées, que l'ensemble des zéros du déterminant de la matrice d'information est dense nulle part, paliant ainsi à l'impossibilité d'une preuve directe de la définie positivité. En utilisant la méthode d'annulation par simuation, nous avons effectué quelques simulations afin de déterminer la maniabilité de notre procédure d'estimation. Dans l'exemple traité, notre estimateur semble approximativement suivre une nor- male, même pour des échantillons de taille modérée. Les estimateurs du maximum de vraissemblance semblent également bien approximer leur paramètres. ABSTRACT Although the single-path changepoint problem has been extensively treated in the statistical literature, the multi-path changepoint problem has been largely ignored. In the multi-path changepoint setting it is often of interest to assess the impact of covariates on the changepoint itself as weil as on the parameters before and after the changepoint. This thesis is concerned with including covariates in the changepoint distribution, a topic never before addressed in the literature. The model we introduce, based on the hazard of change, enjoys features which allow one to establish asymptotic results needed for estimation and testing. Indeed, we establish consistency of the maximum likelihood estimators of the parameters of Our model. -4s the proposed model is a mixture model, two of the difficulties associated with such models are addressed. They are identifiability, and positive definiteness of the information matrix. It is shown that under suitable conditions the set of zeros of the determinant of the information matrix is a nowhere dense set: thus partially compensating for the impossibility of directly establishing positive definiteness. A limited simulation, using simulated annealing, is carried out to assess how the estimation procedure works in practice. In the esample presented, the estimators appear to follow an approximately normal distribution even for moderate sample sizes. The maximum likelihood estirnators appear to approximate their parameter counterparts well. Chapter 4: 93 : Lemma 1. Lemma 2. Lemma 3 and Theorem 5, Lemma 5. Lemma 6. Proposition 4. Theorem 7. Lemma 7, and Theorem 8 $4 : Lemma 8. Theorem 9 and Theorem 10 + $3 : Establishing asym ptot ic normality of the maximum likelihood estimators of the unknown parameters in the mode1 introduced in Chapter 1 ACKNOWLEGEMENT David Wolfson has been much more than a supervisor for me. He has been a mise friend. His fastidiousness enorrnously improved the exposition of this thesis and his unusual patience gave me the chance to work on a varîety of problems and enjoy learning new things. He and his wife, Tina Wolfson of the Division of Clinical Epidemiology at the Jewish General Hospital(JGH), provided me the chance to learn about aspects of statistics not covered in the classroom. Indeed, working at the Jewish General Hospital forced me to understand many things which 1 had never questioned before. It was at the JGH that I was given the chance to work on survivai analysis, my favourite topic in statistics, and where 1 was introduced to the notion of length-biased sampling. For al1 this 1 would like to express my sincere gratitude. 1 thank Sanjo Zlobec for interesting lectures on parametric programrning and inspiration for working on an ongoing problem. Jal Choksi always gave me the most relevant references to my questions. -4mong my friends 1 should start with Enrique Reyes whorn 1 tortured pitilessly with my questions on differential geometry. He introduced me to the book by -4braham, Marsden and Ratiu(1988) which turned out to be my main reference in Chapter 3 of this thesis. Luc Lalond was a source of computer skills from which 1 personally benefitted very much. When 1was stuck for a long time with an error in my program, he devoted a considerable amount of time to find the error, although he was very busy himself. Statistical discussion with my old friend Khalil Shafie has been always beneficial for me. 1 also befitted very much from his computer skills. 1 also like to thank Lassina Dembele who helped me with translation of the abstract of the thesis. 1 am very grateful to the Ministry of Higher Education of Iran for supporting me through my education. 1 would also like to express my acknowledgrnent to the McGill Major Fellowship Foundation who awarded me the "175th Anniversary of McGill University" fellowship . When, for some baseless reason 1 was labekd as somebody who neither has the right to continue his education abroad, nor even in Iran, it was my Masters supervisor, Siamak Xoorbaloochi, who helped me overcome this obstacle. Without him there would not be any thesis nor even an education towards a PhD. 1 am most indebted to him for al1 he did for me. My siblings have always been a great source of encouragement and inspiration. 1 don't think there are any words that can express my real gratitude and acknowl- edgrnent to them. 1 am also very grateful to my father and mother-in-law who helped me and my wife very much. Contents Chapter 1. INTRODUCTION 3 Chapter II. HAZARD APPROACH IN THE MULTI-PATH CHANGE POINT 7 1. Introduction 2. Markovian Structure Of The Changepoint Problem 3. Principle of Maximum Entropy and Modeling 22 3.1. Synopsis Of History And Etymology Of The Word " Entropy" 22 3.2. Entropy And Slodeling 24 4. Introducing Covariates into the Model 5. Mixture Distributions Chapter III. CONSISTENCY OF THE MLE 1. Introduction 51 2. Identifiability Of The Mode1 52 3. Consistency In The Single Parameter Case 61 4. Consistency In The Multiparameter Case For Identifiable Mod- els 72 1 2 CONTENTS 5.