Projecting Pakistan Population with a Bayesian Hierarchical Approach
Total Page:16
File Type:pdf, Size:1020Kb
Sede Amministrativa: Universit`adegli Studi di Padova Dipartimento di Scienze Statistiche Corso di Dottorato di Ricerca in Scienze Statistiche Ciclo XXX Projecting Pakistan Population with a Bayesian Hierarchical approach Coordinatore del Corso: Prof. Nicola Sartori Supervisore: Prof. Stefano Mazzuco Dottorando: Muhammad Adil 31st October, 2017 Abstract UN has generated projections for demographic components for all countries of the world deterministically before revising the methodology and recently started to generate probabilistic projections at national level , but regional/sub national level trajectories are yet to be generated. In this thesis, probabilistic projections for total fertility rate (TFR) , life expectancy at birth for males and females and population totals at national as well as regional level have been generated based on Bayesian Hierarchical modeling approach. The trajectories were also generated for variable number of countries and it was observed that continent based trajectories were enough to model the future pattern instead of going for entire globe data. This in result saves time and decreases the influence of developed world demographic pattern on developing world and vise versa. TFR results were compared based on different values of µ, (i.e., values less than and more than 2.1, which is the ultimate level of replacement) and with the trajectories generated based on Bayesian Hierarchical modeling approach for phase III. For values greater than 2.0, the trajectories were taking long time to converge to replacement level than for smaller values. On the other hand trajectories of Life Expectancy have revealed significant gap between male and female projections at national as well as regional level. It was also observed that Balochistan region has lower life expectancy at birth than the rest of the regions of Pakistan. The trajectories for Population totals were generated based on the probabilistic TFR and life expectancy trajectories. The results revealed a total population of over 207 million at national level which nearly coincides with the recently released provisional figures of census. Furthermore, the results are significantly different from previously adopted deterministic results and could be a good substitute over classical deterministic approaches. Sommario Le proiezioni demografiche delle Nazioni Unite per tutti i paesi del mondo sono sempre state ottenute attraverso un approccio puramente deterministico, e solo recentemente approcci probabilistici sono stati considerati per i livelli nazionali, mentre le traiettorie regionali o sub nazionali non sono ancora state considerate. In questa tesi, le proiezioni probabilistiche del tasso di fecondit`a,dell'aspettativa di vita alla nascita per maschi e per femmine e il totale delle popolazioni sia a livello nazionale che a livello regionale sono state generate attraverso l'uso di modelli Bayesiani gerarchici. Le traiettorie sono state sviluppate per diverse variabili numeriche specifiche di ogni paese dalle quali `eemerso che le informazioni relative al continente risultano pi`uefficienti rispetto a quelle globali per modellare l'andamento futuro dei fenomeni di interesse. Questo approccio permette sia di accelerare la procedura di stima del modello, sia di ridurre l'impatto legato alla variabilit`adegli eventi osservati su tutto il globo. I risultati per il tasso di fecondit`a globale sono stati confrontati rispetto a differenti livelli di µ (e quindi minori o maggiori di 2.1, che `el'ultimo valore considerato) e rispetto a diverse traiettorie generate dal modello gerarchico in fase III. Per valori maggiori di 2.1, le traiettorie hanno richiesto un tempo computazionale molto lungo per convergere al livello prestabilito rispetto a valori pi`upiccoli. D'altro canto, le traiettorie per l'aspettativa di vita hanno rivelato un divario significativo tra i maschi e le femmine sia a livello nazionale che a livello regionale. E' stato osservato inoltre che nella regione del Balochistan l'aspettativa di vita alla nascita `epari almeno a quella del resto del paese. Le traiettorie per il totale delle popolazioni sono state generate basandosi sul tasso di fecondit`aglobale e sulle aspettative di vita generate dal modello Bayesiano. I risultati hanno mostrato una popolazione totale di 207 milioni di persone per il livello nazionale, livello che coincide approssimativamente con i recenti e provvisori dati del censimento. I risultati ottenuti sono inoltre significativamente differenti da quelli fino ad ora ottenuti, ed evidenziano la qualit`adel nuovo approccio proposto come alternativa ai metodi deterministici fino ad ora utilizzati. Dedicated to my family & my supervisor Acknowledgements First of all I would like to express my greatest thanks to almighty Allah, the one and only supreme power, the most merciful, worthy of all praise, the most beneficent who always helps me in difficulties, guides me to the best possible solutions and gives me strength to complete every task. I would like to express my greatest gratitude to my supervisor Professor Stefano Mazzuco from the core of my heart for his support, wisdom, encouragement and precious advice and guidance throughout this journey of research. I would specially mention his always welcoming behaviour even without taking prior appointments. He was always there to support me in tough and rough times. I would extend my thanks to Professor Monica Chiogna who as a course coordinator was a big support during first year of PhD. I would like to thank Professor Nicola Sartori who during the last few months of our research journey helped in resolving administrative issues. Special thanks go to all teachers who provided us with enough knowledge to successfully accomplish this task. I would like to specifically acknowledge Patrizia Piacentini who tried her best to solve any problem which we faced from the start till the end of PhD period. I must thank Research Scientist Hana Sevcikova , University of Washington, who always responded to my emails and guided me in BayesPop R package. I would like to thank my friends in Pakistan Bureau of Statistics specifically Mr. Syed Jawad Ali Shah, who guided me on dealing with glitches relevant to data and always encourages and motivates. I would like to thank Mr. Ahtasham Gul who at all times provided me assistance on technical and bureaucratic issues. I thank my friend S.O. Najeeb Ullah who was always there to help me in getting data from Pakistan Bureau of Statistics. I also thank Professor Qamruzaman, University of Peshawar, for his support and guidance on numerous occasions. I particularly thank my friend Ismail Shah, who guided, motivated and encouraged me to apply for this opportunity and introduced me to the department of Statistics, University of Padova. I thank my friends Danish Wasim and Kaleem Ullah Shahzad for their motivation and encouragement. I thank my colleagues from the 30th PhD cycle and feel very lucky to have shared these three years of ups and downs with such an incredible group of people. I feel Padova as a home due to a bunch of best people i met during my this journey, they include but not limited to Dr.Ali Raza, Mubashar, Dr.Jalal Uddin, Dr.Saeed Khan, Dr.Jagjit Singh, Dr.Ishtiaq, Dr.Saima Imran, Imran bhai and Zaheer bhai. I especially express my thanks to my wife whose love and support always provided me with strength and feeling of belief in myself towards finishing this task successfully. I must acknowledge my lovely daughter Rahemeen whose pure love and affection always motivated me to keep moving forward. I thank my sister and brother for their uncondi- tional love. I wish to express my love and great respect for the support and unlimited love of my parents who are always a great source of motivation. Contents List of Figures xv List of Tables xix Introduction1 Overview......................................1 Main contributions of the thesis..........................2 1 Methodology5 1.1 Fertility....................................5 1.1.1 Age Specific Fertility Rate......................5 1.1.2 Total Fertility Rate.........................6 1.2 Fertility Transition..............................6 1.2.1 Phases of Fertility Transition....................7 1.2.2 UN Methodology on TFR Projection................9 1.3 Fertility Transition models..........................9 1.3.1 Fertility Transition Model for Phase 2................9 1.4 Estimation of Phase II parameters..................... 13 1.4.1 Estimation............................... 13 1.4.1.1 Rejecting Sampling approach............... 14 1.4.1.2 Markov Chain Monte Carlo (MCMC) Approach..... 14 1.4.1.3 The Gibbs sampler..................... 15 1.4.1.4 Metropolis-Hostings algorithm............... 15 1.4.1.5 Slice Sampling....................... 16 1.4.2 Bayesian Hierarchical Model for Phase II.............. 17 1.4.3 Post Transition Model for Phase 3.................. 20 1.4.3.1 Estimation of Post transition model........... 20 1.4.3.2 Estimation of Post transition model in the Bayesian Hi- erarchical set up...................... 21 1.4.4 Projection............................... 22 1.5 Mortality................................... 23 1.6 Life Expectation............................... 23 1.6.1 UN Methodology on Life Expectancy................ 24 1.6.2 Probabilistic model.......................... 25 1.6.3 Estimation of Parameters using Bayesian Hierarchical model... 26 xi xii Contents 1.6.4 Joint Projection of male and female life expectancy based on mod- eling the Gap in life expectancy................... 27 1.7 Reconstruction