Dissertation Submitted to the University of Auckland in Fulfillment of the Requirements of the Degree of Doctor of Philosophy (Ph.D.) in Statistics
Total Page:16
File Type:pdf, Size:1020Kb
Thesis Consent Form This thesis may be consulted for the purposes of research or private study provided that due acknowledgement is made where appropriate and that permission is obtained before any material from the thesis is published. Students who do not wish their work to be available for reasons such as pending patents, copyright agreements, or future publication should seek advice from the Graduate Centre as to restricted use or embargo. Author of thesis Jared Tobin Title of thesis Embedded Domain-Specific Languages for Bayesian Modelling and Inference Name of degree Doctor of Philosophy (Statistics) Date Submitted 2017/05/01 Print Format (Tick the boxes that apply) I agree that the University of Auckland Library may make a copy of this thesis available for the ✔ collection of another library on request from that library. ✔ I agree to this thesis being copied for supply to any person in accordance with the provisions of Section 56 of the Copyright Act 1994. Digital Format - PhD theses I certify that a digital copy of my thesis deposited with the University is the same as the final print version of my thesis. Except in the circumstances set out below, no emendation of content has occurred and I recognise that minor variations in formatting may occur as a result of the conversion to digital format. Access to my thesis may be limited for a period of time specified by me at the time of deposit. I understand that if my thesis is available online for public access it can be used for criticism, review, news reporting, research and private study. Digital Format - Masters theses I certify that a digital copy of my thesis deposited with the University is the same as the final print version of my thesis. Except in the circumstances set out below, no emendation of content has occurred and I recognise that minor variations in formatting may occur as a result of the conversion to digital format. Access will normally only be available to authenticated members of the University of Auckland, but I may choose to allow public access under special circumstances. I understand that if my thesis is available online for public access it can be used for criticism, review, news reporting, research and private study. Copyright (Digital Format Theses) (Tick ONE box only) I confirm that my thesis does not contain material for which the copyright belongs to a third ✔ party, (or) that the amounts copied fall within the limits permitted under the Copyright Act 1994. I confirm that for all third party copyright material in my thesis, I have obtained written permission to use the material and attach copies of each permission, (or) I have removed the material from the digital copy of the thesis, fully referenced the deleted materials and, where possible, provided links to electronic sources of the material. Signature Date 2018/01/10 Comments on access conditions Faculty Student Centre / Graduate Centre only: Digital copy deposited Signature Date December 2015 Research Support Services Libraries and Learning Services Embedded Domain-Specific Languages for Bayesian Modelling and Inference A dissertation submitted to the University of Auckland in fulfillment of the requirements of the degree of Doctor of Philosophy (Ph.D.) in Statistics Jared Tobin January 11, 2018 Abstract This dissertation defends the thesis that novel and useful domain-specific lan- guages for solving statistical problems can be embedded in statically-typed, purely- functional programming languages. It presents techniques for representing probability distributions in embedded languages, deeply-embedding a type-safe probabilistic programming language in a way that is amenable to inference, and embedding a language for building composite Markov transition operators that can be used in MCMC. Acknowledgements I owe a debt of thanks to everyone and everything that encouraged me to enrol in a PhD programme and helped me see it through to the end. Thanks to Josh Stella & the crew at Fugue, Tom Nielsen at Openbrain, Richard Dunne at Bdellium, and my old friends at T&W for being patient with me while I slogged away at this PhD. And a special thanks to Noel Cadigan and Gary Sneddon — my former supervisors — as well as Leslie Kennedy — my former high-school English teacher — each for having been a rather direct impetus for my success in some way. Thanks go to Matt Might and Dan Roy — two people I’ve never met,but whose research and writing compelled me to learn more about computer science, probabilistic programming, and Haskell. And whoever it was who posted that notification on the Well-Typed blog about the 2012 Haskell summer schoolin France — apparently it was Eric Kow — do I ever owe you a drink or something. Thanks to all the authors of the high-quality open-source software andopen- access research that I use every day. You’re an inspiration and I want to be just like you. Thanks to my fantastic supervisor Russell, not only for your patience and ever-constructive advice, but also the freedom you gave me to explore literally life-changing avenues of statistics and computer science. But most importantly: without my wonderful family and friends, this doc- ument would surely not exist. My partner Nadine, my parents Bob & Marilyn, my siblings Shawn & Rachel, and my friends around the world: the notion that I could express my gratitude through some mere poetry is comedy, so I will con- tinue to demonstrate my appreciation to you for as long as I live. Contents 1 Thesis 11 1.1 Representing Probability Distributions .............. 14 1.2 Representing Structured Probabilistic Models ........... 14 1.3 Declarative Markov Chains ..................... 15 1.4 Scope of Formality .......................... 16 1.5 Wrapping up ............................. 17 2 Language Engineering in Haskell 19 2.1 Abstract and Contributions ..................... 19 2.2 Motivation .............................. 20 2.2.1 Domain Specific Languages and Haskell ......... 22 2.3 Algebraic Data Types ........................ 25 2.3.1 Abstract Terms and Types ................. 25 2.3.2 Terms and Types in Haskell ................ 28 2.3.3 A Proto-Representation For Probability Distributions .. 29 2.4 Parametric Polymorphism and Typeclasses ............ 31 2.4.1 The Functor Typeclass ................... 34 2.4.2 Another Proto-Representation ............... 37 2.4.3 The Applicative Typeclass ................. 39 2.4.4 The Monad Typeclass .................... 43 2.5 Conclusion .............................. 50 2.5.1 An Embedded Language for Probability Distributions .. 50 1 2.5.2 Notes ............................. 54 3 Representing Probability Distributions 55 3.1 Abstract and Contributions ..................... 55 3.2 Motivation .............................. 57 3.3 Theoretical Background ....................... 58 3.3.1 Categorical Foundations .................. 58 3.3.2 Probabilistic Foundations ................. 60 3.3.3 P is a Functor ........................ 63 3.3.4 P is a Monad ........................ 65 3.3.5 P is an Applicative Functor ................ 69 3.4 Measure, Integral, and Continuation ................ 69 3.4.1 Typeclass Instances ..................... 73 3.5 Conceptual Example ......................... 78 3.6 Using The Measure Representation ................ 80 3.6.1 Constructing Measures ................... 80 3.6.2 Querying Measures ..................... 83 3.6.3 Operations on Product Measures ............. 89 3.7 Example: Chinese Restaurant Process Measure .......... 93 3.8 Summary ............................... 99 3.8.1 Computational and Feature Limitations ......... 101 3.8.2 Comparison with Other Work ............... 103 3.9 Conclusion .............................. 106 4 Representing Structured Probabilistic Models 108 4.1 Abstract and Contributions ..................... 108 4.2 A Sampling Function-Based Representation ............ 110 4.2.1 Implementation and Computational Complexity ..... 111 4.2.2 A Sampling-Based Embedded Language ......... 118 4.3 Preserving Model Structure ..................... 122 4.3.1 Motivation .......................... 122 2 4.3.2 Towards A Deep Embedding ................ 125 4.3.3 Algebraic Freeness and the Free Monad ......... 127 4.4 A Concrete Deep Embedding .................... 132 4.4.1 Forward-Mode Interpretation ............... 138 4.4.2 Backward-Mode Inference ................. 140 4.5 Working with Structure ....................... 147 4.5.1 Algebraic Cofreeness and the Cofree Comonad ..... 150 4.5.2 Representing Programs That Terminate .......... 153 4.5.3 Running Markov Chains over Execution Traces ..... 156 4.5.4 Working With Execution Traces ............. 163 4.6 Encoding Structural Independence ................. 168 4.7 Summary and Comparison to Other Work ............ 172 4.7.1 Free Monad Encoding ................... 172 4.7.2 Inference .......................... 175 4.8 Conclusion .............................. 177 5 Declarative Markov Chains 179 5.1 Abstract and Contributions ..................... 179 5.2 Motivation .............................. 180 5.3 The Structure of Markov Transitions ................ 185 5.3.1 Markov Chains ....................... 185 5.3.2 The State Monad ...................... 188 5.4 A Shallowly Embedded Language For Transitions ........ 196 5.5 Primitive Transition Operators ................... 199 5.5.1 A Concrete State Type ................... 200 5.5.2 Metropolis-Hastings (MH) ................. 203 5.5.3 Slice Sampling ....................... 204 5.5.4 Hamiltonian Monte Carlo (HMC) ............. 205 5.5.5 Metropolis-adjusted