Deployable Probabilistic Programming

Deployable Probabilistic Programming David Tolpin PUB+ Israel [email protected] Abstract There are two polar views on the role of probabilistic pro- We propose design guidelines for a probabilistic program- gramming. One view is that probabilistic programming is ming facility suitable for deployment as a part of a produc- a flexible framework for specification and analysis of sta- tion software system. As a reference implementation, we in- tistical models [11, 51, 59]. Proponents of this view per- troduce Infergo, a probabilistic programming facility for Go, ceive probabilistic programming as a tool for data science a modern programming language of choice for server-side practitioners. A typical workflow consists of data acquisi- software development. We argue that a similar probabilistic tion and pre-processing, followed by several iterations of ex- programming facility can be added to most modern general- ploratory model design and testing of inference algorithms. purpose programming languages. Once a sufficiently robust statistical model is obtained, anal- Probabilistic programming enables automatic tuning of ysis results are post-processed, visualized, and occasionally program parameters and algorithmic decision making through integrated into algorithms of a production software system. probabilistic inference based on the data. To facilitate addi- The other, emerging, view is that probabilistic program- tion of probabilistic programming capabilities to other pro- ming is an extension of a regular programming toolbox gramming languages, we share implementation choices and allowing algorithms implemented in general-purpose pro- techniques employed in development of Infergo. We illus- gramming languages to have learnable parameters. In this trate applicability of Infergo to various use cases on case view, statistical models are integrated into the software sys- studies, and evaluate Infergo’s performance on several bench- tem, and inference and optimization take place during data marks, comparing Infergo to dedicated inference-centric prob- processing, prediction, and algorithmic decision making abilistic programming frameworks. in production. Probabilistic programming code runs unat- tended, and inference results are an integral part of the algo- ACM Reference Format: rithms. Natural applications arise whenever the algorithm is David Tolpin. 2019. Deployable Probabilistic Programming. In Pro- a generative model of a mental or physical process, in par- ceedings of SPLASH Onward! 2019 (Onward! 2019). ACM, New York, ticular when latent variables of the algorithm are used for NY, USA, 16 pages. hps://doi.org/10.1145/nnnnnnn.nnnnnnn decision making [15, 61]; for example, inference about user behavior in social networks, ongoing health monitoring, or 1 Introduction online motion planning of autonomous robotic devices. Probabilistic programming [13, 14, 26, 62] represents statis- In this work we are concerned with the later view on prob- tical models as programs written in an otherwise general abilistic programming. Our objective is to pave a path to de- programming language that provides syntax for the defini- ployment of probabilistic programs in production systems, arXiv:1906.11199v1 [cs.PL] 20 Jun 2019 tion and conditioning of random variables. Inference can promoting a wider adoption of probabilistic programming be performed on probabilistic programs to obtain the pos- as a flexible tool for ubiquitous statistical inference. Most terior distribution or point estimates of the variables. In- probabilistic programming frameworks are suited, to a cer- ference algorithms are provided by the probabilistic pro- tain extent, to be used in either the exploratory or the programming framework, and each algorithm is usually appli- duction scenario. However, the proliferation of probabilis- 1 cable to a wide class of probabilistic programs in a black-box tic programming languages and frameworks on one hand, manner. The algorithms include Metropolis-Hastings[26, 59, and scarcity of success stories about production use of prob- 63], Hamiltonian Monte Carlo [10], expectation propaga- abilistic programming on the other hand, suggest that there tion [29], extensions of Sequential Monte Carlo [33, 38, 55, is a need for new ideas and approaches to make probabilistic 62], variational inference [23, 60], gradient-based optimiza- programming models available for production use. tion [7, 10], and others. 1There are 45 probabilistic programming languages listed on Wikipedia [58]. 18 probabilistic programming languages were pre- Onward! 2019, October 20-25, 2019, Athens, Greece sented during the developer meetup at PROBPROG’2018, the inaugural 2019. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM...$15.00 conference on probabilistic programming, hp://probprog.cc/, half of hps://doi.org/10.1145/nnnnnnn.nnnnnnn which are not on the Wikipedia list. Onward! 2019, October 20-25, 2019, Athens, Greece David Tolpin Our attitude differs from the established approach in that different semantics [16, 46] than a ‘regular’ program of sim- instead of proposing yet another probabilistic programming ilar structure. Goodman and Stuhlmüller [14] offer a sys- language, either genuinely different [10, 28] or derived from tematic approach to design and implementation of proba- an existing general-purpose programming language by ex- bilistic programming languages based on extension of the tending the syntax or changing the semantics [11, 13, 51], syntax of a (subset of) general-purpose programming lan- we advocate enabling probabilistic inference on models guage and transformational compilation to continuation- written in a general-purpose probabilistic programming lan- passing style [2, 3]. van de Meent et al. [54] give a com- guage, the same one as the language used to program the prehensive overview of available choices of design and im- bulk of the system. We formulate guidelines for such imple- plementation of first-order and higher-order probabilistic mentation, and, based on the guidelines, introduce a prob- programming languages. Stan [10] is built around an im- abilistic programming facility for the Go programming lan- perative probabilistic programming language with program guage [50]. By explaining our implementation choices and structure tuned for most efficient execution of inference on techniques, we argue that a similar facility can be added the model. SlicStan [17] is built on top of Stan as a source- to any modern general-purpose purpose programming lan- to-source compiler, providing the model developer with a guage, giving a production system modeling and inference more intuitive language while retaining performance ben- capabilities which do not fall short from and sometimes efits of Stan. Our work differs from existing approaches in exceed those of probabilistic programming-centric frame- that we advocate enabling probabilistic programming in a works with custom languages and runtimes. Availability of general-purpose programming language by leveraging ex- such facilities in major general-purpose programming lan- isting capabilities of the language, rather than building a guages would free probabilistic programming researchers new language on top or besides an existing one. and practitioners alike from language wars [48] and foster The need for integration between probabilistic programs practical applications of probabilistic programming. and the surrounding software environment is well under- stood in the literature [7, 51]. Probabilistic programming Contributions languages are often implemented as embedded domain- This work brings the following major contributions: specific languages [11, 42, 51] and include a mechanism of calling functions from libraries of the host languages. Our • Guidelines for design and implementation of a proba- work goes a step further in this direction: since the language bilistic programming facility deployable as a part of a of implementation of probabilistic models coincides with production software system. the host language, any library or user-defined function of • A reference implementation of deployable probabilis- the host language can be directly called from the probabilistic programming facility for the Go programming lan- tic model, and model methods can be directly called from guage. the host language. • Implementation highlights outlining important Automatic differentiation is widely employed in machine choices we made in implementation of the proba- learning [5, 56] where it is also known as ‘differentiable bilistic programming facility, and providing a basis programming’, and is responsible for enabling efficient in- for implementation of a similar facility for other ference in many probabilistic programming frameworks [7, general-purpose programming languages. 10, 11, 21, 52]. Different automatic differentiation tech- 2 Related Work niques [18] allow different compromises between flexibil- ity, efficiency, and feature-richness [21, 34, 44]. Automatic Our work is related to three interconnected areas of re- differentiation is usually implemented through either op- search: erator overloading [10, 11, 34] or source code transforma- 1. Design and implementation of probabilistic program- tion [21, 57]. Our work, too, relies on automatic differentia- ming languages. tion, implemented as source code transformation. However, 2. Integration and interoperability between probabilistic a novelty of our approach is that instead of using explicit programming models and the surrounding software calls or directives to denote parts of code which need to

Deployable Probabilistic Programming

University of North Carolina at Charlotte Belk College of Business

Acme: a User Interface for Programmers Rob Pike [email protected]−Labs.Com

Tiny Tools Gerard J

Sequence Alignment/Map Format Specification

An Introduction to Data Analysis Using the Pymc3 Probabilistic Programming Framework: a Case Study with Gaussian Mixture Modeling

Revenues 1010 1020 1030 1040 1045 1060 1090 82,958 139,250

Sam Quick Reference Card

Is Structure Based Drug Design Ready for Selectivity Optimization?

Computer Oral History Collection, 1969-1973, 1977

Using the Go Programming Language in Practice

Programming Languages

Migrating MATLAB®To Python