Tutorial on Probabilistic Programming with Pymc3

Tutorial on Probabilistic Programming with Pymc3

185.A83 Machine Learning for Health Informatics 2017S, VU, 2.0 h, 3.0 ECTS Tutorial 02 - 04.04.2017 Tutorial on Probabilistic Programming with PyMC3 [email protected] http://hci-kdd.org/machine-learning-for-health-informatics-course Holzinger Group hci-kdd.org 1 MAKE Health T01 Schedule ▪ 01. Introduction to Probabilistic Programming ▪ 02. PyMC3 ▪ 03. linear regression – the Bayesian way ▪ 04. generalized linear models with PyMC3 Holzinger Group hci-kdd.org 2 MAKE Health T01 01. Introduction and Overview ▪ Probabilistic Programming (PP) ▪ allows automatic Bayesian inference ▪ on complex, user-defined probabilistic models ▪ utilizing “Markov chain Monte Carlo” (MCMC) sampling ▪ PyMC3 ▪ a PP framework ▪ compiles probabilistic programs on-the-fly to C ▪ allows model specification in Python code Salvatier J, Wiecki TV, Fonnesbeck C. (2016) Probabilistic programming in Python using PyMC3. PeerJ Computer Science 2:e55 https://doi.org/10.7717/peerj-cs.55 Holzinger Group hci-kdd.org 3 MAKE Health T01 Properties of Probabilistic Programs ▪ IS NOT ▪ Software that behaves probabilistically ▪ General programming language ▪ IS ▪ Toolset for statistical / Bayesian modeling ▪ Framework to describe probabilistic models ▪ Tool to perform (automatic) inference ▪ Closely related to graphical models and Bayesian networks ▪ Extension to basic language (e.g. PyMC3 for Python) “does in 50 lines of code what used to take thousands” Kulkarni, T. D., Kohli, P., Tenenbaum, J. B. & Mansinghka, V. Picture: A probabilistic programming language for scene perception. in Proceedings of the ieee conference on computer vision and pattern recognition 4390–4399 (2015). Holzinger Group hci-kdd.org 4 MAKE Health T01 Probabilistic Programs ▪ Machine learning algorithms / models often a black box PP “open box” ▪ Simple approach 1. Define and build model 2. Automatic inference 3. Interpretation of results not much equations anymore! ▪ “inference”: guess latent variables based on observations, using e.g. MCMC Holzinger Group hci-kdd.org 5 MAKE Health T01 Markov chain Monte Carlo (MCMC) ▪ Markov chain ▪ Stochastic process ▪ “memoryless” (Markov property) ▪ Conditional probability distribution of future states depends only upon the present state ▪ Sampling from probability distributions ▪ State of chain sample of distribution ▪ Quality improves with number of steps ▪ Class of algorithms / methods ▪ Numerical approximation of complex integrals Holzinger Group hci-kdd.org 6 MAKE Health T01 Markov chain Monte Carlo (MCMC) (animated) Holzinger Group hci-kdd.org 7 MAKE Health T01 Markov chain Monte Carlo (MCMC) ▪ Metropolis-Hastings: random walk ▪ Gibbs-sampling: popular, complex, no tuning ▪ PyMC3 ▪ No-U-Turn Sampler (NUTS) ▪ Hamiltonian Monte Carlo (HMC) ▪ Metropolis ▪ Slice ▪ BinaryMetropolis Holzinger Group hci-kdd.org 8 MAKE Health T01 Prior & posterior distributions ▪ Quantity of interest: 휃 (theta) ▪ Prior = probability distribution ▪ Uncertainty before observation: p(휃) ▪ Belief in absence of data ▪ Posterior = probability distribution ▪ Uncertainty after observation X: p(휃|X) ▪ Likelihood: p(푋|휃) 푃표푠푡푒푟표푟 ∝ 퐿푘푒푙ℎ표표푑 × 푃푟표푟 푝 푥 휃 푝(휃) 푝 휃 푥 = 푝(푥) ▪ Calculating posterior from observation and prior = updating beliefs Coin toss example Holzinger Group hci-kdd.org 9 MAKE Health T01 PyMC3 ▪ Python 3 package / framework ▪ Probabilistic machine learning ▪ specification and fitting of Bayesian models ▪ Inference by MCMC & variational fitting algorithms ▪ Performance enhancements: cross-compilation to C (Python numerical computation package “Theano”) ▪ Accessible, natural syntax ▪ Various capabilities: GPU computing, sampling backends, object-oriented, extendable design Holzinger Group hci-kdd.org 10 MAKE Health T01 Live presentation ▪ PyMC3 syntax introduction ▪ linear regression – the Bayesian way ▪ generalized linear models with PyMC3 Holzinger Group hci-kdd.org 11 MAKE Health T01 Thank you! Holzinger Group hci-kdd.org 12 MAKE Health T01 Sample Questions ▪ What is probabilistic programming? What type of problems can be solved? ▪ What are the typical steps in a probabilistic program? ▪ What is inference? ▪ What is PyMC3? ▪ What is the posterior distribution? Holzinger Group hci-kdd.org 13 MAKE Health T01 Appendix ▪ Main sources ▪ Salvatier J, Wiecki TV, Fonnesbeck C. (2016) Probabilistic programming in Python using PyMC3. PeerJ Computer Science 2:e55 https://doi.org/10.7717/peerj-cs.55 ▪ Davidson-Pilon, C. Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference. (Addison-Wesley Professional, 2015). ▪ PyMC3’s documentation http://pymc-devs.github.io/pymc3/index.html Holzinger Group hci-kdd.org 14 MAKE Health T01.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    14 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us