Bayesian Network Structure Learning for the Uncertain Experimentalist with Applications to Network Biology
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Variational Message Passing
Journal of Machine Learning Research 6 (2005) 661–694 Submitted 2/04; Revised 11/04; Published 4/05 Variational Message Passing John Winn [email protected] Christopher M. Bishop [email protected] Microsoft Research Cambridge Roger Needham Building 7 J. J. Thomson Avenue Cambridge CB3 0FB, U.K. Editor: Tommi Jaakkola Abstract Bayesian inference is now widely established as one of the principal foundations for machine learn- ing. In practice, exact inference is rarely possible, and so a variety of approximation techniques have been developed, one of the most widely used being a deterministic framework called varia- tional inference. In this paper we introduce Variational Message Passing (VMP), a general purpose algorithm for applying variational inference to Bayesian Networks. Like belief propagation, VMP proceeds by sending messages between nodes in the network and updating posterior beliefs us- ing local operations at each node. Each such update increases a lower bound on the log evidence (unless already at a local maximum). In contrast to belief propagation, VMP can be applied to a very general class of conjugate-exponential models because it uses a factorised variational approx- imation. Furthermore, by introducing additional variational parameters, VMP can be applied to models containing non-conjugate distributions. The VMP framework also allows the lower bound to be evaluated, and this can be used both for model comparison and for detection of convergence. Variational message passing has been implemented in the form of a general purpose inference en- gine called VIBES (‘Variational Inference for BayEsian networkS’) which allows models to be specified graphically and then solved variationally without recourse to coding. -
Latent Dirichlet Allocation, Explained and Improved Upon for Applications in Marketing Intelligence
Latent Dirichlet Allocation, explained and improved upon for applications in marketing intelligence Iris Koks Technische Universiteit Delft Latent Dirichlet Allocation, explained and improved upon for applications in marketing intelligence by Iris Koks to obtain the degree of Master of Science at the Delft University of Technology, to be defended publicly on Friday March 22, 2019 at 2:00 PM. Student number: 4299981 Project duration: August, 2018 – March, 2019 Thesis committee: Prof. dr. ir. Geurt Jongbloed, TU Delft, supervisor Dr. Dorota Kurowicka, TU Delft Drs. Jan Willem Bikker PDEng, CQM An electronic version of this thesis is available at http://repository.tudelft.nl/. "La science, mon garçon, est faite d’erreurs, mais ce sont des erreurs qu’il est utile de faire, parce qu’elles conduisent peu à peu à la vérité." - Jules Verne, in "Voyage au centre de la Terre" Abstract In today’s digital world, customers give their opinions on a product that they have purchased online in the form of reviews. The industry is interested in these reviews, and wants to know about which topics their clients write, such that the producers can improve products on specific aspects. Topic models can extract the main topics from large data sets such as the review data. One of these is Latent Dirichlet Allocation (LDA). LDA is a hierarchical Bayesian topic model that retrieves topics from text data sets in an unsupervised manner. The method assumes that a topic is assigned to each word in a document (review), and aims to retrieve the topic distribution for each document, and a word distribution for each topic. -
UW Latex Thesis Template
Bayesian Unsupervised Labeling of Web Document Clusters by Ting Liu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Computer Science Waterloo, Ontario, Canada, 2011 c Ting Liu 2011 I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. ii Abstract Information technologies have recently led to a surge of electronic documents in the for- m of emails, webpages, blogs, news articles, etc. To help users decide which documents may be interesting to read, it is common practice to organize documents by categories/topics. A wide range of supervised and unsupervised learning techniques already exist for automated text classification and text clustering. However, supervised learning requires a training set of documents already labeled with topics/categories, which is not always readily available. In contrast, unsupervised learning techniques do not require labeled documents, but as- signing a suitable category to each resulting cluster remains a difficult problem. The state of the art consists of extracting keywords based on word frequency (or related heuristics). In this thesis, we improve the extraction of keywords for unsupervised labeling of doc- ument clusters by designing a Bayesian approach based on topic modeling. More precisely, we describe an approach that uses a large side corpus to infer a language model that im- plicitly encodes the semantic relatedness of different words. -
A Latent Dirichlet Allocation Method for Selectional Preferences
Latent Variable Models of Lexical Semantics Alan Ritter [email protected] 1 Agenda • Topic Modeling Tutorial • Quick Overview of Lexical Semantics • Examples & Practical Tips – Selectional Preferences • Argument types for relations extracted from the web – Event Type Induction • Induce types for events extracted from social media – Weakly Supervised Named Entity Classification • Classify named entities extracted from social media into types such as PERSON, LOCATION, PRODUCT, etc… 2 TOPIC MODELING 3 Useful References: • Parameter estimation for text analysis – Gregor Heinrich – http://www.arbylon.net/publications/text-est.pdf • Gibbs Sampling for the Uninitiated – Philip Resnik, Eric Hardisty – http://www.cs.umd.edu/~hardisty/papers/gsfu.pdf • Bayesian Inference with Tears – Kevin Knight – http://www.isi.edu/natural-language/people/bayes- with-tears.pdf 4 Topic Modeling: Motivation • Uncover Latent Structure • Lower Dimensional Representation • Probably a good fit if you have: – Large amount of grouped data – Unlabeled – Not even sure what the labels should be? 5 Topic Modeling: Motivation 6 Terminology • I’m going to use the terms: – Documents / Words / Topics • Really these models apply to any kind of grouped data: – Images / Pixels / Objects – Users / Friends / Groups – Music / Notes / Genres – Archeological Sites / Artifacts / Building Type 7 Estimating Parameters from Text • Consider document as a bag of random words: 푁푤 푃 퐷 휃 = 푃 푤푗 휃 = 휃푤푗 = 휃푤 푗 푗 푤 • How to estimate 휃? 푃 퐷 휃 푃(휃) 푃 휃 퐷 = 푃(퐷) Likelihood × Prior Posterior -
Lecture Notes: Mathematics for Inference and Machine Learning
LECTURE NOTES IMPERIAL COLLEGE LONDON DEPARTMENT OF COMPUTING Mathematics for Inference and Machine Learning Instructors: Marc Deisenroth, Stefanos Zafeiriou Version: January 13, 2017 Contents 1 Linear Regression3 1.1 Problem Formulation...........................4 1.2 Probabilities................................5 1.2.1 Means and Covariances.....................6 1.2.1.1 Sum of Random Variables...............7 1.2.1.2 Affine Transformation.................7 1.2.2 Statistical Independence.....................8 1.2.3 Basic Probability Distributions..................8 1.2.3.1 Uniform Distribution..................9 1.2.3.2 Bernoulli Distribution.................9 1.2.3.3 Binomial Distribution................. 10 1.2.3.4 Beta Distribution.................... 10 1.2.3.5 Gaussian Distribution................. 12 1.2.3.6 Gamma Distribution.................. 16 1.2.3.7 Wishart Distribution.................. 16 1.2.4 Conjugacy............................. 17 1.3 Probabilistic Graphical Models...................... 18 1.3.1 From Joint Distributions to Graphs............... 19 1.3.2 From Graphs to Joint Distributions............... 19 1.3.3 Further Reading......................... 21 1.4 Vector Calculus.............................. 22 1.4.1 Partial Differentiation and Gradients.............. 24 1.4.1.1 Jacobian........................ 26 1.4.1.2 Linearization and Taylor Series............ 27 1.4.2 Basic Rules of Partial Differentiation.............. 29 1.4.3 Useful Identities for Computing Gradients........... 29 1.4.4 Chain Rule............................ 30 1.5 Parameter Estimation........................... 33 1.5.1 Maximum Likelihood Estimation................ 33 1.5.1.1 Closed-Form Solution................. 35 1.5.1.2 Iterative Solution................... 35 1.5.1.3 Maximum Likelihood Estimation with Features... 36 1.5.1.4 Properties........................ 37 1.5.2 Overfitting............................ 37 1.5.3 Regularization.........................