Interconnection of Restricted Boltzmann Machine Method

Total Page:16

File Type:pdf, Size:1020Kb

Interconnection of Restricted Boltzmann Machine Method BRNO UNIVERSITY OF TECHNOLOGY VYSOKÉ UČENÍ TECHNICKÉ V BRNĚ FACULTY OF MECHANICAL ENGINEERING FAKULTA STROJNÍHO INŽENÝRSTVÍ INSTITUTE OF PHYSICAL ENGINEERING ÚSTAV FYZIKÁLNÍHO INŽENÝRSTVÍ INTERCONNECTION OF RESTRICTED BOLTZMANN MACHINE METHOD WITH STATISTICAL PHYSICS AND ITS IMPLEMENTATION IN THE PROCESSING OF SPECTROSCOPIC DATA POPIS RESTRICTED BOLTZMANN MACHINE METODY VE VZTAHU SE STATISTICKOU FYZIKOU A JEHO NÁSLEDNÉ VYUŽITÍ VE ZPRACOVÁNÍ SPEKTROSKOPICKÝCH DAT MASTER'S THESIS DIPLOMOVÁ PRÁCE AUTHOR Bc. Jakub Vrábel AUTOR PRÁCE SUPERVISOR Ing. Pavel Pořízka, Ph.D. VEDOUCÍ PRÁCE BRNO 2019 Zadání diplomové práce Ústav: Ústav fyzikálního inženýrství Student: Bc. Jakub Vrábel Studijní program: Aplikované vědy v inženýrství Studijní obor: Fyzikální inženýrství a nanotechnologie Vedoucí práce: Ing. Pavel Pořízka, Ph.D. Akademický rok: 2018/19 Ředitel ústavu Vám v souladu se zákonem č.111/1998 o vysokých školách a se Studijním a zkušebním řádem VUT v Brně určuje následující téma diplomové práce: Popis Restricted Boltzmann machine metody ve vztahu se statistickou fyzikou a jeho následné využití ve zpracování spektroskopických dat Stručná charakteristika problematiky úkolu: Algoritmy strojového učení (především tzv. Hluboké učení) jsou bytostně prostoupeny fyzikálními zakonitostmi. Fyzika je v tomto ohledu využita nejen jako inspirace ke konstrukci, ale i jako nástroj k pochopení funkce daných modelů učení. Restricted Boltzmann Machines (RBM) je jedním z algoritmů umělých neuronových sítí bez učitele. Z historického vývoje, v roce 2006 Hinton (Hinton 2006) významně upozornil na neuronové sítě zavedením hlubokého učení. Aplikace RBM sahají od redukce dimenze přes klasifikaci až po feature learning, atd. Zpracování spektroskopických dat je předmětem výzkumu na rozmezí fyziky, matematiky a statistiky. Vícerozměrná data, naměřená např. využitím techniky Spektroskopie laserem buzeného plazmatu (Laser–Induced Breakdown Spectroscopy, LIBS), představují, co do objemu a komplexní struktury, náročný úkol pro klasické lineární algoritmy. Z toho důvodu je využití neuronových sítí hojně prosazováno i v oblasti spektroskopie. Optimalizace učícího algoritmu založeného na RBM v aplikaci laserové spektroskopie je pak unikátním a doposud neprozkoumaným řešením. Cíle diplomové práce: 1. Rešerše methods vycházejících ze statistické fyziky a jejich podobnost s Hlubokým učením, především RBM. 2. Přímé využití RBM pro redukci dimenze spektroskopických dat (emisní spektra z LIBS). 3. Srovnání schopností RBM a běžně používaných metod (PCA, …). Fakulta strojního inženýrství, Vysoké učení technické v Brně / Technická 2896/2 / 616 69 / Brno Seznam doporučené literatury: HINTON, G. E. Reducing the Dimensionality of Data with Neural Networks. Science. 313(5786), 504- 507 (2006). MIZIOLEK, A. W., PALLESCHI, V. a SCHECHTER, I. Laser Induced Breakdown Spectroscopy. Cambridge: Cambridge University Press (2006). Termín odevzdání diplomové práce je stanoven časovým plánem akademického roku 2018/19 V Brně, dne L. S. prof. RNDr. Tomáš Šikola, CSc. doc. Ing. Jaroslav Katolický, Ph.D. ředitel ústavu děkan fakulty Fakulta strojního inženýrství, Vysoké učení technické v Brně / Technická 2896/2 / 616 69 / Brno ABSTRACT In this work, connections between statistical physics and machine learning are stu- died with emphasis on the most basic principles and their implications. Also, the general properties of spectroscopic data are revealed and used beneficially for impro- ving automatized processing of the data. In the beginning, the partition function of a Boltzmann distribution is derived and used to study the Ising model utilizing the mean field theory approach. Later, the equivalence between the Ising model and the Hopfield network (machine learning model) is shown, along with an introduction for machine learning in general. At the end of a theoretical part, Restricted Boltzmann Machine (RBM) is obtained from the Hopfield network. Suitability of applying RBM to the processing of spectroscopic data is discussed and revealed by utilization of RBM to dimension reduction of the data. Results are compared to the standard tool (Principal Component Analysis), with discussing possible further improvements. KEYWORDS Machine Learning, LIBS, Spectroscopic Data, Artificial Neural Networks, Deep Lear- ning, Restricted Boltzmann Machine, RBM, Dimension Reduction, Statistical Phy- sics. ABSTRAKT Pr´aca sa zaober´a spojeniami medzi ˇstatistickou fyzikou a strojov´ym uˇcen´ım s d^orazomna z´akladn´eprinc´ıpy a ich d^osledky. Dalejˇ sa venuje obecn´ymvlastnos- tiam spektroskopick´ych d´ata ich zohl'adnen´ıpri pokroˇcilom spracovan´ıd´at.Zaˇciatok pr´aceje venovan´yodvodeniu partiˇcnejsumy ˇstatistick´ehosyst´emu a ˇst´udiuIsingo- vho modelu pomocou "mean field” pr´ıstupu.N´asledne,popri z´akladnom´uvode do strojov´ehouˇcenia,je uk´azan´aekvivalencia medzi Isingov´ymmodelom a Hopfieldovou siet'ou - modelom strojov´ehouˇcenia.Na konci teoretickej ˇcastije z Hopfieldovej siete odvoden´ymodel Restricted Boltzmann Machine (RBM). Vhodnost' pouˇzitiaRBM na spracovanie spektroskopick´ych d´atje diskutovan´aa preuk´azan´ana zn´ıˇzen´ıdimen- zie t´ychto d´at.V´ysledky s´uporovnan´es beˇznepouˇz´ıvanou Met´odou Hlavn´ych Kom- ponent (PCA), spolu so zhodnoten´ımpr´ıstupua moˇznost'ami d'alˇsiehozlepˇsovania. KL´ICOVˇ A´ SLOVA Strojov´euˇcenie,LIBS, spektroskopick´ed´ata,Neuronov´esiete, hlbok´euˇcenie,RBM, redukcia dimenzie. VRABEL,´ Jakub Interconnection of Restricted Boltzmann machine method with statistical physics and its implementation in the processing of spectroscopic data: master's thesis. Brno: Brno University of Technology, Faculty of Mechanical En- gineering, Institute of Physical Engineering, 2019. 85 p. Supervised by Ing. Pavel Poˇr´ızka, Ph.D. DECLARATION I declare that I have elaborated my master's thesis on the theme of \Intercon- nection of Restricted Boltzmann machine method with statistical physics and its implementation in the processing of spectroscopic data" independently, under the supervision of the master's thesis supervisor and with the use of technical literature and other sources of information which are all quoted in the thesis and detailed in the list of literature at the end of the thesis. As the author of the master's thesis I furthermore declare that, concerning the creation of this master's thesis, master's thesis, I have not infringed any copyright. In particular, I have not unlawfully encroached on anyone's personal copyright and I am fully aware of the consequences in the case of breaking Regulation x 11 and the following of the Copyright Act No 121/2000 Vol., including the possible consequences of criminal law resulted from Regulation x 152 of Criminal Act No 140/1961 Vol. Brno . ................................... (author's signature) Acknowledgement In the first place, I would like to thank my parents and family for support during my studies. Also, I want to thank my supervisor Dr. Pavel Poˇr´ızka, for suggestions to the thesis and introducing me to the scientific work. Furthermore, I am thankful to colleagues (DP, EK, JK) from the group for fruitful discussions and cooperation on various projects. Bc. Jakub Vr´abel CONTENTS Introduction1 1 Statistical physics5 1.1 Liouville's theorem............................7 1.2 Energy...................................8 1.2.1 Density of states.........................9 1.2.2 Gibbs canonical ensemble.................... 10 1.2.3 Partition function......................... 13 1.2.4 Boltzmann distribution...................... 14 1.2.5 Grand canonical ensemble.................... 16 1.2.6 Statistical description of identical particles........... 16 2 Statistical physics of fields 19 2.1 Lattice models............................... 19 2.1.1 Ising model............................ 20 2.2 Mean Field Theory (MFT)........................ 21 2.2.1 Variational Free Energy Minimization.............. 22 3 Spectroscopic Data 25 3.1 Laser-Induced Breakdown Spectroscopy (LIBS)............ 27 4 Machine Learning 29 4.1 Model and Cost Function........................ 30 4.1.1 Gradient Descent......................... 32 4.1.2 Maximum Likelihood Estimation (MLE)............ 35 4.1.3 Logistic Regression........................ 36 4.2 Artificial Neural Networks (ANN).................... 36 4.2.1 Deep Learning........................... 39 4.3 Regularization techniques........................ 39 4.4 Supervised Learning........................... 40 4.5 Unsupervised Learning.......................... 41 4.5.1 Principal Component Analysis (PCA).............. 42 4.5.2 Dimension and Clustering.................... 43 5 Interconnection of ML with Statistical Physics 45 5.1 Hopfield Networks............................ 45 5.2 Boltzmann Machines........................... 46 5.3 Restricted Boltzmann Machine (RBM)................. 47 6 Dimensionality reduction of the spectroscopic data 49 6.1 Samples, experiment, data........................ 50 6.1.1 Measurement........................... 51 6.1.2 Application of the RBM..................... 51 6.1.3 Further plans........................... 57 7 Conclusion 59 List of appendices 61 A Backpropagation method 63 B LIBS: physics, instrumentation and applications 65 B.1 Laser-Matter Interaction......................... 65 B.2 Laser Induced Plasma (LIP)....................... 67 B.3 LIP Emission............................... 69 B.3.1 Spectral Lines........................... 73 B.3.2 Plasma temperature estimation................. 77 B.4 Instrumentation and Experiments.................... 79 B.4.1 Basic data processing......................
Recommended publications
  • Neural Networks for Machine Learning Lecture 12A The
    Neural Networks for Machine Learning Lecture 12a The Boltzmann Machine learning algorithm Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed The goal of learning • We want to maximize the • It is also equivalent to product of the probabilities that maximizing the probability that the Boltzmann machine we would obtain exactly the N assigns to the binary vectors in training cases if we did the the training set. following – This is equivalent to – Let the network settle to its maximizing the sum of the stationary distribution N log probabilities that the different times with no Boltzmann machine external input. assigns to the training – Sample the visible vector vectors. once each time. Why the learning could be difficult Consider a chain of units with visible units at the ends w2 w3 w4 hidden w1 w5 visible If the training set consists of (1,0) and (0,1) we want the product of all the weights to be negative. So to know how to change w1 or w5 we must know w3. A very surprising fact • Everything that one weight needs to know about the other weights and the data is contained in the difference of two correlations. ∂log p(v) = s s − s s i j v i j model ∂wij Derivative of log Expected value of Expected value of probability of one product of states at product of states at training vector, v thermal equilibrium thermal equilibrium under the model. when v is clamped with no clamping on the visible units Δw ∝ s s − s s ij i j data i j model Why is the derivative so simple? • The energy is a linear function • The probability of a global of the weights and states, so: configuration at thermal equilibrium is an exponential ∂E function of its energy.
    [Show full text]
  • A Boltzmann Machine Implementation for the D-Wave
    A Boltzmann Machine Implementation for the D-Wave John E. Dorband Ph.D. Dept. of Computer Science and Electrical Engineering University of Maryland, Baltimore County Baltimore, Maryland 21250, USA [email protected] Abstract—The D-Wave is an adiabatic quantum computer. It is an understatement to say that it is not a traditional computer. It can be viewed as a computational accelerator or more precisely a computational oracle, where one asks it a relevant question and it returns a useful answer. The question is how do you ask a relevant question and how do you use the answer it returns. This paper addresses these issues in a way that is pertinent to machine learning. A Boltzmann machine is implemented with the D-Wave since the D-Wave is merely a hardware instantiation of a partially connected Boltzmann machine. This paper presents a prototype implementation of a 3-layered neural network where the D-Wave is used as the middle (hidden) layer of the neural network. This paper also explains how the D-Wave can be utilized in a multi-layer neural network (more than 3 layers) and one in which each layer may be multiple times the size of the D- Wave being used. Keywords-component; quantum computing; D-Wave; neural network; Boltzmann machine; chimera graph; MNIST; I. INTRODUCTION The D-Wave is an adiabatic quantum computer [1][2]. Its Figure 1. Nine cells of a chimera graph. Each cell contains 8 qubits. function is to determine a set of ones and zeros (qubits) that minimize the objective function, equation (1).
    [Show full text]
  • Boltzmann Machine Learning with the Latent Maximum Entropy Principle
    UAI2003 WANG ET AL. 567 Boltzmann Machine Learning with the Latent Maximum Entropy Principle Shaojun Wang Dale Schuurmans Fuchun Peng Yunxin Zhao University of Toronto University of Waterloo University of Waterloo University of Missouri Toronto, Canada Waterloo, Canada Waterloo, Canada Columbia, USA Abstract configurations of the hidden variables (Ackley et al. 1985, Welling and Teh 2003). We present a new statistical learning In this paper we will focus on the key problem of esti­ paradigm for Boltzmann machines based mating the parameters of a Boltzmann machine from on a new inference principle we have pro­ data. There is a surprisingly simple algorithm (Ack­ posed: the latent maximum entropy principle ley et al. 1985) for performing maximum likelihood (LME). LME is different both from Jaynes' estimation of the weights of a Boltzmann machine-or maximum entropy principle and from stan­ equivalently, to minimize the KL divergence between dard maximum likelihood estimation. We the empirical data distribution and the implied Boltz­ demonstrate the LME principle by deriving mann distribution. This classical algorithm is based on new algorithms for Boltzmann machine pa­ a direct gradient ascent approach where one calculates rameter estimation, and show how a robust (or estimates) the derivative of the likelihood function and rapidly convergent new variant of the with respect to the model parameters. The simplic­ EM algorithm can be developed. Our exper­ ity and locality of this gradient ascent algorithm has iments show that estimation based on LME attracted a great deal of interest. generally yields better results than maximum However, there are two significant problems with the likelihood estimation when inferring models standard Boltzmann machine training algorithm that from small amounts of data.
    [Show full text]
  • Stochastic Classical Molecular Dynamics Coupled To
    Brazilian Journal of Physics, vol. 29, no. 1, March, 1999 199 Sto chastic Classical Molecular Dynamics Coupled to Functional Density Theory: Applications to Large Molecular Systems K. C. Mundim Institute of Physics, Federal University of Bahia, Salvador, Ba, Brazil and D. E. Ellis Dept. of Chemistry and Materials Research Center Northwestern University, Evanston IL 60208, USA Received 07 Decemb er, 1998 Ahybrid approach is describ ed, which combines sto chastic classical molecular dynamics and rst principles DensityFunctional theory to mo del the atomic and electronic structure of large molecular and solid-state systems. The sto chastic molecular dynamics using Gener- alized Simulated Annealing GSA is based on the nonextensive statistical mechanics and thermo dynamics. Examples are given of applications in linear-chain p olymers, structural ceramics, impurities in metals, and pharmacological molecule-protein interactions. scrib e each of the comp onent pro cedures. I Intro duction In complex materials and biophysics problems the num- b er of degrees of freedom in nuclear and electronic co or- dinates is currently to o large for e ective treatmentby purely rst principles computation. Alternative tech- niques whichinvolvea hybrid mix of classical and quan- tum metho dologies can provide a p owerful to ol for anal- ysis of structure, b onding, mechanical, electrical, and sp ectroscopic prop erties. In this rep ort we describ e an implementation which has b een evolved to deal sp ecif- ically with protein folding, pharmacological molecule do cking, impurities, defects, interfaces, grain b ound- aries in crystals and related problems of 'real' solids. As in anyevolving scheme, there is still muchroom for improvement; however the guiding principles are simple: to obtain a lo cal, chemically intuitive descrip- tion of complex systems, which can b e extended in a systematic way to the nanometer size scale.
    [Show full text]
  • Learning and Evaluating Boltzmann Machines
    DepartmentofComputerScience 6King’sCollegeRd,Toronto University of Toronto M5S 3G4, Canada http://learning.cs.toronto.edu fax: +1 416 978 1455 Copyright c Ruslan Salakhutdinov 2008. June 26, 2008 UTML TR 2008–002 Learning and Evaluating Boltzmann Machines Ruslan Salakhutdinov Department of Computer Science, University of Toronto Abstract We provide a brief overview of the variational framework for obtaining determinis- tic approximations or upper bounds for the log-partition function. We also review some of the Monte Carlo based methods for estimating partition functions of arbi- trary Markov Random Fields. We then develop an annealed importance sampling (AIS) procedure for estimating partition functions of restricted Boltzmann machines (RBM’s), semi-restricted Boltzmann machines (SRBM’s), and Boltzmann machines (BM’s). Our empirical results indicate that the AIS procedure provides much better estimates of the partition function than some of the popular variational-based meth- ods. Finally, we develop a new learning algorithm for training general Boltzmann machines and show that it can be successfully applied to learning good generative models. Learning and Evaluating Boltzmann Machines Ruslan Salakhutdinov Department of Computer Science, University of Toronto 1 Introduction Undirected graphical models, also known as Markov random fields (MRF’s), or general Boltzmann ma- chines, provide a powerful tool for representing dependency structure between random variables. They have successfully been used in various application domains, including machine learning, computer vi- sion, and statistical physics. The major limitation of undirected graphical models is the need to compute the partition function, whose role is to normalize the joint probability distribution over the set of random variables.
    [Show full text]
  • Learning Algorithms for the Classification Restricted Boltzmann
    JournalofMachineLearningResearch13(2012)643-669 Submitted 6/11; Revised 11/11; Published 3/12 Learning Algorithms for the Classification Restricted Boltzmann Machine Hugo Larochelle [email protected] Universite´ de Sherbrooke 2500, boul. de l’Universite´ Sherbrooke, Quebec,´ Canada, J1K 2R1 Michael Mandel [email protected] Razvan Pascanu [email protected] Yoshua Bengio [email protected] Departement´ d’informatique et de recherche operationnelle´ Universite´ de Montreal´ 2920, chemin de la Tour Montreal,´ Quebec,´ Canada, H3T 1J8 Editor: Daniel Lee Abstract Recent developments have demonstrated the capacity of restricted Boltzmann machines (RBM) to be powerful generative models, able to extract useful features from input data or construct deep arti- ficial neural networks. In such settings, the RBM only yields a preprocessing or an initialization for some other model, instead of acting as a complete supervised model in its own right. In this paper, we argue that RBMs can provide a self-contained framework for developing competitive classifiers. We study the Classification RBM (ClassRBM), a variant on the RBM adapted to the classification setting. We study different strategies for training the ClassRBM and show that competitive classi- fication performances can be reached when appropriately combining discriminative and generative training objectives. Since training according to the generative objective requires the computation of a generally intractable gradient, we also compare different approaches to estimating this gradient and address the issue of obtaining such a gradient for problems with very high dimensional inputs. Finally, we describe how to adapt the ClassRBM to two special cases of classification problems, namely semi-supervised and multitask learning.
    [Show full text]
  • An Efficient Learning Procedure for Deep Boltzmann Machines
    ARTICLE Communicated by Yoshua Bengio An Efficient Learning Procedure for Deep Boltzmann Machines Ruslan Salakhutdinov [email protected] Department of Statistics, University of Toronto, Toronto, Ontario M5S 3G3, Canada Geoffrey Hinton [email protected] Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3G3, Canada We present a new learning algorithm for Boltzmann machines that contain many layers of hidden variables. Data-dependent statistics are estimated using a variational approximation that tends to focus on a sin- gle mode, and data-independent statistics are estimated using persistent Markov chains. The use of two quite different techniques for estimating the two types of statistic that enter into the gradient of the log likelihood makes it practical to learn Boltzmann machines with multiple hidden lay- ers and millions of parameters. The learning can be made more efficient by using a layer-by-layer pretraining phase that initializes the weights sensibly. The pretraining also allows the variational inference to be ini- tialized sensibly with a single bottom-up pass. We present results on the MNIST and NORB data sets showing that deep Boltzmann machines learn very good generative models of handwritten digits and 3D objects. We also show that the features discovered by deep Boltzmann machines are a very effective way to initialize the hidden layers of feedforward neural nets, which are then discriminatively fine-tuned. 1 A Brief History of Boltzmann Machine Learning The original learning procedure for Boltzmann machines (see section 2) makes use of the fact that the gradient of the log likelihood with respect to a connection weight has a very simple form: it is the difference of two pair-wise statistics (Hinton & Sejnowski, 1983).
    [Show full text]
  • Machine Learning Algorithms Based on Generalized Gibbs Ensembles
    Machine learning algorithms based on generalized Gibbs ensembles Tatjana Puˇskarov∗and Axel Cort´esCuberoy Institute for Theoretical Physics, Center for Extreme Matter and Emergent Phenomena, Utrecht University, Princetonplein 5, 3584 CC Utrecht, the Netherlands Abstract Machine learning algorithms often take inspiration from the established results and knowledge from statistical physics. A prototypical example is the Boltzmann machine algorithm for supervised learn- ing, which utilizes knowledge of classical thermal partition functions and the Boltzmann distribution. Recently, a quantum version of the Boltzmann machine was introduced by Amin, et. al., however, non- commutativity of quantum operators renders the training process by minimizing a cost function inefficient. Recent advances in the study of non-equilibrium quantum integrable systems, which never thermalize, have lead to the exploration of a wider class of statistical ensembles. These systems may be described by the so-called generalized Gibbs ensemble (GGE), which incorporates a number of “effective temperatures". We propose that these GGEs can be successfully applied as the basis of a Boltzmann-machine{like learn- ing algorithm, which operates by learning the optimal values of effective temperatures. We show that the GGE algorithm is an optimal quantum Boltzmann machine: it is the only quantum machine that circumvents the quantum training-process problem. We apply a simplified version of the GGE algorithm, where quantum effects are suppressed, to the classification of handwritten digits in the MNIST database. While lower error rates can be found with other state-of-the-art algorithms, we find that our algorithm reaches relatively low error rates while learning a much smaller number of parameters than would be needed in a traditional Boltzmann machine, thereby reducing computational cost.
    [Show full text]
  • Quantum Restricted Boltzmann Machine Is Universal for Quantum Computation
    Quantum restricted Boltzmann machine is universal for quantum computation Yusen Wu State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, 100876, Beijing Chunyan Wei State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, 100876, Beijing Su-Juan Qin Beijing University of Posts and Telecommunications Qiaoyan Wen Beijing University of Posts and Telecommunications Fei Gao ( [email protected] ) Beijing University of Posts and Telecommunications Research Article Keywords: Posted Date: September 15th, 2020 DOI: https://doi.org/10.21203/rs.3.rs-69480/v1 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Quantum restricted Boltzmann machine is universal for quantum computation Yusen Wu1,2,4, Chunyan Wei1,3, 1 1 1,4 Sujuan Qin , Qiaoyan Wen , and ∗Fei Gao 1State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, 100876, China 2State Key Laboratory of Cryptology, P.O. Box 5159, Beijing, 100878, China 3School of Mathematical Science, Luoyang Normal University, Luoyang 471934, China 4Center for Quantum Computing, Peng Cheng Laboratory, Shenzhen 518055, China ∗[email protected] The challenge posed by the many-body problem in quantum physics originates from the difficulty of describing the nontrivial correlations encoded in the many-body wave functions with high complexity. Quantum neural network provides a powerful tool to represent the large-scale wave function, which has aroused widespread concerns in the quantum superiority era. A significant open problem is what exactly the representational power boundary of the single-layer quantum neural network is.
    [Show full text]
  • Machine Learning and Quantum Phases of Matter Hugo Théveniaut
    Machine learning and quantum phases of matter Hugo Théveniaut To cite this version: Hugo Théveniaut. Machine learning and quantum phases of matter. Quantum Physics [quant-ph]. Université Paul Sabatier - Toulouse III, 2020. English. NNT : 2020TOU30228. tel-03157002v2 HAL Id: tel-03157002 https://tel.archives-ouvertes.fr/tel-03157002v2 Submitted on 22 Apr 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE En vue de l'obtention du DOCTORAT DE L’UNIVERSITÉ DE TOULOUSE Délivré par l'Université Toulouse 3 - Paul Sabatier Présentée et soutenue le 24 novembre 2020 par : Hugo THÉVENIAUT Méthodes d'apprentissage automatique et phases quantiques de la matière Ecole doctorale : SDM - SCIENCES DE LA MATIERE - Toulouse Spécialité : Physique Unité de recherche : LPT-IRSAMC - Laboratoire de Physique Théorique Thèse dirigée par Fabien ALET Jury M. Nicolas REGNAULT, Rapporteur M. Simon TREBST, Rapporteur M. David GUéRY-ODELIN, Examinateur M. Evert VAN NIEUWENBURG, Examinateur Mme Eliska GREPLOVA, Examinatrice M. Fabien ALET, Directeur de thèse Remerciements On exprime sans doute trop rarement sa gratitude, c'est donc avec plaisir et une ´emotioncertaine que je me plie `al'exercice qui va suivre, celui des remerciements de th`ese.J'ai eu de nombreux bienfaiteur.e.s pendant cette th`eseet je leur dois enti`erement la r´eussitede ces trois derni`eres ann´ees.
    [Show full text]
  • Feature Learning and Graphical Models for Protein Sequences
    Feature Learning and Graphical Models for Protein Sequences Subhodeep Moitra CMU-LTI-15-003 May 6, 2015 Language Technologies Institute School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 www.lti.cs.cmu.edu Thesis Committee: Dr Christopher James Langmead, Chair Dr Jaime Carbonell Dr Bhiksha Raj Dr Hetunandan Kamisetty Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Copyright c 2015 Subhodeep Moitra The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Keywords: Large Scale Feature Selection, Protein Families, Markov Random Fields, Boltz- mann Machines, G Protein Coupled Receptors To my parents. For always being there for me. For being my harshest critics. For their unconditional love. For raising me well. For being my parents. iv Abstract Evolutionarily related proteins often share similar sequences and structures and are grouped together into entities called protein families. The sequences in a protein family can have complex amino acid distributions encoding evolutionary relation- ships, physical constraints and functional attributes. Additionally, protein families can contain large numbers of sequences (deep) as well as large number of positions (wide). Existing models of protein sequence families make strong assumptions, re- quire prior knowledge or severely limit the representational power of the models. In this thesis, we study computational methods for the task of learning rich predictive and generative models of protein families. First, we consider the problem of large scale feature selection for predictive mod- els.
    [Show full text]
  • Introduction Boltzmann Machine: Learning
    Introduction to Neural Networks Probability and neural networks Initialize standard library files: Off[General::spell1]; SetOptions[ContourPlot, ImageSize → Small]; SetOptions[Plot, ImageSize → Small]; SetOptions[ListPlot, ImageSize → Small]; Introduction Last time From energy to probability. From Hopfield net to Boltzmann machine. We showed how the Hopfield network, which minimized an energy function, could be viewed as finding states that increase probabil- ity. Neural network computations from the point of view of statistical inference By treating neural network learning and dynamics in terms of probability computations, we can begin to see how a common set of tools and concepts can be applied to: 1. Inference: as a process which makes the best guesses given data. E.g. find H such that p(H | data) is biggest. 2. Learning: a process that discovers the parameters of a probability distribution. E.g. find weights such that p(weights | data) is biggest 3. Generative modeling: a process that generates data from a probability distribution. E.g. draw samples from p(data | H) Today Boltzmann machine: learning the weights Probability and statistics overview Introduction to drawing samples Boltzmann Machine: Learning We've seen how a stochastic update rule improves the chances of a network evolving to a global minimum. Now let's see how learning weights can be formulated as a statistical problem. 2 Lect_13_Probability.nb The Gibbs distribution again Suppose T is fixed at some value, say T=1. Then we could update the network and let it settle to ther- mal equilibrium, a state characterized by some statistical stability, but with occasional jiggles. Let Vα represent the vector of neural activities.
    [Show full text]