DIFFERENTIABLE NEURAL LOGIC NETWORKS and THEIR APPLICATION ONTO INDUCTIVE LOGIC PROGRAMMING a Dissertation Presented to the Acad

DIFFERENTIABLE NEURAL LOGIC NETWORKS AND THEIR APPLICATION ONTO INDUCTIVE LOGIC PROGRAMMING A Dissertation Presented to The Academic Faculty By Ali Payani In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Electrical and Computer Engineering Georgia Institute of Technology May 2020 Copyright © Ali Payani 2020 DIFFERENTIABLE NEURAL LOGIC NETWORKS AND THEIR APPLICATION ONTO INDUCTIVE LOGIC PROGRAMMING Approved by: Professor Faramarz Fekri, Advisor School of Electrical and Computer Professor Ghassan AlRegib Engineering School of Electrical and Computer Georgia Institute of Technology Engineering Georgia Institute of Technology Professor Matthieu Bloch School of Electrical and Computer Professor Siva Theja Maguluri Engineering School of Industrial and Systems Georgia Institute of Technology Engineering Georgia Institute of Technology Professor Mark Davenport School of Electrical and Computer Date Approved: April 20, 2020 Engineering Georgia Institute of Technology to my family ACKNOWLEDGEMENTS I like to dedicate this dissertation to my family, specially my mom whom I love dearly, my late father whom I miss everyday and my dearest and kindest sister. I would like to express my gratitude to my advisor, Dr. Faramarz Fekri for his kindness, continuous support and for the valuable mentorship throughout these years. I am grateful to him for believing in my abilities and for giving me the chance to pursue my goals. I would also like to specially thank my thesis committee members: Dr. Matthieu Bloch, Dr. Mark Davenport, Dr. Ghassan AlRegib and Dr. Siva Maguluri for their valuable feedbacks and their support. Many thanks to my close friend and collaborator Afshin Abdi for his support in my early days of PhD and to Yashas Malur Saidutta for offering his kind support and help in many situations including my PhD proposal presentation. I would also like to thank my other friends and collaborators in Georgia Tech: Dr. Entao Liu, Dr. Jinwen Tian, Dr. Ahmad Beirami, Dr. Mohsen Sardari, Dr. Nima Torabkhani and Dr. Masoud Gheisari for making the life at campus fun and entertaining. Finally, I would like to thank my friends in Atlanta and specially Maryam Najiarani, for making me feel at home and supporting me during all these challenging years. iv TABLE OF CONTENTS Acknowledgments . iv List of Tables . ix List of Figures . xi Chapter 1: Introduction and Literature Survey . 1 1.1 Differentiable Neural Logic Networks . 3 1.2 Inductive Logic Programming . 5 1.2.1 Introduction to ILP . 6 1.2.2 Previous Works . 8 1.3 Learning from Uncertain Data via dNL-ILP . 11 1.4 Relational Reinforcement Learning . 13 1.5 Decoding LDPC codes over Binary Erasure Channels . 15 1.6 Independent Component Analysis . 15 Chapter 2: Differentiable Neural Logic Networks . 17 2.1 Introduction . 17 2.2 Neural Conjunction and Disjunction Layers . 18 2.3 Neural XOR Layer . 21 v 2.4 dNL vs MLP . 22 2.5 Binary Arithmetic . 24 2.6 Grammar Verification . 27 Chapter 3: Inductive Logic Programming via dNL . 29 3.1 Introduction . 29 3.2 Logic Programming . 30 3.3 Formulating the dNL-ILP . 34 3.3.1 Forward Chaining . 35 3.3.2 Training . 40 3.3.3 Pruning . 40 3.3.4 Lack of Uniqueness . 41 i 3.3.5 Predicate Rules (Fp)......................... 41 3.4 ILP as a Satisfiability Problem . 42 3.5 Interpretability . 43 3.6 Implementation Details . 43 3.7 Experiments . 45 3.7.1 Benchmark ILP Tasks . 46 3.7.2 Learning Decimal Multiplication . 48 3.7.3 Sorting . 51 Chapter 4: Learning from uncertain data via dNL-ILP . 53 4.1 Introduction . 53 4.2 Classification for Relational Data . 53 vi 4.2.1 Implementation Details: . 57 4.3 Handling Continuous Data . 59 4.4 Dream4 Challenge Experiment (Handling Uncertain Data) . 61 4.5 Comparing dNL-ILP to the Past Works . 62 Chapter 5: Relational Reinforcement Learning via dNL-ILP . 65 5.1 Introduction . 65 5.2 Relational Reinforcement Learning via dNL-ILP . 67 5.2.1 State Representation . 68 5.2.2 Action Representation . 72 5.3 Experiments . 73 5.3.1 BoxWorld Experiment . 73 5.3.2 GridWorld Experiment . 77 5.3.3 Relational Reasoning . 80 5.3.4 Asterix Experiment . 81 Chapter 6: Decoding LDPC codes over Binary Erasure Channels via dNL . 84 6.1 Introduction . 84 6.2 Message Passing Decoding . 85 6.3 Experiments . 89 6.3.1 Performance . 89 6.3.2 Generalizations . 90 6.3.3 dNL vs MLP . 92 6.4 ML decoding of LDPC over BEC via dNL . 93 vii Chapter 7: Independent Component Analysis using Variational Autoencoder Framework . 97 7.1 Introduction . 97 7.2 Proposed Method . 98 7.2.1 Encoder . 100 7.2.2 Decoder . 101 7.2.3 Learning Algorithm . 102 7.3 Experiments . 104 7.3.1 VAE-nICA vs VAE . 106 7.3.2 Performance . 106 7.3.3 Robustness to Noise . 106 Chapter 8: Conclusion . 109 8.1 Summary of Achievements . 109 8.2 Future Research Directions . 112 8.2.1 Incorporating Human Preference and Safe AI . 112 Appendix A: Proof of Theorem 1 . 116 Appendix B: BoxWorld Experiment Details . 117 Appendix C: GridWorld Experiment Details . 120 Appendix D: Relational Reasoning Experiment Details . 123 Appendix E: Asterix Experiment Details . 126 viii References . 136 ix LIST OF TABLES 1.1 Program definition of learning daughter relation [11] . 7 2.1 Size of required training samples to reach certain levels of accuracy . 27 3.1 Some of the notations used in this chapter . 38 3.2 dNL-ILP vs dILP and Metagol in benchmark tasks . 50 4.1 Dataset Features . 55 4.2 Some of the learned rules for classification tasks . 55 4.3 AUPR score for the 5 relational classification tasks . 56 4.4 Classification accuracy . 61 4.5 DREAM4 challenge scores . 62 5.1 Number of training episodes required for convergence . 79 6.1 Evaluating the generalization performance of the model . 92 6.2 Design parameters for functions F and G . 93 7.1 Maximum correlation results for the 3 different types of mixings; Linear, PNL and MLP. 107 7.2 Maximum correlation results for the linear mixing problem in the presence of white Gaussian noise with three different σe. 107 x B.1 ILP definition of the BoxWorld . 119 C.1 ILP definition of the GridWorld . 122 D.1 Flat index for a grid of 4 by 4 used in relational learning task . 123 D.2 ILP definition of the relational reasoning task . 125 E.1 ILP definition of the Asterix experiment . 127 xi LIST OF FIGURES 1.1 Interpretability vs performance trade-off . 3 1.2 Learning connectedness in directed graphs . 9 1.3 An example of relational database . 12 2.1 Truth table of Fc(·) and Fd(·) functions . 19 2.2 Comparing MLP vs dNL for learning Boolean functions . 25 2.3 Feed-Forward and recurrent models used in our experiments . 26 2.4 Accuracy results for Palindrome grammar check . 28 i 3.1 Internal representation of Flt via a dNL-DNF function with of hidden layer of size 4 . 36 3.2 The diagram for one step forward chaining for predicate lt where Flt is implemented using a dNL-DNF network. ..

Load more