LEARNING TO REPRESENT AND REASON UNDER LIMITED SUPERVISION A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Aditya Grover August 2020 © 2020 by Aditya Grover. All Rights Reserved. Re-distributed by Stanford University under license with the author. This work is licensed under a Creative Commons Attribution- Noncommercial 3.0 United States License. http://creativecommons.org/licenses/by-nc/3.0/us/ This dissertation is online at: http://purl.stanford.edu/jv138zg1058 ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Stefano Ermon, Primary Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Moses Charikar I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Jure Leskovec I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Eric Horvitz Approved for the Stanford University Committee on Graduate Studies. Stacey F. Bent, Vice Provost for Graduate Education This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file in University Archives. iii Abstract Natural agents, such as humans, excel at building representations of the world and using them to effectively draw inferences and make decisions. Critically, the develop- ment of such advanced reasoning capabilities can occur even with limited supervision. In stark contrast, the major successes of machine learning (ML)-based artificial agents are primarily in tasks that have access to large labelled datasets or simulators, such as object recognition and game playing. This dissertation focuses on probabilis- tic modeling frameworks that shrink this gap between natural and artificial agents and thus enable effective reasoning in supervision constrained scenarios. This dissertation comprises of three parts. First, we formally lay the foundations for learning probabilistic generative models. The goal here is to simulate any available data, thus providing a natural learning objective even in settings with limited su- pervision. We discuss various trade-offs involved in learning and inference in high dimensions using these models, including the specific choice of learning objective, optimization procedure, and parametric model family. Building on these insights, we develop new algorithms to boost the performance of state-of-the-art models and mitigate their biases when trained on large uncurated and unlabelled datasets. Second, we extend these models to learn feature representations for relational data. Learning these representations is unsupervised, and we demonstrate their utility for classification and sequential decision making. Third and last, we present two real- world applications of these models for accelerating scientific discovery in: (a) learning data dependent priors for compressed sensing, and (b) design of experiments for optimizing charging in electric batteries. Together, these contributions enable ML systems to overcome the critical supervision bottleneck for high-dimensional inference and decision making problems in the real world. iv Acknowledgments First and foremost, I would like to thank my PhD advisor, Stefano Ermon. As one of Stefano’s first students, I have been fortunate to cherish a long list of experiences with him. My rollercoaster PhD, involving diverse research projects and teaching a new course to hundreds of students among many other endeavors, would not have been possible without Stefano’s inspiring work ethics, patient advising, and trademark clarity in thought. In all these years, it amazes me how every research meeting I have with Stefano ends on a personal note of optimism and intellectual satisfaction. Thank you for sharing your magic with me, I will treasure it forever. My experience at Stanford would be incomplete without the company I enjoyed in Ermon group. Neal instantaneously became a lifelong trusted friend (and personal gym trainer); Jonathan and Tri exemplify humility (and solving convex optimization problems); Shengjia, Tony, and Yang never fail to bring a smile (and write great papers for every deadline); Kristy and Rui double up as my creative alter egos (and Mixer buddies). Some who left early on continue to be great friends: Jon, Michael, Russell, Tudor, Vlad, and the newer ones continue to make the group as vibrant as ever: Andy, Burak, Chris, Kuno, Lantao, Ria. Outside of Ermon group, I will also dearly miss hanging out with the many friends at StatsML Group, InfoLab, Stanford Data Science Institute, and Stanford AI Lab. I have also had the privilege of standing on the shoulders of amazing mentors and collaborators over the years: Ankit Anand, Mausam, and Parag Singla tricked me into the joys of research when I was an undergraduate at IIT Delhi; Christopher Ré, Greg Valiant, Jure Leskovec, Moses Charikar, Noah Goodman, Percy Liang, and Stephen Boyd were generous with their insightful feedback and time during v rotations and oral exams at Stanford; Alekh Agarwal, Ashish Kapoor, Ben Poole, Dustin Tran, Eric Horvitz, Harri Edwards, Ken Tran, Kevin Murphy, Maruan Al- Shedivat, and Yura Burda broadened my research perspectives in fun summer internships at Google, Microsoft, and OpenAI. I have also enjoyed mentoring junior students who helped me pursue new directions in research: Aaron Zweig, Chris Chute, Eric Wang, Manik Dhar, and Todor Markov. A shoutout to Will Chueh and Peter Attia, my co-leads on the 4 year long battery project—Will’s investment in my research and career success coupled with Peter’s positivity, perseverance, and hard work makes for a dream collaboration. I am also deeply thankful to all my other collaborators, administrators, and funding agencies for keeping the show running. It is rightly said that friends are the family you choose. Thanks to my friends from India who have stayed in touch even after I moved to US. And I would be remiss in not acknowledging friends from Stanford who I have interacted with regularly in the last five years: Jayesh for his excellent taste in spices and Netflix shows andbeing the most helpful roommate I could have hoped for, Hima & Pracheer for being my no-filter friends on any topic of conversation from academia to Bollywood, Daniel for all the laughs, vents, and immigrant walks at Bytes and Coupa, and Aditi, Anmol & Vivek for allowing me to be myself without judgement. Thanks to many other friends and extended family who have been an integral part of this journey; I could not overstate their importance in lifting me during the lows, cheering for my highs, and being a part of memories that will last a lifetime. Finally, I would like to thank my brother, Abhinav, for being my best friend since childhood and my sister-in-law, Vaishali, for being a pillar of support to our family. Last and definitely the most, I will forever be indebted to my parents, Manila & Vimal, for their unconditional love and unfaltering care for my happiness and well-being. This one is for you. vi To my parents, Manila and Vimal Grover. vii Contents Abstract iv Acknowledgments v 1 Introduction 1 1.1 Approach ................................... 2 1.1.1 Foundations of Probabilistic Generative Modeling ....... 3 1.1.2 Relational Representation and Reasoning ............ 4 1.1.3 Applications in Science and Society ............... 5 1.2 Related Research .............................. 6 1.3 Dissertation Overview ........................... 7 I Foundations of Probabilistic Generative Modeling 9 2 Background 10 2.1 Problem Setup ................................ 10 2.2 Learning & Inference ............................ 12 2.3 Deep Generative Models .......................... 13 2.3.1 Energy-based Models ....................... 14 2.3.2 Autoregressive Models ....................... 15 2.3.3 Variational Autoencoders ..................... 16 2.3.4 Normalizing Flows ......................... 18 2.3.5 Generative Adversarial Networks ................ 20 viii 3 Trade-offs in Learning & Inference 22 3.1 Introduction ................................. 22 3.2 Flow Generative Adversarial Networks ................. 24 3.3 Empirical Evaluation ............................ 25 3.3.1 Log-likelihoods vs. sample quality ................ 26 3.3.2 Comparison with Gaussian mixture models .......... 28 3.3.3 Hybrid learning of Flow-GANs .................. 30 3.4 Explaining log-likelihood trends ..................... 31 3.5 Discussion & Related Work ........................ 32 3.6 Conclusion .................................. 33 4 Model Bias in Generative Models 35 4.1 Introduction ................................. 35 4.2 Preliminaries ................................. 37 4.3 Likelihood-Free Importance Weighting .................. 38 4.4 Boosted Energy-Based Model ....................... 43 4.5 Empirical Evaluation ............................ 46 4.5.1 Goodness-of-fit testing ....................... 46 4.5.2 Data augmentation for multi-class classification ........ 47 4.5.3 Model-based off-policy policy evaluation
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages237 Page
-
File Size-