Apr 1 1 1996
Total Page:16
File Type:pdf, Size:1020Kb
Learning Algorithms with Applications to Robot Navigation and Protein Folding by Mona Singh S.M.. Computer Science Harvard University (1989) A.B., Computer Science Harvard University (1989) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the S i j.ASACHUSTTIL OF TECHNOLOGY MASSACHUSETTS INSTITUTE OF TECHNOLOGY APR 1 11996 September 1995 LIBRARIES @ Massachusetts Institute of Technology 1995 Signature of Author t-mliatmeu u, I~-~Icaý r-ugineerng and Computer Science -tember, 1995 Certified by- '7 Ronald L. Rivest Professor of Computer Science Thesis Supervisor N-~L Certified by- Bonnie A. Berger Assistant Professor of Mathematics Thesis Supervisor Accepted by. Frederic R. Morgenthaler . ... i.littee on Graduate Students Learning Algorithms with Applications to Robot Navigation and Protein Folding by Mona Singh Submitted to the Department of Electrical Engineering and Computer Science on September, 1995, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract We consider three problems in machine learning: * concept learning in the PAC model * mobile robot environment learning * learning-based approaches to protein folding prediction In the PAC framework, we give an efficient algorithm for learning any function on k terms by general DNF. On the other hand, we show that in a well-studied restriction of the PAC model where the learner is not allowed to use a more expressive hypothesis (such as general DNF), learning most symmetric functions on k terms is NP-hard. In the area of mobile robot environment learning, we introduce the problem of piecemeal learn- ing an unknown environment. The robot must learn a complete map of its environment, while satisfying the constraint that periodically it has to return to its starting position (for refueling, say). For environments that can be modeled as grid graphs with rectangular obstacles, we give two piecemeal learning algorithms in which the robot traverses a linear number of edges. For more general environments that can be modeled as arbitrary undirected graphs, we give a nearly linear algorithm. The final part of the thesis applies machine learning to the problem of protein structure predic- tion. Most approaches to predicting local 3D structures, or motifs, are tailored towards motifs that are already well-studied by biologists. We give a learning algorithm that is particularly effective in situations where large numbers of examples of the motif are not known. These are precisely the situations that pose significant difficulties for previously known methods. We have implemented our algorithm and we demonstrate its performance on the coiled coil motif. Thesis Supervisor: Ronald L. Rivest, Professor of Computer Science Thesis Supervisor: Bonnie A. Berger, Assistant Professor of Mathematics Acknowledgements I thank my advisors Ron Rivest and Bonnie Berger for their ideas, help, and time. I have enjoyed working with each of them, and am grateful to them for all that they have taught me. Portions of this thesis are joint work with Ron and Bonnie, as well as with Margrit Betke, Avrim Blum, and Baruch Awerbuch. I am grateful to all of them for allowing me to include our joint work in this thesis. Thanks to David Johnson for encouraging me to go to graduate school, Shafi Goldwasser for being on my thesis committee, Albert Meyer for much good advice, and the teachers at Indian Springs School for having a continuing impact on my life. Thanks to Peter Kim, David Wilson and Ethan Wolf for help on the biology project. Thanks to the attendees of Ron's machine learning reading group for many helpful discussions. I am grateful to the National Science Foundation, Ron Rivest, Bonnie Berger, Baruch Awerbuch, Arvind, and CBCL for funding me at various points in my graduate career. Thanks to the Scott Bloomquist, David Jones and Bruce Dale for their help around MIT. Special thanks to Be Hubbard for interesting conversation, laughter, and much help. Thanks to all the members of the MIT theory group for providing such a wonderful envi- ronment. Special thanks to Margrit Betke for being my officemate and good friend for all these years. Thanks to Javed Aslam, Ruth Bergman, Avrim Blum, Lenore Cowen, Scott Decatur, Aditi Dhagat, Bronwyn Eisenberg, Rainer Gawlick, Rosario Genarro, Lalita Jagdeesan, Joe Kil- ian, Dina Kravets, John Leo, Roberto Segala, Donna Slonim, Mark Smith, David Williamson, Ethan Wolf, and Yiqun Yin. Thanks to my friends from the "outside" who have kept me sane. I especially thank Cindy Argo, Lisa Barnard, Kelly Bodnar, Anupam Chander, Lara Embry, Kyung In Han, Mike Hase, Pam Lipson, Lucy Song, Barbara Van Gorder, Robert Weaver, Tracey Drake Weber, and Irene Yu. Special thanks to Alisa Dougless, who has provided hours of entertainment while I've been at graduate school. I am grateful to Trevor Jim for being the best friend I could ever imagine. Most of all, I am grateful to my family. I thank my uncles, aunts and cousins for much help during difficult times. I especially thank my parents and my brothers Rajiv and Jimmy for their unbelievable courage, and for their continuing love and support. This thesis is dedicated to them. 5 Publication Notes Chapter 2 is joint work with Avrim Blum. An extended abstract describing this work appeared in Proceedings of the Third Annual Workshop on Computational Learning Theory [24]. The first part of Chapter 3 is joint work with Margrit Betke and Ron Rivest. This research is described in the journal Machzne Learning [20]. An extended abstract of this work also appeared in Proceedings of the Sixth Conference on Computational Learning Theory [19]. The second part of this chapter is joint work with Baruch Awerbuch, Margrit Betke and Ron Rivest. An extended abstract describing this work appeared in Proceedings of the Eighth Conference on Computational Learning Theory [6]. Chapter 4 is joint work with Bonnie Berger, and has been submitted for publication. Table of Contents 1 Introduction 9 2 Learning functions on k terms 17 2.1 Introduction ....... ............ .................. 17 2.2 Notation and definitions ................... ........... 19 2.3 The learning algorithm ................... ............. 20 2.3.1 Decision lists . .................. ............ 23 2.4 Hardness results ................. ................... 25 2.5 Conclusion ................... .................... 31 3 Piecemeal learning of unknown environments 33 3.1 Introduction .................. ................... 33 3.2 Related work ....... ............ .................. 34 3.3 Formal model ....... ............ .................. 37 3.4 Initial approaches to piecemeal learning ................... .... 38 3.5 Our approaches to piecemeal learning ................... ..... 40 3.5.1 Off-line piecemeal learning ................... ........ 41 3.5.2 On-line piecemeal learning ................... ........ 42 3.6 Linear time algorithms for city-block graphs . ................ .... 44 3.6.1 City-block graphs ................... ........... .. 44 8 Table of Contents 3.6.2 The wavefront algorithm ................. .. 49 3.6.3 The ray algorithm .. ................... 66 3.7 Piecemeal learning of undirected graphs . ........... ... 68 3.7.1 Algorithm STRIP-EXPLORE .. .............. ... 69 3.7.2 Iterative strip algorithm . ................ ..... 73 3.7.3 A nearly linear time algorithm for undirected graphs . ..... 76 3.8 Conclusions .... ........................... ........... .. 81 4 Learning-based algorithms for protein motif recognition 4.1 Introduction ................... 83 4.2 Further background .... .. ............ 87 4.2.1 Related work on protein structure prediction .. ... .... 87 4.2.2 Previous approaches to predicting coiled coils . .. .. ... .. 88 4.3 The algorithm . .. ..................... ............ ... 90 4.3.1 Scoring ... .................... ............ ... 92 4.3.2 Computing likelihoods . ............ ............ 93 4.3.3 Randomized selection of the new database .. ... .... ... 94 4.3.4 Updating parameters .. ....... ............ 94 4.3.5 Algorithm termination . ............ ............ 97 4.4 Results ....... ................... ...... ..... .. 97 4.4.1 The databases and test sequences . ...... ............ 98 4.4.2 Learning 3-stranded coiled coils ... ... .. ............ 99 4.4.3 Learning subclasses of 2-stranded coiled coils . .... .... .. .101 4.4.4 New coiled-coil-like candidates ....... ............ .103 4.5 Conclusions ... ...................... ............ .104 5 Concluding remarks 107 Bibliography 109 CHAPTER 1 Introduction There are many reasons we want machines, or computers, to learn. A machine that can learn is able to use its experience to help itself in the future. Such a machine can improve its performance on some task after performing the task several times. This is useful for computer scientists, since it means we do not have to consider all the possible scenarios a machine might encounter. Such a machine is able to adapt to various conditions or environments, or even to changing environments. A machine that is able to learn can also help push science forward. It may be able to speed up the learning process for humans, or it may be able discern patterns or do things which humans are incapable of doing. For example, we may want to build a machine that can learn patterns that aid in medical diagnosis, or that may be able to learn how to understand and process speech. Or we might want to build an autonomous robot