Spare a Little Change? Towards a 5-Nines Internet in 250 Lines of Code
Total Page:16
File Type:pdf, Size:1020Kb
Spare a Little Change? Towards a -Nines Internet in Lines of Code Mukesh Agrawal CMU-CS-- May School of Computer Science Carnegie Mellon University Pittsburgh, PA Thesis Committee: David G. Andersen Bruce M. Maggs, Duke University Srinivasan Seshan, Chair Hui Zhang Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Copyright © Mukesh Agrawal This research was sponsored by the National Science Foundation under grant numbers CNS-, ANI-, and ANI-; the Air Force Research Laboratory under grant number F---; the Pittsburgh Digital Greenhouse; and AT&T. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Keywords: Internet reliability, BGP performance, Quagga This document includes excerpts of the source code for the Linux operating system kernel, and the Quagga routing software suite. These programs are copyrighted by their authors; excerpts are reproduced here solely for scholarly purposes. To my parents मेरे सब उपलिध आप का योछावर पर हआु and नानाजी और नानीजी को दनयाु म कतना दूर लेकन दल म हमेशा पास iii iv Abstract From its beginnings as a single link between two research institutions in , the Internet has grown in size and scope, to become a global internetwork connecting over million computers, and . billion users. No longer a niche facility for scientific col- laboration, the Internet now touches the lives of the world’s population, irrespective of their occupation or geography. It is used by people the world over, to pay bills, read the news, listen to music, watch videos, telephone or video-conference friends and family, and much more. The Internet is the premier communications network of our age. Unfortunately, however, there are some respects in which the Internet lags the net- works it replaces. In particular, with respect to reliability, the Internet falls far short of the Public Switched Telephone Network which proceeded it. Whereas the PSTN sought, and often delivered the vaunted “five nines” of reliability, the Internet strug- gles to compete. As for the cause of this reliability shortfall, available evidence indicates that much of the shortfall is due to the unreliability of IP routers themselves. Given the importance of a reliable Internet to contemporary society, vendors and researchers have proposed a number of solutions to either improve the reliability of individual IP routers, or to make networks more resilient to the unavailability of a single router. While having some promise, these existing solutions face significant obstacles to widespread deployment. Thus, in this dissertation, we endeavor to find or construct a practical, readily deployable, method for mitigating the outages caused by IP routers. To achieve our goal, we take inspiration from previous proposals, which advocated the use of link migration. These proposals improve network resilience, by moving links away from a failed (or failing) router, to an in-service router. To understand the con- straints of a practical solution, and resolve the limitations of previous proposals, we conduct extensive experimentation, and study source code and protocol specifications. Using the insights produced by these studies, we construct a practical, readily deploy- able migration solution with sub-second outage times. v vi Acknowledgments This dissertation could not, and would not, have come to pass, but for for the abundant help, guid- ance, and support I have been so fortunate to receive over the past several years. And so, it brings me much joy to finally have the ideal occasion to give thanks, to the many who helped me along the way. I offer my sincere thanks to all those that follow, and my humble apologies to those who I might, quite regrettably, neglect to mention. Apart from myself, I suspect no one has invested more time in the work leading to this disser- tation than my thesis advisor, Srini Seshan. He has shared his time generously for the past nine years. For that alone, I am deeply indebted. In addition, Srini has made the effort, time and again, to find problems that matched my interests and skills. For that, as well, I am truly grateful. Srini has always set high expectations. Indeed, some goals seemed simply unattainable at the outset. But such goals were set with the understanding that, if they were truly impossible, an explanation of why they were so would suffice. It is an approach that is simultaneously idealistic, and pragmatic. And it is has served me well — both in research, and beyond. Perhaps, then, the best way to understand limits is to push them… There are many factors that contribute to research success, some more obvious than others. Perhaps one of the most under-appreciated is a solid methodology. Thus, for the successes I have had, I must thank those who helped develop my approach to research. In addition to my research advisors, these include teachers and colleagues, from primary school through grad school. For contributing to my research methods, I thank specifically: Juanita Mitchell, who empha- sized the importance of meticulousness in her science class; Doug Collar, whose insisted on good note-taking in his literature class; Bianca Schroeder, who taught me how to tease insights from in- timidatingly large log files; David Nagle, who advised me to work every day, and to work neither too little, nor too much; and Sean Rhea, who taught me that when stuck, it is better to take a walk than to bang one’s head against the proverbial wall. As many before me, I had no small amount of doubt during graduate school. For helping me through those doubting moments, I thank Michael Abesamis, Luis von Ahn, Aditya Akella, Rajesh Balan, Nikhil Bansal, Ashwin Bharambe, Sharon Burks, Shuchi Chawla, Beatrice Chen, Urs Hengartner, John Langford, Kate Larson, Bruce Maggs, Amit Manjhi, Mahim Mishra, David Nagle, Suman Nath, Yamuna Raju, Bianca Schroeder, Srini Seshan, Maverick Woo, and Jeannette Wing. I thank especially Rajesh Balan, Amit Manjhi, Suman Nath, and Srini Seshan for their kind and sustained encouragement to complete my dissertation. The Computer Science Department at Carnegie Mellon is an amazing place to study, and con- duct research. Beyond the caliber of research talent, and the first-rate facilities, lurk two less well- known, but no less important, treasures. These are the administrative staff, and, the not entirely unrelated department culture. Sharon Burks, Deborah Cavlovich, Catherine Copetas, Tracy Far- bacher, Barbara Grandillo, Karen Lindenfelser, Angela Miller, and others took care of administra- tive matters, so that I never had to. vii When I returned to Pittsburgh in the Spring of , Deb even had the foresight to find me an office literally two doors down from my advisor, so that I could easily drop in with quick ques- tions. That Deb foresaw what I would need, when I myself had not, is truly remarkable, and quite appreciated. I also thank Catherine and Karen taking the time to welcome me back during that Spring, despite my having been out of touch for so long. The administrative staff takes the lead in creating a culture that values students not just as stu- dents, but as people with interests and needs beyond those of classwork or research. The students themselves build on this foundation, to provide a welcoming, supportive, and fun-loving social en- vironment. In particular, the members of D/5 volunteer their time to organize non-work events, to help balance the lives of their fellow students. Thank you to Chris Colohan, Francisco Pereira, Patrick Riley, Ted Wong, Martin Zinkevich, and many others. While this work owes much to my time, and my mentors, at Carnegie Mellon, I would be remiss not to mention my mentors from before Carnegie Mellon, and from summers away from Carnegie Mellon. From my time as a graduate student at the University of Michigan, I thank Farnam Jaha- nian, Rob Malan, and Brian Noble. Farnam gave me my start in research, and he, along with Rob, and Brian, encouraged me to continue towards the doctoral degree. From my summers at IBM Research, I thank Dilip Kandlur, Anees Shaikh, and Dinesh Verma. And from my time at AT&T Labs, I thank Bobbi Bailey, Albert Greenberg, Charles Kalmanek, and Jennifer Yates. My work has, I believe, an unusually large empirical component. While that is my natural tendency, it took others to help me appreciate its value. Farnam and Anees appreciated my exper- imental skills long before I could do so myself. Max Poletto, my first manager at Meraki, allowed me to pursue a similar approach in my development work there. It was my first opportunity to ex- periment with, and then apply experimental insights to improve, a production. Successes in those endeavors led me to trully believe in, and fully embrace, the value of experimental work. Thank you, Farnam, Anees, and Max. The results presented herein come from no fewer than one hundred experiments, of at least ten trials each, conducted between March and Novemeber of . For the laboratory that made these experiments possible, I thank the late Jay Lepreau, and the dedicated, talented, and visionary team at the Utah Network Testbed. Their work, and their vision, has dramatically expanded the limits of what is possible in empirical Computer Science. I hope that my work here has, in its own small way, demonstrated the power and value of their vision. Good work cannot be appreciated without good presentation. So I thank those who have im- proved my presentation skills immeasurably over the past many years. I thank Mor Harchol-Balter, my first advisor at Carnegie Mellon, for emphasizing the importance of easy to read, visually ap- pealing graphics.