Unstructured! Overlay!Networks!
Total Page:16
File Type:pdf, Size:1020Kb
! UNIVERSIDADE!TÉCNICA!DE!LISBOA! INSTITUTO!SUPERIOR!TÉCNICO! ! ! ! ! ! ! ! ! ! ! Topology!Management!for!Unstructured! Overlay!Networks! ! ! ! João$Carlos$Antunes$Leitão$ ! ! Supervisor:!Doctor!Luís!Eduardo!Teixeira!Rodrigues! ! Thesis!approved!in!public!session!to!obtain!the!PhD!Degree!in! Information!Systems!and!Computer!Engineering! ! Jury!final!classification:!Pass!with!Distinction! ! Jury:! ! Chairperson:!Chairman!of!the!IST!Scientific!Board! ! Members!of!the!Committee:! Doctor!Luís!Eduardo!Teixeira!Rodrigues! Doctor!AnneMMarie!Kermarrec! Doctor!Teresa!Maria!Sá!Ferreira!Vazão!Vasques! Doctor!Henrique!João!Lopes!Domingos! ! ! ! ! ! ! ! !! 2012! ! ! UNIVERSIDADE!TÉCNICA!DE!LISBOA! INSTITUTO!SUPERIOR!TÉCNICO! ! ! ! ! ! Topology!Management!for!Unstructured! Overlay!Networks! ! ! João$Carlos$Antunes$Leitão$ ! Supervisor:!Doctor!Luís!Eduardo!Teixeira!Rodrigues! ! Thesis!approved!in!public!session!to!obtain!the!PhD!Degree!in! Information!Systems!and!Computer!Engineering! ! Jury$final$classification:!Pass!with!Distinction! ! Jury:$ ! Chairperson:!Chairman!of!the!IST!Scientific!Board! ! Members$of$the$Committee:$ ! Doctor!Luís$Eduardo$Teixeira$Rodrigues,!Professor'Catedrático'do'Instituto' Superior'Técnico,'da'Universiade'Técnica'de'Lisboa! Doctor!AnneCMarie$Kermarrec,!Directrice'de'recherce,'do'Institut'National'de' Recherche'en'Informatique'et'en'Automatique,'France! Doctor!Teresa$Maria$Sá$Ferreira$Vazão$Vasques,!Professora'Associada'do' Instituto'Superior'Técnico,'da'Universidade'Técnica'de'Lisboa' Doctor!Henrique$João$Lopes$Domingos,!Professor'Auxiliar'da'Faculdade'de' Ciências'e'Tecnologia,'da'Universidade'Nova'de'Lisboa' ' Funding$Institutions:$ FCT:!Fundação!para!a!Ciência!e!Tecnologia! INESCQID:!Instituto!de!Engenharia!de!Sistemas!e!Computadores!Investigação!e! Desenvolvimento!em!Lisboa! ! ! ! ! !! ! 2012$ Acknowledgements The work described in this pages would never be possible to be conducted without the support of many people, to whom I am in debt. First of all, I have to thank my advisor, Professor Lu´ıs Rodrigues, for providing me all the conditions and the environment to explore new ideas, and to look for excellence in my research. His motivation, his good nature, and his principles remain to this day an inspiration, and the opportunities that he offered me to real understand what is to be a researcher were essential to make me the person I am today. Evidently I have to thank all my co-authors, Robbert van Renesse, Jose´ Orlando Pereira, Benoit Garbinato, Mouna Allani, Rui Oliveira, Nuno A. Carvalho, Liliana Rosa, Joao˜ Alveir- inho, Mario´ Ferreira, Joao˜ Marques, and Joao˜ Paiva, for the opportunity to collaborate and learn from them. Working with them has provided me the opportunity to broaden the spec- trum of my research and assisted me in growing as a scientist. The Distributed Systems Group (GSD) of INESC-ID was, and still is, a great environment to work. Great environments are a result of the people that make it. Therefore, I wish to thank the present and past elements of the GSD group with whom I have shared ideas and experiences for more than four years. With no particular order (and hopping not to forget anyone) my sincere thanks to (in no particular order): Andre´ Pessoa, Carlos Ribeiro, Carlos Torrao,˜ Cristina Fonseca, Diogo Monica,´ Diogo Paulo, Jose´ Mocito, Joao˜ Barreto, Joao˜ Ferreira, Joao˜ Garcia, Joao˜ Matos, Joao˜ Nuno Silva, Lu´ıs Veiga, Maria Couceiro, Mauro Silva, Miguel Branco, Miguel Correia, Nuno Carvalho, Nuno Machado, Oksana Denysyuk, Paolo Romano, Paulo Ferreira, Pedro Pedrosa, Rodrigo Rodrigues, Rui Joaquim, Sergio´ Garrau, and Xavier Vilac¸a. The INESC-ID laboratory, where I have conducted the work presented in the thesis, has a great work environment. It would never be possible to list everyone in INESC-ID to whom I feel grateful. However, a special word of appreciation to David Matos, Lu´ısa Coheur, and Rito Silva, for their good spirits, help, discussions, and friendship. Also, a big thanks for the administrative people of INESC-ID which were always helpful and many times have brighten my days, in particular and with no particular order: Vanda Fidalgo, Teresa Mimoso, Manuela Sado, Paula Monteiro, Elisabete Rodrigues, and Aurelia´ Constantino. A special word of appreciation to all my friends (listing them all would be impossible), that supported me, helped me, and really were patient enough to stay by my side when I complained about the hardships of life. A special word of appreciation to Alexandre Ch´ıcharo that also revised the text presented here. Also, a big thanks to my family, in particular to my mother Natalina Leitao,˜ and my brother Paulo Leitao˜ that always without any doubt supported me and my decisions. A special word for my late grandfather Manuel Claudino, which every other week would ask me “Have you finished school yet?”. I never really explained to him that I was not attending school as he thought in his mind, but I know that if I could say to him now “I’m done!” he would have given me his best smile. His intelligence and his capacity to continually work and make efforts will never stop being an inspiration. Evidently, the work here was also affected by so many other people in my life, even if in different measures. Because of that I have to thank my ex-colleagues in CIISEG, the LaSIGE laboratory at FCUL (in particular my old room companions, in particular Hugo Bastos, and the elements of the now extinct DiALNP research group) for everything, and my undergraduate teachers at FCUL. Also, my thanks to the member of the Zen Shin Kan Iaido Clube de Lisboa (and my Sensei Joaquim Mendes in particular), for the excellent environment they provided for me to let go some steam during the course of my PhD. In particular, to Pedro Gomes and Maria Lu´ıs for everything inside and outside the dojo. A final word of appreciation to the Fundac¸ao˜ para a Cienciaˆ e Tecnologia (FCT), for their financial support. The work presented in the thesis was supported by the Fundac¸ao˜ para a Cienciaˆ e Tecnologia (FCT) through: The PhD grant with reference “SFRH/BD/35913/2007”; • The FCT founded project “Redico” (PTDC/EIA/71752/2006); • The FCT founded project “HPCI” (PTDC/EIA-EIA/102212/2008); • The FCT founded project “ADAAS” (CMU-PT/ELE/0030/2009); • The INESC-ID multi-annual funding through the PIDDAC Program fund grant; • To all those that contributed to the work described in these pages, and to all those that will read this in the future and find it useful. Abstract The peer-to-peer (P2P) paradigm has emerged as a viable alternative to overcome limitations of the client-server model namely, in terms of scalability, fault-tolerance, and even operational costs. This paradigm has gained significant popularity with its successful application in the context of file sharing applications. The success of these applications is illustrated by systems such as Napster, Emule, Gnutella, and recently, BitTorrent. In order to ensure the scalability of these solutions many P2P services operate on top of unstructured overlay networks, which are logical networks deployed at the application level. Unstructured overlay networks establish random neighboring associations among participants of the system. Although the random nature of these overlays is desirable by many P2P services, the resulting topology may present sub-optimal characteristics, for instance from the point of view of link latency. This may have a significative impact of the performance of P2P services executed over these overlays. This thesis focuses on the design and evaluation of techniques that manage the topology of unstructured overlay networks, to understand if and how it is possible to manage the topology of unstructured overlays in such a way that the random nature and low overhead that charac- terizes these overlays is not lost, while being able to impose some relaxed constraints over the topology to benefit the operation of specific P2P services. To answer this question, four different approaches to manage the topology of unstructured overlays are proposed and evaluated in the thesis. Each approach is evaluated in the context of a distinct P2P service that serves as a case study for measuring its benefits. In more detail, this thesis presents: i) CellFarm, a new overlay that combines properties of unstructured and structured overlays, to achieve a highly resilient topology composed of cliques of nodes, highly connected among themselves, that supports efficient replication for P2P systems; ii) X-BOT, a protocol to bias the topology of unstructured overlay networks given any criteria X; iii) Thicket, a protocol that efficiently embeds multiple interior-node-disjoint trees over a single unstructured overlay; and finally, iv) OpenFire, a protocol for balancing rumor mongering exchanges in networks populated by Firewalls and NAT-boxes. Resumo O paradigma entre-pares (P2P, do Inglesˆ peer-to-peer) surge como uma alternativa viavel´ para ultrapassar as limitac¸oes˜ do modelo cliente-servidor nomeadamente, no que diz respeito a` capacidade de escala, toleranciaˆ a falhas, e ate´ mesmo em termos de custos monetarios´ op- eracionais. Este paradigma ficou popularizado com o sucesso de aplicac¸oes˜ de partilha de ficheiros. O sucesso e relevanciaˆ desta classe de aplicac¸oes˜ e´ ilustrado por exemplos como os sistemas Napster, Emule, Gnutella, e mais recentemente o protocolo de distribuic¸ao˜ entre-pares BitTorrent. De forma a garantir a capacidade de escala dos sistemas baseados no paradigma entre-pares, e´ frequente estes sistemas operarem sobre uma rede sobreposta nao˜ estruturada: uma rede logica´ estabelecida ao n´ıvel da aplicac¸ao.˜ As redes sobrepostas nao˜ estruturadas esta- belecem relac¸oes˜ de vizinhanc¸a aleatorias´ entre os participantes do sistema. Apesar da natureza aleatoria´ destas redes ser util´ a` operac¸ao˜ de varios´ servic¸os entre-pares, a topologia resultante pode apresentar um conjunto de caracter´ısticas sub optimas´ por exemplo, a topologia da rede sobreposta pode ser definida, maioritariamente, por ligac¸oes˜ que apresentam elevada latencia.ˆ Consequentemente, a topologia das redes sobrepostas pode ter um impacto significativo no desempenho de servic¸os entre-pares executados sobre estas.