Overlaymeter: Robust System-Wide Monitoring and Capacity-Based Search in Peer-To-Peer Networks
Total Page:16
File Type:pdf, Size:1020Kb
OverlayMeter: Robust System-wide Monitoring and Capacity-based Search in Peer-to-Peer Networks Inaugural-Dissertation zur Erlangung des Doktorgrades der Mathematisch-Naturwissenschaftlichen Fakultät der Heinrich-Heine-Universität Düsseldorf vorgelegt von Andreas Disterhöft aus Duschanbe in Tadschikistan Düsseldorf, März 2018 aus dem Institut für Informatik der Heinrich-Heine-Universität Düsseldorf Gedruckt mit der Genehmigung der Mathematisch-Naturwissenschaftlichen Fakultät der Heinrich-Heine-Universität Düsseldorf Berichterstatter: 1. Jun.-Prof. Dr.-Ing. Kalman Graffi 2. Prof. Dr. Michael Schöttner Tag der mündlichen Prüfung: 24.09.2018 Abstract In the last decade many peer-to-peer research activities have taken place. Applications using the peer-to-peer paradigm are present and their traffic, depending on the region, accounts fora significant proportion of the total traffic on the Internet. The defined goal of the systemsisto deliver a certain quality of service, which is a challenge in decentralized systems. This is due to the fact that participants have to make decisions based on their locally available information. In order to make the best decisions, a solid and extensive data basis is indispensable. For this purpose the literature on the field of p2p networks proposes monitoring the system, an approach that we follow in this work. Monitoring refers to the gathering and dissemination of system- and peer-specific data. This dissertation deals with open research questions for the improvement and extension of monitoring approaches. Furthermore, issues to simplify procedures for putting such peer-to-peer systems into operation are addressed in this work. In the first part we deal with monitoring procedures in the system-specific context. Here, we focus on tree-based procedures, as we have seen the biggest potential for this class, and tackle existing problems in their robustness. State-of-the-art tree-based approaches proposed in the literature are said to be highly precise but not robust, which is due to the dynamics of the users. As a consequence communication paths in the monitoring tree are disturbed, interrupted and in the worst case they have to be reconstructed. This leads to data loss which has a negative influence on the precision of the monitoring. Our answers to open research questions include the proposal of redundancy in a smart distribution function and the proposal of additional mechanisms that improve the perception of each participant in the monitoring structure. Thus, we present two methods that significantly increase the robustness, keep the precision on a consistently high level and have a good cost-benefit ratio. In addition to the improvement of existing monitoring systems for system-specific data, we treat phenomena that lead to an impairment of the monitoring result. The influence of malicious participants in monitoring procedures is not adequately analyzed in the literature and existing solutions for the general handling of such participants are complex and could open doors for further attack vectors. We analyze the influence of malicious participants who manipulate the result or do not adhere to protocol specifications, highlight the serious vulnerabilities, and present mechanisms for mitigation. In a comprehensive evaluation, we show that manipulation attacks can be limited to a convex hull based on outlier detection and attacks on the monitoring structure can be reduced by verifying their origin. Overall, the proposed extension for tree- based monitoring systems operates passively and is therefore versatile. On the other hand, we are pursuing a undisclosed field in monitoring; the incomplete participation of users in monitoring solutions. We identify the impact that such a behavior has on the accuracy of the monitoring and present a solution solving this problem. We focus on collecting the global state of the system and are therefore interested in the monitoring information of all participants. As a solution, we propose a generic middleware that reuses existing monitoring procedures. Here, we rely on an organization of the active participants in the middleware to measure the passive participants by probing them and feed captured information into the monitoring solution. The evaluation shows a high degree of precision, which among others depends on the precision of the probing methods. Moreover, we address the peer-specific data gathering and the capacity-based peer search. The goal is to create an efficient indexing structure that efficiently stores highly dynamic peers’ heterogeneous capacities in a distributed manner. We aim on an accurate peer search mechanism that obtains peers fulfilling set requirements. Such systems are motivated with the delegation of tasks and the realization of distributed computing in a decentralized context. The literature presents techniques that do not take all peers’ capacities into account during the search process, do not scale with the number of capacities or do not comply with the requirement of storing highly dynamic data. We propose two solutions that efficiently store highly dynamic peer capacities and distribute the load fairly among all participants. While our first solution globally sets the number of capacities to a constant, our second solution usesa dynamic approach. The proposed search processes perform quickly and precisely, so that they meet the demands. In the final part of this dissertation, we focus on a barrier which arises when using peer-to- peer software. It is the mandatory installation and set up of third-party software prior to running it. We suggest using the well-known web environment: open your browser and head to the requested service in the web. The new WebRTC standard offers the possibility to establish direct connections between browser instances, thus offering the opportunity of using peer-to-peer techniques in the context of the web. We create a chatting platform based on a peer-to-peer overlay, evaluate its performance and give an overview of possible problems of this new opportunity. In another work, we present an efficient distribution strategy of data that gives guarantees on the distribution time by estimating an upper bound. This method is used in a start-up to realize bandwidth savings in live streaming scenarios using the new standard and peer-to-peer techniques. We combine the presented methods to the so-called OverlayMeter, which provides a basis for extensive monitoring. This basis includes accurate and cost-effective system- and peer-specific data monitoring along with capacity-based peer search. The methods of the OverlayMeter can finally be utilized in various peer-to-peer applications to reliably and precisely monitor the network. iv Zusammenfassung Im letzten Jahrzehnt fanden viele Forschungsaktivitäten im Bereich von Peer-to-Peer-Systemen statt. Applikationen, die das Peer-to-Peer-Paradigma anwenden, sind gegenwärtig und deren Verkehrsaufkommen macht, abhängig von der Region, einen signifikanten Anteil am Gesamtver- kehrsaufkommen im Internet aus. Das definierte Ziel der Systeme ist die Schaffung einer gewis- sen Servicequalität, was in dezentralen Systemen eine Herausforderung darstellt. Diese beruht auf der Tatsache, dass Teilnehmer aufgrund ihrer vorliegenden Informationen Entscheidungen treffen müssen. Um optimale Entscheidungen treffen zu können, ist eine solide und umfangre- iche Datenbasis unabdingbar. Die Literatur hält hierfür Monitoringverfahren bereit, worauf in dieser Arbeit aufgebaut wird. Unter Monitoring wird die Datenerfassung und Verteilung von System- und Teilnehmer-spezifischen Daten verstanden. Diese Dissertation beschäftigt sich mit offenen Forschungsfragen zur Verbesserung und Erweiterung von Monitoringverfahren. Ferner werden Fragestellungen zur Vereinfachung von Abläufen zur Inbetriebnahme solcher Peer-to- Peer-Systeme angegangen. Im ersten Teil beschäftigen wir uns mit Monitoringverfahren im System-spezifischen Kontext. Hier setzen wir auf Baum-basierte Verfahren, denen wir das größte Potenzial zusprechen, und gehen vorhandene Probleme in deren Robustheit an. Modernste Baum-basierte Verfahren in der Literatur werden als hoch präzise aber nicht robust eingestuft, welche der Dynamik der Teilnehmer geschuldet ist. Als Folge werden Kommunikationspfade entlang des Baumes gestört, unterbrochen und im schlimmsten Fall müssen diese rekonstruiert werden. Dies führt zu Datenverlust, was sich negativ auf die Präzision auswirkt. Unsere Antworten auf offene Forschungsfragen umfassen die Einführung von Redundanz in einer smarten Verteilungsfunk- tion und die Vorstellung zusätzlicher Mechanismen, die für eine verbesserte Wahrnehmung der Monitoringstruktur sorgen. Zwei Verfahren werden im Rahmen dieser Dissertation vorgestellt. Diese erhöhen die Robustheit signifikant, halten die Präzision auf einem gleichbleibend hohem Niveau und weisen ein gutes Kosten-Nutzen-Verhältnis auf. Neben der Verbesserung von vorhandenen Monitoringsystemen, behandeln wir Phänomene, die zu einer Beeinträchtigung des Monitoringergebnisses führen. Der Einfluss von bösartigen Teilnehmern in Monitoringverfahren wird in der Literatur nicht hinreichend analysiert und vorhandene Lösungen zum allgemeinen Umgang solcher Teilnehmer sind komplex und könnten weitere Angriffsvektoren schaffen. Wir analysieren den Einfluss von bösartigen Teilnehmern, die Ergebnisse manipulieren oder sich nicht an die Protokollspezifikationen halten, betonen die gravierenden Schwachstellen und stellen Mechanismen zur Abschwächung vor. Wir zeigen in einer