Peer-To-Peer Distribution of Web Content Using Webrtc Within a Web Browser
Total Page:16
File Type:pdf, Size:1020Kb
TVE 15 036 Examensarbete 15 hp Juni 2015 Peer-to-peer distribution of web content using WebRTC within a web browser Kerstin Ersson Siri Persson Abstract Peer-to-peer distribution of web content using WebRTC within a web browser Kerstin Ersson, Siri Persson Teknisk- naturvetenskaplig fakultet UTH-enheten The aim of this project was to investigate if it is possible to host websites using the BitTorrent Besöksadress: protocol, a protocol for distribution of data on Ångströmlaboratoriet Lägerhyddsvägen 1 the web. This was done using several Node.js Hus 4, Plan 0 modules, small clusters of code written in JavaScript, such as Browserify and a modified Postadress: version of WebTorrent. In these modules, Box 536 751 21 Uppsala technologies like websockets and WebRTC are implemented. The project resulted in a working Telefon: WebTorrent module, implemented on the 018 – 471 30 03 website www.peerweb.io. However, the module Telefax: still needs optimization concerning the time it 018 – 471 30 00 takes to set up a WebRTC peer connection. With these modifications, we believe that Hemsida: hosting websites via peer-to-peer network will http://www.teknat.uu.se/student be the future of the web. Handledare: Magnus Lundstedt Ämnesgranskare: Rikard Emanuelsson Examinator: Martin Sjödin ISSN: 1401-5757, TVE 15 036 Populärvetenskaplig sammanfattning En fråga av stor vikt i dagens informationstekniska samhälle är hur data effektivt ska distribueras på nätet. Våra tidspressade scheman kräver att vi snabbt kan få tillgång till alla former av data. Den traditionella lösningen, en server som tillhandahåller information klienter som måste fråga servern innan de kan få tillgång till och ladda ner data, är både ineffektiv och tidskrävande. Som tur är finns nya lösningar till detta problem,baserade på ny teknik som websockets och WebRTC, två verktyg för realtidskommunikation, vilket möjliggör effektivare metoder att sprida data. BitTorrent är ett protokoll för så kallade peer-to-peer-nätverk, som används bland annat av fildelningstjänster. Trots att BitTorrent mest har använts till att olagligt sprida filmer, musik och spel på nätet har det också öppnat lagliga möjligheter för företag och enskilda användare, exempelvis kan dessa tekniker bidra till att minska bandbreddskostnader. I detta projekt undersöktes om det är möjligt att implementera ett peer-to-peer nätverk som använder sig av BitTorrent-protokollet, websockets och WebRTC för att dela data som till exempel hemsidor. Målet var att skapa en fungerande, modifierad version av den befintliga modulen WebTorrent, skriven i Node.js med programspråket JavaScript, för att kunna överföra dessa filer. Målet uppnåddes och det enda problemet som kvarstod var att processen ibland tog längre tid än önskat. Vid överföring av filer är detta acceptabelt, men vid överföring av hemsidor krävs en högre överföringshastighet. Detta skulle dock kunna åtgärdas i en framtida optimeringsprocess. 3 Contents 1 Introduction 7 1.1 Precisit . .7 1.2 Project description . .7 2 Theory 8 2.1 Programming languages . .8 2.1.1 Python . .8 2.1.2 HTML . .8 2.1.3 CSS . .8 2.1.4 JSON . .9 2.1.5 JavaScript and Node.js . .9 2.2 Developing environment and services . .9 2.2.1 GitHub . 10 2.2.2 Terminal . 10 2.2.3 Sublime Text . 10 2.3 Server-client model and peer-to-peer networks . 10 2.4 Websockets . 11 2.5 WebRTC . 11 2.5.1 Signaling . 12 2.6 The BitTorrent protocol . 13 2.6.1 Uploading and downloading files . 14 2.7 WebTorrent . 14 2.8 Similar projects . 15 3 Method 16 4 3.1 Introduction projects . 16 3.1.1 Chat application using websockets . 17 3.1.2 Video chat using WebRTC . 17 3.1.3 Chat application using a WebRTC data channel . 17 3.2 Analyzing and developing the WebTorrent module . 17 3.2.1 Modifying the modules . 17 3.2.2 Testing during development . 19 4 Result 19 4.1 Subprojects . 19 4.2 WebTorrent . 21 4.3 www.peerweb.io . 21 5 Discussion 22 5.1 Methods and the development phase . 22 5.1.1 The subprojects . 22 5.1.2 Developing the WebTorrent module . 23 5.2 Benefits and disadvantages when hosting websites with WebTorrent 24 6 Conclusions 25 5 Abbreviations API Application Programming Interface CDN Content Delivery Network CSS Cascading Style Sheets DNS Domain Name System HTML HyperText Markup Language HTTP HyperText Transfer Protocol ICE Interactive Connectivity Establishment IM Instant Messaging IP Internet Protocol JSON JavaScript Object Notation REST Representational State Transfer RTC Real Time Communication SDP Session Description Protocol SHA Secure Hash Algorithm STUN Session Traversal Utilities for NAT TCP Transmission Control Protocol UDP User Datagram Protocol URI Uniform Resource Identification URL Uniform Resource Locator 6 1 Introduction Recent technologies within modern web browsers enable new solutions to old problems, such as distribution of content (HTML code, CSS sheets, JavaScript, movies, pictures etcetera). The traditional client-server model is being replaced by new solutions with connections directly between peers, such as websockets (Theory 2.4) and WebRTC (Theory 2.5) peer connections. These new solutions have opened a world of new possibilities for distribution of content in browsers, making it faster and easier than ever to exchange information on the web. The first service for file sharing through peer-to-peer networks is considered to be Napster, which was released in 1999 [1]. Since then, several other clients have been released, and in the early 2000’s the BitTorrent protocol was presented. The BitTorrent protocol has revolutionized the way people share files on the internet. Even though its main use has been unauthorized distribution of movies, games, music, and other copyright data, it is also used legally by companies and individuals to cut bandwidth costs. 1.1 Precisit This project was done at Precisit, a startup company in IT and management consulting, at their office in Uppsala. Precisit’s vision is to work with cutting edge technology that can simplify our everyday life, and possibly change the world. 1.2 Project description This project will attempt to implement a distribution network for web content using a web implementation of the BitTorrent protocol, WebTorrent. We will, however, make a few changes to the WebTorrent source code, in order to receive desired functionality. Currently, the host company has a solution for real time signaling between a large number of peers, called jsFlow.[2] This would be extended with support for torrent trackers written in JavaScript to enable distributing content over the WebRTC data channels used by WebTorrent. The aim of this project is to investigate the feasibility in using these technologies 7 to share data such as web pages, in terms of, inter alia, application-handiness and latency. 2 Theory 2.1 Programming languages In this project several different programming languages were used. In this section, these programming languages are listed and shortly explained, since a background knowledge about these different programming languages is needed to understand concepts later on in the project. 2.1.1 Python Python is a programming language used for object oriented and functional programming. It has extensive libraries available for implementing efficient REST-services such as Tornado, which is a library by Facebook for solving the C10k problem (>10k connecting per server) and as such is highly scalable and efficient.[3] 2.1.2 HTML HTML is the standard markup language for creating web pages. The language mostly uses tag pairs to create elements, one opening tag and one closing tag enclosed by angle brackets. Web browsers use the HTML code to interpret the content of the web page. The HTML objects are the building blocks of the web pages and allow for the embedding of, for example, videos and images.[4] 2.1.3 CSS CSS is a style sheet language used to change the layout of interfaces written in a markup language. Together with HTML and JavaScript, CSS forms the cornerstone technology of building engaging web pages and mobile applications. For each HTML object used, a number of formatting instructions are provided.[5] 8 2.1.4 JSON JSON is an open standard format language originally derived from JavaScript to enable transmission of data in attribute-value pairs. Most programming languages has code for handling JSON objects.[6] 2.1.5 JavaScript and Node.js JavaScript is a dynamic script language which is mostly used on the client side in web applications. Many JavaScript developers use the module pattern when coding, which is a way of dividing bigger programs into smaller clusters of code. An example of this is Node.js, which allows JavaScript to be used on the server side as well as allows connections to databases, sending emails and so on. The idea behind Node.js modules is that they each should do one thing, and do that well. Thus, bigger and more complex programs can then be built using these modules as building blocks. npm is a package manager for Node.js, where Node developers can upload their modules for other people to use in their programs. npm has over 70 000 modules that handles anything from video streaming to advanced mathematics and encoding. Modules are imported to Node by using the command require(’module’).[7] In the browser, however, the require method is not defined. But software written for Node.js can be compiled to run in the browser environment by using Browserify, a tool that compiles Node.js style modules for use in the browser. JavaScript is also used in Precisit’s real-time messaging service jsFlow, which uses websockets to enable users to send JSON-objects.[8] 2.2 Developing environment and services To be able to develop our own code in an efficient and practical way, we used Github, the Terminal and Sublime text.