A Study of Limitations and Performance in Scalable Hosting Using Mobile Devices

DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2018 A study of limitations and performance in scalable hosting using mobile devices NIKLAS RÖNNHOLM KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE A study of limitations and performance in scalable hosting using mobile devices DA222X, Masters Thesis in Computer Science En studie i begränsningaroch prestanda förskalbar hosting med hjälpav mobila enheter DA222X, Exjobbsrapport i Datalogi Niklas Rönnholm [email protected] Supervisor: Erik Isaksson Examiner: Johan H˚astad School of Electrical Engineering and Computer Science, KTH 2018 March 2018 1 Abstract At present day, distributed computing is a widely used technique, where volunteers support different computing power needs organizations might have. This thesis sought to benchmark distributed computing performance limited to mobile device support since this type of support is seldom done with mobile devices. This thesis proposes two approaches to harnessing computational power and infrastructure of a group of mobile devices. The problems used for benchmarking are small instances of deep learning training. One requirement posed by the mobile devices' non-static nature was that this should be possible without any significant prior configuration. The protocol used for communication was HTTP. The reason deep-learning was chosen as the benchmarking problem is due to its versatility and variability. The results showed that this technique can be applied successfully to some types of problem instances, and that the two proposed approaches also favour different problem instances. The highest request rate found for the prototype with a 99% response rate was a 2100% increase in efficiency compared to a regular server. This was under the premise that it was provided just below 2000 mobile devices for only particular problem instances. Sammanfattning För närvarande är distribuerad databehandling en utbredd teknik, där frivilliga individer stödjer olika organisationers behov av datorkraft. Denna rapport försökte jämföra prestandan för distribuerad databehandling begränsad till enbart stöd av mobila enheter d˚adenna typ av stöd sällan görs med mobila enheter. Rapporten föresl˚artv˚asätt att utnyttja beräkningskraft och infrastruktur för en grupp mobila enheter. De problem som används för benchmarking är sm˚aexempel p˚adeep-learning. Ett krav som ställdes av mobilenheternas icke-statiska natur var att detta skulle vara möjligt utan n˚agrabetydande konfigureringar. Protokollet som användes för kommunikation var HTTP. Anledningen till att deep- learning valdes som referensproblem beror p˚adess m˚angsidighetoch va- riation. Resultaten visade att denna teknik kan tillämpas framg˚angsriktp˚a vissa typer av probleminstanser, och att de tv˚aföreslagna tillvägag˚angssätten ocks˚agynnar olika probleminstanser. Den högsta requesthastigheten hit- tad för prototypen med 99% svarsfrekvens var en 2100% ökning av effekti- viteten jämfört med en vanlig server. Detta givet strax under 2000 mobila enheter för vissa speciella probleminstanser. 2 Contents 1 Introduction 6 1.1 Thesis subject ............................. 6 1.2 Goal & problem formulation ..................... 8 1.3 Limitations .............................. 8 1.4 Delimitations ............................. 8 2 Background 9 2.1 Related works ............................. 9 2.1.1 Folding@home ........................ 9 2.1.2 Internet of things ....................... 9 2.1.3 Peer-to-peer technology ................... 10 2.1.4 Mobile service platform: A middleware for nomadic mobile service provisioning ................... 10 2.2 Deep learning ............................. 10 2.2.1 Multilayered feedforward neural networks ......... 11 2.3 Parallel computing .......................... 12 2.3.1 Distributed computing .................... 12 2.3.2 Amdahl's law & Gustafson's law .............. 13 2.4 Scalable hosting & Load balancing ................. 13 2.4.1 Client-based load balancing ................. 14 2.4.2 DNS-based load balancing .................. 14 2.4.3 Dispatcher-based load balancing .............. 14 2.4.4 Server-based load balancing ................. 14 2.5 Network address translator ..................... 14 2.5.1 NAT variants ......................... 15 2.5.2 NAT traversal ........................ 15 2.6 Load generation { httperf ...................... 16 2.7 GPU vs CPU for computing deep learning ............. 16 2.8 Security ................................ 17 3 Method 18 3.1 The choice of server problem .................... 18 3.2 Design of prototypes ......................... 19 3.2.1 Prototype A: Reverse proxy solution ............ 19 3.2.2 Prototype B: Redirect/dispatcher solution ......... 21 3.2.3 Prototype optimization ................... 22 3.3 Performance test of prototypes ................... 24 3.3.1 Test procedure ........................ 25 3.3.2 Implementations & Technology choice ........... 26 3.4 Test suite ............................... 27 3.4.1 Test 1 { Single device & Return time ............ 27 3.4.2 Test 2 { 35 emulated proxy nodes & Dispatcher performance extrapolation ..................... 28 3.4.3 Test 3 { Dispatcher maximum capacity .......... 28 3 3.5 Test characteristics .......................... 29 3.5.1 Connection loss ........................ 29 3.5.2 Test input ........................... 30 3.5.3 Input & Output data format ................ 31 3.6 Effectiveness formula ......................... 33 3.6.1 How to calculate transfer time of a given number of bytes 34 3.6.2 How to calculate input time ................. 35 3.6.3 How to calculate processing time .............. 36 3.6.4 How to calculate output time ................ 36 3.6.5 Comparing performance in different platforms ....... 37 4 Results 38 4.1 Test 1 - Dispatcher/Proxy single device capacity comparison ... 38 4.2 Test 2 - 35 device capacity comparison with Normal Server ... 42 4.3 Test 3 - Dispatcher & Normal maximum capacity comparison .. 47 4.4 Problem specific data ........................ 51 4.5 Connection loss probabilities .................... 52 4.6 Efficiency comparison ........................ 53 5 Discussion 56 5.1 Dispatcher vs. Normal server .................... 56 5.2 Proxy vs. Dispatcher ......................... 56 5.3 Connection overhead ......................... 57 5.4 Efficiency model evaluation ..................... 57 5.5 Report weaknesses .......................... 58 5.6 Possible improvements & Future work ............... 58 5.7 Deep learning & other problems classes .............. 60 5.8 Using mobile devices vs Regular computers ............ 60 5.9 Commercialization .......................... 60 6 Conclusion 61 4 Terminology Mobile device A mobile device, as referred to in this report is a modern smart phone or tablet equipped with Wi-Fi that has an operating system which supports 3rd party applications. NAT Acronym for Network Address Translation, which is used when forward- ing traffic from a local host using a local address to the Internet. This is done by creating a temporary translation between the local address space to the global address space (Internet). See section 2.5 for more information. DNS Domain Name Service abbreviated \DNS" is a system made to give a particular IP-address a name (RFC1035 [1]). Scalable hosting A technique used to connect multiple servers into a more powerful cluster of servers, hosting the same material. See section 2.4 for more information. Dispatcher A dispatcher is a server used to dispatch clients from itself to the correct node so to avoid handling processing or responses. See section 2.4.3 for more information. Reverse proxy A reverse proxy (Apache website [2]) is an intermediary server used by clients to access a server behind a firewall. The difference from a 'regular' proxy (Structure and encapsulation in distributed systems: the proxy principle, Marc Shapiro, 1986 [3] is that it does not forward traffic from clients outwards towards a network, but forwards traffic from clients on a network inwards towards a server. Web Service According to W3C [4], a web service is software that supports machine-to-machine communication, usually over HTTP. Such machines most often distribute prepared files (such as image files or text files), and to perform calculations as well (services such as \WolframAlpha" [5]). For the purpose of this report, this was referred to as file distributing services and processing services. Deep Learning Deep learning is a term used for describing learning systems using layered computational models capable of abstracted learning of different types. See section 2.2 for more information. FLOPs FLOPs (TechTarget, 2011 [6]) is an abbreviation of “floating-point operations per second" which is a common measure of computing capacity when comparing different hardware. 5 1 Introduction This section describes the project in terms of subject, purpose, scientific ques- tion, limitations, delimitations, scientific contribution and its relevance to soci- ety. 1.1 Thesis subject An approach emerging as an increasingly prevalent concept at present-day is referred to as volunteer computing and is mentioned in a report on a project called Folding@home [7] which was used to simulate biological phenomena by utilizing hundreds of thousands of personal devices (described further in section 2.1.1). Volunteer computing can be described as a form of distributed computing, with the distinction of having computers connected

Load more