A Study of Limitations and Performance in Scalable Hosting Using Mobile Devices

A Study of Limitations and Performance in Scalable Hosting Using Mobile Devices

DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2018 A study of limitations and performance in scalable hosting using mobile devices NIKLAS RÖNNHOLM KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE A study of limitations and performance in scalable hosting using mobile devices DA222X, Masters Thesis in Computer Science En studie i begr¨ansningaroch prestanda f¨orskalbar hosting med hj¨alpav mobila enheter DA222X, Exjobbsrapport i Datalogi Niklas R¨onnholm [email protected] Supervisor: Erik Isaksson Examiner: Johan H˚astad School of Electrical Engineering and Computer Science, KTH 2018 March 2018 1 Abstract At present day, distributed computing is a widely used technique, where volunteers support different computing power needs organizations might have. This thesis sought to benchmark distributed computing perfor- mance limited to mobile device support since this type of support is seldom done with mobile devices. This thesis proposes two approaches to harnessing computational power and infrastructure of a group of mo- bile devices. The problems used for benchmarking are small instances of deep learning training. One requirement posed by the mobile devices' non-static nature was that this should be possible without any significant prior configuration. The protocol used for communication was HTTP. The reason deep-learning was chosen as the benchmarking problem is due to its versatility and variability. The results showed that this technique can be applied successfully to some types of problem instances, and that the two proposed approaches also favour different problem instances. The highest request rate found for the prototype with a 99% response rate was a 2100% increase in efficiency compared to a regular server. This was under the premise that it was provided just below 2000 mobile devices for only particular problem instances. Sammanfattning F¨or n¨arvarande ¨ar distribuerad databehandling en utbredd teknik, d¨ar frivilliga individer st¨odjer olika organisationers behov av datorkraft. Denna rapport f¨ors¨okte j¨amf¨ora prestandan f¨or distribuerad databehand- ling begr¨ansad till enbart st¨od av mobila enheter d˚adenna typ av st¨od s¨allan g¨ors med mobila enheter. Rapporten f¨oresl˚artv˚as¨att att utnyttja ber¨akningskraft och infrastruktur f¨or en grupp mobila enheter. De pro- blem som anv¨ands f¨or benchmarking ¨ar sm˚aexempel p˚adeep-learning. Ett krav som st¨alldes av mobilenheternas icke-statiska natur var att det- ta skulle vara m¨ojligt utan n˚agrabetydande konfigureringar. Protokollet som anv¨andes f¨or kommunikation var HTTP. Anledningen till att deep- learning valdes som referensproblem beror p˚adess m˚angsidighetoch va- riation. Resultaten visade att denna teknik kan till¨ampas framg˚angsriktp˚a vissa typer av probleminstanser, och att de tv˚af¨oreslagna tillv¨agag˚angss¨atten ocks˚agynnar olika probleminstanser. Den h¨ogsta requesthastigheten hit- tad f¨or prototypen med 99% svarsfrekvens var en 2100% ¨okning av effekti- viteten j¨amf¨ort med en vanlig server. Detta givet strax under 2000 mobila enheter f¨or vissa speciella probleminstanser. 2 Contents 1 Introduction 6 1.1 Thesis subject ............................. 6 1.2 Goal & problem formulation ..................... 8 1.3 Limitations .............................. 8 1.4 Delimitations ............................. 8 2 Background 9 2.1 Related works ............................. 9 2.1.1 Folding@home ........................ 9 2.1.2 Internet of things ....................... 9 2.1.3 Peer-to-peer technology ................... 10 2.1.4 Mobile service platform: A middleware for nomadic mo- bile service provisioning ................... 10 2.2 Deep learning ............................. 10 2.2.1 Multilayered feedforward neural networks ......... 11 2.3 Parallel computing .......................... 12 2.3.1 Distributed computing .................... 12 2.3.2 Amdahl's law & Gustafson's law .............. 13 2.4 Scalable hosting & Load balancing ................. 13 2.4.1 Client-based load balancing ................. 14 2.4.2 DNS-based load balancing .................. 14 2.4.3 Dispatcher-based load balancing .............. 14 2.4.4 Server-based load balancing ................. 14 2.5 Network address translator ..................... 14 2.5.1 NAT variants ......................... 15 2.5.2 NAT traversal ........................ 15 2.6 Load generation { httperf ...................... 16 2.7 GPU vs CPU for computing deep learning ............. 16 2.8 Security ................................ 17 3 Method 18 3.1 The choice of server problem .................... 18 3.2 Design of prototypes ......................... 19 3.2.1 Prototype A: Reverse proxy solution ............ 19 3.2.2 Prototype B: Redirect/dispatcher solution ......... 21 3.2.3 Prototype optimization ................... 22 3.3 Performance test of prototypes ................... 24 3.3.1 Test procedure ........................ 25 3.3.2 Implementations & Technology choice ........... 26 3.4 Test suite ............................... 27 3.4.1 Test 1 { Single device & Return time ............ 27 3.4.2 Test 2 { 35 emulated proxy nodes & Dispatcher perfor- mance extrapolation ..................... 28 3.4.3 Test 3 { Dispatcher maximum capacity .......... 28 3 3.5 Test characteristics .......................... 29 3.5.1 Connection loss ........................ 29 3.5.2 Test input ........................... 30 3.5.3 Input & Output data format ................ 31 3.6 Effectiveness formula ......................... 33 3.6.1 How to calculate transfer time of a given number of bytes 34 3.6.2 How to calculate input time ................. 35 3.6.3 How to calculate processing time .............. 36 3.6.4 How to calculate output time ................ 36 3.6.5 Comparing performance in different platforms ....... 37 4 Results 38 4.1 Test 1 - Dispatcher/Proxy single device capacity comparison ... 38 4.2 Test 2 - 35 device capacity comparison with Normal Server ... 42 4.3 Test 3 - Dispatcher & Normal maximum capacity comparison .. 47 4.4 Problem specific data ........................ 51 4.5 Connection loss probabilities .................... 52 4.6 Efficiency comparison ........................ 53 5 Discussion 56 5.1 Dispatcher vs. Normal server .................... 56 5.2 Proxy vs. Dispatcher ......................... 56 5.3 Connection overhead ......................... 57 5.4 Efficiency model evaluation ..................... 57 5.5 Report weaknesses .......................... 58 5.6 Possible improvements & Future work ............... 58 5.7 Deep learning & other problems classes .............. 60 5.8 Using mobile devices vs Regular computers ............ 60 5.9 Commercialization .......................... 60 6 Conclusion 61 4 Terminology Mobile device A mobile device, as referred to in this report is a modern smart phone or tablet equipped with Wi-Fi that has an operating system which supports 3rd party applications. NAT Acronym for Network Address Translation, which is used when forward- ing traffic from a local host using a local address to the Internet. This is done by creating a temporary translation between the local address space to the global address space (Internet). See section 2.5 for more informa- tion. DNS Domain Name Service abbreviated \DNS" is a system made to give a particular IP-address a name (RFC1035 [1]). Scalable hosting A technique used to connect multiple servers into a more powerful cluster of servers, hosting the same material. See section 2.4 for more information. Dispatcher A dispatcher is a server used to dispatch clients from itself to the correct node so to avoid handling processing or responses. See section 2.4.3 for more information. Reverse proxy A reverse proxy (Apache website [2]) is an intermediary server used by clients to access a server behind a firewall. The difference from a 'regular' proxy (Structure and encapsulation in distributed systems: the proxy principle, Marc Shapiro, 1986 [3] is that it does not forward traffic from clients outwards towards a network, but forwards traffic from clients on a network inwards towards a server. Web Service According to W3C [4], a web service is software that supports machine-to-machine communication, usually over HTTP. Such machines most often distribute prepared files (such as image files or text files), and to perform calculations as well (services such as \WolframAlpha" [5]). For the purpose of this report, this was referred to as file distributing services and processing services. Deep Learning Deep learning is a term used for describing learning systems using layered computational models capable of abstracted learning of dif- ferent types. See section 2.2 for more information. FLOPs FLOPs (TechTarget, 2011 [6]) is an abbreviation of “floating-point operations per second" which is a common measure of computing capacity when comparing different hardware. 5 1 Introduction This section describes the project in terms of subject, purpose, scientific ques- tion, limitations, delimitations, scientific contribution and its relevance to soci- ety. 1.1 Thesis subject An approach emerging as an increasingly prevalent concept at present-day is referred to as volunteer computing and is mentioned in a report on a project called Folding@home [7] which was used to simulate biological phenomena by utilizing hundreds of thousands of personal devices (described further in section 2.1.1). Volunteer computing can be described as a form of distributed comput- ing, with the distinction of having computers connected

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    66 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us