ink - An HTTP Benchmarking Tool Andrew J. Phelps Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Master of Science in Computer Science & Application Godmar V. Back, Chair Ali R. Butt Denis Gracanin May 11, 2020 Blacksburg, Virginia Keywords: Networking, Benchmarking, HTTP, Distributed Systems Copyright 2020, Andrew J. Phelps ink - An HTTP Benchmarking Tool Andrew J. Phelps (ABSTRACT) The Hypertext Transfer Protocol (HTTP) is one the foundations of the modern Internet. Because HTTP servers may be subject to unexpected periods of high load, developers use HTTP benchmarking utilities to simulate the load generated by users. However, many of these tools do not report performance details at a per-client level, which deprives developers of crucial insights into a server’s performance capabilities. In this work, we present ink, an HTTP benchmarking tool that enables developers to better understand server performance. ink provides developers with a way of visualizing the level of service that each individual client receives. It does this by recording a trace of events for each individual simulated client. We also present a GUI that enables users to explore and visualizing the data that is generated by an HTTP benchmark. Lastly, we present a method for running HTTP benchmarks that uses a set of distributed machines to scale up the achievable load on the benchmarked server. We evaluate ink by performing a series of case studies to show that ink is both performant and useful. We validate ink’s load generation abilities within the context of a single machine and when using a set of distributed machines. ink is shown to be capable of simulating hundreds of thousands of HTTP clients and presenting per-client results through the ink GUI. We also perform a set of HTTP benchmarks where ink is able to highlight performance issues and differences between server implementations. We compare servers like NGINX and Apache and highlight their differences using ink. ink - An HTTP Benchmarking Tool Andrew J. Phelps (GENERAL AUDIENCE ABSTRACT) The World Wide Web (WWW) uses the Hypertext Transfer Protocol to send web content such as HTML pages or video to users. The servers providing this content are called HTTP servers. Sometimes, the performance of these HTTP servers is compromised because a large number of users requests documents at the same time. To prepare for this, server maintainers test how many simultaneous users a server can handle by using benchmarking utilities. These benchmarking utilities work by simulating a set of clients. Currently, these tools focus only on the amount of requests that a server can process per second. Unfortunately, this coarse- grained metric can hide important information, such as the level of service that individual clients received. In this work, we present ink, an HTTP benchmarking utility we developed that focuses on reporting information for each simulated client. Reporting data in this way allows for the developer to see how well each client was served during the benchmark. We achieve this by constructing data visualizations that include a set of client timelines. Each of these timelines represents the service that one client received. We evaluated ink through a series of case studies. These focus on the performance of the utility and the usefulness of the visualizations produced by ink. Additionally, we deployed ink in Virginia Tech’s Computer Systems course. The students were able to use the tool and took a survey pertaining to their experience with the tool. Acknowledgments First, I would like to thank my advisor, Dr. Back, for helping me perform this research. He has been of immense help over the last year, and he has been a resource of knowledge for me. He has spent a great amount of his own time assisting me, and his guidance has helped me complete this research and this document. I would also like to thank the rest of my committee, Dr. Butt and Dr. Gracanin, for the insights that they provided on this work. They both made suggestions that improved the quality of this research. Specifically, I’d like to thank Dr. Gracanin for taking the time to go through this entire thesis with me. iv Contents List of Figures vii List of Tables ix 1 Introduction 1 1.1 Testing for Performance ............................. 1 1.2 Proposed Solution ................................ 3 1.3 Contributions ................................... 6 1.4 Roadmap ..................................... 7 2 Background Information 8 2.1 Transmission Control Protocol .......................... 8 2.2 Hypertext Transfer Protocol ........................... 13 2.3 HTTP Server Concurrency Models ....................... 17 2.4 Linux Connection Management ......................... 22 2.5 HTTP Benchmarking ............................... 26 3 Design and Implementation 32 3.1 Load Generation ................................. 34 v 3.2 Distributed Benchmarking Manager ....................... 38 3.3 ink API ...................................... 41 3.4 ink GUI ...................................... 44 4 Evaluation 58 4.1 Goals and Methodology ............................. 58 4.2 Assessing ink’s Load Generation Ability .................... 60 4.3 Evaluating Server Performance with ink .................... 69 4.4 Survey Results .................................. 83 5 Related Work 86 5.1 Assessing Server Quality and Performance ................... 86 5.2 Visualization Techniques ............................. 90 6 Future Work 92 6.1 Load Generation Improvements ......................... 92 6.2 GUI Improvements ................................ 92 7 Conclusion 94 Bibliography 96 List of Figures 2.1 Example of HTTP clients experiencing different levels of service, leading to misleading data points .............................. 30 3.1 High level overview of ink’s architecture .................... 33 3.2 Brewer’s red-to-green divergent color scheme [1] ................ 46 3.3 100 client moving averages rendered using colored line segments ....... 49 3.4 1; 000 concurrent clients rendered as 200 timelines ............... 51 3.5 1; 000 concurrent clients rendered as 50 timelines ............... 52 3.6 Image of the dashboard of alternative data visualizations. Visualizations are based on the same dataset as Figure 3.3 .................... 54 3.7 2; 000 clients grouped by physical machine origin ............... 57 4.1 Benchmarking with 43; 520 simulated clients and 17 physical clients ..... 62 4.2 Benchmarking with 87; 040 simulated clients and 17 physical clients ..... 63 4.3 Benchmarking with 174; 080 simulated clients and 17 physical clients .... 64 4.4 Benchmarking with 348; 160 simulated clients and 17 physical clients .... 65 4.5 Requests-per-second generated by NGINX under varying levels of load ... 66 4.6 Average request latency observed by clients benchmarking NGINX ...... 67 vii 4.7 Histogram of latencies from benchmark with 348,160 connections and 17 phys- ical clients ..................................... 68 4.8 File descriptors allocated by HTTP server during 60 second benchmark ... 70 4.9 An ink report that indicates that only half of clients were served. Observed TCP connections are marked by black dots. .................. 71 4.10 Memory usage of HTTP server during 60 second benchmark ......... 72 4.11 Benchmark with 15; 000 clients released in waves of 5; 000 clients every 30 seconds ...................................... 74 4.12 CPU usage of pre-fork Apache HTTP server with 1; 000 concurrent connections 77 4.13 Network usage of pre-fork Apache HTTP server with 1; 000 concurrent con- nections ...................................... 78 4.14 CPU usage of NGINX HTTP server with 1; 000 concurrent connections ... 79 4.15 Network usage of NGINX HTTP server with 1; 000 concurrent connections . 80 4.16 Client timelines of pre-fork Apache HTTP server with 1; 000 concurrent con- nections ...................................... 81 4.17 Client timelines of NGINX HTTP server with 1; 000 concurrent connections 82 List of Tables 2.1 Comparison of load generated on NGINX by popular HTTP benchmarking tools ........................................ 28 4.1 Comparison of RPS generated by original wrk and modified wrk ....... 61 ix Chapter 1 Introduction Society is becoming more and more reliant on web-based services. Websites, mobile appli- cations, and headless systems are all connected to the Internet. A large number of these services is built upon the Hypertext Transfer Protocol (HTTP), which is the most popular application layer protocol that is used on the Internet today. HTTP is far from its simple origins of transferring markup files; now, complex, critical systems use HTTP to transfer important data. For this reason, it is important that the performance of these services can be tested adequately. If critical services, such as government health care services [2], run on top of HTTP, developers need to understand the impact that a sudden surge of users can have on their application. Crisis situations can lead to unforeseen spikes in the load being put on web servers [3]. Even applications that were developed with a smaller target audience in mind can be subject to heavy load if a “Slashdot Effect” [4] occurs in which an unprepared HTTP server is accessed by thousands of users in an instant. 1.1 Testing for Performance To help prepare for situations that result in an unexpected influx of traffic, developers have created specialized software that can simulate
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages112 Page
-
File Size-