Performance and Availability Characterization for Linux Servers
Total Page:16
File Type:pdf, Size:1020Kb
Performance and Availability Characterization for Linux Servers Vasily Linkov Oleg Koryakovskiy Motorola Software Group Motorola Software Group [email protected] [email protected] Abstract software solutions that were very expensive and in many cases posed a lock-in with specific vendors. In the cur- The performance of Linux servers running in mission- rent business environment, many players have come to critical environments such as telecommunication net- the market with variety of cost-effective telecommuni- works is a critical attribute. Its importance is growing cation technologies including packed data technologies due to incorporated high availability approaches, espe- such as VoIP, creating server-competitive conditions for cially for servers requiring five and six nines availability. traditional providers of wireless types of voice commu- With the growing number of requirements that Linux nications. To be effective in this new business environ- servers must meet in areas of performance, security, re- ment, the vendors and carriers are looking for ways to liability, and serviceability, it is becoming a difficult task decrease development and maintenance costs, and de- to optimize all the architecture layers and parameters to crease time to market for their solutions. meet the user needs. Since 2000, we have witnessed the creation of several industry bodies and forums such as the Service Avail- Other Linux servers, those not operating in a mission- ability Forum, Communications Platforms Trade As- critical environment, also require different approaches sociation, Linux Foundation Carrier Grade Linux Ini- to optimization to meet specific constraints of their op- tiative, PCI Industrial Computer Manufacturers Group, erating environment, such as traffic type and intensity, SCOPE Alliance, and many others. Those industry types of calculations, memory, and CPU and IO use. forums are working on defining common approaches This paper proposes and discusses the design and imple- and standards that are intended to address fundamental mentation of a tool called the Performance and Avail- problems and make available a modular approach for ability Characterization tool, PAC for short, which op- telecommunication solutions, where systems are built erates with over 150 system parameters to optimize over using well defined hardware specifications, standards, 50 performance characteristics. The paper discusses the and Open Source APIs and libraries for their middle- PAC tool’s architecture, multi-parametric analysis algo- ware and applications [11] (“Technology Trends” and rithms, and application areas. Furthermore, the paper “The .org player” chapters). presents possible future work to improve the tool and ex- The Linux operating system has become the de facto tend it to cover additional system parameters and char- standard operating system for the majority of telecom- acteristics. munication systems. The Carrier Grade Linux initia- tive at the Linux Foundation addresses telecommunica- 1 Introduction tion system requirements, which include availability and performance [16]. The telecommunications market is one of the fastest Furthermore, companies as Alcatel, Cisco, Ericsson, growing industries where performance and availabil- Hewlett-Packard, IBM, Intel, Motorola, and Nokia use ity demands are critical due to the nature of real-time Linux solutions from MontaVista, Red Hat, SuSE, and communications tasks with requirement of serving thou- WindRiver for such their products as softswitches, tele- sands of subscribers simultaneously with defined quality com management systems, packet data gateways, and of service. Before Y2000, telecommunications infras- routers [13]. Examples of existing products include Al- tructure providers were solving performance and avail- catel Evolium BSC 9130 on the base of the Advanced ability problems by providing proprietary hardware and TCA platform, and Nortel MSC Server [20], and many • 263 • 264 • Performance and Availability Characterization for Linux Servers more of them are announced by such companies as Mo- not about performance problems, but about performance torola, Nokia, Siemens, and Ericsson. characterization of the system. In many cases, people working on the new system development and fortunately As we can see, there are many examples of Linux-based, having performance requirements agreed up front use carrier-grade platforms used for a variety of telecommu- prototyping techniques. That is a straightforward but nication server nodes. Depending on the place and func- still difficult way, especially for telecommunication sys- tionality of the particular server node in the telecommu- tems where the load varies by types, geographic loca- nication network infrastructure, there can be different tion, time of the day, etc. Prototyping requires creation types of loads and different types of performance bot- of an adequate but inexpensive model which is problem- tlenecks. atic in described conditions. Many articles and other materials are devoted to ques- The authors of this paper are working in telecommu- tions like “how will Linux-based systems handle per- nication software development area and hence tend to formance critical tasks?” In spite of the availability mostly consider problems that they face and solve in of carrier-class solutions, the question is still important their day-to-day work. It was already said that perfor- for systems serving a large amount of simultaneous re- mance issues and characterization are within the area of quests, e.g. WEB Servers [10] as Telecommunication- interest for a Linux-based system developer. Character- specific systems. ization of performance is about inexpensive modeling Telecommunication systems such as wireless/mobile of the specific solution with the purpose of predicting networks have complicated infrastructures imple- future system performance. mented, where each particular subsystem solves its spe- cific problem. Depending on the problem, the critical What are the other performance-related questions that systems’ resource could be different. For example, Dy- may be interesting when working in telecommunica- namic Memory Allocation could become a bottleneck tions? It isn’t just by chance we placed the word Per- for Billing Gateway, Fraud Control Center (FCC), and formance close to Availability; both are essential char- Data Monitoring (DMO) [12] even in SMP architecture acteristics of a modern telecommunication system. If environment. Another example is WLAN-to-WLAN we think for a moment about the methods of achiev- handover in UMTS networks where TCP connection re- ing of some standard level of availability (let’s say the establishment involves multiple boxes including HLR, five- or six- nines that are currently common industry DHCP servers and Gateways, and takes significant time standards), we will see that it is all about redundancy, (10–20 sec.) which is absolutely unacceptable for VoIP reservation, and recovery. Besides specific requirements applications [15]. A similar story occurred with TCP to the hardware, those methods require significant soft- over CDMA2000 Networks, where a bottleneck was ware overhead functionality. That means that in addi- found in the buffer and queue sizes of a BSC box [17]. tion to system primary functions, it should provide al- The list of the examples can be endless. gorithms for monitoring failure events and providing ap- propriate recovery actions. These algorithms are obvi- If we consider how the above examples differ, we would ously resource-consuming and therefore impact overall find out that in most cases performance issues appear to system performance, so another problem to consider is be quite difficult to deal with, and usually require rework a reasonable tradeoff between availability and produc- and redesign of the whole system, which may obviously tivity [8], [18]. be very expensive. Let’s consider some more problems related to telecom- The performance improvement by itself is quite a well- munication systems performance and availability char- known task that is being solved by the different ap- acterization that are not as fundamental as those de- proaches including the Clustering and the Distributed scribed above, but which are still important (Figure 1). Dynamic Load Balancing (DDLB) methods [19]; this can take into account load of each particular node (CPU) Performance profiling. The goal of performance pro- and links throughput. However, a new question may filing is to verify that performance requirements have arise: “Well. We know the load will be even and dy- been achieved. Response times, throughput, and other namically re-distributed, but what is the maximum sys- time-sensitive system characteristics should be mea- tem performance we can expect?” Here we are talking sured and evaluated. The performance profiling is ap- 2007 Linux Symposium, Volume One • 265 Performance Load one more approach that was successfully used by the profiling testing authors in their work. 2 Overview of the existing methods for Perfor- Performance TARGET Stress tuning PROBLEMS testing mance and Availability Characterization A number of tools and different approaches for perfor- mance characterization exist and are available for Linux Performance issues investigation systems. These tools and approaches target different problems and use different techniques for extracting system data to be analyzed as well, and support differ- Figure 1: Performance and Availability Characterization ent ways to represent the results of