Memory System Behavior of Java-Based Middleware

Appears in the proceedings of the 9th Annual International Symposium on High-Performance Computer Architecture (HPCA-9) Anaheim, CA, February 8-12, 2003 Memory System Behavior of Java-Based Middleware Martin Karlsson, Kevin E. Moore, Erik Hagersten, and David A.Wood Uppsala University University of Wisconsin Information Technology Department of Computer Sciences Department of Computer Systems 1210 W. Dayton St. P.O. Box 325, SE-751 05 Uppsala, Sweden Madison, WI 53706 Email: {martink, eh}@it.uu.se Email: {kmoore, david}@cs.wisc.edu Abstract 1. Introduction Java-based middleware, and application servers in Architects have long considered On-Line Transaction particular, are rapidly gaining importance as a new class Processing (OLTP) and Decision Support Systems (DSS) of workload for commercial multiprocessor servers. SPEC as important workloads for multiprocessor servers. The has recognized this trend with its adoption of recent shift toward 3-tier and N-tier computing models has SPECjbb2000 and the new SPECjAppServer2001 created a large and rapidly-growing market for Java-based (ECperf) as standard benchmarks. Middleware, by defini- middleware, especially application servers. Still, middle- tion, connects other tiers of server software. SPECjbb is a ware workloads are not yet well understood, and there are simple benchmark that combines middleware services, a few accepted benchmarks that measure the performance of simple database server, and client drivers into a single middle-tier applications. This is due both to the recent Java program.ECperf more closely models commercial emergence of middleware as a mainstream workload and middleware by using a commercial application server and to the fact that 3-tier workloads are by nature difficult to separate machines for the different tiers. Because it is a install, tune and run. distributed benchmark, ECperf provides an opportunity We present a detailed characterization of two Java- for architects to isolate the behavior of middleware. based middleware benchmarks, SPECjbb and ECperf In this paper, we present a detailed characterization (now SPECjAppServer2001 [17]), running on shared- of the memory system behavior of ECperf and SPECjbb memory multiprocessors. ECperf more closely resembles using both commercial server hardware and Simics full- commercial middleware applications because it runs on system simulation. We find that the memory footprint and top of a commercial application server and is deployed on primary working sets of these workloads are small com- a 3-tiered system. The distributed nature of ECperf also pared to other commercial workloads (e.g., on-line trans- facilitates monitoring the behavior of each tier indepen- action processing), and that a large fraction of the dently. ECperf, however, is difficult to install and run. It working sets are shared between processors. We observed requires the coordination of several machines and several two key differences between ECperf and SPECjbb that pieces of software. SPECjbb is also a Java middleware highlight the importance of isolating the behavior of the benchmark. It is an attractive alternative to ECperf middle tier. First, ECperf has a larger instruction foot- because although it models a 3-tiered system, it is a single print, resulting in much higher miss rates for intermediate- Java program that can be run on any Java Virtual Machine size instruction caches. Second, SPECjbb’s data set size (JVM). SPECjbb includes many common features of 3- increases linearly as the benchmark scales up, while tiered systems in a single program running on a single ECperf’s remains roughly constant. This difference can machine. lead to opposite conclusions on the design of multiproces- The goal of this paper is to understand the memory sor memory systems, such as the utility of moderate sized system behavior of these middleware benchmarks, to gain (i.e., 1 MB) shared caches in a chip multiprocessor. insight into the behavior of Java-based middleware, and to provide useful data and analysis to memory systems designers targeting middle-tier servers. We focus on mid- range (up to 16 processor) shared-memory multiproces- This work is supported in part by the National Science Founda- sors because many application servers target these sys- tion, with grants EIA-9971256, EIA-0205286, and CDA- 9623632, the PAMP research program supported by the Swedish tems. We also investigate whether or not the simple Foundation for Strategic Research, a Wisconsin Romnes Fellow- SPECjbb benchmark behaves similarly enough to the ship (Wood), and donations from Intel Corporation, IBM, and more complex ECperf to be considered representative of Sun Microsystems. commercial middleware applications. 1 Benchmark Process Client Threads Tier 1 Tier 2 Tier 3 Object Web Server Trees Business Logic Engine Business Logic Browsers/ Thin Clients Middleware Internet Databases firewall LAN/WAN Figure 2: SPECjbb Overview Figure 1: 3-Tiered Systems stateless servers and is sometimes considered to be a We find that these Java-based middleware applica- first-tier application. N-Tiered architectures allow the tions have moderate CPIs compared to previously-pub- application server to focus entirely on the business logic. lished commercial workloads (between 2.0 and 2.8 for ECperf). In particular, memory related stalls are low, with 2.1. SPECjbb Overview misses to main memory accounting for as little as 15% of the data stall time and 5% of total execution time. Con- SPECjbb is a software benchmark designed to mea- versely, sharing misses occur frequently in both work- sure a system’s ability to run Java server applications. loads, accounting for over 60% of second-level cache Inspired by the On-Line Transaction Processing Bench- misses on larger systems. SPECjbb is similar to ECperf mark TPC-C, SPECjbb models a wholesale company in many ways, but there are important differences with a variable number of warehouses. Beyond the between the two benchmarks. ECperf has a larger nomenclature and business model, however, there are few instruction working set, but a lower data cache miss rate. similarities between TPC-C and SPECjbb. TPC-C is Furthermore, the memory footprint of ECperf remains intended to measure the performance of large-scale trans- nearly constant as the benchmark scales up, whereas the action processing systems, particularly databases. In con- memory use of SPECjbb grows linearly with database trast, SPECjbb was written to test the scalability and size. We show that this difference can lead to opposite performance of JVMs and multiprocessor servers that run conclusions on some design decisions, like the utility of Java-based middleware. It emphasizes the middle-tier shared level-two caches in a chip multiprocessor. business logic that connects a back-end data store to a set of thin clients, and is implemented entirely in Java. 2. Background SPECjbb models a 3-tiered system, but to make the benchmark portable and easy to run, it combines the The emergence of the Internet and World Wide Web behavior of all 3 tiers into a single application (see Figure has triggered a shift in enterprise computing from a two- 2). Instead of using a commercial database engine like tiered, client-server architecture to a 3-tiered architecture most real 3-tiered systems, SPECjbb stores its data in (see Figure 1), where a Web browser is now used univer- memory as trees of Java objects [18]. sally as a database client. For databases, connection to The SPECjbb specification calls for running the the Web allows users to access data without installing a benchmark with a range of warehouse values. In an offi- client program. For Web pages, databases provide cial SPECjbb run, the benchmark is run repeatedly with dynamic content and permanent storage. Software that an increasing number of warehouses until a maximum connects databases to Web pages is known as “middle- throughput is reached. The benchmark is then run the ware.” Much of the middleware used today is written in same number of times with warehouse values starting at Java. Two of the most popular Java middleware architec- the maximum and increasing to twice that value. There- tures are Java Servlets and Enterprise Java Beans (EJB). fore, if the best throughput for a system comes with n The two are often used together, with Servlets imple- warehouses, 2n runs are made. The benchmark score is menting the presentation logic and EJB providing the the average of runs from n to 2n warehouses. This large business rules. Application servers host both Servlets and number of separate benchmark runs would take prohibi- EJB and provide them with communication with both tively long in simulation. Therefore, in our simulation back-end databases and front-end web clients. experiments, we selected 3 values for the number or Recently, Web-connected database applications have warehouses to represent the range of values that would be also been deployed in an “N-Tier” architecture in which included in a publishable SPECjbb result for our hard- the presentation logic is separated from the business ware configuration. In order to simply our monitoring rules. The presentation logic can be implemented by simulations, we report results from the steady state inter- 2 System Under Test Driver Application Server Order Agents Servlet Host EJB Container Database Mfg Agents Orders & Java Corp Corp DB Mfg Beans Orders Servlets Orders DB Supplier Mfg Mfg DB Emulator Supplier Servlets Supplier Supplier DB DTDs Presentation Logic Business Rules Emulator Servlet Figure 3: ECperf Setup val of SPECjbb running with the optimal number of group of machines. Each of these parts is represented by warehouses at each system size. a box in Figure 3. Application Server The application server, shown in the 2.2. ECperf Overview center of Figure 3, hosts the ECperf Java Beans. ECperf is a middle-tier benchmark designed to test Together, they form the middle tier of the system, which the performance and scalability of a real 3-tier system. is the most important component to performance on ECperf models an on-line business using a “Just-In- ECperf.

Load more