An Evaluation of XML-RPC?

Mark Allman NASA Glenn Research Center/BBN Technologies [email protected]

Abstract traditional RPC systems in several ways:

This paper explores the complexity and performance of the ¯ In some RPC systems, such as Sun RPC, the RPC sys- XML-RPC system for remote method invocation. We de- tem (with input from the programmer) generates stubs veloped a program that can use either XML-RPC-based net- for the programmer to call. For instance, if the program- work communication or a hand-rolled version of networking mer wanted to call a remote method foo() they would code based on the java.net package. We first compare our two call a local method foo() which would be a stub gen- implementations using traditional object-oriented metrics. In erated by the RPC system. XML-RPC differs from such addition, we conduct tests over a local network and the Inter- systems, as it does not generate stubs for the program- net to assess the performance of the two versions of the net- mer. Rather, XML-RPC provides several primitives for working code using traditional internetworking metrics. We programmers to use to construct method requests and find that XML-RPC reduces the programming complexity of obtain the corresponding response. The XML-RPC sys- the software by roughly 50% (across various metrics). On the tem reduces the work required by programmers because other hand, the hand-rolled java.net-based implementationof- the programmer does not have to tightly specify their re- fers up to an order of magnitude better network performance mote procedures for a stub generator. On the other hand, in some of our tests. XML-RPC requires developers to know more about the underlying system than an RPC system that provides a 1 Introduction stub generator.

The notion of remote procedure calls (RPC) was first outlined ¯ Since XML-RPC uses a standard XML encoding strat- in [4]. The idea behind RPC is that a software developer is al- egy the system is highly interoperable. A system like ready familiar with the idea of making a procedure call (or Sun RPC can be used across architectures, operating sys- a method call in object-oriented programming vernacular). tems and languages. However, the programmer must Therefore, to make the development of distributed systems have the same RPC system on all platforms. On the other more accessible to all programmers RPCs provide the devel- hand, XML-RPC represents a loose coupling between oper an interface to communications code that is as close as hosts. As long as the hosts both work to the specification possible to simply making a procedure call. Using this no- they can communicate trivially.

tion the developer does not have to write networking code, ¯ Argument marshaling is a non-issue in XML-RPC since thus allowing programmers who do not happen to be experts all data is encoded as text before transmission. in developing network code to write large complex systems ¯ XML-RPC is easily layered on top of existing appli- distributed across any number of hosts on a network. cation protocols (e.g., the HyperText Transfer Protocol Many systems have tried to implement the notion of RPC in (HTTP) [3, 8]). Therefore, integrating XML-RPC with various ways. Sun RPC was an early, widely deployed and other applications is straightforward. used variant of RPC, with CORBA [15], DCOM, JavaRMI, SOAP [6] and many others following. While these sys- While the motivation behind RPC (and in particular XML- tems are all meant for slightly different environments (e.g., RPC) is compelling we wondered about the benefits versus CORBA is an object-oriented version of RPC) they all share the costs. The XML-RPC system is quite flexible and generic the same major goal of making distributed applications easier and therefore likely not optimized for any particular situation. to write and maintain. In the study presented in this paper we While this is a feature it may also be a disadvantage if an focus on a remote procedure call system called XML-RPC1. implementer is trying to obtain good network performance. At its core, XML-RPC defines a framework for transmitting The goal of this paper is to assess XML-RPC by comparing method calls and the resulting responses between processes a set of simple remote procedure calls implemented in XML- across hosts. The transactions are encoded in a standard way RPC, as well as using hand-rolled networking code based on using XML [7]. The XML-RPC approach differs from more the java.net package. While we use the Java version of the XML-RPC system, we note that XML-RPC is implemented ? This paper appears in ACM Performance Evaluation Review, March for many other languages. 2003. 1http://www.xmlrpc.org Our comparison of manually written networking code with

p. 1 XML-RPC-based code has two major thrusts. First, we com- server can handle vectors of arbitrary size as input. In pare the two systems in terms of object-oriented code metrics. our experiments, the client requests the log transforma- That is, we assess the difficulty of writing and maintaining the tion of 50,000 randomly chosen values. This method two variants, as well as the complexity of the resulting code. represents a big request/big response transaction. The second area for comparison is in terms of the network performance attained by the two variants. Most of the code in the testing programs is the same regard-

The remainder of this paper is organized as follows. Ü 2 out- less of which mechanism is being used to communicate over lines our testing program and the environment in which our the network. For instance, the code that actually performs the

tests were conducted. Ü 3 outlines the results of applying above actions is implemented in a PerfActions class and object-oriented metrics to both versions of the communica- is shared.

tions subsystem that we implemented. Ü 4 outlines the results of applying traditional network performance metrics to our Client Server two versions of networking code. Ü 5 discusses using com- pression techniques to reduce the size of XML-RPC transac- Network tions – which is found to be a major cause of performance Comm Comm problems for XML-RPC. Finally, Ü 6 gives our conclusions and outlines future work in this area. Figure 1: Layout of testing programs. 2 Test Environment

To examine the differences between our manually written The general program flow is shown in figure 1. As shown, code based on the java.net package and XML-RPC code we both the client and server applications communicate through wrote two modest testing programs (a client and a server) some communications system which, in turn, exchanges mes- that perform a number of operations. The client can use the sages over some network. The communication system used java.net or XML-RPC codeto call on the server to performthe in our test program depends on command-line input from the requested operations based on command-line options given user. The three currently implemented subsystems are a hand- by the user. The server program receives requests from the rolled application layer protocol based on java.net, an XML- client, performs the requested operation and returns the re- RPC-based system for remote method invocation and a mech- sults. Like the client, the server can use either the java.net- anism that simply invokes the methods locally rather than run- based or XML-RPC code for communication. ning the method on another host (for debugging purposes). The four operations the program performs were chosen to The first two subsystems are explained in the following sub- use different kinds of requests and responses. While we only sections. scratch the surface of all possible method calls and responses 2.1 java.net Code we believe the following methods offer a useful exploration The first communicationsubsystem we discuss is based on the of the space. The operations we use are: java.net package. We use the Socket and ServerSocket classes to setup a separate connection for each transaction.

¯ boolean IsPrime (int n) In principle, we can leave the connection open to serve more This function determines whether the given integer is than one transaction. However, as an initial study, we did prime and returns this determination. This method rep- not want to call multiple methods through a single connection resents a small request/small response transaction. (likewise, we did not use XML-RPC’s boxcar or multi-call

¯ double Average (Vector numbers) feature). We used the Transmission Control Protocol (TCP) This function returns the average of the given vector of [14] to ensure reliable data delivery across the network. In

double-precision numbers. The server can handle a vec- addition, in some of our tests (as outlined in Ü 4.3) we increase tor of arbitrary size. In the tests reported in this paper the size of the send and receive socket buffers to 60 KB. we use a list of 50,000 randomly chosen numbers. This We implement the client networking code in one class and method represents a big request/small response transac- the server code in another. In addition, we use a third class tion. for generic networking routines to implement three methods

¯ Vector GetRandNums (int n) needed by both the client and server. A longer discussion of Ü This function returns a vector of Ò double-precision ran- the complexity of the implementation is given in 3. dom numbers. In the tests reported in this paper the We implement our own simple application layer protocol to client requests 50,000 random numbers. This method conduct the transactions. The first item sent from the client represents a small request/big response transaction. is a 2 byte short integer identifying the remote method we

¯ Vector LogTransform (Vector numbers) wish to invoke. Everything sent after this identifier in both the This function takes a vector of double-precision num- request and the response is specific to the particular method bers, performs a log transformation on each value and being invoked. For example, the IsPrime() request sends returns a vector containing the transformed values. The a 4 byte integer after the 2 byte method identifier and receives

p. 2 a 1 byte response from the server (whether the given integer into two parts. The first subsection presents a brief side-by- is prime). side comparison of one of the stubs in the communications subsystem to give the reader a feel for the code. The second 2.2 XML-RPC Code subsection uses object-oriented metrics to quantify the com- Next we implement a communication subsystem for our test- plexity of the code for each communications subsystem. ing program based on XML-RPC. TCP is used by XML-RPC for reliable data delivery. In addition, the HyperText Trans- 3.1 Qualitative Comparison fer Protocol (HTTP) [3, 8] is used as the application protocol In this subsection we examine the client-side stubs for both due to its vast deployed base. The data portion of the payload the java.net and XML-RPC based implementations. Figure 2 is an XML request to run a particular method with the given shows the java.net-based stub for the LogTransform() parameters or an XML response that encodes the results of method and figure 3 shows the XML-RPC version of the same the remote method call. As with the implementation based stub. The first thing we note is that the java.net code repre- on the java.net package, the XML-RPC code uses one TCP sents more work for the application designer than the XML- connection per transaction. Future work in this area should RPC code because the programmer has to implement all the include reusing TCP connections and assessing the impact of details of the transactions. In the java.net code we explic- conducting multiple transactions in short amounts of time. itly open and close the TCP connection (via alternate meth- Unlike our hand-rolled java.net-based implementation we do ods that we also must implement). In addition, as discussed not have to specify the application layer protocol used by in Ü 2 we implement our own application protocol, sending

XML-RPC. As discussed in Ü 4 we captured the packets the identification number for the remote method we want ex- involved in the XML-RPC transactions to measure perfor- ecuted followed by the parameters for the method. While mance. A side-effect of this is that we are able to show a we slightly generalized the parameter passing for lists of sample of an XML-RPC transaction from our tests, which is double-precision numbers by using the two generic methods given below2: WriteDoubleVector() and ReadDoubleVector() the code cannot be made arbitrarily reusable since each re- POST /RPC2/ HTTP/1.1 mote method will have its own set of arguments and will re- Content-Length: 164 turn its own set of results. In addition, note that if we wanted Content-Type: text/ to pass a vector of integers we would have to add new meth- User-Agent: Java1.3.0 ods to the java.net code to accomplish this task (leading to Host: mercedes:8081 more complexity). Accept: text/html, image/gif, [...] Connection: keep-alive On the other hand, the XML-RPC code is short and easy to understand, even without studying XML-RPC. All parame- RPC system with the name of the remote method to be in- voked (“act.LogTransform” in the case shown). The results act.IsPrime are passed back in a table that provideseasy access for the de- veloper. The client and server must agree on a naming scheme for the remote method and the results. However, this is true 82 no matter what type of system is used for communication. Also note that in both versions of the networking code presented we have shown a stub for LogTransform() method. However, unlike systems like Sun RPC we are not required to have a stub for the remote methods in XML-RPC. 3 Object Oriented Metrics In Sun RPC the program calls a stub rather than the desired method. However, in XML-RPC (and our hand-rolled ver- In this section we compare the complexity of the java.net- sion) the programmer can use the primitives to directly call a based and XML-RPC-based communications code. Note that remote method without writing a stub (or having a tool write we only consider the networking code developed by the appli- a stub). cation designer in this section since the non-communications code is the same regardless of communication subsystem. 3.2 Quantitative Comparison Also, we do not consider the underlying complexity of the We now focus on using coding metrics to assess the two ver- XML-RPC implementation itself, since that is not of concern 3 sions of the communications code. The first metric we use to the application programmer . This section is divided of the to compare our java.net and XML-RPC code is the number 2Note that the XML has been reformatted for presentation in this paper. of lines of code required to implement the required function- The line breaks are not present in the code that is transmitted across the net- ality. The number of lines of code required is a first-order work. examination of the complexity of the code and of how diffi- 3However, the complexity of the underlying code will impact the perfor- mance of the XML-RPC-based system and therefore of interest to the appli- cult to maintain that code may be. In this paper, we report the

cation programmer. This is discussed in more detail in Ü 4. number of lines of code containing actual program statements

p. 3 public Vector LogTransform (Vector nums) { Vector newnums = null;

try { open_conn (); out.writeShort (PerfActions.LogTrans); out.writeInt (nums.size ()); NMisc.WriteDoubleVector (out,nums); out.flush (); newnums = NMisc.ReadDoubleVector (in,nums.size ()); close_conn (); } catch (Exception e) { System.err.println ("LogTransform: " + e.toString()); } return (newnums); } Figure 2: The java.net-based version of the client stub for the LogTransform() method.

public Vector LogTransform (Vector nums) { Vector params = new Vector (); Vector ln_nums = null; Hashtable result;

params.addElement (nums); try { result = (Hashtable)server.execute ("act.LogTransform", params); ln_nums = (Vector)result.get ("logtrans"); } catch (Exception e) { System.err.println ("LogTransform: " + e.toString()); } return (ln_nums); } Figure 3: The XML-RPC-based version of the client stub for the LogTransform() method.

Program java.net XML-RPC Program java.net XML-RPC Client 110 93 Client 1,8,5 1,6,2 Server 136 76 Server 1,7,6 1,6,2 Generic 39 0 Generic 1,3,1 0,0,0 Total 285 169 Total 3, 17, 12 2, 11, 4

Table 1: Lines of code in each communication subsystem (exclud- Table 2: Number of classes, methods and data members needed to ing blank lines and lines containing only comments). each communication subsystem.

(i.e., not counting blank lines and lines that contain nothing than the java.net code. At a minimum the client and server but comments). class must each have five methods (according to the interface they implement). Of these, four methods act as stubs for the Table 1 shows the lines of code required to implement the two forms of communication. As show in the table, the java.net four operations outlined in Ü 2. Further, the client is required to implement a method that returns a string identifying the code is a factor of nearly 1.7 larger in total lines of code when compared to XML-RPC. Most of the savings realized type of networking being used (for logging purposes) and the Start() by XML-RPC comes from the server code and the lack of re- server is required to implement a method that is used to begin a transaction (but, is different from the con- quiring the generic routines used in the java.net version of the code. structor which is used to setup the parameters of the trans- action, but not any particular transaction). Given these re- Next, table 2 takes a deeper look into the complexity and quirements XML-RPC uses one method more than the mini- manageability of the code. This table shows the number of mum in both the client and the server (the constructor in both classes, methods and data members required to implement cases). Meanwhile, the java.net implementation requires sev- each version of the communication subsystem. As with the eral additional methods (e.g., open a TCP connection). Also, above discussion of the number of lines of code, in all metrics as noted above, the java.net code requires 3 generic methods presented in table 2 XML-RPC appears to be less complex used by both the client and server code and statically defined

p. 4 Program java.net XML-RPC plementations access. As shown, the total number of classes Client 76 42 that the java.net-based implementation communicates with is Server 77 33 higher than that of the XML-RPC version of the code by a Generic 15 0 factor of 2. This indicates that the java.net code is more de- Total 168 75 pendent on other portions of the system and therefore may Mean/Class 56.0 37.5 be more difficult to maintain due to changes in the outside Mean/Method 9.5 7.0 classes.

Table 3: Cyclomatic Complexity of both versions of the networking 4 Network Performance code. 4.1 Methodology Program java.net XML-RPC We performed two sets of experiments to assess the network Client 13 8 performance of the java.net and XML-RPC based communi- Server 13 8 cation systems. The first set of experiments involves a local Generic 6 0 10 Mbps Ethernet network with only a simple hub between Total 32 16 the client and server. The second set of measurements are from transactions over the Internet between NASA’s Glenn Table 4: Amount of coupling between the networking code and Research Center (GRC) and Ohio University (OU). The path outside classes. between the client and server encompassed roughly 15 router hops at the time of our experiments (although, as shown in in a non-instantiated generic class. The XML-RPC code has [13] routes can change arbitrarily and so our measure of the no such dependency. Finally, note that the number of data hop-count may not be accurate for the entire test period). The members used by the java.net code is three times higher than hosts used in the local network tests were Pentium III 400 those kept by the XML-RPC code. This is caused by the Mhz FreeBSD 4.4 machines. In the Internet tests, a Pentium need for the java.net code to handle all the networking ob- III 400 Mhz FreeBSD 4.4 machine at NASA GRC was used jects, while XML-RPC abstracts these details from the pro- as the client, while the server at OU was a dual-processor Sun grammer. Enterprise 250 running Solaris 8. In summary, table 2 shows that the java.net code is more com- We invoke all four methods using both the java.net and XML- plex code that will be more difficult to implement, test and RPC based communication framework at roughly 30 second maintain when compared to the XML-RPC implementation intervals (the exact interval is determined using a Poisson – which nicely abstracts the networking code away from the process with a mean of 30 seconds). The testing program application developer. wrapped around the communication systems takes times- We next we use the Cyclomatic Complexity (CC) [9] to gauge tamps before and after the transaction to measure the length of the complexity and amount of testing required for the meth- the remote method call from the application’s vantage point. ods in each implementation of the communication subsystem. In addition, we captured all packets transmitted by the clients 4 Given a method flow graph, the Cyclomatic Complexity is de- using tcpdump . The packet traces show the length of time fined as: each network transaction takes (e.g., without accounting for

the time needed for pre- or post-processing as the application-

´Gµ = e Ò · Ô · ½ Î (1)

level measurement includes). We also use the packet traces to Ò where e is the number of edges in the graph, is the number determine the number of data bytes sent by each version of

of nodes in the graph and Ô is the number of connected com- the networking code. Tcpdump indicated that no packet filter ponents found in the graph. We calculate CC for each method drops occurred during our experiments. in our code. 4.2 Local Network Tests Table 3 shows the CC for the java.net and XML-RPC versions The distributions of the transaction times for both the java.net- of the code. As shown, the java.net-based code has higher to- based and XML-RPC-based implementations for the tests tal complexity(by more than a factor of 2). But, this is largely over the local network are shown in figure 4. From the figure because the java.net implementation has more methods than we observe that the performance does not vary much between the XML-RPC version of the code as indicated by the mean runs. This is expected due to relatively static condition of the CC per method – 9.5 for the java.net code versus 7.0 for the local network. In addition, we see that the java.net implemen- XML-RPC code. So, on a per-method basis the java.net code tation of the networking code always outperforms the XML- involves a more complicated flow graph than the XML-RPC RPC code. With the exception of calling the IsPrime() code (more conditionals, more loops and more reliance on method, the java.net transactions takes roughly an order of other methods). Using XML-RPC abstracts the details of the magnitude less time than the XML-RPC transactions. conditionals, loops, etc. from the developer. For the smallest transaction, IsPrime(), the XML-RPC Finally, we examine the coupling required between our com- transactions take roughly 70 ms. While this is slightly more munication subsystem and outside classes. Table 4 shows the number of outside classes that each of our networking im- 4http://www-nrg.ee.lbl.gov or http://www.tcpdump.org

p. 5 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 CDF CDF 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0 5 10 15 20 25 30 35 40 45 50 Transaction Time (sec) Transaction Time (sec)

java.net XML-RPC java.net XML-RPC (a) IsPrime () (b) Average ()

1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 CDF CDF 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0 5 10 15 20 25 30 35 40 45 50 0 20 40 60 80 100 120 Transaction Time (sec) Transaction Time (sec)

java.net XML-RPC java.net XML-RPC (c) GetRandNums () (d) LogTransform ()

Figure 4: Distribution of the transaction times for the local network experiments. than than the time required by the java.net version of the code the java.net version, which explains some of the overall dis- (which takes roughly 60 ms) the difference is likely not sig- parity between the two communication systems. nificant in systems involving human interaction. For instance, A large disparity (a factor of roughly 50 to over 300) in trans- when conducting a single small transaction a user will not action size between the two IsPrime() versions is shown perceive the difference between the two versions of the code. in table 5. However, the difference between the median trans- However, if many small transactions are performed back-to- action times is only about 10 ms because both transactions back (e.g., by an automated system) the increased amount are small enough to fit in a single packet. So, even though of time required by the XML-RPC version of the code may the XML-RPC transaction is larger none of TCP’s congestion add up to an interval that is perceptible and meaningful to control algorithms [2] come into play in either case. the user. Furthermore, the extra time required for the XML- RPC transactions may hinder the performance of automated Next, we examine the overhead incurred by the two versions systems that utilize numerous RPC calls in completing their of the code that happens before or after the transaction is sent task. across the network. Table 6 shows the median transaction time for the XML-RPC code measured by the application, as We now turn our attention to explaining the discrepancy in well as the median difference between the transaction time performance between the java.net and XML-RPC communi- measured by the application process and the length of the TCP cation systems. Table 5 shows the number of unique bytes connection measured from the packet traces. From this table transmitted for each transaction5. With the exception of the we make several observations: IsPrime() method call the results show that the XML-RPC versionof the programsends roughlysix times more data than

¯ There is little pre- or post-processing overhead involved 5 The java.net version transfers numbers as binary data and therefore the in the IsPrime() method call. transaction is always the same size. However, XML-RPC transmits a tex- tual representation of the data and therefore the size of the transaction can ¯ The Average() and GetRandNums() methods take vary from run to run (e.g., one run might send “3.1” while the next sends roughly the same amount of time to execute when mea- “541.78234”, yielding a difference of six bytes between the two numbers. Therefore, the XML-RPC results presented represent the median size of the sured by the application. This is expected as roughly the all the runs and therefore are denoted as approximate in the table. same amount of data is exchanged in both cases – just in

p. 6 Transaction Request Size (bytes) Response Size (bytes)

IsPrime – java.net 6 1  IsPrime – XML-RPC  339 332

Average – java.net 400,006 8  Average – XML-RPC  2,513,756 334

GenRandNums – java.net 6 400,000  GenRandNums – XML-RPC  346 2,513,812

LogTransform – java.net 400,006 400,000  LogTransform – XML-RPC  2,459,521 2,466,477

Table 5: Transaction sizes in bytes. The numbers designated as approximate represent the median number of bytes transferred. Exact numbers of bytes are given where the number of bytes does not vary across transfers.

Method Transaction Time Delta ply multiply the transaction time experienced by the java.net Time (sec) (sec) by 6.15 we expect an XML-RPC transaction time of just over

IsPrime () 0.069 0.003 24 seconds. Combining the expected transfer time ( 24 sec-

Average () 46.787 16.559 onds) with the encoding/decodingtime ( 66 seconds) we get GetRandNums () 47.242 0.065 an expected transaction time of 90 seconds. Even if this anal- LogTransform () 95.303 16.706 ysis is a bit off we believe that the XML-RPC processing and the added bytes that XML-RPC sends across the network ex- Table 6: Comparison of median time required for XML-RPC Log- plain the majority of the performance difference between the Transform() transaction and the difference between two communication systems. the total elapsed time of the transaction and the time re- quired by the underlying TCP connection. 4.3 Internet Tests We now describe measurements taken on the Internet path be- different directions. tween NASA GRC and Ohio University. We tested both the java.net and XML-RPC communication subsystems over the ¯ We observe a difference of over 16.5 seconds be-

tween the total transaction time and the network trans- path as described above (see Ü 4.1). Preliminary measure- fer time in the Average() and LogTransform() ments illustrated a performance problem across the Internet calls. In these two calls the client encodes a vector of caused by FreeBSD’s default TCP advertised window size, 50,000 numbers before starting the TCP connection and which in some cases was too small to fully utilize the avail- 6 pushing the data over the network. able bandwidth . Therefore, for our Internet tests we added a variant of the hand-rolled java.net code, denoted “java.net+” ¯ We note little difference (roughly 65 ms) between that increases the socket buffer sizes (and, therefore, TCP’s the total transaction length and the duration of the advertised window). TCP connection for the GetRandNums() method

call. As noted above, the total transaction time for TCP throughput, Ì , is ultimately limited based on the size of ÊÌÌ the GetRandNums() call is similar to that of the the advertised window, Ï , and the round-trip time, , of Average() call. However, the GetRandNums() the network path, as follows [14]: function cannot encode data before the start of the TCP

connection because the server creates the vector of data Ï = Ì (2)

based on input from the client. ÊÌÌ Throughput can be further limited by TCP’s congestion win- From the above table we can sketch a back-of-the-envelope dow [10, 2] which is based on the measured load on the net- analysis to measure the difference between the java.net work. When a TCP connection fills the entire advertised win- and XML-RPC versions in terms of network usage for the dow with data the sender may be able to achieve higher per- LogTransform() method. We know that encoding a formance if the advertised window were increased. In the vector of 50,000 values takes roughly 16.5 seconds from java.net+ variant in our measurements the advertised window the above table. So, approximately 33 seconds of the is increased from the default (32 KB) to 60 KB. LogTransform() transaction are spent encoding data to Figure 5 shows the distributions of transaction times for the be transmitted. If we assume that parsing the incoming XML four method calls across the Internet path. When compared to and adding the values to the vector takes the same amount of the local network tests, the Internet measurements show more time as encoding the data for transit we need another 33 sec- variability, as expected. Again the plot shows that (with the onds for decoding for the LogTransform() routine. So, of exception of the IsPrime() method) the XML-RPC trans- the roughly 95 seconds that the transaction takes (on median) actions take roughly an order of magnitude longer than the 66 seconds are spent encoding/decoding the data. We also java.net (and java.net+) transactions. Given the analysis of the know (from table 5) that the XML-RPC code sends roughly 6.15 times as much data as the java.net code. So, if we sim- 6Note that [1] shows that this situation is not rare on the Internet.

p. 7 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 CDF CDF 0.4 0.4 0.3 0.3 0.2 0.2 java.net java.net 0.1 java.net+ 0.1 java.net+ XML-RPC XML-RPC 0 0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 0 10 20 30 40 50 60 70 80 90 Transaction Time (sec) Transaction Time (sec) (a) IsPrime () (b) Average ()

1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 CDF CDF 0.4 0.4 0.3 0.3 0.2 0.2 java.net java.net 0.1 java.net+ 0.1 java.net+ XML-RPC XML-RPC 0 0 0 20 40 60 80 100 120 0 50 100 150 200 250 Transaction Time (sec) Transaction Time (sec) (c) GetRandNums () (d) LogTransform ()

Figure 5: Distribution of the transaction times for the Internet experiments. transaction sizes and the processing overhead presented in the packet trace. While we are only analyzing a single transac- last subsection these results are not surprising. In addition, we tion we note that the request/response size in this transaction see that using larger TCP window sizes increases the perfor- is less than 0.04% different from the median transaction sizes

mance over the stock TCP configuration. This shows that a reported in table 5 in Ü 4.2. programmer who knows how to tune the sockets7 can induce Table 7 shows the sizes of the XML-RPC request and re- better performance, which is a downside of using something sponse for the chosen transaction, as well as the sizes of like XML-RPC that hides these details from the programmer. the java.net requests and responses for comparison. The Finally, we note that the performance of the small first compression technique we use is to simply gzip the re- IsPrime() transaction is generally invariant of the un- quest/response. This reduces the size of the request/response derlying communications scheme used. Since this transac- to less than a quarter of the size of the original XML-RPC tion involves only one-segment in each direction for both the transaction. However, the transaction size of the gzipped ver- java.net and XML-RPC code the performance is dominated sion is still on the order of 25% larger than the java.net trans- by the latency between NASA GRC and Ohio University. action. On the Pentium III FreeBSD machines we ran our tests on the gzip operation took a little over 1 second, with 5 Compression the de-compression taking just under 0.25 seconds. There- As discussed in the last section, one of the largest perfor- fore, we believe that using gzip to compress the transactions mance problems with XML-RPC lies with the number of would be a net win for large transactions. For comparison we bytes transmitted into the network. Therefore, we now briefly gzipped the java.net transaction byte streams. The gain for the analyze the usefulness of abbreviations and compression in java.net-based transaction is not as substantial as for the text- mitigating this performance barrier. For our analysis we took based XML-RPC-based transactions, only yielding a savings one of our XML-RPC LogTransform() transactions and of roughly 6% over the uncompressed version. extracted the data bytes (excluding protocol headers) from the Our results show benefits to using gzip for XML-RPC trans- 7There are additional socket changes that could be made besides adjusting actions. However, there are also disadvantages, such as ob- the advertised window size based on the application at hand (e.g., disabling scuring the payload such that debugging XML-RPC is more the Nagle algorithm [12]).

p. 8 Technique Request Size (bytes) Response Size (bytes) XML-RPC 2,459,472 2,466,521 java.net 400,006 400,000 GZipped XML-RPC 527,105 495,890 GZipped java.net 376,967 375,040 Abbreviate 1,559,465 1,566,506 Abbreviate/Combine 1,309,472 1,316,521 New XML Tag 1,159,472 1,166,521 New XML Tag/Gzip 478,021 451,236

Table 7: Impact of compression on the LogTransform() transaction size. difficult and added reliance by XML-RPC on another . be replaced, just augmented to support any of the techniques So, we next look at three techniques that can be completely outlined in this section. A method for the client to ask the implemented in XML-RPC. server if the new technique is supported would have to be added to XML-RPC. One migration path may be to query

First, as shown in Ü 2.2 XML-RPC is verbose with respect to the server only for large transactions that would benefit from the parameters being passed to a method. In the case of an vector of double precision numbers every number is transmit- sending significantly smaller transactions. For instance, pay- ing the additional RTT cost to query the server for its sup- ted as a string of the form: ported techniques would not seem to be worth it when exe- NN.NN cuting the IsPrime() method, but may well allow a perfor- mance improvement for a method like LogTransform() One possible extension to XML-RPC is to use abbreviations. when working on a large list of numbers. As shown in table 7, abbreviating “value” with “v” and “dou- Also, note that the size of transactions impacts the end host’s ble” with “d” results in transmitting over 900,000 fewer bytes RAM usage. As shown in the last section, requests are en- (or roughly 36% of the transfer size) in each direction when coded before being transferred over the network. There- compared to using the current scheme. fore, these transactions require a user-space memory “staging The next row of the table shows the size of the transactions if area”. In addition, since XML-RPC connections last longer XML-RPC were to abbreviate and combine the “value” and than hand-rolled java.net-based transactions kernel memory “double” tags to be transmitted like: for TCP’s retransmission buffer is also required for a longer amount of time (potentially impacting other network applica- NN.NN tions). When compared to abbreviations alone this reduces the Finally, we note that TCP/IP header compression [11, 5] can amount of transmitted data by roughly 200,000 bytes. When also be employed at various links across a network path to re- compared to standard XML-RPC using this technique reduces duce the number of bytes that must be transmitted. However, the transfer size by more than 1.1 MB. since TCP/IP headers generally represent a small fraction of Next we introduce a new (abbreviated) tag with an argument the bytes transmitted in the course of a long transfer (3-8%) instead of using begin and end tags to get encoding like: compressing only the headers leads to only modest decreases in transfer time. Also note that the W3C’s Web Services Ini- tiative8 is also investigating ways to compress “on the wire” XML. This new tag saves 3 bytes per double-precisionnumber trans- mitted when compared to the abbreviate and combine tech- 6 Conclusions and Future Work nique. Further, the new-style tag represents less than 20% of the overhead imposed by the current XML-RPC technique The following are the major conclusions from our study of (6 bytes as opposed to 32 bytes). remote method invocation using the java.net package and the As a last step we examine the efficacy of gzipping transac- XML-RPC system: tions that use the new-style tag introduced above. As the ta- ble shows, using the new tag with gzip reduces the transaction ¯ The java.net implementation is roughly 50% more com- sizes by roughly 9% over using gzip on the standard encod- plex to code and maintain when compared the the XML- ing. However, the resulting transaction is still larger than the RPC code across a variety of coding metrics. java.net encoding.

¯ The time required for small transactions is roughly the Since all the abbreviations and tag combinations introduced same regardless of underlying communication system above are new (i.e., not understood by XML-RPC) they can across both the local network and the Internet. be added without breaking any of the current functionality. That is, the current tags used by XML-RPC do not need to 8http://www.w3c.org/2002/ws/

p. 9 ¯ The java.net code performs up to an order of magnitude References better than the XML-RPC code when transfering a sig- nificant amount of data. The performance problems of [1] M. Allman. A Web Server’s View of the Transport Layer. Computer Communications Review, 30(5):10–20, Oct. XML-RPC are largely caused by (i) the increased size of XML-RPC transactions (an increase of over a factor 2000.

of 6 when compared to the java.net code), and (ii) the [2] M. Allman, V. Paxson, and W. R. Stevens. TCP Con- overhead of encoding and decoding XML. gestion Control, Apr. 1999. RFC 2581.

¯ We show several compression and abbreviation schemes [3] T. Berners-Lee, R. Fielding, and H. Nielsen. Hypertext to reduce the size of XML-RPC transactions. Even with Transfer Protocol – HTTP/1.0, May 1996. RFC 1945. the compression/abbreviation strategies the hand-rolled [4] A. Birrelland B. J. Nelson. ImplementingRemote Pro- java.net transactions are always smaller than XML-RPC cedure Calls. ACM TOCS, 2(1):39–59, Feb. 1984. transactions. However, using some form of compression within XML-RPC will likely have a positive impact on [5] C. Bormann, C. Burmeister, M. Degermark, performance. H. Fukushima, H. Hannu, L.-E. Jonsson, R. Hakenberg, T. Koren, K. Le, Z. Liu, A. Martensson, A. Miyazaki, ¯ Tweaking socket options can yield positive performance K. Svanbro, T. Wiebke, T. Yoshimura, and H. Zheng. RObust gains. We show performanceimprovementby increasing Header Compression (ROHC): Framework and Four Profiles: the socket buffer sizes in our Internet tests (but, note that RTP, UDP, ESP, and Uncompressed, July 2001. RFC 3095. tweaking the advertised window size had no effect on the tests across the local network). [6] D. Box, D. Ehnebuske, G. Kakivaya, A. Layman, N. Mendelsohn, H. F. Nielson, S. Thatte, and D. Winer. Sim- ple Object Access Protocol (SOAP) 1.1. Technical report, In addition, we have identified a number areas for future work World Wide Web Consortium, May 2000. in this area: [7] T. Bray, J. Paoli, C. M. Sperberg-McQueen, and E. Maler. Extensible Markup Language (XML) 1.0 (Sec-

¯ The encoding/decoding of transactions is a barrier to ond Edition). Technical report, World Wide Web Consortium, good performance in the Java version of XML-RPC. Fu- Oct. 200. ture work should quantify this further, keeping the fol- lowing questions in mind: Is this purely a Java Virtual [8] R. Fielding, J. Gettys, J. C. Mogul, H. Frystyk, and Machine issue? Does the encoding/decoding perform T. Berners-Lee. Hypertext Transfer Protocol – HTTP/1.1, Jan. differently in other languages? Are there more efficient 1997. RFC 2068. techniques to encode/decode that will aid XML-RPC [9] B. Henderson-Sellars. Modularization and McCabe’s transaction times? Cyclomatic Complexity. Communications of the ACM,

¯ The XML-RPC system could be extended to give the 37(12):17–19, Dec. 1992. programmer optional access to the underlying sockets. [10] V. Jacobson. Congestion Avoidance and Control. In In doing this, XML-RPC would allow savvy program- ACM SIGCOMM, 1988. mers to tweak socket options to obtain better perfor- [11] V. Jacobson. Compressing TCP/IP Headers For Low- mance for their particular application. Speed Serial Links, Feb. 1990. RFC 1144.

Finally, this paper only scratches the surface of evaluating [12] J. Nagle. Congestion Control in IP/TCP Internetworks, RPC mechanisms in general (and XML-RPC in particular). Jan. 1984. RFC 896. Analyzing additional RPC systems from a numberof different [13] V. Paxson. End-to-End Routing Behavior in the Inter- vantage points (network performance, coding complexity, en- net. In ACM SIGCOMM, Aug. 1996. coding/decoding algorithm complexity) is useful and should [14] J. Postel. Transmission Control Protocol, Sept. 1981. be undertaken more often as we attempt to write and refine RFC 793. complex systems. [15] S. Vinoski. CORBA: Integrating Diverse Applications Acknowledgments Within Distributed Heterogeneous Environments. IEEE Com- munications, 14(2), Feb. 1997. Thanks to Ethan Blanton, Shawn Ostermann and Vern Paxson for arranging the remote hosts used to test against. Thanks to Doug Dechow, John Lambert, Rich Slywczak, Lee White and the anonymous reviewers for comments on an earlier version of this paper. I engaged in spirited discussions with Joseph Ishac, Eric Megla and Vern Paxson that improved this work. Finally, Doug Dechow helped a great deal with the analysis

presented in Ü 3. My thanks to all!

p. 10