Streaming-Based XML Encryption and Decryption
Chair for Network- and Data Security Horst Görtz Institute for IT Security Ruhr-University Bochum
Benjamin Sanno Matriculation Number: 108006248774
Supervisors: Prof. Dr. Jörg Schwenk, Juraj Somorovský
October 14, 2010 Declaration of Authorship
I hereby declare that
• that I have written this thesis without any help from others and without the use of documents and aids other than those stated below,
• that I have mentioned all used sources and that I have cited them correctly according to established academic citation rules
• that I have produced this thesis without the prohibited assistance of third parties and without making use of aids other than those specified
• that this thesis has not previously been presented in identical or similar form to any other German or foreign examination board.
Bochum, October 14, 2010
Benjamin Sanno
2 Abstract
XML Encryption is a W3C recommendation that specifies how XML elements should be encrypted. Therewith, message confidentiality can be achieved. However, conventional frameworks applying XML Encryption use DOM-based XML processing. The DOM API is tree-based and therefore the whole document must be parsed before data can be encrypted or decrypted. In contrast, SAX and StAX do streaming-based XML processing and their output is a stream of events. So far there are no efficient and fast frameworks that apply XML Encryption to a stream of XML events. In this thesis, an event pipeline concept is used to further process the output of streaming-based XML APIs. Efficient and fast event pipeline modules are proposed that facilitate encryption and decryption. Each module was implemented and the decryption modules were analyzed to figure out which is most efficient. Measurements reveal that an event pipeline which uses streaming XML parsers has advantages over the DOM API with regard to memory requirements and execution time for the parsing and decryption process.
3 Contents
1 Introduction 1
2 Related Work 3 2.1 XML...... 3 2.2 XML APIs...... 4 2.2.1 SAX - Simple API for XML...... 5 2.2.2 DOM API...... 6 2.2.3 StAX - Streaming API for XML...... 8 2.3 Event Pipeline...... 9
3 Design Concepts 11 3.1 Nested XML Encryption and Decryption...... 11 3.2 Stream Encryption...... 12 3.3 Stream Decryption...... 13 3.3.1 Push-Pull Problem on Event Streams...... 14 3.3.2 Event-Stream Decryption...... 15 3.3.3 Byte-Stream Decryption...... 17
4 Implementation 19 4.1 Event Pipeline...... 20 4.2 Encryption Module...... 22 4.3 Decryption Modules...... 23 4.4 Modification of the Javolution Parser Source Code...... 26
5 Performance Analysis 27 5.1 Experimental Setup...... 27 5.1.1 Measuring Execution Times...... 27 5.1.2 Measuring Memory Usage...... 28 5.2 Excursion: Parser Analysis...... 29 5.3 Excursion: Base64 Decoding Performance...... 32 5.4 Parsing Analysis...... 33 5.4.1 Memory Usage...... 33 5.4.2 Execution Time...... 35 5.5 Decryption Analysis...... 35 5.5.1 Memory Usage...... 36 5.5.2 Execution Time...... 39
6 Conclusion 40
4 List of Figures
2.1 Simple API for XML [17]...... 5 2.2 Transformation of a document to its DOM representation [7]...... 6 2.3 Sun’s Project X reference implementation of a DOM API [17][28]...... 7 2.4 Streaming API for XML...... 8 2.5 Basic event pipeline design...... 9
3.1 Event pipeline configuration for nested encryption or decryption...... 11 3.2 Internal design of the encryption module...... 12 3.3 Core design problem: interface between character events and the parser...... 14 3.4 Push-pull problem on streams...... 15 3.5 Decryption module that implements simple buffering...... 16 3.6 Decryption module that implements thread-based decryption...... 17 3.7 Illustration of the CharactersInputStream component...... 18
4.1 Prototype suite package overview...... 19 4.2 Inner body of the pipeline package...... 20 4.3 Strongly simplified call graph of the pipeline implementation...... 21 4.4 Call graph of the encryption module...... 22 4.5 Internal dependencies of the decryption package...... 23 4.6 Internal dependencies of the ESBufferDecrypterModule ...... 24 4.7 Internal dependencies of the ESThreadDecrypterModule ...... 25
5.1 Parser: heap size over time during parsing of 200.000 XML elements that have either the same tag name or all different tag names...... 30 5.2 Xerces Parser: MAT analysis results...... 31 5.3 Parser: heap size over time during parsing a completely encrypted XML document. 32 5.4 Base64 performance analysis...... 32 5.5 Parser modules: heap size over time during parsing of 200.000 XML elements (150 different names)...... 34 5.6 Parser modules: performance analysis...... 35 5.7 Decryption modules: heap size over time during parsing and decryption of 200.000 XML elements...... 36 5.8 Decryption modules: heap size over elements during parsing and decryption of 100.000 XML elements...... 37 5.9 Decryption modules: heap size over decryption progress in percent (200k elements, 8MB file)...... 38 5.10 Decryption modules: performance analysis...... 39
5 1 Introduction
The Extensible Markup Language (XML) [1] is used in many modern applications e.g. web services, cloud computing, database management, Service Oriented Architectures (SOA), et cetera.
XML Encryption XML files can contain sensitive and confidential data. Therefore, security and especially privacy is important. To achieve these goals, computationally intensive XML file encryp- tion is necessary based on the XML Encryption recommendation [5].
XML APIs A common Application Programming Interface (API) to access XML files is the DOM API [6]. The Document Object Model (DOM) is an in-memory representation of the XML file. This API is usually used to read an XML file. However, a disadvantage is that much system memory is occupied if large files are parsed. Furthermore, multiple DOM objects must be created and stored in the memory if XML Encryption is used. Another concept are streaming-based XML APIs like SAX (Simple API for XML) [14] or StAX (Streaming API for XML) [19]. These do not create an in-memory representation of the XML file. Their output is a stream of XML events, not a DOM. Streaming-based concepts have great advantages over tree-based concepts. XML element data becomes available to the subsequent application as soon as the parser has read it. This results in low latency, efficient use of CPU cycles and more reactive client applications. Another advantage of the streaming concept is that XML source files can be larger than the system memory. For instance, XML database files of insurance companies, in cloud computing environments, and others can have sizes up to 10GB or more [29, 30]. Some modern XML applications are executed on mobile devices with limited computing and memory resources so that a DOM object would occupy a large proportion of that limited resources [31].
Streaming-Based Processing "It can be argued that the majority of XML business logic can benefit from stream processing, and does not require the in-memory maintenance of entire DOM trees." [18]. An event pipeline pattern can be used to apply complex functionality to streaming- based XML APIs. To address confidentiality of information, an XML event pipeline should realize the XML Encryption recommendation to enable cryptographic functionality for those XML APIs. Until now, there is less scientific work in this field of IT-security [2,3,4].
Prototype Implementation The main goal of this thesis is to design efficient and fast event- stream encryption and mainly decryption concepts for event pipeline modules. Those concepts are implemented to demonstrate their functionality. The final decryption component should be able to process nested encrypted data as efficient as possible. Although the source XML file contains nested and encrypted XML elements, the final implementation should be able to process the elements strictly in sequential order. Otherwise, the advantages of the event pipeline pattern may be lost.
1 1 Introduction
Optimization Usually, network bandwidth as well as execution time, latency and efficient memory usage are crucial factors in computing environments. As a consequence of this, it is very important to address performance and processing efficiency. Therefore, the performance of the prototype implementation was extensively analyzed for this thesis.
The next chapter explains the related work. Basic concepts like XML, XML Encryption, event pipeline pattern, and XML APIs are described. Chapter three is about design concepts for streaming-based encryption and decryption. The fourth chapter deals with the prototype im- plementation details. And finally, in chapter five the performance analysis results are illustrated and evaluated.
2 2 Related Work
The following sections introduce to basic concepts that are necessary to understand the subsequent chapters. Terms and acronyms are defined and specified e.g. XML, API, SAX, DOM, tree-based, streaming-based, base64, event pipeline, et cetera.
2.1 XML
The acronym XML stands for Extensible Markup Language. It is a W3C recommendation since 1998 [1]. W3C is a community of information technology experts. This organization publishes technical specifications and recommendations to ensure a long-term growth of the World Wide Web. XML was designed to transport and store data. Originally, it should be human readable to some extent for debugging and other administrative work. An XML document consists of markup and character data. Markup is defined as all tags, references, declarations, sections, and comments. Character data is all text that is not markup ([1] Section 2.4). There are some rules how markup elements can be used, which results in strict, tree-structured documents. A simple XML example is shown in Listing 2.1. The core concept is the XML element that consists of a start-tag, an end-tag and content. Tags must be well-formed, i.e. for each start-tag must exist an end-tag with the same name in the document. Otherwise, a parser would throw an exception. All tags must be nested correctly. That means all end-tags must be closed in the opposite order than the start-tags. Every XML document has a header and a body. The header specifies that the text file is an XML document. Listing 2.1 is a simple XML document and its header is the first line . Beside the version attribute, it can also contain attributes like encoding or standalone to signalize the XML parser how to interpret the body’s content correctly.
1 2
XML Encryption The W3C recommendation document for the standardization of XML data en- cryption is called "XML Encryption Syntax and Processing" and was published in 2002 [5]. It specifies how encrypted data must be embedded in an XML structure. The
3 2 Related Work
must be base64 encoded and surrounded by a special set of XML elements: the
1 2
The
2.2 XML APIs
An application programming interface (API) is a software component that resides between different software applications. It facilitates standardized interaction and communication between those components. There are a few XML APIs that follow various objectives. The following subsections introduce to some important XML APIs and their specifics are explained in detail. First, SAX is described. Then, the document object model is explained and how it is derived from an XML source. Finally, the StAX interface is explained in detail.
4 2 Related Work
2.2.1 SAX - Simple API for XML SAX is a modern XML API although it has not changed significantly over the last years. The concept was originally implemented by David Megginson in 1997 [13]. The current final version is SAX 2.0.2 final released on April 28th, 2004. It is open source and can be downloaded from SourceForge (http://sourceforge.net/). The simple API for XML (SAX) is only able to read XML data. Writing is not supported [18]. A SAX parser reads the XML source and for every detected XML element (e.g. tags, attributes, namespaces, etc.), it creates an event so that a stream of events is generated while it is parsing. This results in low and consistent memory usage due to processing (in theory). SAX creates a series of events and pushes these to the client application. On account of this, this API is called event-based (or streaming-based). The client application has to be reactive to the parser because the parser is in control when new events are created and pushed. For this reason, this XML API is also called an active API [10].
SAXParserFactory
startDocument() startElement() InputStream Characters() ContentHandler endElement() startElement() …
ErrorHandler XML SAX client source parser application DTDHandler
EntityHandler
Figure 2.1: Simple API for XML [17]
Figure 2.1 shows which components work together. The SAXParserFactory class (the actual JRE provides the XMLReaderFactory) creates an instance of a SAX parser. It is system dependent which parser variant is the system default. There are SAX parser from different companies and developers like Oracle R , Sun R , ApacheTM, or even specific solutions as the Javolution SAX2 parser. Once a SAX parser is created it is connected to an XML source and some handler. The ContentHandler is important because the parser invokes ContentHandler methods whenever a new XML element is read from the source. The client application reacts to those method calls, i.e. for each of those calls it can execute some code, but it is not able to force the ContentHandler or the parser object to read further and call the next method. This is why the parser of SAX is called
5 2 Related Work a push parser.
2.2.2 DOM API DOM API is an acronym for Document Object Model Application Programming Interface. It is a W3C recommendation since 1998 [6]. This API concept is a platform and language independent interface that specifies how to represent and interact with documents like XML, HTML and similar languages. One of its basic features is the ability to provide dynamic access to the document’s content. Furthermore, the DOM API is able to dynamically update the style and content of the document, i.e. get, add, modify or delete XML elements at arbitrary locations within the document (random- access manipulation). Hence, DOM should be used if the application accesses the content of an XML file multiple times at different positions during its execution. If the content of an XML file is not accessed in the same sequential order as in the file, the usage of DOM would be a very good solution. To provide this functionality the tree-structured source, the HTML or XML file, has to be transferred completely into the system memory. All source bytes must be read and stored in the main memory.
HTML document DOM representation
Shady Grove | Aeolian | ||||
Over the River, Charlie | Dorian |
Figure 2.2: Transformation of a document to its DOM representation [7]
Figure 2.2 illustrates the W3C recommendation (see [7]) how a document should be transformed. An XML file consists of XML elements, and every XML element is represented by a DOM node. The document’s structure is represented in a DOM node-tree, so the branches in the DOM tree represent the object relations and hierarchy. For this reason, DOM is called tree-based. The DOM representation of an XML structure is also called XML DOM object.
6 2 Related Work
DocumentBuilderFactory
DocumentBuilder
DOM object InputStream
DocumentHandler … ErrorHandler … XML SAX client source parser application DTDHandler
EntityHandler
Figure 2.3: Sun’s Project X reference implementation of a DOM API [17][28]
Figure 2.3 illustrates which components work together to transform an XML source into a docu- ment object that is stored in the main memory. In the example, Sun Microsystem’s implementation of the DOM API is shown, but there may be of course different DOM API implementations. DOM is just a model. Therefore, any DOM API must use a parser to read an XML file. Because there are different SAX parser implementations, these DOM APIs can be distinguished (see also http://xerces.apache.org/xerces-j/faq-migrate.html):
• Apache Xerces project: org.apache.xerces.parsers.DOMParser() (uses SAX parser in package org.xml.sax.*)
• Oracle DOM parser: oracle.xml.parser.v2.DOMParser() (uses SAX parser in class oracle.xml.parser.v2.SAXParser())
• Sun DOM parser: com.sun.xml.tree.XmlDocument.createXmlDocument(uri) (uses SAX parser in package com.sun.xml.parser.Parser())
A Factory class (e.g. javax.xml.parsers.DocumentBuilderFactory) configures and obtains the parser variant that is used. In contrast to SAX, a DocumentBuilder object is instantiated by an instance of the DocumentBuilderFactory class. A DocumentHandler is created instead of a ContentHandler. This special handler is able to create a Document object (DOM object) while the parsing process is executing. Once parsing is done, the DocumentBuilder instance returns a Document object. Then, a client application can use the DOM object interface to access the DOM object and traverse the XML structure.
7 2 Related Work
2.2.3 StAX - Streaming API for XML StAX is another conventional XML API. SAX as well as StAX is open source. The StAX 1.2 final release is the last official source code that was released on Jun 19th, 2006. Binaries, source code and documentation can be downloaded from the official homepage [19], but sources included in the Java Development Kit (JDK) or similar should be newer (e.g. XMLInputFactory is version 1.2 in the JDK 1.6.0_20 and version 1.0 in the stax-src-1.2.0 package from the official homepage). It is part of the Java Platform SE since version 6 and therefore it is further developed in this context. The StAX concept is formally specified in JSR-173 [20]. These are the StAX design goals: • API for reading and writing XML documents (symmetrical bi-directional API) • efficient, extensible, simple and modular • J2ME compatible StAX provides an interface to read XML data similar to SAX and furthermore to write XML documents. In contrast to SAX, it uses a pull parser design to read XML sources. Hence, such a parser is reactive to method calls from subsequent program components. Therefore, the client application is in control of the parser. This is the exact opposite to such application designs that use SAX. Both SAX and StAX use event-based parser to read the source and the advantages due to event- based processing are the same: low memory usage because no in-memory document representation is created, and the client application has low latency because output is created from the very first. A restriction is that all event-based parsers are strictly sequential. They can only go forward in the event sequence. If a component of the client application requires access to multiple events in the XML structure, it has to store the events that are needed for the component during the parsing process.
XMLInputFactory
XMLStreamReader
InputStream
while(hasNext())
StAX client XML PULL source parser application
next()
Figure 2.4: Streaming API for XML
8 2 Related Work
Figure 2.4 demonstrates the simple design of this streaming API. A new instance of the parser is created by the XMLInputFactory class, and the XML source is passed to the parser object. When this is done the client application can iterate over all XML events that the parser will be able to read. There are two types of StAX: iterator-based (XMLEventReader) and pointer-based (also cursor- based) (XMLStreamReader). The Java EE Tutorial 5 explains which class should be used: "If performance is your highest priority (for example, when creating low-level libraries or infrastruc- ture), the cursor API is more efficient. If you want to create XML processing pipelines, use the iterator API." ([18] on page 546). The XMLEventReader class works on top of the XMLStreamReader and hence the performance of the latter should be slightly better (see also [21]). Therefore, when using Sun’s StAX implementation (SJSXP) the "cursor API (...) is the most efficient way to read XML data" [22].
2.3 Event Pipeline
Figure 2.5 shows an illustration of an XML event pipeline. Detailed explanations can be found in David Brownells book SAX2 [10, Chapter 4.5] or in the paper of Grushka, Jensen, and .. about design pattern for event-based processing [3]. This abstract design consists of consecutive software components called modules. New modules can be added or inserted at runtime or at the beginning statically.
InputStream
absorbed … events module last parser module
XML module … sequence created events
… event
close() read transform output
Figure 2.5: Basic event pipeline design
The arrows in the figure illustrate how data is pushed through the modules. Data is transmitted from module to module via XML events. In general, these events consist of the payload and metadata. All data references that are transmitted between modules are appended to XML event objects. It is important that XML events do not carry complete data objects, only references. Then, the pipeline can be memory efficient and fast.
9 2 Related Work
Event objects need not to be processed by every module. A module can pass through an event or decide to process it dependent on its functionality. Events can also be created by a parser module or any other module of the pipeline. A parser module creates a new event for each XML element that it reads from the source. Therefore, XML events are created for each start tag, end tag, namespace declarations, start of document, end of document, or for character blocks between tags. Events can also be absorbed or generated within the pipeline as illustrated in Figure 2.5. In more complex applications it could be necessary to send information backwards through the pipeline, i.e. a module should be in a position to push events to the previous module. For example, a web service client application can cancel the transmission and stop the decryption of an encrypted SOAP message ([34]) exactly when an erroneous or invalid XML element has been detected. This message "could be dropped without wasting valuable processor capacity for cryptographic operations" [4] to avoid, for example, an Oversize Payload attack. This attack is similar to conventional buffer overflow attacks: "an attacker may exploit a vulnerability in a Web Service by sending overtly large XML files" [37, page 157]. Modules can be classified as active or reactive. Typically, each module receives XML events from its previous module. Modules that react to another module’s XML event stream are called reactive. Contrary to other modules, parser modules do not receive XML events from the pipeline. Instead, they obtain a byte input stream (typically any extension of java.io.InputStream) from any specified source. Consequently, a parser module is in control of its data source, hence it is classified as active.
10 3 Design Concepts
This chapter deals primarily with module designs. First, the advantage of the pipeline to encrypt or decrypt multiple times is illustrated. Then, two solutions for encryption modules are introduced. Finally, three design solutions for decryption modules are explained in detail.
3.1 Nested XML Encryption and Decryption
In some cases, it may be necessary to encrypt or decrypt XML elements multiple times. For example, a document can be encrypted completely so that the XML file only consists of one
InputStream
decrypter … parser or encrypter XML module
sequence … decrypter or event encrypter … close()
Figure 3.1: Event pipeline configuration for nested encryption or decryption
The figure illustrates an example configuration for the pipeline. In this case, it is able to decrypt a completely encrypted XML file and furthermore decrypt an embedded
11 3 Design Concepts
3.2 Stream Encryption
InputStream … encrypter module
sequence buffer
Base64 Cipher CHARACTERS event Input Input
Stream Stream Char
… …
Figure 3.2: Internal design of the encryption module
Design Solution 1 - Simple Buffering In general, the encryption module receives elements from the pipeline as illustrated in Figure 3.2. If the module is not encrypting, its task is to compare the name of all incoming XML events with the name of the element which is meant to be encrypted. The element name that should be encrypted is specified when the encryption module is created. Once a match is found, the encryption module enters the encryption mode. That means, it creates a new serial of XML events to push a complete
Design Solution 2 - OutputStreams In this thesis, the focus is on decryption so this design is explained in brief and not implemented for this thesis. Because output streams can handle fragmented input data, they can be used to encrypt character chunks. Streams are memory efficient, hence a memory efficient solution should use a cascade of
12 3 Design Concepts output streams to transform the unencrypted data (XMLEvents) into a sequence of encrypted and base64 encoded characters. Finally, a file writer module can write the new encrypted XML file block by block. This solution solves the main disadvantage of the Simple Buffering design solution explained above, because it does not matter how large the source XML file is, which should be encrypted. Moreover, the memory usage should be very constant due to encryption.
3.3 Stream Decryption
There is a difference in meaning of the word "stream" in case it is used in conjunction with decryption or parsing. The following two paragraphs explain how the word is used in this thesis. As depicted in Chapter2, an XML API can be streaming-based (event-based), for example SAX or StAX, because of their output. The output of a streaming XML API is a stream of XML events. The XML parser creates XML events while it is sequentially reading the XML source file or any other input stream. For this reason, subsequent software components can continue processing those events although the complete parsing process has not finished. Decryption can be processed with streams, i.e. a CipherInputStream or CipherOutputStream can be used. These streams extend the standard Java classes java.io.InputStream or java.io.OutputStream, respectively. Such a stream is a sequence of data that can be accessed sequentially by some read() or write() methods. In general, streams are able to process just one single data unit of the stream (e.g. just one byte, char, int, etc.) or complete blocks of data (e.g. byte[], char[], int[], etc.). The ability to handle blocks of data is very useful in the context of event pipeline concepts. Even so, there is a problem if event-based parsing and stream decryption comes together:
13 3 Design Concepts
InputStream
XML parser … decrypter next source module
CHARACTERS Char[] InputStream … CHARACTERS Char[] sequence CHARACTERS … Char[] parser CHARACTERS Char[]
event CHARACTERS Char[] …
close() …
close()
Figure 3.3: Core design problem: interface between character events and the parser
Figure 3.3 illustrates an event pipeline that consists of a parser module and decryption module with a subsequent temporary parser. This adjacent parser is created by the decryption module for this reason it is called internal parser module. It resides virtually within an event pipeline module. Its task is to parse the decrypted output that it retrieves from the decryption module (decrypter).
3.3.1 Push-Pull Problem on Event Streams Imagine two software components one is active and the other one is reactive as shown in Figure 3.4. The active component is in control what and when it sends data to the reactive component that receives this data. In the context of the event pipeline, the active component is the previous module and the reactive component is the next module of the pipeline. A parser (or parser module) is always an active component, which means such a software component is in control of its source.
14 3 Design Concepts
Push
Active Reactive Component Component
Pull
Reactive Aktive Component Component
Push
Active Active Component Component
Figure 3.4: Push-pull problem on streams
The characteristics of a parser and a stream of character events that are pushed from one module to the next module lead to a logical problem that is illustrated visually in Figure 3.4. Usually, the payload of
3.3.2 Event-Stream Decryption The decryption process that decrypts string chunks in serial order is called event-stream decryption. Listing 3.1 shows a
1
Common streaming-based parsers do not create just one character event for arbitrary long strings. In fact, they create multiple events (the Javolution parser is an exception). A decryption module that can decode and decrypt a stream of character events must be able to process this fragmented string. That means, all characteristics of this string must be considered by the decryption module that are elucidated in Section 2.1 on page3. These characteristics are base64 encoding with padding, block cipher encryption with padding, and string fragmentation with unknown and various chunk lengths. Another problem is that the string length is not known before all string chunks are read. For this reason, a subsequent decryption module cannot allocate the correct memory space for any buffering in advance.
15 3 Design Concepts
Design Solution 1 - Simple Buffering A simple design for the decryption module uses a buffer that collects all incoming string chunks. This solution is also described in "A Stream-based Imple- mentation of XML Encryption" by Takeshi Imamura, Andy Clark, and Hiroshi Maruyama [2]. Due to string splitting the decryption module has to collect all character events that occur between the start tag
InputStream
parser … next module
CHARACTERS Char[] …
CHARACTERS sequence Char[] CHARACTERS latency Char[] … CHARACTERS event Char[] CHARACTERS Char[] InputStream … parser
Figure 3.5: Decryption module that implements simple buffering
All incoming character events are added to a single buffer as shown in Figure 3.5. Concatena- tion of strings can influence the performance of the program ([8,9]). It is important to use the most efficient implementation to create and fill a buffer that is meant to store strings. When the decryption module detects the end tag, it can be sure that the complete string is stored in the buffer. Then, the base64 encoded string is decoded into a byte array. Finally, the decryption module decrypts this binary data. This simple solution has one disadvantage: it uses much system memory. The complete character string, base64 decoded binary data and the decrypted binary data allocate space in memory. Despite that, the memory usage should be lower in comparison with the DOM API. Simple buffering is an easy to understand concept and its implementation is less complex than other designs.
Design Solution 2 - Thread-based Another sophisticated design solution to circumvent the push- pull problem on events streams uses system threads. A thread is a part of a process that is executed within the operating system. Whereas, a system process is all data that is created and used when a computer program is being executed. A thread, virtually within a process, is processed independent of other threads, but it can access resources within the same process. Resources can be shared among threads, but this is only possible, if shared resources are synchronized. That means, if one thread is writing to the shared resource, all other access is blocked to prevent conflicts. In case another thread is requesting access to the blocked resource, it is suspended until the resource is released by the occupying thread.
16 3 Design Concepts
InputStream
parser … next
OutputStream module
latency … CHARACTERS
CHARACTERS B sequence a Output D Piped Piped … CHARACTERS s Stream Output Input parser CHARACTERS e E Stream Stream event Writer 6 … CHARACTERS 4 S
…
Figure 3.6: Decryption module that implements thread-based decryption
This can solve the push-pull problem. Figure 3.6 demonstrates the complex design. Each parser that is created by a module is executed within its own thread. The orange colored components in the figure are executed in a new thread that is created every time the decryption module has to parse the character events encapsulated by a
3.3.3 Byte-Stream Decryption This decryption design is special because the concept boundaries induced by the event pipeline pattern design are avoided. That means the pipeline is bypassed by using a direct InputStream connection to the StAX parser module to solve the push-pull problem on event streams. This is because, it is a violation of the event pipeline concept. An advantage is that only as many bytes as required have to be read from the source. This could save network bandwidth if the source file is located on a distant storage and the accessing service does not require all XML data that are stored in the file. A prototype based on this design should allocate low system memory because it does not use extensive buffering or create any in-memory XML tree-structure. The complete processing chain uses streams (Input- and OutputStreams) and no buffering except the byte chunks that
17 3 Design Concepts the streams use internally. Streams are used for all transcoding and decryption to implement a continuous streaming functionality.
Design Solution 3 - StAXCharactersInputStream The character input stream approach cir- cumvents the event pipeline concept. A hierarchy of input streams is used instead of character events that are provided by the previous pipeline module. Figure 3.7 shows where the class StAXCharactersInputStream is located in the concept design and in the following its mode of operation is depicted.
InputStream next … XML StAX decrypter module module source parser … XML file latency … or module Base64 Cipher sequence Input StAX … Input Input CIS Stream Stream Stream parser
… … event
close()
Figure 3.7: Illustration of the CharactersInputStream component
This special stream is created by the StAXParserModule whenever a
18 4 Implementation
A prototype suite was implemented in Java to test the design concepts for stream encryption and decryption that are explained in Chapter3. In this chapter, the prototype’s source code is introduced in some detail. First, a review is given of the source code structure. Then, a detailed explanation follows about the event pipeline’s module manager as well as the parser modules, the encryption module, and stream decryption modules.
Figure 4.1: Prototype suite package overview
Figure 4.1 shows a call graph of the test suite that was generated by the useful Eclipse plugin ispace (http://ispace.stribor.de/). The test suite consists of three main parts: the event
19 4 Implementation pipeline implementation, a conventional DOM API implementation and the XMLStreamProcessing main class. This class implements the testing sequences to get those values that are illustrated in the next chapter. There are two packages on the left side of the figure that represent the core functionality of the event pipeline. One is called pipeline and the other one is called modules. The next section is about the event pipeline source code that is contained in the pipeline package.
4.1 Event Pipeline
The XMLEventPipeline is an important class in the pipeline package. The next call graph (Figure 4.2) shows the inner body of that package and the methods of this class.
Figure 4.2: Inner body of the pipeline package
Because a pipeline can consist of many modules and its configuration is complex, a module management class is useful. The XMLEventPipeline class implements a simple module manager. There are methods to insert, add or get modules. Furthermore, the module manager class is able to print the actual pipeline configuration. Another functionality of the manager is to assure that the first module is a parser module. A process method initiates the pipeline, i.e. the process() method of the first module of the pipeline is executed. This is always a parser module as mentioned above, and hence the parsing process is executed.
20 4 Implementation
Figure 4.3: Strongly simplified call graph of the pipeline implementation
Dependencies among the modules are illustrated in the call graph shown in Figure 4.3. The abstract class AbstractEventHandler is in the center of the graph and surrounded by its inheriting classes that all represent pipeline modules. On the left side, there are SAX and StAX parser modules. The SAX2 module consists of two classes (SAX2ParserModule and SAX2EventHandler) that are shown in the group that is surrounded by a rectangle. This module can be configured so that the Xerces, Crimson or Sun parser is used. There are three decryption modules:
• ESThreadDecrypterModule,
• ESBufferDecrypterModule, and
• StAXISDecrypterModule.
The first module implements the decryption module using threading. It is based on the decryption module design that is proposed in 3.3.2. The second module implements the decryption module that uses simple buffering. Finally, the third module up in the middle implements the special design introduced in 3.3.3.
STAXISDecrypterModule The third module is the STAXISDecrypterModule and it uses the StAXCharactersInputStream to access the character string in between
21 4 Implementation this class extends java.io.InputStream. As you would expect StAXCharactersInputStream re- turns -1 if the StAX parser has reached the end of character events, i.e. points to an XML end tag (usually ). Such an InputStream would not be able to provide a correct return value for the available() method. This is because it is not known in advance how many character chunks can be pulled from the parser. Every InputStream is in control of its source, hence this solution works with pull parsers only. However, SAX is a push parser and therefore it is impossible to implement a SAXCharactersInputStream class. Such an implementation would have to solve the problem to pull next character events from a push parser (push-pull problem).
4.2 Encryption Module
In this thesis, the focus is on decryption, and for this reason, only design solution one for the encryption module was implemented in the prototype (see 3.2). The next call graph (Figure 4.4) illustrates the internal body of the encryption module.
Figure 4.4: Call graph of the encryption module
The module goes through a certain sequence while encrypting. In the figure, there are some methods, for example EVENT_START_ELEMENT or EVENT_END_ELEMENT. These methods are in the right order in the call graph to illustrate the sequence of the process. In the following the sequence is explained.
1. EVENT_START_ELEMENT - analyze incoming events and search for the start tag of the XML node that should be encrypted; if it is found do the following
2. initEncryptedDataElement - push the XML elements between
22 4 Implementation
3. initWriterModule - serialize the XML node to be encrypted into a byte array buffer, i.e. all incoming XML events are redirected and serialized into a byte array buffer
4. EVENT_END_ELEMENT - analyse incoming events and search for the end tag of the XML node that should be encrypted; if it is found do the following
5. generateCipherValue - finalize encryption and encoding process: call CharactersEncrypter to encipher and encode the byte array buffer
6. pushCipherValue - create
7. closeEncryptedDataElement - push closing XML elements between and to the next module
4.3 Decryption Modules
In this section, the source code of the decryption modules is explained. As described in Section 3.3, there are three different design concepts contained in the decrypt package. Despite that, the StAXISDecrypterModule implementation is not explained because it violates the event pipeline concept. The internal body of the decrypt package and internal class dependencies are shown in the next figure (4.5). In the following paragraphs, the source code of each decryption module is explained in detail.
Figure 4.5: Internal dependencies of the decryption package
23 4 Implementation
ESBufferDecrypterModule This module implements the Simple Buffering concept (Figure 4.6).
Figure 4.6: Internal dependencies of the ESBufferDecrypterModule
Typically, the
1. doFinal() function of the CharacterEventsDecrypter object is called
2. Base64 decoding of buffered data
3. decryption
4. creation of an internal parser
5. parsing the decrypted XML and push events to the next module of the pipeline
ESThreadDecrypterModule This module implements the second design solution that uses a threading approach to decrypt the event stream (Section 3.3.2, second paragraph). The next call graph shows the internal body (Figure 4.7).
24 4 Implementation
Figure 4.7: Internal dependencies of the ESThreadDecrypterModule
As already described in the paragraph above, the
25 4 Implementation
4.4 Modification of the Javolution Parser Source Code
In this section, the high efficient Javolution parser is introduced [27]. During testing the prototype, it was found that the common streaming parser implementations are not as fast and efficient as expected (see next chapter for results). The memory usage rises over time. This is not the expected behavior of a streaming parser and seems to be a common problem (see https://issues.apache.org/jira for Xerces OutOfMemory bugs, e.g. Bug ID 6536111). In theory, such a parser should allocate a small and quite constant heap space. None of the proto- type’s parser modules showed the expected behavior in all test cases. Therefore, a separated test suite to test different parser solutions was implemented. The original implementation of the high efficiency Javolution parser does not split long character strings between
26 5 Performance Analysis
The prototype consists of multiple components that work together. It is important to examine the components apart. These components are the parser itself, base64 decoding, the parser modules, or the decryption modules, for example. Therefore, this chapter is divided into sections that treat of those components.
5.1 Experimental Setup
The following two tables list the hardware and software that were used for testing. Test Hardware Processor Intel R CoreTM2 CPU T5500 @1,66 GHz Chipset Chipset Intel R i945GM Rev. 3 Memory MDT R DDR2-800 2x2GB Hard Drive Corsair R SSD CMFSSD-128GBG2D
Test Software Operating System Microsoft R Windows 7TMProfessional x64 Java VM JavaTM Software Development Kit version 1.6.0_21
5.1.1 Measuring Execution Times It is difficult to measure the performance of Java programs because of several factors of influence [35]. Moreover, the influence of these factors can vary. This could lead to some measurement results that are difficult to reproduce. The Java Virtual Machine is not in the optimal state when it is instantiated: Usually, programs are executed slowly at this time. This is the result of some dynamic mechanisms of the JVM Just-In-Time compiler (JIT) to improve the performance. It switches dynamically between already compiled code and the Java bytecode interpretation [36]. Therefore, all performance tests are executed once before the actual measurements take place so that the JIT code generator has time to convert the Java bytecode into native machine code. There may be side effects due to caching and CPUs with many cores, especially if multiple threads are used by the program. For example, if the same encrypted XML file is decrypted multiple times, it is likely that the second run is faster than decryption of a new XML file with other cipher data. For each run a new XML file is created beforehand to reduce the influence of caching hardware. These are the general conditions for the performance tests: • JVM Java bytecode compilation phase at the beginning, i.e. measurements in this round are not counted into the results
• Separated runs, i.e. for each run a new XML file with random character data is created
27 5 Performance Analysis
• Median of multiple runs (15 times) if possible, i.e. times are measured in milliseconds for each run and stored in a list, then the median value is computed and returned.
• One CPU core is used, i.e. all cores but one are disabled by the operating system (the boot options can be configured with msconfig in Windows 7TM)
• Influence of anti-virus or other background software is minimized.
There are multiple ways to measure the execution time of Java programs ([23] Chapter 3). The prototype was programmed with Eclipse, so it is obvious to use Eclipse plugins to examine the source code. This is a selection of Java performance analysis tools and plugins that were used in the complete process of development and implementation of the prototype:
• Eclipse Test & Performance Tools Platform (TPTP) http://www.eclipse.org/tptp/
• JProbe http://www.quest.com/jprobe/
• PerfAnal: A Performance Analysis Tool http://java.sun.com/developer/technicalArticles/Programming/perfanal/
• System time difference measurement (Runtime class)
Solution - System Method Finally, the Runtime class was used to get the execution times illus- trated in the following figures. Listing 5.1 shows the code that was inserted in the test suite to measure execution times.
1 //set start timestamp 2 long startTime = System.currentTimeMillis(); 3 4 //some source code 5 6 //calculate span 7 long span = System.currentTimeMillis() - startTime; Listing 5.1: Source code to measure execution times
5.1.2 Measuring Memory Usage To illustrate the advantages of event-based streaming and such decryption designs proposed in this thesis, it is required to measure the allocated memory by the program during the parsing process. The analyzing program has to show the heap size over time. Furthermore, it must be possible to set precisely the times of measurement. In general, it is not so easy to measure the memory usage of a Java program. A common program is the Eclipse plugin TPTP that can measure the allocated size, and number of calls of all classes that are instantiated during execution. Nevertheless, it turned out that the results were not meaningful enough. The Eclipse Memory Analyzer Tool (MAT) http://www.eclipse.org/mat/ was used to de- tect memory leaks. Eclipse can be configured so that it generates memory dumps when an
28 5 Performance Analysis
OutOfMemoryException occurs (set VM arguments to -XX:+HeapDumpOnOutOfMemoryError). The dump is stored in an hprof file and then can be opened in MAT. Another tool is the jConsole (jconsole.exe in the JAVA_HOME/bin directory). It can be con- nected to any Java process and depicts heap memory usage, CPU usage, and the number of threads and classes. Although, the jConsole does show the memory size over time, its update interval is at least one second and no absolute values can be further processed.
Solution - Runtime Method Despite all that, there is a solution that finally fulfilled the require- ments. The Runtime class can be used to poll the amount of bytes that is used by the running Java process (Chapter 5 in [23]). The next Listing (5.2) shows the source code of the method that was used to calculate the actual heap usage.
1 public static String getHeapSize(){ 2 //get runtime object 3 Runtime rt = Runtime.getRuntime(); 4 //askJVM to execute garbage collection 5 rt.gc (); 6 //calculate allocated heap size 7 Long heapSize = rt.totalMemory() - rt.freeMemory(); 8 return Long.toString(heapSize/1000); 9 } Listing 5.2: Method to poll the actual heap usage
It is obvious that whenever getHeapSize() is called the JVM is asked to execute the garbage collection to release any objects that are not in use. After this is activated the heap size is calculated by subtracting the actual free memory from total memory that is allotted to the process. This method is located in the source code where significant results are expected. It is controversial if the gc()-method should be called this way or not. However, the heap usage test results are much more significant when the garbage collection is called. To be sure, the source code that is embedded in the prototype to measure the heap usage is not processed when execution time is measured. A special HeapAnalyser class was developed and implemented to monitor the bytes that the pro- totype reads sequentially from the source. It is able to invoke the getHeapSize() method whenever next 200KB have been read. The HeapAnalyser class implements the FilterInputStream inter- face.
5.2 Excursion: Parser Analysis
This section deals with the memory usage of some SAX and StAX parser implementations. These parsers were analyzed, and the results are presented and explained in detail:
• StAX: Sun’s StAX implementation (JRE 1.6.0_21, XMLStreamReader)
• Javolution: high efficiency parser and the modified variant ([27], javolution.xml.sax.XMLReaderImpl)
• SAX2 Crimson (version 1.1.3)
• SAX2 Xerces (version 2.10.0)
29 5 Performance Analysis
Figure 5.1 and Figure 5.3 illustrate the heap usage over time during the parsing process. Two test cases were analyzed: the parsing of an unencrypted XML file with 200.000 elements and a completely encrypted XML document consisting of a long character string. The unencrypted file was encrypted by the prototype implementation using the encryption module. A completely encrypted XML file consisted of one
Heap Usage over Time parsing an unencrypted file (200k elements, 5.8MB file size) same tag name different tag names 600 45000 40000 500 35000
in KB in 400 30000 25000 300
20000 Heap Size Size Heap 200 15000 10000 100 5000 0 0
kilobytes read while parsing
SAX2XERCES SAX2JAVOLUTION SAX2CRIMSON StAXSUN SAX2JAVOLUTIONMOD
Figure 5.1: Parser: heap size over time during parsing of 200.000 XML elements that have either the same tag name or all different tag names
Parser Test: Unencrypted Files The figure above has two charts. On the left, the test file that were used to test the parsers contains 200.000 XML elements that all have the same tag name (file type A). On the right, the chart shows the measured heap size values when all 200.000 XML elements have different tag names (file type B). Over the x-axis time is shown in both charts, i.e. the HeapAnalyser class executes the getHeapSize() method when next 200k of source bytes are read. The allocated heap size in KB is shown on the y-axis. While parsing an XML file of type B the heap usage rises as shown in the right chart of Figure 5.1. The widespread parser Xerces, Sun’s StAX parser and Crimson show this behavior. In this test case, the original Javolution parser and especially the modified variant show their advantage over those conventional parsers. Based on the right chart, one may conclude that there is a memory leak in the tested parsers. The Eclipse Memory Analyzer Tool can discover a memory leak and therefore it was applied to the Xerces parser.
30 5 Performance Analysis
Figure 5.2: Xerces Parser: MAT analysis results
Figure 5.2 shows the MAT analysis results. The most memory is accumulated in one instance of the org.apache.xerces.util.SymbolTable class. This class implements the string intern- ing feature of Xerces (URI: http://xml.org/sax/features/string-interning). "All element names, prefixes, attribute names, namespace URIs, and local names are internalized using the java.lang.String#intern(String):String method" [16]. This feature can only be set to true. On the left, the results are illustrated when file type A is parsed. All curves are quite constant over the complete parsing process. Sun’s StAX implementation allocates 20% more heap space than the other parsers that all require about 400KB of system memory. In conclusion, the required memory usage of conventional XML parsers (except the Javolution parser) is proportionally depended on the number of different element names, prefixes, attribute names, etc. in the file. In contrast, the Javolution parser is very memory efficient, especially if the XML file contains many different element names.
Parser Test: Encrypted Files Figure 5.3 demonstrates the test results in case long character strings must be read. Heap space is increasingly allocated by the original Javolution parser (SAX2JAVOLUTION) during the parsing progress. This is because the Javolution SAX parser in version 5.5.1 reads character sequences at once into its internal buffer array. It does not split the input stream into character chunks internally. For this reason, only the modified Javolution SAX2 parser is further analyzed because there is no rise in heap size during the parsing process in both test cases. The other more conventional parsers like Xerces or Sun’s StAX parser behave as expected: long character strings are split into small character chunks that can be deallocated during parsing. These results in low memory usage over time as shown in the second figure (5.3).
31 5 Performance Analysis
Heap Size Over Time parsing a completely encrypted file (8MB character string) 20000 18000 16000
in KB in 14000 12000 10000
Heap Size Size Heap 8000 6000 4000 2000
0
200742 401446 602150 802854
1605670 4214822 1003558 1204262 1404966 1806374 2007078 2207782 2408486 2609190 2809894 3010598 3211302 3412006 3612710 3813414 4014118 4415526 4616230 4816934 5017638 5218342 5419046 5619750 5820454 6021158 6221862 6422566 6623270 6823974 7024678 7225382 7426086 7626790 7827494 8028198 kilobytes read while parsing
SAX2XERCES SAX2JAVOLUTION SAX2CRIMSON StAXSUN SAX2JAVOLUTIONMOD
Figure 5.3: Parser: heap size over time during parsing a completely encrypted XML document
5.3 Excursion: Base64 Decoding Performance
There are different types of base64 decoding implementations. It is not obvious which base64 implementation to choose for the decryption modules. Therefore, a rough performance analysis was done to ascertain the performance of different base64 decoding implementations. Both Apache (org.apache.commons.codec.binary.Base64) and iharder (iharder.sourceforge.net) provide an own library.
Base64 Performance Analysis 160 140 120 100 80 60
milliseconds 40 Apache 20 0 iharder
Figure 5.4: Base64 performance analysis
32 5 Performance Analysis
A base64 encoded source was about 2MB of size, and all tests were run 40 times. The times in Figure 5.4 are the median values of all forty runs each. On the y-axis the execution times in milliseconds are depicted and on the x-axis the different test cases InputStreams, OutputStreams, and Block Decoding are confronted with each other. On the one hand, a 2MB byte array is passed to the decoding method and decoded at once. This is called Block Decoding in the vertical bar graph. These are the methods:
• Apache: org.apache.commons.codec.binary.Base64.decodeBase64(byte[] b)
• iharder: Base64.decode(byte[] b))
On the other hand, a ByteArrayInputStream(byte[] b) object (or output stream) is created and the 2MB in-memory byte array pointer is passed. Then, the byte array stream is passed to the base64 input stream (or output stream) that is analyzed. The measurements reveal that the Apache implementation is the fastest in each test case. Large differences appear if distinct test cases are compared. As is evident in the graph,
• InputStreams are not fast Apache: org.apache.commons.codec.binary.Base64InputStream, iharder: Base64.InputStream
• OutputStreams are much faster Apache: org.apache.commons.codec.binary.Base64OutputStream, iharder: Base64.OutputStream
• and Block Decoding is the fastest Apache: org.apache.commons.codec.binary.Base64.decodeBase64, iharder: Base64.decode
Therefore, InputStreams were avoided as often as possible in the design concepts (Chapter3). In case of Simple Buffering as a decryption module concept, Block Decoding is used. Event-stream decryption that uses a new thread for the parser uses the OutputStream class. Only the byte-stream decryption concept uses the InputStream class, because the design demands it.
5.4 Parsing Analysis
This section has two subsections. One is about memory usage and the other deals with execution times of the pipeline if an XML file is parsed. In the following, a simple pipeline configuration is considered. It consists of a parser module and an output module to simulate a simple pipeline with- out a decryption module. For DOM API test, the Apache example source code written by Vishal Mahajan (from Apache SVN: org/apache/xml/security/samples/encryption/Decrypter.java) was modified slightly and used in the following tests.
5.4.1 Memory Usage Figure 5.5 illustrates the heap size over time during the parsing process of 200.000 XML elements. Each element name is selected at random out of 150 different names. Test file size was about 6MB. The source file was not encrypted and hence the decryption modules (pipeline implementation) and the decryption method (DOM API) were disabled.
33 5 Performance Analysis
Heap Size Over Time parsing 200.000 elements (150 different names) 40000 35000 30000 25000 20000 15000
Heap Size in KB in Size Heap 10000 5000
0 time
200 400 600 800
Start
5800 6000 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 5000 5200 5400 5600 kilobytes read while parsing
Pipeline/DocFab StAXSUN_Module Modules/DocBuilder
SAX2XERCES_Module only) (DOM DecClass Doc parsed (DOM only) (DOM parsed Doc DOMAPI only) (DOM done Search SAX2JAVOLUTIONMOD_Module
Figure 5.5: Parser modules: heap size over time during parsing of 200.000 XML elements (150 different names)
On the x-axis the progress of the parsing process is depicted, that means labels on the x-axis are the times of measurement during the parsing process. The process starts on the left. Then, the program instantiates the pipeline and module objects if StAX or SAX is tested. If the DOM API implementation is tested, the times of measurement are right behind the creation of the DocumentBuilderFactory and DocumentBuilder class (DecFab and DocBuilder in the figure). Next, the heap size is periodically measured while the parser reads the source file. Finally and only in DOM API tests, the last three measurements were taken. The DocParsed heap size value is polled just after the DOM API parser (SAX2 parser) has finished reading the file. DecClass is a measuring time right before the decryption methods are invoked. The last value is polled after all XML elements were processed by the DOM search method (getElementsByTagName). There are four curves in the chart. The memory usage of all streaming APIs is very similar, low, and constant over time. This is because the number of different element names is small. However, if all XML element names were different, the heap usage would rise over time for the parser modules that use Xerces or Sun’s StAX parser. Only the Javolution parser module does not allocate more memory if all element names are different (see Figure 5.1). As you would expect from theory, the DOM API requires more heap space to store the document object model. The DOMAPI curve has a positive slope while parsing. This additional memory usage should be caused by the in-memory XML object generation. The Apache DOM API imple- mentation uses the standard SAX2 parser that is Xerces. The additional memory that is allocated by the pipeline implementation is very low in comparison to the pure parser test results in Figure 5.1.
34 5 Performance Analysis
5.4.2 Execution Time In this subsection, the performance of the parser modules is analyzed and compared to ascertain the prototype’s source code efficiency and quality. This test does not measure the parser performance directly as extensively done in related work [24, 25, 26]. Instead, the execution time of the parser working in a simple configuration of the event pipeline is measured. In this context, this is more realistic and meaningful.
Execution Time parsing scalability test 180 500 160 450 140 400 120 350 300 100 250 80 200
milliseconds 60 150 40 100 20 50
0 0
1k 2k 3k 4k 5k 6k 7k 8k 9k
10k
10k 20k 30k 40k 50k 60k 70k 80k 90k number of XML elements in the test file 100k SAX2JAVOLUTIONMOD_Module SAX2XERCES_Module DOMAPI StAXSUN_Module
Figure 5.6: Parser modules: performance analysis
Figure 5.6 shows four curves: one for StAX, two for SAX, and one for the DOM API results in two line graphs. In both charts the x-axis represents the number of XML elements in the XML test file. On the y-axis, the median execution time in milliseconds is illustrated. The left chart illustrates the execution time for XML files with 1.000 to 10.000 elements and that one on the right for 10.000 to 100.000 elements. The DOM API takes most time to parse a file. In contrast, the SAX2XERCES_Module curve is predominantly below all other curves. Sun’s StAX parser and the modified Javolution parser modules have similar execution times over the entire measuring range. These modules are just a bit slower than the fast Xerces parser module in this test case.
5.5 Decryption Analysis
To perform the following analysis, the pipeline configuration is extended by one more module. It consists of a parser module, decryption module, and the console output model with disabled output.
35 5 Performance Analysis
5.5.1 Memory Usage In this subsection, the influence of decryption to the memory usage is explained and illustrated.
Heap Usage Over Time In Chapter3 several decryption designs were proposed and in the next figure (5.7) the heap usage that is required by those is illustrated.
Heap Size Over Time parsing and decrypting (200k elements, 150 different names) 80000 70000 60000
50000 in KB in 40000
30000 Heap Size Size Heap 20000 10000
0
200 400 600 800
Start
1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 5000 5200 5400 5600 5800 6000 6200 6400 6600 6800 7000 7200 7400 7600 7800 8000
Pipeline Modules kilobytes read while parsing body End
DOMAPI SAX2JAVOLUTIONMOD_ESbuf
Decrypt1 (DOM only) (DOM Decrypt1 only) (DOM Decrypt2
SAX2JAVOLUTIONMOD_ESThread SAX2XERCES_ESbuf done only) (DOM Search Process done/Doc parseddone/Doc Process SAX2XERCES_ESThread StAXSUN_ESbuf StAXSUN_ESThread
Figure 5.7: Decryption modules: heap size over time during parsing and decryption of 200.000 XML elements
Figure 5.7 shows that the memory usage of the DOM API is always above all event pipeline solutions during the parsing process. After the source is parsed and the DOM is created, the decryption is done (decrypt1, decrypt2 in the figure). When the first decryption is done (decrypt1), about 50MB of additional heap size is allocated. Consequently, this large increase is a disadvantage of the DOM API. There are three other curves in the chart that show the characteristics of heap usage if simple buffering is used in the decryption module. The curves are StAXSUN_ESbuf, SAX2XERCES_- ESbuf, and SAX2JAVOLUTIONMOD_ESbuf. These value curves are superimposed and have steps. As mentioned above, this should be caused by the buffer array expansion. Once all encrypted bytes are stored in the buffer, the base64 decoding and decryption process is executed. For this reason, new byte arrays are created and filled with data. This should be the cause for the peak value at End body on the right side. The curves StAXSUN_ESThread, SAX2XERCES_ESThread, and SAX2JAVOLUTIONMOD_- ESThread illustrate the heap usage if the pipeline uses thread-based decryption modules. These curves have a similar memory usage footprint. It is low and constant. However, if most of the XML
36 5 Performance Analysis elements in the parsed XML file have different names, all streaming parsers except the modified Javolution parser cause increasing heap usage during the parsing progress of the pipeline (see Figure 5.1). Because the modified Javolution parser is very memory efficient during parsing an encrypted or unencrypted XML source, the SAX2JAVOLUTIONMOD_ESThread curve should never show any slope. As a result, the event pipeline design that uses thread-based decryption modules and the modified Javolution SAX2 parser is the most memory efficient implementation (SAX2JAVOLUTIONMOD_- ESThread). It is able to release processed events and objects completely. In this chart (Figure 5.7) all streaming APIs allocate less memory because the number of different XML element names is small. Xerces, Sun’s StAX and Crimson cannot release all parsed objects and therefore are not suitable for XML files with many different XML elements. An advantage over the DOM API is that there are no memory usage peaks if the event pipeline design with thread-based decryption is used. Furthermore, the latency should be very low in contrast to the DOM API because data is already base64 decoded and decrypted while the parser is reading the source. Only if the parser module and the decryption module allocate heap economically, the pipeline concept is superior to the conventional DOM API.
Heap Usage Over Elements The second paragraph in this subsection is about the scalability of proposed decryption designs. In the following line graph, the heap usage of different decryption solutions over 10.000 to 100.000 XML elements is compared. The time of measurement for each value is at the end of the decryption process, i.e. when the last element (