
Published in the proceedings of the IEEE International Conference on Web Services (ICWS) 2006, pages 249–256. Exploring Remote Object Coherence in XML Web Services Robert van Engelen1∗ Madhusudhan Govindaraju2† Wei Zhang1 1 Department of Computer Science and School of Computational Science, Florida State University 2 Department of Computer Science, State University of New York (SUNY) at Binghamton Abstract Yet, SOAP RPC [20] provides an object and data serial- ization format that clearly suggests a tight coupling between Object-level coherence in distributed applications and programming language types and XML constructs such systems has been studied extensively. Object coherence in as primitive types, arrays, structs, and multi-references. platform-specific and tightly-coupled systems is achieved The SOAP RPC standard appears to be particularly well- with binary serialization protocols to ensure data struc- suited for RPC-based messaging by mapping application tures and object graphs are safely transmitted, manipu- data onto SOAP RPC-encoded XML. With the move to doc- lated, and stored. On the opposite side of the spectrum ument/literal Web services and XML schema as a popular are platform-neutral Web services that embrace XML as serialization meta-format, accurate programming language a serialization protocol for building loosely coupled sys- type mappings have become even more important, yet more tems. The advantages of XML to connect heterogeneous sys- difficult to achieve because there is no standard mapping be- tems are plenty, but rendering programming-language spe- tween XML schema and program language types. cific data structures and object graphs in text form incurs Several authors [8, 10] refer to the mapping problem a performance hit and presents challenges for systems that as the Object/XML (O/X) impedance mismatch, a term require object coherence. Achieving the latter goal poses that bears some resemblance with the heavily-studied Ob- difficulties by a phenomenon that is sometimes referred to ject/Relational (O/R) mapping problem, thereby relating as the “impedance mismatch” between programming lan- the SOAP/XML interoperability issues to the imperfec- guage data types and XML schema types. This paper exam- tion of the O/X mapping. However, a major part of the ines the problem, debunks the O/X-mismatch controversy, O/X-mismatch can be attributed to the limitations of the and presents a mix of static/dynamic algorithms for accu- widely-used JAX-RPC implementation of SOAP in Java, rate XML serialization. Experimental results show that the see Loughran et al. [8]. They also argue that the O/X- implementation in C/C++ is efficient and competitive to bi- mismatch problem is compounded by the misconception nary protocols. Application of the approach to other pro- among many developers that JAX-RPC is SOAP-RPC. gramming languages, such as Java, is also discussed. This paper debunks the O/X-mismatch controversy, ex- amines the coherence problem, and presents a mix of static/dynamic algorithms for accurate XML serializa- 1. Introduction tion within the limits of XML schema constraints. In Sec- tion 2 we argue that trying to compare XML schema to pro- XML Web services technologies have proven to be ex- gramming language type systems is counter productive. cellent vehicles for bridging the platform and programming Section 3 states the requirements for object-level coher- language gap in heterogeneous distributed systems. The ex- ence and presents a brief overview of systems that achieve pressiveness and simplicity of XML paired with SOAP Web coherence using binary protocols. We discuss the de- services interoperability characteristics are highly appeal- sign and solution space for object coherence in XML ser- ing compared to binary protocols. However, object coher- vices using specific examples from Apache Axis (Java) [1] ence in platform-specific and tightly-coupled systems has and gSOAP (C/C++) [17]. The mapping problem of pro- been achieved for years with binary serialization protocols. gramming language types to XML schema is discussed By contrast, Web services are loosely coupled and are not in Section 4 and specifically applied to C/C++ and Java. specifically designed to meet these strong requirements. Our XML serialization approach for object-level coher- ence is introduced in Section 5 with a presentation of the al- ∗ Supported in part by NSF grant BDI-0446224 and DOE Early Career gorithms. Section 6 gives performance results to verify the Principal Investigator grant DEFG02-02ER25543. efficiency of the implementation on various platforms. Fi- † Supported in part by NSF grants IIS-0414981 and CNS-0454298. nally, Section 7 summarizes our findings. 2. The O/X Impedance Mismatch Revisited ical namespace prefixes in identifier names, such as struct/class members with gSOAP [17], is practical Any context-free language has an infinite number of and effective to resolve namespaces. Annotations sug- grammars to describe the syntactically valid strings. Like- gested in [8] are not sufficient, because name clashes wise, XML content can be described and constrained in cannot be resolved. many different ways [12], e.g. using XML schema, Relax • Unsupported XML schema components. As we men- NG, DTD, and LL(1) grammars [9] to name just a few. In tioned earlier, many schema components that are for- fact, many different XML schemas can be constructed to eign to programming language types can be elimi- describe the structure of a set of XML documents. In other nated and translated to simpler structures. De-sugaring words, we believe it is counter productive to compare a spe- a schema yields an equivalent but simpler schema. cific schema to a programming language type system. • Unsupported types. All built-in XSD types can be The primary purpose of XML schema is to provide a mapped to basic programming language types or to meta language for (manually) defining valid XML content, specialized types introduced to represent XSD types. where certain schema components are clearly intended to • Serializing a graph of objects. SOAP-RPC encod- make this process simpler with convenient constructs to re- ing provides multi-referencing to serialize (cyclic) ob- lieve the schema author from heavy text editing and by pro- ject/data graphs. Most SOAP toolkits support this fea- viding modular components for elements, types, and group- ture, but not all of the toolkits necessarily follow the ings of these. Any XML schema with element references, SOAP specification that limits the use of multi-ref for groups, and substitutions can be translated into an equiva- data with multiple references only. Document/literal lent schema (or schema-like language) without these, while poses some further problems, see Section 3. maintaining the validity constraints. Once we obtain a de- sugared schema by translating the unnecessary extras to This section presented mostly a schema-centric view of simpler constructs, the mapping is much clearer. Therefore, Web services layered on top of a programming language we reject the notion that the O/X mapping is intractable. and its type system, where we reflected on the primary con- However, we agree that the mapping is a challenge for cerns for finding suitable programming language types for each unique programming language, especially for serial- XML schema components. However, the latter issue also izing object graphs in SOAP/XML. presents a point of view from a language perspective, where We summarize and comment on the arguments com- the problem at hand has close similarities to the object- monly stated in the context of the alleged O/X-mismatch: coherence problem. • The inability to support derivation by restriction, such as restricted value ranges and patterns. However, re- 3. Object-Level Coherence Requirements stricted types basically stand on their own, so the Large-scale distributed systems require strong ob- derivation hierarchy is not relevant at runtime when se- ject coherence guarantees [2] to ensure that objects moved, rializing objects. Indeed, support for restriction is not cached, and copied across a set of nodes in a distributed sys- fundamentally more difficult than defining restricted tem preserve their structure and state. Coherence is a ba- types in languages that support subtyping, e.g. sub- sic requirement in tightly-coupled distributed systems, such range types in Pascal. Note that languages that do not as Java RMI [15], CORBA [13], and XDR-based RPC [6]. support subtyping require other mechanisms, such as In these systems the consistency of the distributed ob- the program annotations with gSOAP [17] for C/C++. jects is critical and leads to fragility of the system when • Not being able to map XML names to identifiers. Punc- kept unchecked, e.g. Java class loaders dynamically ver- tuation and Unicode characters in XML names can ify imported classes. be encoded with simple conventions such as with hex Similarly, SOAP/XML processors share schemas to ver- _xABCD_ codes in identifier names, as suggested in ify XML content, but must also be aware of the object refer- the SOAP specifications. Many modern programming encing mechanism used when (de)serializing object graphs. languages also support Unicode identifier names. Accurately representing object graphs in XML is critical • No mechanism for implementing XML namespaces. to achieve structural coherence. We consider resolutions to We agree that Java packages and C++ namespaces the following issues critical for achieving object-level co- cannot adequately simulate the XML namespace con- herence in XML Web services: cept, because XML namespace resolution is not lim- ited to types, but also determines the namespaces of • SOAP 1.1 RPC encoded multi-ref accessors are placed local elements and attributes. Schema import, include, at the end of a message, so that all references are and redefine constructs also do not translate to package forward pointing. Object copying or pointer back- imports. We believe that the incorporation of canon- patching must be used by the deserializer for each for- Tree DAG DAG X X X X X Shape Y Z Y X <complexType name="X"> <complexType name="X"> <sequence> ..
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages8 Page
-
File Size-