PLAN-X 2006 Informal Proceedings

BRICS NS-05-6 Castagna & Raghavachari (eds.): PLAN-X 2006 Informal Proceedings BRICS Basic Research in Computer Science PLAN-X 2006 Informal Proceedings Charleston, South Carolina, January 14, 2006 Giuseppe Castagna Mukund Raghavachari (editors) BRICS Notes Series NS-05-6 ISSN 0909-3206 December 2005 Copyright c 2005, Giuseppe Castagna & Mukund Raghavachari (editors). BRICS, Department of Computer Science University of Aarhus. All rights reserved. Reproduction of all or part of this work is permitted for educational or research use on condition that this copyright notice is included in any copy. See back inner page for a list of recent BRICS Notes Series publications. Copies may be obtained by contacting: BRICS Department of Computer Science University of Aarhus Ny Munkegade, building 540 DK–8000 Aarhus C Denmark Telephone: +45 8942 3360 Telefax: +45 8942 3255 Internet: [email protected] BRICS publications are in general accessible through the World Wide Web and anonymous FTP through these URLs: http://www.brics.dk ftp://ftp.brics.dk This document in subdirectory NS/05/6/ PLAN-X 2006 Informal Proceedings Charleston, South Carolina 14 January 2006 Invited Talk • Service Interaction Patterns 1 John Evdemon Papers • Statically Typed Document Transformation: An Xtatic Experience 2 Vladimir Gapeyev, François Garillot and Benjamin Pierce • Type Checking with XML Schema in XACT 14 Christian Kirkegaard and Anders Møller • PADX: Querying Large-scale Ad Hoc Data with XQuery 24 Mary Fernandez, Kathleen Fisher, Robert Gruber and Yitzhak Mandelbaum • OCaml + XDuce 36 Alain Frisch • Polymorphism and XDuce-style patterns 49 Jer´ omeˆ Vouillon • Composing Monadic Queries in Trees 61 Emmanuel Filiot, Joachim Niehren, Jean-Marc Talbot and Sophie Tison • Type Checking For Functional XML Programming Without Type Annotation 71 Akihiko Tozawa Demos • Accelerating XPath Evaluation against XML Streams 82 Dan Olteanu • Imperative Programming Languages with Database Optimizers 83 Daniela Florescu and Anguel Novoselsky • Xcerpt and visXcerpt: Integrating Web Querying 84 Sacha Berger, François Bry and Tim Furche • XJ: Integration of XML Processing into Java 85 Rajesh Bordawekar, Michael Burke, Igor Peshansky and Mukund Raghavachari • XML Support in Visual Basic 9 86 Erik Meijer and Brian Beckman • XACT – XML Transformations in Java 87 Christian Kirkegaard and Anders Møller • XTATIC 88 Vladimir Gapayev, Michael Levin, Benjamin Pierce and Alan Schmitt • OCamlDuce 89 Alain Frisch • LAUNCHPADS: A System for Processing Ad Hoc Data 90 Mark Daly, Mary Fernandez,´ Kathleen Fischer, Yitzhak Mandelbaum and David Walker • XHaskell 92 Martin Sulzmann and Kenny Zhou Ming Lu Program Committee Gavin Bierman (Microsoft Research) Giuseppe Castagna (CNRS Ecole Normale Superieure´ de Paris), chair Alain Frisch (INRIA Roquencourt) Giorgio Ghelli (University of Pisa) Tova Milo (Tel Aviv University) Makoto Murata (IBM Japan) Dan Olteanu (Saarland University) Benjamin Pierce (University of Pennsylvania) Mukund Raghavachari (IBM T.J. Watson Research Center), demo chair Helmut Seidl (Technische Universitat¨ Munchen)¨ General Chair Anders Møller (BRICS, University of Aarhus) Service Interaction Patterns (Invited Talk) John Evdemon [email protected] Abstract The traditional method for building a service requires a developer to ensure that business logic is not hosted directly within the service itself. While this approach helps make the service more flexible it does not address the biggest architectural gap facing web services today: service interaction patterns (SIPs). A SIP occurs when services engage in concurrent and interrelated interactions with other services. Traditional web service architectures are designed to accommodate simple point-to-point interactions - there is no concept of a logical flow or series of steps from one service to another. Standards such as WS-BPEL are being developed to address this gap. In this session we will discuss a “manifesto” for workflow- enabled solutions, review emerging standards (BPEL, others) and address possible misconceptions regarding these standards. 1 Statically Typed Document Transformation: An XTATIC Experience Vladimir Gapeyev François Garillot Benjamin C. Pierce University of Pennsylvania Ecole´ Normale Superieure´ University of Pennsylvania Abstract mars, section numbering, setting up cross-references, and generating the table of contents. A useful effect of emu- XTATIC is a lightweight extension of C with native sup- lating a finished untyped application is that both costs port for statically typed XML processing. It features XML and benefits are visible all at once, rather than arising trees as built-in values, a refined type system based on and being dealt with incrementally, throughout the de- regular types alaXD` UCE,andregular patterns for inves- sign and development process. To maximize the opportu- tigating and manipulating XML. We describe our experi- nities for comparison, our XTATIC implementation closely ences using XTATIC in a real-world application: a program follows not only the behavior, but also, as far as possible, for transforming XMLSPEC, a format used for authoring the structure of the original XSLT implementation. W3C technical reports, into HTML. Our implementation closely follows an existing one written in XSLT, facilitat- Thecontributionsofthepaperareasfollows.First,we ing comparison of the two languages and analysis of the draw attention to the XMLSPEC problem itself. This prob- costs and benefits—both significant—of rich static typing lem offers a good balance of size, complexity, and fa- for XML-intensive code. miliarity, and we hope that it can be re-used by others as a common benchmark for XML processing languages. Second, we present a detailed analysis of the costs and 1Introduction benefits of expressive static types for XML manipulation, both of which were substantial in this application. The A profusion of recent language designs, including main cost is the difficulty of inferring appropriate types XDUCE [17, 18, 19], CDUCE [11, 2], XACT [25, 8], for multiple, mutually recursive transformations. The XQUERY [4, 10], XJ [15], XOBE [23], and XTATIC [14, 26, main benefit is the expected one: design flaws in the 12, 13, 27], are founded on the belief that rich static type XMLSPEC DTD—which show up in the XSLT stylesheet systems based on regular tree languages can offer signif- as behavioral bugs—are instead exposed as type incon- icant benefits for XML-intensive programming. Though sistencies. Third, we demonstrate that the type sys- attractive, this belief can be questioned on a number of tem and processing primitives of XTATIC are sufficiently counts. Are familiar XML processing idioms from untyped powerful and flexible to fix (or gracefully work around) settings easy to enrich with types, or are there important these bugs without modifying the XMLSPEC DTD. Fix- idioms for which static typing is awkward or unwork- ing some of them in the XSLT stylesheet appears more able? Is it feasible to reimplement untyped applications difficult. Finally, reimplementing an existing stylesheet in a statically typed language in a “bug-for-bug compat- gives us many opportunities for head-to-head compar- ible” fashion? Does the need to please the typechecker isons of XSLT and XTATIC, highlighting areas where each lead to too much repetitive boilerplate or too many type shines. In particular, we observe that XTATIC-style regu- annotations? Our aim is to put these questions to the lar pattern matching is more natural than XSLT’s style— test by a detailed comparison of a non-trivial application structural recursion augmented with “context probing”— originally written in XSLT 1.0 [9] and a faithful reimple- when processing structures, such as the BNF grammar mentation of the same application in XTATIC. descriptions found in XMLSPEC, where ordering is important. Conversely, XSLT is very convenient for straight- For this experiment, we chose a task that has also been forward structural traversions with local transformations, used as a case study in the standard XSLT reference where XTATIC requires a heavier explicit-dispatch control [20, 21]: translation of structured documents from a flow. Also, XSLT’s data model, which treats the original high-level document description language, XMLSPEC, document as a resource for the computation, is more nat- into XHTML. XMLSPEC is the format used for authoring ural for certain tasks, though we can mimic some of its official W3C recommendations and drafts. This exam- uses with generic libraries in XTATIC. ple is non-trivial but of manageable size: the DTD for XMLSPEC defines 102 elements and 57 type-like entities, Section 2 summarizes XMLSPEC and gives a high-level while the XHTML DTD defines 89 elements and 65 en- explanation of the transformation task. Section 3 de- tities; the XSLT stylesheet implementing the transforma- scribes the main challenges of expressing the core XSLT tion is 770 lines long. Besides styling XMLSPEC elements processing model in XTATIC. Section 4 compares the pro- as HTML, its functions include formatting BNF gram- cessing of structured data such as BNF grammars in XSLT 2 and XTATIC. Section 5 describes the auxiliary data struc- resented explicitly in XMLSPEC by nested sectional ele- tures that our application uses in place of the global doc- ments div1,...,div4, while in HTML it is implied by ument access primitives offered by XSLT. We close with heading elements h1,...,h6 that interrupt the flow of an overview of other evaluations of XML processing lan- paragraph-level

Load more