Taxonomy of XML Schema Languages Using Formal Language

Total Page:16

File Type:pdf, Size:1020Kb

Taxonomy of XML Schema Languages Using Formal Language TaxonomyofXMLSchemaLanguagesusing FormalLanguageTheory Onthebasisofregulartreelanguages,wepresentaformalframeworkforXMLschemalanguages.This frameworkhelpstodescribe,compare,andimplementsuchschemalanguages.Ourmainresultsareasfollows:(1) fourclassesoftreelanguages,namely"local","single-type","restrainedcompetition"and"regular";(2)document validationalgorithmsfortheseclasses;and(3)classificationandcomparisonofschemalanguages:DTD,XML- Schema,DSD,XDuce,RELAXCore,andTREX. MakotoMurata,IBMTokyoResearchLab./InternationalUniversityofJapan DongwonLee,UCLA/CSD MuraliMani,UCLA/CSD 1. Introduction XMLisametalanguageforcreatingmarkuplanguages.Torepresentaparticulartypeof information,wecreateanXML-basedlanguagebydesigninganinventoryofnamesforelements andattributes.Thesenamesarethenutilizedbyapplicationprogramsdedicatedtothistypeof information. Aschemaisadescriptionofsuchaninventory:aschemaspecifiespermissiblenamesfor elementsandattributes,andfurtherspecifiespermissiblestructuresandvaluesofelementsand attributes.Advantagesofcreatingaschemaareasfollows:(1)theschemapreciselydescribes permissibleXMLdocuments,(2)computerprogramscandeterminewhetherornotagivenXML documentispermittedbytheschema,and(3)wecanusetheschemaforcreatingapplication programs(bygeneratingcodeskeletons,forexample).Thus,schemasplayaveryimportantrole indevelopmentofXML-basedapplications. Severallanguagesforwritingschemas,whichwecallschemalanguages,havebeenproposedin thepast.Somelanguages(e.g.,DTD)areconcernedwithXMLdocumentsingeneral;thatis, theyhandleelementsandattributes.Otherlanguagesareconcernedwithparticulartypeof informationwhichmayberepresentedbyXML;primaryconstructsforsuchinformationarenot elementsorattributes,butratherconstructsspecifictothattypeofinformation.RDFSchemaof W3Cisanexampleofsuchaschemalanguage.SinceprimaryconstructsforRDFinformation areresources,properties,andstatements,RDFSchemaisconcernedwithresources,properties, andstatementsratherthanelementsandattributes.Inthispaper,welimitourconcerntoschema languagesforXMLdocumentsingeneral(i.e.,elementsandattributes);specifically,weconsider DTD[BPS00],XML-Schema[TBMM00],DSD[KMS00],RELAXCore[Mur00b],andTREX [Cla01].1Althoughschemalanguagesdedicatedtoparticulartypesofinformation(e.g.,RDF, XTM,andER)areusefulforparticularapplications,theyareoutsidethescopeofthispaper. Webelievethatprovidingaformalframeworkiscrucialinunderstandingvariousaspectsof schemalanguagesandfacilitatingefficientimplementationsofschemalanguages.Weuse regulartreegrammartheory([Tak75]and[CDG+97])toformallycaptureschemas,schema languages,anddocumentvalidation.Regulartreegrammarshaverecentlybeenusedbymany ExtremeMarkupLanguages2000 1 MakotoMurata,DongwonLeeandMuraliMani•TaxonomyofXMLSchemaLanguagesusingFormalLanguage Theory researchersforrepresentingschemasorqueriesforXMLandhavebecomethemainstreamin thisarea(seeOASIS[OA01]andVianu[VV01]);inparticular,XMLQuery[CCF+01]ofW3C isbasedontreegrammars. Ourcontributionsareasfollows: 1. Wedefinefoursubclassesofregulartreegrammarsandtheircorrespondinglanguagesto describeschemasprecisely; 2. Weshowalgorithmsforvalidationofdocumentsagainstschemasforthesesubclasses andconsiderthecharacteristicsofthesealgorithms(e.g.,thetreemodel.vs.theevent model);and 3. Basedonregulartreegrammarsandthesevalidationalgorithms,wepresentadetailed analysisandcomparisonofafewXMLschemaproposalsandtypesystems;anXML schemaproposalAismoreexpressivethananotherproposalBifthesubclasscapturedby AproperlyincludesthatcapturedbyB. Theremainderofthispaperisorganizedasfollows.InSection2,weconsiderrelatedworks suchasothersurveypapersonXMLschemalanguages.InSection3,wefirstintroduceregular treelanguagesandgrammars,andthenintroducerestrictedclasses.InSection4,weintroduce validationalgorithmsforthefourclasses,andconsidertheircharacteristics.InSection5,onthe basisoftheseobservations,weevaluatedifferentXMLschemalanguageproposals.Finally, concludingremarksandthoughtsonfutureresearchdirectionsarediscussedinSection6. 2. RelatedWork MorethantenschemalanguagesforXMLhaveappearedrecently,and[Jel00]and[LC00] attempttocompareandclassifysuchXMLschemaproposalsfromvariousperspectives. However,theirapproachesarebyandlargenotmathematicalsothattheprecisedescriptionand comparisonamongschemalanguageproposalsarenotstraightforward.Ontheotherhand,this paperfirstestablishesaformalframeworkbasedonregulartreegrammars,andthencompares schemalanguageproposals. SinceKilhoShinadvocateduseoftreeautomataforstructureddocumentsin1992,many researchershaveusedregulartreegrammarsortreeautomataforXML(seeOASIS[OA01]and Vianu[VV01]).However,tothebestofourknowledge,nopapershaveusedregulartree grammarstoclassifyandcompareschemalanguageproposals.Furthermore,weintroduce subclassesofregulartreegrammars,andpresentacollectionofvalidationalgorithmsdedicated tothesesubclasses. XML-SchemaFormalDescription(formerlycalledMSL[BFRW01])isamathematicalmodelof XML-Schema.However,itistailoredforXML-Schemaandisthusunabletocaptureother schemalanguages.Meanwhile,ourframeworkisnottailoredforaparticularschemalanguage. Asaresult,allschemalanguagescanbecaptured,althoughfinedetailsofeachschemalanguage arenot. ExtremeMarkupLanguages2000 2 MakotoMurata,DongwonLeeandMuraliMani•TaxonomyofXMLSchemaLanguagesusingFormalLanguage Theory 3. TreeGrammars Inthissection,asamechanismfordescribingpermissibletrees,westudytreegrammars.We beginwithaclassoftreegrammarscalled"regular",andthenintroducethreerestrictedclasses called"local","single-type",and"restrained-competition". Somereadersmightwonderwhywedonotusecontext-free(string)grammars.Context-free (string)grammars[HU79]representsetsofstrings.Successfulparsingofstringsagainstsuch grammarsprovidesderivationtrees.Thisscenarioisappropriateforprogramminglanguages andnaturallanguages,whereprogramsandnaturallanguagetextarestringsratherthantrees. Ontheotherhand,starttagsandendtagsinanXMLdocumentcollectivelyrepresentatree. Sincetraditionalcontext-free(string)grammarsareoriginallydesignedtodescribepermissible strings,theyareinappropriatefordescribingpermissibletrees. 3.1 RegularTreeGrammarsandLanguages Weborrowthedefinitionsofregulartreelanguagesandtreeautomatain[CDG+97],butallow treeswith"infinitearity";thatis,weallowanodetohaveanynumberofsubordinatenodes,and allowtheright-handsideofaproductionruletohavearegularexpressionovernon-terminals. Definition1.(RegularTreeGrammar)Aregulartreegrammar(RTG)isa4-tupleG=(N,T, S,P),where: • Nisafinitesetofnon-terminals, • Tisafinitesetofterminals, • Sisasetofstartsymbols,whereSisasubsetofN. • PisafinitesetofproductionrulesoftheformX→ar,whereX∈N,a∈T,andrisa regularexpressionoverN;Xistheleft-handside,aristheright-handside,andristhe contentmodelofthisproductionrule. Example1.ThefollowinggrammarG1=(N,T,S,P)isaregulartreegrammar.Theleft-hand side,right-handside,andcontentmodelofthefirstproductionruleisDoc,Doc(Para1, Para2*),and(Para1,Para2*),respectively. N={Doc,Para1,Para2,Pcdata} T={doc,para,pcdata} S={Doc} P={Doc→doc(Para1,Para2*),Para1→para(Pcdata), Para2→para(Pcdata),Pcdata→pcdata } Werepresenteverytextvaluebythenodepcdataforconvenience. Withoutlossofgenerality,wecanassumethatnotwoproductionruleshavethesamenon- terminalintheleft-handsideandthesameterminalintheright-handsideatthesametime.Ifa regulartreegrammarcontainssuchproductionrules,weonlyhavetomergethemintoasingle ExtremeMarkupLanguages2000 3 MakotoMurata,DongwonLeeandMuraliMani•TaxonomyofXMLSchemaLanguagesusingFormalLanguage Theory productionrule.Wealsoassumethateverynon-terminaliseitherastartsymboloroccursinthe contentmodelofsomeproductionrule(inotherwords,nonon-terminalsareuseless). Wehavetodefinehowaregulartreegrammargeneratesasetoftreesoverterminals.Wefirst defineinterpretations. Definition2.(Interpretation)AninterpretationIofatreetagainstaregulartreegrammarGis amappingfromeachnodeeinttoanon-terminal,denotedI(e),suchthat: • I(eroot)isastartsymbolwhereerootistherootoft,and X→ar • foreachnodeeanditssubordinatese0,e1,...,ei,thereexistsaproductionrule suchthat • I(e)isX, • theterminal(label)ofeisa,and • I(e0)I(e1)...I(ei)matchesr. Now,wearereadytodefinegenerationoftreesfromregulartreegrammars,andregulartree languages. Definition3.(Generation)AtreetisgeneratedbyaregulartreegrammarGifthereisan interpretationoftagainstG. Example2.AninstancetreegeneratedbyG1,anditsinterpretationagainstG1areshownin Figure1. d1 Doc p1 p2 Para1 Para2 pcdata1 pcdata2 Pcdata Pcdata a) Instance tree, t b) Interpretation of t generated by G against G Figure1. AninstancetreegeneratedbyG1,anditsinterpretationagainstG1.Weuseunique labelstorepresentthenodesintheinstancetree. Definition4.(RegularTreeLanguage)Aregulartreelanguageisthesetoftreesgeneratedby aregulartreegrammar. ExtremeMarkupLanguages2000 4 MakotoMurata,DongwonLeeandMuraliMani•TaxonomyofXMLSchemaLanguagesusingFormalLanguage Theory 3.2 LocalTreeGrammarsandLanguages Wefirstdefinecompetitionofnon-terminals,whichmakesvalidationdifficult.Then,we introducearestrictedclasscalled``local''byprohibitingcompetitionofnon-terminals[Tak75]. ThisclassroughlycorrespondstoDTD. Definition5.(CompetingNon-Terminals)Twodifferentnon-terminalsAandBaresaid competingwitheachotherif • oneproductionrulehasAintheleft-handside, • anotherproductionrulehasBintheleft-handside,and • thesetwoproductionrulessharethesameterminalintheright-handside. Example3.ConsideraregulartreegrammarG3=(N,T,S,P),where: N={Book,Author1,Son,Article,Author2,Daughter} T={book,author,son,daughter} S={Book,Article} P={Book→book(Author1),Author1→author(Son),Son→son , Article→article(Author2),Author2→author(Daughter), Daughter→daughter } Author1andAuthor2competewitheachother,sincetheproductionruleforAuthor1andthatfor Author2sharetheterminalauthorintheright-handside.Therearenoothercompetingnon-
Recommended publications
  • Using Tree Automata and Regular Expressions to Manipulate
    Using Tree Automata and Regular Expressions to Manipulate Hierarchically Structured Data Nikita Schmidt and Ahmed Patel University College Dublin∗ Abstract Information, stored or transmitted in digital form, is often structured. Individual data records are usually represented as hierarchies of their elements. Together, records form larger structures. Information processing applications have to take account of this structuring, which assigns different semantics to different data elements or records. Big variety of structural schemata in use today often requires much flexibility from applications—for example, to process information coming from different sources. To ensure application interoperability, translators are needed that can convert one structure into another. This paper puts forward a formal data model aimed at supporting hierarchical data pro- cessing in a simple and flexible way. The model is based on and extends results of two classical theories, studying finite string and tree automata. The concept of finite automata and regular languages is applied to the case of arbitrarily structured tree-like hierarchical data records, represented as “structured strings.” These automata are compared with classical string and tree automata; the model is shown to be a superset of the classical models. Regular grammars and expressions over structured strings are introduced. Regular expression matching and substitution has been widely used for efficient unstruc- tured text processing; the model described here brings the power of this proven technique to applications that deal with information trees. A simple generic alternative is offered to replace today’s specialised ad-hoc approaches. The model unifies structural and content transforma- tions, providing applications with a single data type.
    [Show full text]
  • Custom XML Attribute
    ISO/IEC JTC 1/SC 34/WG 4 N 0050 IS 29500:2008 Defect Report Log Rex Jaeschke (project editor) ([email protected]) 2009-05-27 DRs 08-0001 through 08-0015 DRs 09-0001 through 09-0230 IS 29500:2008 Defect Report Log Introduction ........................................................................................................................................................................ 1 Revision History ................................................................................................................................................................... 3 DR Status at a Glance .......................................................................................................................................................... 5 1. DR 08-0001 — DML, Framework: Removal of ST_PercentageDecimal from the strict schema .............................. 6 2. DR 08-0002 — Primer: Format of ST_PositivePercentage values in strict mode examples .................................... 9 3. DR 08-0003 — DML, Main: Format of ST_PositivePercentage values in strict mode examples ........................... 12 4. DR 08-0004 — DML, Diagrams: Type for prSet attributes ................................................................................... 14 5. DR 08-0005 — PML, Animation: Description of hsl attributes Lightness and Saturation ..................................... 20 6. DR 08-0006 — PML, Animation: Description of rgb attributes Blue, Green and Red ........................................... 22 7. DR 08-0007 — DML, Main: Format
    [Show full text]
  • Taxonomy of XML Schema Languages Using Formal Language Theory
    Taxonomy of XML Schema Languages using Formal Language Theory MAKOTO MURATA IBM Tokyo Research Lab DONGWON LEE Penn State University MURALI MANI Worcester Polytechnic Institute and KOHSUKE KAWAGUCHI Sun Microsystems On the basis of regular tree grammars, we present a formal framework for XML schema languages. This framework helps to describe, compare, and implement such schema languages in a rigorous manner. Our main results are as follows: (1) a simple framework to study three classes of tree languages (“local”, “single-type”, and “regular”); (2) classification and comparison of schema languages (DTD, W3C XML Schema, and RELAX NG) based on these classes; (3) efficient doc- ument validation algorithms for these classes; and (4) other grammatical concepts and advanced validation algorithms relevant to XML model (e.g., binarization, derivative-based validation). Categories and Subject Descriptors: H.2.1 [Database Management]: Logical Design—Schema and subschema; F.4.3 [Mathematical Logic and Formal Languages]: Formal Languages— Classes defined by grammars or automata General Terms: Algorithms, Languages, Theory Additional Key Words and Phrases: XML, schema, validation, tree automaton, interpretation 1. INTRODUCTION XML [Bray et al. 2000] is a meta language for creating markup languages. To represent an XML based language, we design a collection of names for elements and attributes that the language uses. These names (i.e., tag names) are then used by application programs dedicated to this type of information. For instance, XHTML [Altheim and McCarron (Eds) 2000] is such an XML-based language. In it, permissible element names include p, a, ul, and li, and permissible attribute names include href and style.
    [Show full text]
  • XML Access Control Using Static Analysis
    XML Access Control Using Static Analysis Makoto Murata Akihiko Tozawa Michiharu Kudo IBM Tokyo Research Lab/IUJ IBM Tokyo Research Lab IBM Tokyo Research Lab Research Institute 1623-14, Shimotsuruma, 1623-14, Shimotsuruma, 1623-14, Shimotsuruma, Yamato-shi, Yamato-shi, Yamato-shi, Kanagawa-ken 242-8502, Kanagawa-ken 242-8502, Kanagawa-ken 242-8502, Japan Japan Japan [email protected] [email protected] [email protected] ABSTRACT 1. INTRODUCTION Access control policies for XML typically use regular path XML [5] has become an active area in database research. expressions such as XPath for specifying the objects for ac- XPath [6] and XQuery [4] from the W3C have come to be cess control policies. However such access control policies widely recognized as query languages for XML, and their are burdens to the engines for XML query languages. To implementations are actively in progress. In this paper, relieve this burden, we introduce static analysis for XML we are concerned with fine-grained (element- and attribute- access control. Given an access control policy, query expres- level) access control for XML database systems rather than sion, and an optional schema, static analysis determines if document-level or collection-level access control. We be- this query expression is guaranteed not to access elements lieve that access control plays an important role in XML or attributes that are permitted by the schema but hidden database systems, as it does in relational database systems. by the access control policy. Static analysis can be per- Some early experiences [21, 10, 3] with access control for formed without evaluating any query expression against an XML documents have been reported already.
    [Show full text]
  • Towards Conceptual Modeling for XML
    Towards Conceptual Modeling for XML Erik Wilde Swiss Federal Institute of Technology Zurich¨ (ETHZ) Abstract: Today, XML is primarily regarded as a syntax for exchanging structured data, and therefore the question of how to develop well-designed XML models has not been studied extensively. As applications are increasingly penetrated by XML technologies, and because query and programming languages provide native XML support, it would be beneficial to use these features to work with well-designed XML models. In order to better focus on XML-oriented technologies in systems engineering and programming languages, an XML modeling language should be used, which is more focused on modeling and structure than typical XML schema languages. In this paper, we examine the current state of the art in XML schema languages and XML modeling, and present a list of requirements for a XML conceptual modeling language. 1 Introduction Due to its universal acceptance as a way to exchange structured data, XML is used in many different application domains. In many scenarios, XML is used mainly as an exchange format (basically, a way to serialize data that has a model defined in some non- XML way), but an increasing number of applications is using XML as their native data model. Developments like the XML Query Language (XQuery) [BCF+05] are a proof of the fact that XML becomes increasingly visible on higher layers of the software architec- ture, and in this paper we argue that current XML modeling approaches are not sufficient to support XML in this new context. Rather than treating XML as a syntax for some non-XML data model, it would make more sense to embrace XML’s structural richness by using a modeling language with built-in XML support.
    [Show full text]
  • Document Declaration Is Missing Html Attribute
    Document Declaration Is Missing Html Attribute Flimsier and horror-stricken Jerrold gnawn her squinancy warm-ups while Duncan recompense some ribbings vitalistically. Assessable temperamentally?and bookable Quinn still caracoling his syllogism unpreparedly. Reynolds remains optic: she gleans her blackboy jutty too As html attribute declarations declare a declared. The DOM tree is unmodified at local point. It sets formatted output with indentation, without BOM and with default node declaration, if necessary. DLL: this is needed as it is the only reliable way to ensure that building a DLL succeeded. RELAX Core that not prescribe such structured strings. We will identify the effective date of the revision in the posting. You are already subscribed. The lower when. Indicates a XUL textbox element. This alerts you if the heading image is specified with empty ALT attribute as below. In this post, I will tell you Some Important Css tricks. People who am i wake up and attribute is performed if both? In calls do other document declaration is missing attribute or a format that the allowed as seen some authors. In this case, points will not be deducted. If numguess already exists, use the existing. REST services is to include error details in the body of the response. Count distinct observations over requested axis. This works in just enclose further calls are not been chosen to know how and let it? Unified EL as used by JSF and JSP. How do before following compare? As follows flow of each entry points in this warning is used to delimited column labels. Most html document declaration, it just pass a missing property after that should prefer not documented.
    [Show full text]
  • Logics for XML
    Institut National Polytechnique de Grenoble Logics for XML Pierre Geneves` Thesis presented in partial fulfillment of the requirements for the degree of Ph.D. in Computer Science and Software Systems from the Institut National Polytechnique de Grenoble. Dissertation prepared at the Institut National de Recherche en Informatique et Automatique, Montbonnot, France. Thesis th arXiv:0810.4460v2 [cs.PL] 24 May 2014 defended on the 4 of December 2006. Board of examiners: Giorgio Ghelli Referee Denis Lugiez Referee Makoto Murata Referee Christine Collet Examiner Vincent Quint Ph.D. advisor Nabil Laya¨ıda Invited Member Abstract This thesis describes the theoretical and practical foundations of a system for the static analysis of XML processing languages. The system relies on a fixpoint temporal logic with converse, derived from the µ-calculus, where models are finite trees. This calculus is expressive enough to capture regular tree types along with multi-directional navigation in trees, while having a single exponential time complexity. Specifically the decidability of the logic is proved in time 2O(n) where n is the size of the input formula. Major XML concepts are linearly translated into the logic: XPath naviga- tion and node selection semantics, and regular tree languages (which include DTDs and XML Schemas). Based on these embeddings, several problems of major importance in XML applications are reduced to satisfiability of the logic. These problems include XPath containment, emptiness, equivalence, overlap, coverage, in the presence or absence of regular tree type constraints, and the static type-checking of an annotated query. The focus is then given to a sound and complete algorithm for deciding the logic, along with a detailed complexity analysis, and crucial implementation techniques for building an effective solver.
    [Show full text]
  • Parsing and Querying XML Documents in SML
    Parsing and Querying XML Documents in SML Dissertation zur Erlangung des akademischen Grades des Doktors der Naturwissenschaften am Fachbereich IV der Universitat¨ Trier vorgelegt von Diplom-Informatiker Andreas Neumann Dezember 1999 The forest walks are wide and spacious; And many unfrequented plots there are. The Moor Aaron, in1: The Tragedy of Titus Andronicus by William Shakespeare Danksagung An dieser Stelle mochte¨ ich allen Personen danken, die mich bei der Erstellung meiner Arbeit direkt oder indirekt unterstutzt¨ haben. An erster Stelle gilt mein Dank dem Betreuer meiner Arbeit, Helmut Seidl. In zahlreichen Diskussionen hatte er stets die Zeit und Geduld, sich mit meinen Vorstellungen von Baumautomaten auseinanderzusetzen. Seine umfassendes Hintergrundwissen und die konstruktive Kritik, die er ausubte,¨ haben wesent- lich zum Gelingen der Arbeit beigetragen. Besonderer Dank gilt Steven Bashford und Craig Smith fur¨ das muhevol-¨ le und aufmerksame Korrekturlesen der Arbeit. Sie haben manche Unstim- migkeit aufgedeckt und nicht wenige all zu “deutsche” Formulierungen in meinem Englisch gefunden. Allen Mitarbeitern des Lehrstuhls fur¨ Program- miersprachen und Ubersetzerbau¨ an der Universitat¨ Trier sei hiermit auch fur¨ die gute Arbeitsatmosphare¨ ein Lob ausgesprochen, die gewiss nicht an jedem Lehrstuhl selbstverstandlich¨ ist. Nicht zuletzt danke ich meiner Freundin Dagmar, die in den letzten Mo- naten viel Geduld und Verstandnis¨ fur¨ meine haufige¨ – nicht nur korperliche¨ – Abwesenheit aufbrachte. Ihre Unterstutzung¨ und Ermutigungen so wie ihr stets offenes Ohr waren von unschatzbarem¨ Wert. 1Found with fxgrep using the pattern ==SPEECH[ (==”forest”) ] in titus.xml [Bos99] Abstract The Extensible Markup Language (XML) is a language for storing and exchang- ing structured data in a sequential format.
    [Show full text]
  • PLAN-X 2006 Informal Proceedings
    BRICS NS-05-6 Castagna & Raghavachari (eds.): PLAN-X 2006 Informal Proceedings BRICS Basic Research in Computer Science PLAN-X 2006 Informal Proceedings Charleston, South Carolina, January 14, 2006 Giuseppe Castagna Mukund Raghavachari (editors) BRICS Notes Series NS-05-6 ISSN 0909-3206 December 2005 Copyright c 2005, Giuseppe Castagna & Mukund Raghavachari (editors). BRICS, Department of Computer Science University of Aarhus. All rights reserved. Reproduction of all or part of this work is permitted for educational or research use on condition that this copyright notice is included in any copy. See back inner page for a list of recent BRICS Notes Series publications. Copies may be obtained by contacting: BRICS Department of Computer Science University of Aarhus Ny Munkegade, building 540 DK–8000 Aarhus C Denmark Telephone: +45 8942 3360 Telefax: +45 8942 3255 Internet: [email protected] BRICS publications are in general accessible through the World Wide Web and anonymous FTP through these URLs: http://www.brics.dk ftp://ftp.brics.dk This document in subdirectory NS/05/6/ PLAN-X 2006 Informal Proceedings Charleston, South Carolina 14 January 2006 Invited Talk • Service Interaction Patterns 1 John Evdemon Papers • Statically Typed Document Transformation: An Xtatic Experience 2 Vladimir Gapeyev, Franc¸ois Garillot and Benjamin Pierce • Type Checking with XML Schema in XACT 14 Christian Kirkegaard and Anders Møller • PADX: Querying Large-scale Ad Hoc Data with XQuery 24 Mary Fernandez, Kathleen Fisher, Robert Gruber and Yitzhak Mandelbaum •
    [Show full text]
  • XML Graphs in Program Analysis
    XML Graphs in Program Analysis Anders Møller and Michael I. Schwartzbach BRICS, Department of Computer Science University of Aarhus, Denmark {amoeller,mis}@brics.dk Abstract by common schema formalisms, such as DTD [7] and XML Schema [39]; XML graphs have shown to be a simple and effective formalism for representing sets of XML documents in program analysis. It • it must allow static validation against common schema for- has evolved through a six year period with variants tailored for a malisms and also navigation with XPath expressions [14]; and range of applications. We present a unified definition, outline the it must be fully implemented. key properties including validation of XML graphs against differ- • ent XML schema languages, and provide a software package that In this paper we describe the XML graph model, which has matured enables others to make use of these ideas. We also survey four very through substantial practical experience in building static analyses different applications: XML in Java, Java Servlets and JSP, trans- of languages that manipulate XML. We also survey four different formations between XML and non-XML data, and XSLT. applications, showing the versatility of the model. XML graphs are fully implemented and available in an open source software pack- 1 Introduction age. Many interesting programming formalisms deal explicitly with 2 XMLGraphs XML documents. Examples range from domain-specific lan- guages, such as XSLT and XQuery, to general-purpose languages, W3C’s DOM [1] and XDM [17] both provide a view of XML doc- such as Java in which XML documents may be handled by special uments as unranked, labeled, finite trees.
    [Show full text]
  • Iso/Iec Jtc 1/Sc 34 N 1707 Date: 2011-10-03
    ISO/IEC JTC 1/SC 34 N 1707 DATE: 2011-10-03 ISO/IEC JTC 1/SC 34 Document Description and Processing Languages Secretariat: Japan (JISC) DOC. TYPE Resolutions Resolutions of the ISO/IEC JTC 1/SC 34 Plenary Meeting, Busan, TITLE Republic of Korea, 2011-09-30 SOURCE SC 34 Secretariat PROJECT Resolutions adopted at the ISO/IEC JTC 1/SC 34 Plenary Meeting, STATUS Busan, Republic of Korea, 2011-09-30 ACTION ID FYI DUE DATE P, O and L Members of ISO/IEC JTC 1/SC 34 ; ISO/IEC JTC 1 DISTRIBUTION Secretariat; ISO/IEC ITTF ACCESS LEVEL Open ISSUE NO. 230 NAME 1707.pdf SIZE FILE (KB) 8 PAGES Secretariat ISO/IEC JTC 1/SC 34 - IPSJ/ITSCJ (Information Processing Society of Japan/Information Technology Standards Commission of Japan)* Room 308-3, Kikai-Shinko-Kaikan Bldg., 3-5-8, Shiba-Koen, Minato-ku, Tokyo 105-0011 Japan *Standard Organization Accredited by JISC Telephone: +81-3-3431-2808; Facsimile: +81-3-3431-6493; E-mail: [email protected] Resolutions of the ISO/IEC JTC 1/SC 34 Plenary Meeting, Busan, Republic of Korea, 2011-09-30 National bodies and Liaison organizations present at this plenary: Brazil, China, Germany, Japan, Korea (Republic of), United Kingdom, USA, Ecma International, and OASIS Resolution 1: Endorsement of Chairman SC 34 endorses Prof. Sam Gyun OH as the Chairman of SC 34 for an additional term. SC 34 Secretariat is instructed to forward this nomination to JTC 1 for appointment at the JTC 1 Plenary Meeting to be held in San Diego in November 2011.
    [Show full text]
  • Converting Into Pattern-Based Schemas: a Formal Approach
    Extreme Markup Languages 2007® Montréal, Québec August 7-10, 2007 Converting into pattern-based schemas: a formal approach Antonina Dattolo Angelo Di Iorio Department of Mathematics and Department of Computer Science, University of Applications R. Caccioppoli, University of Bologna Napoli Federico II Silvia Duca Antonio Angelo Feliziani Department of Computer Science, Department of Computer Science, University of University of Bologna Bologna Fabio Vitali Department of Computer Science, University of Bologna Abstract A traditional distinction among markup languages is how descriptive or prescriptive they are. We identify six levels along the descriptive/prescriptive spectrum. Schemas at a specific level of descriptiveness that we call "Descriptive No Order" (DNO) specify a list of allowable elements, their number and requiredness, but do not impose any order upon them. We have defined a pattern-based model based on a set of named patterns, each of which is an object and its composition rule (content model), enough to write descriptive schemas for arbitrary documents. We show that any schema can be converted into a pattern-based one without loss of information at the DNO level (invariant conversion). We present a formal analysis of invariant conversions of arbitrary schemas as a demonstration of the correctness and completeness of our pattern model. Although all examples are given in DTD syntax, the results should apply equally to XSD, Relax NG, or other schema languages. Converting into pattern-based schemas: a formal approach Table of
    [Show full text]