COM-18-7987 Research Note R. Knox 30 December 2002

Commentary

Here's What's Wrong With XML-Defined Standards

It is relatively easy to construct XML standards for vertical domains. This has led to thousands of redundant and conflicting specifications. This chaos is the direct result of how these standards are developed.

Extensible (XML)-defined standards began with a number of vertically focused standards in 1999 (see Figure 1) but grew to an overwhelming number by 2001 (see Figure 2).

Figure 1 All Tied Up With XML: 1999 Schema for OO Chemical Markup Financial Information XML (SOX) Weather Observation Language (CML) eXchange Markup Language XML Linking Markup Format (OMF) (FIXML) Language XML Forms Open MLS (Real Estate) (XLink) Architecture (XFA)Health Level 7 Resource Description Framework (RDF) Vector Markup (HL7) Language (VML) Signed Document Markup Open Trading Protocol (OTP) Language (SDML) Universal Commerce Synchronized Multimedia Language and Protocol Integration Language (SMIL) (UCLP) Commerce XML (cXML) Meta Content Document Commerce Business FrameworkObject Model (DOM) Library (CBL) (MCF) XML Metadata Extensible Forms Interchange (XMI) Description Language Bank Internet Open(XFDL) Software SmartX (Smart Card) Markup XML Data Payment Distribution (OSD) Language (SML) System Web Interface Scalable Vector XML Query Document Type Definition (DTD) Definition Graphics (SVG) Language (XML- Channel Definition Format Language (WIDL) QL) (CDF) Extensible Style XML-Electronic Language (XSL) Precision Graphics Markup Language (PGML) Web Data Interchange (XML-EDI) XML Namespaces Extensible Markup Collections Language (XML) Document Content XSL Transformations (XSLT) Information and Content Description (DCD) Exchange (ICE)

Source: Gartner Research

Gartner

Entire contents © 2002 Gartner, Inc. All rights reserved. Reproduction of this publication in any form without prior written permission is forbidden. The information contained herein has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Gartner shall have no liability for errors, omissions or inadequacies in the information contained herein or for interpretations thereof. The reader assumes sole responsibility for the selection of these materials to achieve its intended results. The opinions expressed herein are subject to change without notice. Figure 2. All Tied Up With XML: 2001 Geography MarkupOpen Health Care BIOpolymer Markup Language (BIOML) Straight Through Processing Markup Name and AddressHuman Markup Resources Language Markup (NAML) ExtensibleSchema Rights for OOMarkup XML Language (SOX) (XrML) AssociationLanguage (GML) for CooperativeRecords Operations XML Open Philanthropy XML Forms Language (STPML) Language (HRML)Human Markup Language Signed Document Markup Research and Development (ACORD) MusicXML eXchange Information(OPX)(HumanML) TechnologyArchitecture Markup (XFA)Language (SDML) Manufacturing Data Systems Inc XML Metal XML LanguageSociety (ITML)Predictive for Worldwide Model Markup Interbank Language Financial (MDSI-XML)EquationEpicentre of StateMarkup XML Exchange Language Format GeophysicalTelecommunication Markup Language Telecommunications Interchange Well (PMML)Log Markup Extensible CustomerStylesheet Identity Markup(eosML) ML Schools Interoperability(swiftML) Markup (TIM) Language (WellLogML)DirectoryFramework Services Markup (SIF) ExtensibleLanguage User (XSL) InterfaceLanguage (CIML) Research InformationLanguage eXchange(DSML) Markup Architecture, Engineering, Open MLS (real estate)Language (RIXML) LanguageFinancial (XUL) Information eXchange Security Assertion Markup Universal Construction XML (aecXML) XML Linking Language (XLink) MarkupRe-usable Language Data Markup (FIXML) Language (SAML) CommerceSystems Biology Web CollectionsElectronic Catalog Language and Language (RDML) Cell Markup Language Markup Language Bibliographic Markup XML (eCX) Protocol Weather Observation Markup (CellML) Molecular Dynamics(SBML) MarkupLegalXML Language (BiblioML)SoftwareSynchronization Open (UCLP) Format (OMF) Banner MarkupMeta Content Markup CommerceLanguage (MoDL) XML (cXML)XML Query Access Protocol (SOAP) Language (BannerML)Framework Language XMLPIPE for Environmental,(Partner InterfaceLanguage Process for Energy) (XML-QL) Commerce Business (MCF) (SyncML) Safety & OccupationalOpen Trading Protocol Library (CBL)Security Services Markup Vector Markup XML Court Interface (XCI)Language (S2ML) HealthSynchronized (esohXML.org) Multimedia(OTP) Extensible Forms Description Language Bioinformatic SequenceLanguage (VML) Integration Language (SMIL) Open Financial Exchange (OFX) (XFDL) Markup Language (BSML)The WellSchematicML Standard for the Exchange of Product Product Data Markup LanguageMeat (PDML)and Poultry User Interface Markup Project Virtual Model Data XML ( STEPml) Bank Internet Payment SystemVoice MarkupXML Language Language (UIML) Open Software Distribution (OSD) SmartX (smart card) Markup Instruments Markup Language Digital Property XML-Electronic (VoiceXML)DARPA Agent Markup Language (DAML) Language (SML)HealthResource Level 7 (HL7) Description Framework (RDF) Rights Data Interchange XML Metadata Extensible Media Commerce Scalable Vector Graphics (SVG) XML Forms (Xforms) XSL TransformationsJobM Channel Definition Format Language (DPRL) (XML-EDI) Interchange (XMI)Language (XMCL) Standards for TechnologyeXtensible in Business Reporting ExtensibleChemical Markup Language (CML)XML DataBase(XSLT)ExtensibleL Style (CDF) Web Modeling Language UniversalAutomotive Description, Retail Discovery(STAR)/XML and Language (XBRL) Markup Extensible 3D (X3D)(dbXML) Language (XSL)Integration (UDDI)Precision Graphics Markup Language Trading Partner (WebML) Business Rules for E-Commerce (BREC) Electronic Business XML (ebXML) LanguageAgreement MarkupFinancial Planning Markup LanguageDocument Object Model (DOM)XML Namespaces (PGML) (XML) Web InterfaceXML Data DefinitionWeb Services Description Language (WSDL) Multi-Channel AccessXML Language (tpaML)(FpML)Corpus Encoding Standard for XML (XCES)Virtual Instruments Meta Language (VIML) Construction ManufacturingLanguage and (WIDL)Document Content Description (DCD) Agricultural Markup Language(MAXML) Extensible Graph Markup and Distribution Extensible SpacecraftMarkup Markup LanguageGene Expression (SML) Robotic Markup Markup Language ((AgXML) RoboML)Info. and Content Language (cmdXML) Modelling Language (XGMML)Language (GEML)Document Type Definition (DTD) Exchange (ICE)

Source: Gartner Research The number of XML-defined standards is still growing and today is in the thousands of often-closely- related, redundant and often-conflicting standards. This problematic situation is the direct result of how these standards are developed and of at least four different problems.

There Are No Rules. There are no rules that say what the scope of an XML standard must be, how big or small it can be, or how it should be developed. A single company may develop and publish a standard for public development and use. This was done originally, for example, with TaXML (www.taxml.org) and Value Chain Markup Language (VCML — see www.vcml.net). On the other hand, a group with global companies and organizations as members may be involved, as is the case with electronic business XML — www.ebXML.org, sponsored by Organization for the Advancement of Structured Information Standards (OASIS) and the United Nations Centre for Trade Facilitation and Electronic Business (UN/CEFACT). Certainly the scope of a global tax initiative (for example, TaXML) is extremely complex and must ultimately involve many players if it is to be successful. However, as originally initiated, it was relatively easy for a single company to provide the impetus and direction for the work. Well, why not? There are no requirements that dictate which groups must be involved based on the scope and potential impact of the standard that is ultimately developed.

It's Too Easy to Get Into the Standards Game. Because of the interest in the domain a standard covers, the groups convened to create them can be very large (in contrast to the original work on TaXML). For example, Extensible Business Reporting Language (XBRL; www.xbrl.org) is a standard for exchanging and analyzing financial information. As such, it concerns the accounting industry, investors, public and private companies, and regulators. More than 170 companies, associations and government organizations participate. HR-XML (www.hr-.org) is developing XML specifications to automate human-resources-related data exchange; it has approximately 120 participating companies and organizations. Although all member organizations don't participate in all aspects of the standards development work, many have some say or contribution to much of it, and they often have voting and

COM-18-7987 Copyright 2002 30 December 2002 2 approval rights. The level of agreement required complicates and slows down the development process, and often results in standards meeting the requirements of a lowest common denominator.

Some wonder why the World Wide Web Consortium (W3C) or OASIS doesn't supervise the standards work. The W3C's mission is to "lead the World Wide Web to its full potential by developing common protocols that promote its evolution and ensure its interoperability." Hence, its focus is on core standards like XML itself. OASIS is a global consortium that "drives the development, convergence and adoption of e-business standards." It hosts work on Universal Business Language and jointly works with UN/CEFACT on ebXML; it has tended to focus on specifications relevant to many vertical domains rather than to any particular one. It is impossible for one organization to orchestrate or supervise standards development in all vertical domains, and we don't expect any organization to announce such plans.

Most of the Time, It Takes Too Long. As the size of a working group grows, so does the time to develop a standard. Discussions start about what things should be called. Should a shorthand notation be used for element names to conserve storage space and reduce bandwidth requirements? Or should long names be used so the standards are self-documenting? How all-inclusive should the definition of an element be? For example, should "person" include the person's address and phone numbers? Or just fixed information such as name, social security number and date of birth? If an enterprise attempts to standardize common elements wherever they are used in their systems, it can take a long time to reach agreement. In an investment firm, for example, this would mean the marketing department, account maintenance, personal investors, fund managers and the legal department would have to reach agreement on a common model (and XML representation) for their clients. This is a notoriously difficult, territorial and time-consuming task. This task would be repeated for every shared element — for example, product description or earnings report.

Standards Grow to Be Too Big. Historically, XML standards have been monolithic. The approach of creating all-encompassing standards follows the precedent set with document type definitions (DTDs) developed for the Standard Generalized Markup Language, XML's parent language. This means that everything in the domain intended to be covered by the standard is captured in the one standard, even though any particular transaction or data-sharing instance may not require the complete standard.

This makes maintenance of most XML standards extremely difficult. New requirements need modification to the published standard — committee work, approval and publication — and applications must be modified to handle the new additions (that is, either to process the additional data or to ignore it as an optional component). Final approval of standards is delayed to ensure that all requirements have been addressed, yet, as soon as agreement is reached, new requirements are identified. Domains are constantly evolving, so no standard, no matter how big or apparently comprehensive, can be "right" for long.

Conclusions: Wasn't XML supposed to make data shareable? No. XML provides the tools to define shareable data models, but it does not make them shareable any more than the alphabet makes every word in the English language understood by anyone who speaks English.

Aren't the problems with XML standards development true of all standards efforts? Yes, but no other standards efforts have or are generating specifications at the rate XML is (in the hundreds). XML is a metalanguage for creating standards specifications. So, it's the efforts XML supports in so many vertical domains that are both its and the source of the problem. The simultaneous efforts that create redundant models are the problem.

COM-18-7987 Copyright 2002 30 December 2002 3 Bottom Line: Without some revolutionary change to the way in which XML-defined standards are developed, the maze of standards will continue to proliferate, and there will be no way to discover redundancies or identify conflicts and reconcile them. At the least, the proliferation of standards will result in millions of dollars of lost effort. At worst, it will corrupt data and compromise business-critical transactions and operations because different parts of the same company will process conflicting XML messages without knowing it. From 2001 through 2004, enterprises worldwide will spend more than $3 billion on XML modeling activities with no return on investment on $2 billion of it (0.8 probability). Enterprises must take seriously the need to look for better approaches to XML standards development. Gartner recommends that enterprises look at the framework from Accredited Standards Committee (ASC) X12 for a reference model for XML design (www.x12.org/x12org/xmldesign/index.cfm) as the best approach available today.

COM-18-7987 Copyright 2002 30 December 2002 4