STG's XML 1.0 Reference Validator

STG's XML 1.0 Reference Validator

Abstract

This report examines why validation, and readily available validation facilities, are critical to the rapid dissemination and success of XML; it also introduces a new, public reference validator intended to help fill this niche.

Table of Contents

A Surprising Fact What's Wrong With Invalid XML? Valid XML and the DTD DTDs and STG's Validator Using the Validator (aka "Quick Start") Inside the Validator Availability

Note: This report was written originally in October of 1998 - at a time when there were no complete, working, web-available XML validators. Since then, in addition to STG's validator, one other web-available validator has appeared (author: Richard Tobin). Others will doubtless follow.

A Surprising Fact

With all the hubbub surrounding XML lately - all the conferences, debates, books, papers, and articles - it is a surprising fact that only a small fraction of the XML available on the net is actually valid; i.e., only a small fraction of it follows the full February 1998 W3C XML 1.0 spec. The reason for this is simple: There isn't much XML software (as yet) to adequately generate and check it. Nor are there any full, working, web-based XML validation services analogous what we see in the HTML world.

Access to validation services, however, is critical to the success of XML because without it we end up back where we started, i.e., back to the very same chaos that prompted the development of XML in the first place.

In efforts to help reduce the chaos, and make validation facilities more broadly available, Brown University's Scholarly Technology Group (STG) has placed on its website a public reference XML 1.0 validator. This report examines the rationale behind that validator, and offers a brief semi-technical overview of its design. What's Wrong With Invalid XML?

http://cds.library.brown.edu/service/xmlvalid/Xml.tr98.2.shtml[6/20/14 12:22:39 AM] STG's XML 1.0 Reference Validator

The ubiquity of invalid XML documents (or, more broadly, our inability to detect them easily as such) presents a serious obstacle to the rapid dissemination and success of XML because it perpetuates the same interoperability problems that have hampered the development of XML's cousin, HTML.

As most Web designers and programmers are well aware, nonconformant HTML (i.e., HTML that fails to validate against an IETF or W3C standard) is, in many quarters, more the rule than the exception. Nonconformant HTML, though, often works out in practice because browser manufacturers, in addition to creating their own HTML extensions, have managed to work around most of the mistakes that programmers and authors typically make. But the manufacturers can't anticipate every possible mistake; and neither can every piece of software we use with our HTML. As a result HTML software is something of a free-for-all. Some software works fine with some HTML. Other software breaks on the same material.

The fundamental reason why HTML software has become such a free-for-all is that HTML began its life with no formal specification. Worse yet, when formal specifications finally did begin to appear, they came too slowly to be of much use to Web designers and programmers. As a result, every browser manufacturer felt obligated to define its own version of HTML. Microsoft and Netscape also felt it necessary to hire armies of programmers to figure out what their competitors were doing.

The result has been a dramatic increase in the cost and complexity of HTML processors - and an interoperability nightmare. Valid XML and the DTD

With XML ("Extensible Markup Language"), the situation is potentially quite different from what we have seen with HTML. With XML we don't have to worry as much about browser manufacturers arbitrarily redefining the specs. Nor do we have to wait for standards bodies to reach consensus. With XML, each of us has the power to take matters into our own hands; to define our own markup language, or to extend an existing one - and to decide what is, and isn't, a valid construct in that language. What is more, we can do all this in a way that conforming XML processors will understand. In other words, we can do it without creating the same interoperability problems that have dogged HTML.

http://cds.library.brown.edu/service/xmlvalid/Xml.tr98.2.shtml[6/20/14 12:22:39 AM] STG's XML 1.0 Reference Validator

The mechanism through which XML grants us these powers is the document type definition (DTD) - a document that specifies what elements, attributes, and entities an XML document instance may consist of, and in what order and combination. With a DTD (and a stylesheet) users have close to total control over the language and presentation of their documents.

(Although HTML has official DTDs, they are controlled by standards organizations, are rarely used, and often do not reflect actual practice.) DTDs and STG's Validator

Despite the freedom that XML DTDs can give us, there is, as yet, little software that allows anyone to take advantage of them. Most XML processors available now essentially ignore the DTD. And of those that do full (DTD-aware) validation, only one, as of this writing (Oct 98), is available freely over the Internet (I have not yet managed to get that validator, based in Korea, to work). See Robin Cover's definitive XML testing and validation resource list.

The absence of a full, working, publicly available XML reference validator creates a critical gap, especially now that consortiums have begun popping up everywhere, defining their own XML-based formats, and laying claim to its platform independence and interoperability. Without widely available validation facilities these claims are null because there is no way to verify, or enforce, actual conformance.

Perhaps not surprisingly, even an informal check of actual and proposed XML interchange formats reveals that most do not reflect valid XML 1.0 constructions. Some are so far from the spec that one wonders how anyone could call them XML. Until there is a publicly available reference XML validator people can point to, it will be difficult to stem the tide of this faux XML, and to get down to the business of creating genuinely interoperable formats, and field testing the XML processors that are to operate on them.

It is in efforts to fill this need for an XML reference validator that the Brown University Scholarly Technology Group (STG) has placed on its website a simple form-based XML 1.0 validation system.

http://cds.library.brown.edu/service/xmlvalid/Xml.tr98.2.shtml[6/20/14 12:22:39 AM] STG's XML 1.0 Reference Validator

Using the Validator

Using STG's XML validator is easy. Just go to the Web form, and either type in a local filename, or paste some actual XML into its text field; then click on the validate button. The validator will then either respond with a "validates OK" message, or else output a list of error and warning messages. Inside the Validator

The overall design of STG's system is tripartite. It is a familiar design common to many "traditional" web-based interfaces. It consists of:

1. a static HTML form 2. a short (500 line) PERL script 3. a back-end written with stock programming utilities (e.g., YACC and Lex)

The back end (component 3 above) is written specifically for legacy computer systems that lack intrinsic library support for and that may even have old-style SGML catalogs around. It validates at a rate of about ten seconds a megabyte on an old dual 125mhz HyperSparc 20 server, about four seconds per megabyte on a Pentium Pro 200 desktop. For more information on the back end, see its Unix man page.

The PERL script (component 2 above) is something of a bottleneck, but it uses the now nearly universal CGI interface, and has the advantage of being portable and easy to maintain. The same might be said of the static HTML form (1 above), which provides a simple, effective, maintainable entry point into the system. Obviously it would be nice to have an XML-based entry point, but the software is not yet available to support this. Availability

The reference validator's back end has just finished a brief in-house alpha testing, and the system as a whole is now ready for public access on STG's main website:

http://cds.library.brown.edu/service/xmlvalid/xmlvalid.var

We consider the system to be in beta testing now, and we invite bug reports. (Doubtless there will be more than a few of these.)

The source code for the parser is available at STG's website, as are binaries for a few platforms.

Please direct questions or comments on the system, or on any of the issues surrounding its release, to the STG staff (address below).

STG: [email protected]

http://cds.library.brown.edu/service/xmlvalid/Xml.tr98.2.shtml[6/20/14 12:22:39 AM] STG XML Validation Form

XML Validation Form

To validate a small XML document, just paste it into the text field below and hit the validate button. If the document is too large to be conveniently pasted into the text field, enter its filename into the local file field. You may also validate an arbitrary XML document on the Web by typing its URI into the URI field.

For more instructions, see below. See also the FAQ.

Local file: no file selected Suppress warning messages Relax namespace checks

URI:

Suppress warning messages Relax namespace checks

Text:

Suppress warning messages Relax namespace checks

Instructions

http://cds.library.brown.edu/service/xmlvalid/[6/20/14 12:22:33 AM] STG XML Validation Form

This interface offers full XML 1.0 validation facilities. Its only notable deviation from the 1.0 spec comes in its handling of whitespace, which it ignores inside of markup where syntactically irrelevant. Note, though, that this deviation from the spec has nothing to do with the hotly debated issue of whitespace in actual character data (in which respect this validator follows the spec).

To use this interface to validate a small XML document, paste the document in question into the text field above and click on the lower validate button.

To validate a file (e.g., something on your local hard drive), type that file's name into the local file field above. Then hit the upper validate button. If your file isn't encoded as UTF-8, make sure that it has an XML declaration (if it is encoded as ISO-8859-x, make sure it has an encoding declaration as well).

To validate an arbitrary document on the Web, type its URI into the URI field above. Then click on the validate button.

Note that, in order to validate OK, a document's system identifiers must all be resolvable URIs. Dummy URIs and local paths will not resolve. Naturally, all documents must have a DOCTYPE declaration. (On the reasons why, see STG TR 1998:2.) Elements and attributes in namespaces must be declared, unless the relax namespace checks box is checked (which turns off strict validation for undeclared elements and attributes in namespaces).

Validation results are displayed as follows: If errors are found, a list of them is printed out. If any of these errors occurs in the document itself (as opposed to an external file), the document is appended, with links to the relevant error messages. If no errors are found, a "document validates OK" message is displayed, possibly accompanied by a list of warnings. All results are encoded as UTF-8.

For security reasons, if you are validating an arbitrary document on the Web (i.e., if you are validating by URI, and not by direct upload from your local machine), the original document is never fully displayed, even if errors are found.

Notes: 1) This system is only useful for documents that have a line structure conducive to human reading. 2) Some of the error links also will not work properly with Internet Explorer version 4.0 and earlier. 3) Browsers may vary widely in their ability to display UTF-encoded results. And 4) it can take a while to validate documents containing lots of external entities that must be resolved and fetched over the network.

Richard Goerwitz, STG [email protected]

http://cds.library.brown.edu/service/xmlvalid/[6/20/14 12:22:33 AM] XML Validator FAQ

XML Validator Frequently Asked Questions List

This page lists frequently asked questions regarding STG's XML Validator. If you have a question, check to see if it is answered here before firing off e-mail to STG.

Your Validator is Broken, Why? How Do I Report a Possible Bug? Why am I Getting So Many Error Messages? Why am I Getting So Many Warning Messages? Why Do I Get Error Messages about Entities I'm Not Even Using? Can I Run the Validator Locally? Why Am I Getting Ambiguous Content Model Errors? Why Can't I Validate XML Files with Local DTDs? Your Validator is Broken, Why?

Because it's a beta test version. If you run into a problem, we'd actually be very grateful if you'd send us a bug report. How Do I Report a Possible Bug?

First, check to be sure the "bug" isn't discussed below. If it isn't, create a short XML file (preferably standalone) that illustrates the problem; then mail it to us at STG. We'll get back to you. If you can't illustrate the problem with a single XML file, feel free to send him a .zip or .tar archive. Why am I Getting So Many Error Messages?

Although it's difficult to answer this question without seeing the actual XML document that is being validated, experience has shown that this question most often arises when someone attempts to validate a document that lacks a document type definition (DTD). Any XML document that lacks a DTD is, by definition, invalid, and may trigger a cascade of error messages.

(Of course, the other typical reason that people get a lot of error messages is that the document being validated has a lot of errors.)

See also the next two FAQs. Why am I Getting So Many Warning Messages?

http://cds.library.brown.edu/service/xmlvalid/FAQ.shtml[6/20/14 12:22:36 AM] XML Validator FAQ

STG's validator follows the XML 1.0 specification pretty closely, providing a wide assortment of warnings about problems, both potential and actual, that most other validators ignore. Most of these messages have to do with XML - SGML compatibility and interoperability issues.

Here are some sample warning messages, with explanations of what they mean, and why you may (or may not) want to pay attention to them:

built-in entity not redeclared according to the spec The entities <, >, ", ', and & are built-in. They are predefined, that is, by the XML parser. If you declare them yourself, you need to be very careful (see the 1.0 spec, section 4.6). Typically it's better not to bother with them, unless you are using a lot of legacy SGML software. discarding apparent old-style SGML comment You're forgetting that this is XML. Comments can't be stuck inside just any markup. You must place them inside special comment delimiters, . element has more than one attlist declaration For interoperability with SGML software, an XML processor may issue a warning when more than one attlist declaration is provided for a single element type, or more than one attribute definition is provided for a given attribute. Ignore this warning if interoperabilty with SGML is not a concern. Otherwise, if possible, try to gather your ATTLIST declarations together into a single declaration. empty-tag syntax used for element not declared with EMPTY content model To facilitate interoperability with SGML software, the XML 1.0 specification says that elements using the special XML empty-element syntax (e.g.,


) should be declared explicitly as EMPTY in the DTD. If you're not using SGML software, ignore this warning. value appears in multiple enumerations for attributes of one element The same token should not occur more than once in the enumerated attribute types of a single element (e.g., ). In SGML, this was not allowed. So if you want to interoperate with SGML software, make sure you don't do it. Why Do I Get Error Messages about Entities I'm Not Even Using?

The short answer here is that STG's validator goes a bit above and beyond what the specification actually calls for in the way of validation.

The longer answer follows.

Most validating XML parsers validate XML documents as part of a more general process (e.g., readying them for manipulation and/or display). That is, they aren't there simply to flag errors. STG's parser/validator on the other hand, does little else. Our validator, in other words, has as its primary purpose to flag errors, and to help you locate potential problems in your XML.

As a result, our validator can be far more aggressive than it strictly needs to be. In particular, it can resolve and/or process all entities declared in your DTD. If it finds errors that may pose problems down the road, it will flag them - even if you don't happen to use the entity in question in the document you are validating.

The idea here is to help designers avoid half-baked DTDs that seem to work fine with

http://cds.library.brown.edu/service/xmlvalid/FAQ.shtml[6/20/14 12:22:36 AM] XML Validator FAQ

some documents, but suddenly start producing unexpected errors when used on documents that happen to make use of invalid entities that were lurking unused in the DTD. Can I Run the Validator Locally?

Yes. But to do so you may need to compile the back-end parser from source yourself and install it. The source code for the parser is available at STG's website.

Note that this software is still in beta testing, and will doubtless contain many bugs. Please let us know if you find one, preferably giving us enough information to reproduce it (e.g., your OS version, parser version, and sample XML input). Why Am I Getting Ambiguous Content Model Errors?

You are getting ambiguous content model errors because at least one of your content models is nondeterministic (in SGML terms, "ambiguous"). In essence what this means is that the content model(s) in question can match identical XML element sequences in more than one way.

STG's XML validation system aggressively reports such ambiguities not only because the specification says it should (appendix D)., but also because XML software strives for simplicity and consistency. If you give XML software an element stream that can be processed in several different ways, it will normally select just one of those ways (probably not even telling you what it's done), and then continue processing. This situation can lead to confusion, especially when you aren't aware that there were any ambiguities in the first place.

The most frequent cause of ambiguous content models is the use of patterns like ((a, b?) | a). Take, for example, the following DTD fragment:

(In the United States, a postal code consists of five digits plus an optional extension.) Although old SGML hands rarely make such mistakes, one often sees XML DTDs containing expressions that, in this instance, would reduce to:

((postalcode, postalcode_extension?) | postalcode)

When an XML processor, having internalized the above content-model fragment, sees an actual document instance containing a basic five-digit postal code

02912

it has no idea whether to process this postal code as an instance of a postal code plus a null extension, or as a complete postal code in and of itself. That is, it has no idea whether to treat it as an instance of (postalcode, postalcode_extension?) or of (postalcode).

If you find you are getting ambiguous content model errors, check for situations like the above, where the same XML text could match your content model in multiple ways.

If you aren't concerned about such problems, feel free to turn off warning messages altogether using the checkbox provided on the main validation form.

http://cds.library.brown.edu/service/xmlvalid/FAQ.shtml[6/20/14 12:22:36 AM] XML Validator FAQ Why Can't I Validate XML Files with Local DTDs?

The reason why STG's validator cannot validate XML document instances against local DTDs (e.g., DTDs on your local hard drive) is that it must be able resolve and fetch over the network any external entities it needs to in order to process your document. For it to resolve and fetch arbitrary local files on people's hard drives, everyone would need to offer our validation system access to their local filesystems.

Needless to say, this sort of access (if a reasonable way could be found to offer it) would present an unacceptable security risk.

If you want our validator to be able to find your DTDs, therefore, you must place them in a public directory on a webserver you have access to, and change your system identifiers to point to the relevant URIs.

If you have no access to a webserver, or if you are working on private DTDs and files, see above on compiling the parser locally.

http://cds.library.brown.edu/service/xmlvalid/FAQ.shtml[6/20/14 12:22:36 AM] Xmlparse Unix Manual Page Xmlparse Unix Manual Page

NAME

xmlparse - a validating XML parser

SYNOPSIS

xmlparse [-c ] [-C ] [ - d ] [ - E ] [-f] [-h] [-l ] [-m ] [-n] [-s] [-v] xmlparse [-h]

DESCRIPTION

Xmlparse is a full validating XML parser for use as a back- end to Web-based XML validation systems, or as a general- purpose XML validation tool. It is particularly well-suited to legacy SGML documents that are in the process of being converted, along with their associated DTDs, to XML. Xmlparse knows the difference between SGML and XML, and can often elucidate mistakes that stem from SGML/XML incompati- bilities (e.g., it reminds users that SDATA entities don't exist in XML; it warns users about nondeterministic content models, which are illegal in SGML; it also flags general problems like declared but not used, and used but not declared, elements in DTDs).

OPTIONS

Xmlparse may be invoked with several command-line options that tell it where to send error output, and where to look for catalog, message, and other auxiliary files. Normally environment variables and/or compile-time defaults should provide reasonable fallbacks for all of these command-line run-time options. -c filename Use filename as the configuration file. See also option - d below. Do not leave this file world- writable. -C filenames Use filenames as the SGML catalog files (if more than one is given, separate them with a colon). Note that if -C is not supplied on the command line, the value of the SGML_CATALOG_FILES environment variable is used instead. -d directory Use directory as the default location for data, library, and configuration files.

http://cds.library.brown.edu/service/xmlvalid/xmlparse.man.html[6/20/14 12:28:40 AM] Xmlparse Unix Manual Page

-E max errors Print no more than max errors errors and/or warnings for every file parsed. - f Force undefined attributes and element names in namespaces to validate OK -h Print a brief help message, then exit. See also - v below. -l level Set debugging level to level (must be an integer from 0 to 7; higher = more information). Debugging messages go to syslog(3) (facility DAEMON, priority DEBUG). Cf. system messages, which go to syslog only when specified (see -s below). This switch only works if the system administrator left debugging enabled at compile time. -m filename Use filename as the message file name. This file con- tains all error, warning, and parsing messages emitted by xmlparse at run-time. Do not leave this file world-writable. -n Resolve only remote http:, urn:, and ftp: system ids (be certain to use this option if you are running xmlparse as a back end to a web-based validator). Note that, even with the - n option, xmlparse will still resolve local files if supplied on the command-line. It will not, however, resolve URIs given on the command-line unless they begin with http:, urn:, or ftp:. -s Output system error and warning messages to syslog(3) (facility DAEMON, priority ERR or WARNING). These error messages cover things like malformed SGML catalog files, missing system files, and so on. Debugging mes- sages (see -l above) always go to syslog (DAEMON, DEBUG). Parsing errors always go to stderr. -v Print version number, then exit. See also -h above.

CONFIGURATION FILE

Run-time settings may be supplied, not only through command-line options, but also through a system-wide confi- guration file (usually installed as /usr/local/lib/xmlparse/xmlparse.cfg). Where they coincide, directives supplied in the configuration file override command-line options and compile-time defaults. Normally the configuration file is used only to set the external FPI and/or URI resolution commands (used by xmlparse to resolve PUBLIC and SYSTEM identifiers). It may also be used, however, to override the command-line options -C, -E, -l, -m, -n, and -s. All configuration file direc- tives are fully documented in the sample configuration file, xmlparse.cfg, included with the base xmlparse source distri- bution.

DIAGNOSTICS

If no validation errors are detected, xmlparse exits with status 0. Warnings may be issued to stderr. If actual errors are detected, xmlparse exits with status 4, and emits a list of parsing errors/warnings to stderr. Fatal system errors resulting in early program termination produce other non-zero terminations. Xmlparse may emit various diagnostic messages at run-time about missing files or arguments. By default, these go to

http://cds.library.brown.edu/service/xmlvalid/xmlparse.man.html[6/20/14 12:28:40 AM] Xmlparse Unix Manual Page

stderr. They may, however, be redirected to syslog(3) through the -s command-line switch (on which, see above). Xmlparse is aggressive in reporting ambiguous content models, elements that are declared but not used in any con- tent model, unresolvable public and system identifiers, and so on. Xmlparse also issues warning messages that encourage DTD writers to declare things before using them. For example, it reports cases where ATTLIST declarations name as-yet undeclared elements; it also flags unparsed entity declara- tions that point to as-yet undeclared NOTATIONs.

CONFORMANCE

Xmlparse implements the published (February 1998) XML 1.0 standard. It will also check namespaces (see, however, the -f option above). Xmlparse deviates from the 1.0 spec in one notable way: That it ignores syntactically meaningless whitespace inside of declarations and markup. The rationale here is that this practice not only follows SGML (e.g., Handbook, 65 [371:16]), but also simplifies processing - and renders XML more easily manageable using programming tools like flex(1). Note that this deviation from the spec has nothing to do with the hotly debated issue of whitespace in actual charac- ter data (which the validator maintains internally, as per the spec). Xmlparse also deviates from the strict 1.0 standard in its early reporting of malformed entity replacement text (if an entity's replacement text would be malformed, xmlparse flags it, whether or not you actually use the entity). The rationale here is that early reporting of malformed entity replacement text prevents users from declaring entities that are at best useless, and at worst harmful in that they trigger DTD-based errors in documents whose DTDs were thought to be correct. Xmlparse does not prohibit '<' in attribute values. The rationale in this instance is that excluding '<' actually complicates processing for validating parsers. Also, with all its intricate entity replacement rules and constraints, XML is already such a pain to process that this so-called DPH restriction is just plain silly. A final area in which xmlparse deviates from the XML 1.0 spec is that it ignores the encoding types specified by external transfer protocols, such as HTTP. Experience reveals that these protocols very often provide incorrect encoding information (e.g., UTF-8 usually gets sent as ISO- 8859-1 or plain-text ASCII). As a practical necessity, therefore, xmlparse relies for encoding information on its own internal charset detection facilities and on the encod- ing declaration, if the text provides one.

INSTALLATION

To set up Xmlparse follow the instructions in the INSTALL file that came with the source distribution. These instruc- tions cover source code configuration and building, as well as the actual installing. Xmlparse has been coded specifically for platforms that still lack support for UCS-2/4, UTF-16, and Unicode (i.e., nearly all stock Unix systems). It can also make limited use of legacy SGML catalog files (basically it ignores com- ments and lines that don't start with PUBLIC).

http://cds.library.brown.edu/service/xmlvalid/xmlparse.man.html[6/20/14 12:28:40 AM] Xmlparse Unix Manual Page

Xmlparse compiles using stock GNU tools available for nearly all POSIX systems (e.g., (G)CC, Bison, and Flex [patched for Unicode support]).

SEE ALSO

nsgmls(1)

LIMITATIONS, BUGS

Xmlparse is an ugly, inelegant piece of software built to run on legacy POSIX systems with C libraries and compilers that don't understand Unicode (i.e., nearly all Unix systems out there today). Xmlparse assumes that all auxiliary files, other than the XML source files and DTDs, are encoded using straight ASCII or UTF-8. This includes the message catalog, the system- wide configuration file, and any SGML catalogs used. Xmlparse will parse XML source files and DTDs that use UTF- 8, UTF-16, UCS-2/4 (big or little-endian), or any of the ISO-8859 standards, although all messages it emits are con- verted to UTF-8. Naturally, documents that don't use UTF-8 should provide an encoding declaration, since xmlparse will otherwise assume the default, UTF-8 (as per the spec). Documents using ISO 8859-x should include an encoding declaration as well. Xmlparse handles memory inefficiently. This inefficiency is compounded by its internal use of the wchar_t data type (if available) for character and string operations. Xmlparse also emits geekly line-numbered error messages that XML/SGML neophytes may find inscrutable. These messages are kept in a simple sprintf catalog that hard codes argument orderings, and will therefore be a pain to port to some language environments.

AUTHOR

Xmlparse was written by Richard Goerwitz for the Brown University Scholarly Technology Group. Send bug reports to .

COPYRIGHT

Copyright 1998 by Richard Goerwitz and Brown University Xmlparse is free software. Use it if you like (with appropriate acknowledgments) and modify it to suit your needs. But don't blame us if it doesn't do what you want or expect. Make sure to check the COPYRIGHT file that came with the xmlparse source distribution for a full statement of copyright and usage conditions.

http://cds.library.brown.edu/service/xmlvalid/xmlparse.man.html[6/20/14 12:28:40 AM] Xmlparse Unix Manual Page

Man(1) output converted with man2html

http://cds.library.brown.edu/service/xmlvalid/xmlparse.man.html[6/20/14 12:28:40 AM] STG XML Validationsformular

XML Validationsformular

Um ein kurzes XML-Dokument zu validieren, fügen Sie es einfach in das Textfeld ein, und klicken Sie auf die untere Validierungstaste.

Sollte das Dokument zu groß sein, um es ohne Umstände in das Textfeld einzufügen, geben Sie den Dateinamen in das Feld «Local File» ein. Sie können auch ein beliebiges XML-Dokument im Web validieren, indem Sie seinen URI in das «URI»-Feld unten einfügen.

Für weitere Hinweise, siehe Instruktionen unten.

Local File: no file selected Warnungen unterdrücken Vereinfachte Namespace-Überprüfung

URI:

Warnungen unterdrücken Vereinfachte Namespace-Überprüfung

Text:

Warnungen unterdrücken Vereinfachte Namespace-Überprüfung

http://cds.library.brown.edu/service/xmlvalid/xmlvalid.iso.de.shtml[6/20/14 12:43:03 AM] STG XML Validationsformular

Instruktionen

Diese Oberfläche bietet eine komplette XML-Validierung nach XML 1.0 an. Die einzige wesentliche Abweichung vom 1.0 Standard ist in der Behandlung von Whitespace (Leerzeichen), die innerhalb von der Markierung ignoriert wird, solange sie keine syntaktische Bedeutung trägt. Bitte beachten Sie aber, daß diese Abweichung vom Standard nichts mit der kontrovers diskutierten Frage von Whitespace im eigentlichen Character Data (Elementeninhalt) zu tun hat. In dieser Beziehung folgt der Validator dem Standard.

Um ein kurzes XML-Dokument zu validieren, fügen Sie es einfach in das Textfeld oben ein und klicken Sie auf die untere Validierungstaste.

Um eine Datei (d.h. auf Ihrer lokalen Festplatte) zu validieren, geben sie den Namen der Datei in das «Local File» Feld oben ein. Klicken Sie dann auf die obere Validierungstaste. Falls Ihre Datei nicht als UTF-8 kodiert ist, überprüfen Sie, daß sie eine XML-Deklaration hat. (Wenn die Datei als ISO-8859-x kodiert ist, muß sie auch eine Encoding-Deklaration haben.)

Um ein beliebiges Dokument im Web zu validieren, geben Sie seinen URI in das URI- Feld unten ein. Klicken Sie dann auf der Validierungstaste.

Bitte beachten Sie, daß alle System Identifiers eines Dokuments auflösbare URIs sein müssen, um validiert werden zu können. Platzhalter-URIs und lokale Pfade werden nicht aufgelöst. Natürlich müssen alle Dokumente eine DOCTYPE-Deklaration haben. (Warum: siehe STG TR 1998:2.) Elemente und Attribute in Namespaces müssen deklariert werden, es sein denn, das «Vereinfachte Namespace Überprüfung»-Kästchen ist angekreuzt (dadurch wird die strenge Validierung ausgeschalten, die nach nicht deklarierten Elementen und Attributen in Whitespaces sucht).

Ergebnisse der Validierung werden wie folgt angezeigt: Wenn Fehler gefunden werden, wird eine Fehlerliste ausgedruckt. Wenn diese Fehler im Dokument selbst (im Gegensatz zu einer externen Datei) auftreten, wird das Dokument angehängt, mit Links zu den relevanten Fehlermeldungen. Werden keine Fehler gefunden, wird die Meldung «document validates OK» angezeigt, womöglich mit einer Liste von Warnungen. Alle Ergebnisse werden als UTF-8 kodiert, und erscheinen in Englich.

Wenn Sie ein beliebiges Dokument im Web validieren (d.h. durch URI-Angabe, anstatt eines direkten Zugriffs auf Ihren lokalen Rechner), wird das Original-Dokument aus Sicherheitsgründen nie komplett angezeigt, selbst wenn Fehler gefunden werden.

Bermerkungen: 1) Dieses System ist nur bei Dokumenten sinnvoll, deren Zeilenstruktur der menschlichen Lesens entspricht. 2) Manche Fehler-Links werden mit dem Internet Explorer (Version 4.0 und frühere Versionen) nicht korrekt funktionieren. 3) Nicht alle Browser werden UTF-kodierte Ergebnisse übereinstimmend darstellen können. 4) Das Validieren von Dokumenten, die viele externe Entities beinhalten, kann sehr lange dauern, da diese über das Netzwerk eingeholt und aufgelöst werden müssen.

STG: [email protected]

http://cds.library.brown.edu/service/xmlvalid/xmlvalid.iso.de.shtml[6/20/14 12:43:03 AM] Forma di STG per la convalida di XML

Forma Di Convalida Di XML

Per verificare la validità di un breve documento XML, fate copia e incolla nel campo sottostante Text e selezionate il bottone 'validate'. Se il documento è troppo ampio per essere incollato nel campo Text, inserite il nome del vostro file nel campo Local file. Potete anche fare la 'validation' di un documento XML da voi preso arbitrariamente sul Web digitando il suo URI (Uniform Resource Identifier) nell'apposito campo denominato URI.

Per maggiori informazioni, leggete le istruzioni sottostanti. Andate a vedere anche la sezione FAQ.

Local file: no file selected Sopprimere i messaggi d'avvertimento Distendersi i controlli del namespace

URI:

Sopprimere i messaggi d'avvertimento Distendersi i controlli del namespace

Text:

Sopprimere i messaggi d'avvertimento Distendersi i controlli del namespace

http://cds.library.brown.edu/service/xmlvalid/xmlvalid.iso.it.shtml[6/20/14 12:43:34 AM] Forma di STG per la convalida di XML

Istruzioni

Questa interfaccia offre un servizio di verifica completa della validità di XML 1.0. L'unica differenza rispetto alle specifiche 1.0 è legata al fatto che questo sistema ignora gli spazi bianchi all'interno del 'markup', ove sintatticamente irrilevanti. Ciò non ha nulla a che vedere con il tema caldamente dibattuto degli spazi bianchi costituenti parte integrante del 'Character Data', nel qual caso questo processore validante segue le specifiche.

Per verificare la validità di un piccolo documento XML, incollate il documento in questione nel campo Text qui sopra e poi premete il bottone 'validate'.

Per verificare la validità di un intero file (ad esempio residente sul vostro disco fisso), digitate il nome del file nel campo Local file. Poi premete il bottone 'validate'. Se il vostro file non segue la codifica UTF-8, assicuratevi che abbia una 'XML declaration' (se segue la codifica ISO-8859-X, assicuratevi che abbia in aggiunta anche una 'encoding declaration').

Per verificare la validità di un documento preso arbitrariamente dal Web, digitatene l'URI nel campo URI. Quindi premete il bottone 'validate'. Notate che per riuscire a effettuare la 'validation', tutti gli identificatori 'system' del documento devono essere risolvibili URI. Percorsi 'local' e URI non reali non funzioneranno. Naturalmente, tutti i documenti devono avere una 'DOCTYPE declaration'. (A questo proposito leggete STG TR 1998:2). Gli elementi e gli attributi nei 'namespace' devono essere dichiarati, a meno che l'opzione relax namespace checks sia stata selezionata (tale opzione disattiva la 'validation' di elementi e attributi non dicharati nei 'namespace').

I risultati della validation vengono così indicati:

una lista di errori se tali errori sono presenti nel documento stesso (e non in un file esterno), il documento contenente link agli errori principali viene allegato al messaggio di 'validation'. se il documento non ha errori, appare il messaggio 'document validates ok', eventualmente accompagnato da una lista di avvertenze. Tutti i risultati vengono presentati in codifica UTF-8 e in lingua inglese.

Per ragioni di sicurezza, se state verificando la validità di un documento preso dal Web (ad esempio se verificate la validità servendovi dell'URI e non di un diretto 'upload' locale dal vostro computer), il documento originale non viene mai presentato per intero, neppure in caso di errori.

Bene di Nota:

1. Il sistema è utile soltanto per documenti che abbiano una struttura di linea leggibile da occhio umano 2. Alcuni dei link dei messaggi di errore non funzioneranno bene con Internet Explorer 4.0 e precedenti. 3. A seconda del browser da voi utlizzato, i risultati in codifica UTF-8 potranno apparire in modo differente. 4. Il tempo di validation di documenti con molte entità esterne può essere elevato, poiché tali entità devono essere risolte e reperite attraverso la rete.

http://cds.library.brown.edu/service/xmlvalid/xmlvalid.iso.it.shtml[6/20/14 12:43:34 AM] Forma di STG per la convalida di XML

STG: [email protected]

http://cds.library.brown.edu/service/xmlvalid/xmlvalid.iso.it.shtml[6/20/14 12:43:34 AM]