Sun Multi-Schema XML Validator

Preview Version 1
July, 2001

Kohsuke KAWAGUCHI

Copyright © 2001
Sun Microsystems, Inc.
All Rights Reserved

Command Line Usage Guide

The Sun Multi-Schema Validator(MSV) is a JavaTM technology tool to validate XML documents against a variety of XML schemata. It supports RELAX Namespace, RELAX Core, RELAX NG, TREX, XML DTDs, and a subset of W3C XML Schema Part 1.

Usage

To validate XML documents with a RELAX grammar/module, enter a command of the following form:


$ java -jar msv.jar MySchema.rxg MyDocument1.xml MyDocument2.xml ...

If you'd like to use TREX, enter a command of the form:


$ java -jar msv.jar MySchema.trex MyDocument1.xml MyDocument2.xml ...

Or if you'd like to use W3C XML Schema, enter the command:


$ java -jar msv.jar MySchema.xsd MyDocument1.xml MyDocument2.xml ...

MSV will detect the schema language regardless of its file extension.

However, you have to add the -dtd switch to use a DTD.


$ java -jar msv.jar -dtd my.dtd MyDocument1.xml MyDocument2.xml ...

Some environments support wild cards for filenames, while others (e.g., "jview" VM from Microsoft) don't. You can also use a URL instead of a filename.

If the schema contains errors, you will see messages like this.


start parsing a grammar.
the following hedgeRules form an infinite recursion ( y > z > y )
  18:23@file:///c:/work/relax/relax024.e.rlx
  14:23@file:///c:/work/relax/relax024.e.rlx
failed to load a grammar.

18:23@XYZ indicates that the error is located at line 18, column 23 of file XYZ.

If the document and the schema are both valid, you will see messages like this:


start parsing a grammar.
validating c:\work\relax\relax001.v00.xml
the document is valid.

If the schema is valid but the document has errors, you will see messages like this:


start parsing a grammar.
validating c:\work\relax\relax001.n02.xml
Error at line:5, column:5 of file:///c:/work/relax/relax001.n02.xml
  tag name "q" is not allowed. possible tag names are:

the document is NOT valid.

Please note that line/column infomation is sometimes inaccurate, and one tab char is counted as one character, not 4 or 8. So you might want to look around to find the actual error.

Options

The command line tool has several other options. You can see the list of available options by invoking MSV without any arguments.


$ java -jar msv.jar
-warning

Display all warning messages. Warnings are useful to catch potential errors.

-loose

This switch prevents MSV from resolving external resources (such as an external DTD referenced from documents, or external entities).


Known Limitations

DTD

  1. Attribute declarations of the form of xmlns:*** are ignored. All namespace declarations within the instance document are also ignored.

  2. In general, MSV handles XML namespaces differently from most DTD validators. As a result, MSV may validate documents that other DTD validators reject. When this is the case, MSV always issues a warning.

  3. Strictly speaking, MSV cannot replace XML parser's DTD validation because DTD validation affects lot of other things. For example, DTD validation in XML parser can expand entity references and provides default attribute values; DTD validation by MSV doesn't do those things.

TREX

This implementation fully conforms to the current specification (2001-02-13 version).

RELAX NG

This implementation attempts to fully implement the spec (Aug,11,2001 version). W3C XML Schema Part 2 is the only one datatype vocabulary supported. ID,IDREF, and IDREFS types implement the cross-reference semantics.

RELAX Core

This implementation fully conforms to the current JIS TR specification. ( English , Japanese )

RELAX Namespace

<anyOtherAttribute> is not implemented.

W3C XML Schema Part 1

  1. "Schema component constraints" and "Schema representation constraints" are not fully enforced. In other words, MSV accepts schemata that are rejected by other conforming processors.

    Unimplemented checks include (but not limited to) "UPA constraint" and "Particle Valid Restriction"

  2. "Missing sub-components" are treated as immediate errors. This behavior better serves the majority of users.

  3. Default values are ignored. In fact, no infoset contribution is supported: MSV validates documents, but it doesn't augment them. Post Schema-Validation Infoset (PSVI) is also not supported.

Further Reading

To learn RELAX, "How to RELAX" is a good starter. To learn TREX, there is the "TREX tutorial". RELAX NG also has its own tutorial. For W3C XML Schema Part 2, see section 2.3 of XML Schema Part 0: Primer.