The World-Wide Web Consortium (W3C) promotes XML and related standards, including XML Schema, XQuery, and XPath. This paper describes a formalization XML Schema. A formal semantics based on these ideas is part of the official XQuery and XPath specification, one of the first uses of formal methods by a standards body. XML Schema features both named and structural types, with structure based on tree grammars. While structural types and matching have been studied in other work (notably XDuce, Relax NG, and previous formalizations of XML Schema), this is the first work to study the relation between named types and structural types, and the relation between matching and validation.
Three decades past, the relational empire conquered the hierarchical hegemony. Today, an upstart challenges the relational empire's dominance, threatening the return of hierarchy. XML is Lisp's bastard nephew, with uglier syntax and no semantics. Yet XML is poised to enable the creation of a Web of data that dwarfs anything since the Library at Alexandria. This talk examines the design of XQuery, the W3C standard query language for XML, and related standards such as XML Schema.
MSL (Model Schema Language) is an attempt to formalize some of the core idea in XML Schema. The benefits of a formal description is that it is both concise and precise. MSL has already proved helpful in work on the design of XML Query. We expect that similar techniques can be used to extend MSL to include most or all of XML Schema.
The slides from the presentation in Hong Kong.
An updated version of the paper An Algebra for XML Query.
An updated version of the manuscript An Algebra for XML Query.
This note presents a possible algebra for an XML query language, submitted as an input to the XML Query working group.
Please try out our implementation of the algebra!
There is also a November 1999 draft of the algebra, which is rather different.
XML (eXtensible Markup Language) is a magnet for hype: the successor to HTML for Web publishing, electronic data interchange, and e-commerce. In fact, XML is little more than a notation for trees and for tree grammars, a verbose variant of Lisp S-expressions coupled with a poor man's BNF (Backus-Naur form). Yet this simple basis has spawned scores of specialized sublanguages: for airlines, banks, and cell phones; for astronomy, biology, and chemistry; for the DOD and the IRS. Domain-specific languages indeed! There is much for the language designer to contribute here. In particular, as all this is based on a sort of S-expression, is there a role for a sort of Lisp?
A second version, with more on XSLT and Xduce, but less on syntax. Invited talk, DIMACS workshop on Data Processing on the Web: A Look into the Future, 6-7 March 2000.
This paper identifies essential features of an XML query language by examining four existing query languages: XML-QL, YATL, Lorel, and XQL. The first three languages come from the database community and possess striking similarities. The fourth comes from the document community and lacks some key functionality of the other three.
This note presents a formal semantics of the pattern language from the 16 December 1998 draft of XSLT. The semantics is clear and concise, summarizing in one page of formulas what required about ten pages of prose to describe. With the aid of the semantics one can rigorously state and prove properties of the language; these properties helped to guide future development of the XSLT design. The semantics was developed using standard techniques from the programming language community, and this article provides a tutorial introduction to these techniques. While little here will be new to the language theorist, some of what is here may be of use to the markup technologist.