Cover Pages Logo SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic

XML Validation


Date:          25 Jun 1999 16:19:44 -0700
Subject:       XML Validation
From:          jenglish@flightlab.com (Joe English)           
Organization:  Advanced Rotorcraft Technology, Inc.        
Newsgroups:    comp.lang.functional, comp.text.sgml, comp.text.xml

On comp.lang.functional, Ketil Z Malde <ketil@ii.uib.no> wrote:

>
> I think a validating [XML parser] shouldn't be too
> hard either [in Haskell]

To which I replied:
>
>This is true. I'd say that a validator is
>far easier to write in a functional language
>than in C++ or Java.

And just to put my money where my mouth is, I went and wrote one :-)

Well, not exactly. Just the part that matches an element's children against a content model (regular expression). Parsing the DTD, the instance, entity resulution, et cetera, et cetera, is left as an excercise to the reader (but see again HaXml [1], which does all of this and more). Still, not bad for half an hour's work and 80 lines of Haskell. More details at:

URL: http://www.flightlab.com/~joe/sgml/validate.html

This is mostly of interest because it demonstrates an algorithm that doesn't seem to be as well-known as it should be. Every XML validator and RE-matcher I've seen uses finite automata, but there is another (IMO much simpler) technique based on derivatives of regular expressions; the web page above contains a (very brief) description plus a Haskell program fragment demonstrating the algorithm.

Followups to comp.text.sgml.

[1] HaXml: URL: http://www.cs.york.ac.uk/fp/HaXml/.


[. . . I've written a (*very* brief) note describing an regular expression matching algorithm which may be of interest to XML parser writers. . .

Also: Mark Hopkins' regex package at the comp.compilers archive includes a much more complete description of a souped-up version of this technique: ftp://iecc.com/pub/file/regex.tar.gz]


--Joe English

jenglish@flightlab.com


Prepared by Robin Cover for the The SGML/XML Web Page archive.


Globe Image

Document URL: http://xml.coverpages.org/englishXMLValidation.html