[April 13, 2000] Don Park recently announced the availability of Minimal XML specification and a Java Minimal XML parser, with performance test utilities. Basically, Minimal XML is XML without: Attributes, CDATA Sections, Comments, Document Type Declarations, Empty-Element Tags, Entity References, Mixed Contents, Predefined Entities, Processing Instructions, Prolog, and XML Declaration. This work is part of the SML-DEV project. SML-DEV is "a group of over 75 XML experts working to create simple XML standards and to simplify existing XML standards. Currently active SML-DEV projects are: Minimal XML (aka SML), 'a minimal subset of XML for data-centric XML applications' and Common XML, 'common usage guidelines for XML'." Minimal XML is "a subset of XML 1.0, including features essential for data interchange applications, and excluding non-essential features that are arcane, legacy-related, problematic for data interchange applications, or redundant. [The goals are to provide:] (1) A subset that allows easily implemented parsers that are much faster and smaller than full XML parsers. (2) A subset with simpler information model that can easily be mapped to other information models. (3) A subset that is much easier to learn, teach, and use. Minimal XML documents must be encoded in either UTF-8 or UTF-16. Minimal XML parsers must support both UTF-8 and UTF-16 character encoding formats." Don writes: "Min, Java Minimal XML parser, is released. This version will parse Minimal XML (aka SML) as specified in the preliminary spec at http://www.docuverse.com/smldev/minxmlspec.html. It supports SAX 1.0 and JAXP (Java API for XML Parsers). You can download the distribution ZIP file at http://www.docuverse.com/min/. It includes binary and source code JAR files and JavaDoc API documentation. There is also a utility class for converting XML files into MinXML files. You can just use the minimize.bat batch file like this once you have the binary JAR in your CLASSPATH... On my Celeron laptop, it zips through Minimal XML files at around 6 megabytes of UTF-8 data per second and over 10 megabytes of UTF-16 data per second. Sun's XML parser parses at about 3 megabytes/sec on UTF-8 and 4 megabytes per second on UTF-16. Min used to report events via a custom API called MAX (Minimal API for XML :) but I switched to SAX despite about 10% performance penalty because people would have to rewrite code to use Min. I might release a MAX version later... Because performance testing seems to be what most of you are doing after downloading Min, I have written a Java class for testing parsing performance of Min and other XML parsers. It is included in the version 1.0A3 of Min which is available for downloading now."
References:
Announcement for Min and Minimal XML
Min - Min is a Minimal XML document parser, available for commercial and non-commercial use without licensing fee.
Minimal XML Parser demonstration. 'This is a demonstration of a tiny MinXML parser written in Javascript.' From Sjoerd Visscher. See the description.
XTech 2000 Presentation slides - "Simplifying XML: New Developments from SML-DEV"