Cover Pages: Tiny API for Markup (TAM) and Parser for Lightweight XML Processing.

A posting from Simon St.Laurent announces the initial release of a Tiny API for Markup (TAM) and supporting Java 2 Micro Edition parser. "The Tiny API for Markup (TAM) provides a very small interface for parsing XML and similar documents, targeted at Java 2 Micro Edition (J2ME). TAM is designed to report pretty much everything the parser encounters, leaving it to applications to do some work (notably DTD interpretation) if they need it. TAM is based on a subset of SAX2, which is then slightly expanded. TAM is not a proper drop-in replacement for SAX; it uses similar method calls, and a similar approach, but it's been reduced to meet the needs of even smaller projects, and expanded slightly to reflect that TAM parsers are not required to process the DTD... This parser does support namespaces, and namespace declarations are reported as attributes. The startPrefix/endPrefix methods of SAX2 are not supported by TAM. The current version also does very little character checking of markup, and while it normalizes line-ends, it doesn't do attribute white-space normalization. These features (and DOCTYPE processing) will appear in a later version of the parser which supports more of XML 1.0 and also Markup Object Events (MOE)."

Also from the announcement:

The API is based loosely on SAX2, though the looseness has increased over time. TAM both subsets and supersets SAX2's core ContentHandler interface, and changes from (uri, localName, qName) to (uri, localName, prefix). SAX2 developers should find it familiar, but direct compatibility with SAX2 was definitely not a goal.

This parser is not XML 1.0-compliant, primarily because it doesn't support DOCTYPE processing at all. J2ME is a very constrictive environment, so I've made some reductions and passed responsibility from the parser to the application. The TAM API provides an event that reports the entire DOCTYPE declaration to the application, which is then responsible for any DOCTYPE processing that may be necessary. TAM will resolve simple entities if the application registers them with the parser, but does not handle entities which contain markup -- that's a job for the application.

The TAM API interface and TAMException class are in the public domain, while the parser is licensed under the Mozilla Public License, version 1.1. While this code is written for J2ME, it also works in J2SE and should be fine in any Java 2 environment.

About Markup Object Events (MOE):

Markup Object Events (MOE) is a Java API which combines the tree-based nature of DOM with the event-based nature of SAX. MOE makes it easy for developers to create trees representing partial documents. MOE object events can listen to other object events, 'filling up' until they reach completion and are ready for processing.

MOE is built around a core set of interfaces which define all nodes as having a similar structure. Every node, whatever its type, has at least the possibility of a (namespace-aware three-part) name, an unordered set of contents, an ordered set of contents, and a map for annotation. The classes which implement those interfaces can use them to represent XML at arbitrary levels of lexical preservation from a pure (or even refined) Infoset view to the preservation of "useless" things like spacing between attributes, single-quotes or double-quotes around attributes.

While MOE is currently oriented toward XML - and the particular notion of namespaces it uses definitely comes from XML - it's been designed to support a much wider set of information. MOE's origins lie in an effort to break down subcomponents of XML documents with lexical tools (regular expressions). While XML is a critical, and perhaps even canonical form, of markup, markup has many faces which are not XML. Developers will have to write parsers to support those forms of markup, but MOE should be capable of representing them. The CoreComponent abstract class and Mutable class are designed explicitly to be extensible in this way.

Monastic XML: An ascetic view of XML best practices. "MonasticXML.org is a look at XML from a different angle, focusing on what markup is best at rather than what markup can do to solve a particular problem or set of problems. While XML is powerful, developers seem insistent on using XML in ways which seem convenient for a moment but which cause much greater trouble down the line to both their projects and to markup itself... MonasticXML.org presents a deeply conservative view of markup in general and XML in particular, though it often turns out that discipline brings its own rewards. Paying attention to the details of how a technology works may not be as interesting as building large projects, but it may save time and energy over the long term.

Principal references:

Announcement 2002-08-20: "Tiny API for Markup, Parser"
TAM - The Tiny API for Markup
TAM javadoc documentation
TAM download
Markup Object Events (MOE)
Monastic XML


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY