AElfred XML Parser
From david@freenet.carleton.ca Tue Dec 9 18:19:34 1997
Date: Tue, 9 Dec 1997 19:19:18 -0500
Message-Id: <199712100019.TAA00266@unready.microstar.com>
From: David Megginson <ak117@freenet.carleton.ca>
Subject: AElfred XML Parser
------------------------------------------------
Microstar Software Ltd. is happy to announce Ælfred (AElfred), a
small, fast, DTD-aware Java-based XML parser, especially suitable for
use in Java applets.
We've designed Ælfred for Java programmers who want to add XML support
to their applets and applications without doubling their size: Ælfred
consists of only two class files, with a total size of approximately
24K, and requires very little memory to run. Ælfred also implements
Java's java.lang.Runnable interface and a zero-argument constructor,
so it's easy to start Ælfred as a separate thread or to adapt it for
use as a JavaBean.
Ælfred is free for both commercial and non-commercial use, and COMES
WITH NO WARRANTEE. You can download a copy of version 1.0 (with
source code) from the following URL:
http://www.microstar.com/XML/index.htm
[December 11, 1997 Update:
Date: Thu, 11 Dec 1997 17:22:43 -0500
From: David Megginson <ak117@freenet.carleton.ca>
To: xml-dev Mailing List <xml-dev@ic.ac.uk>
Subject: AElfred 1.0beta3 release
There is a new release of Microstar's Ælfred XML parser at
http://www.microstar.com/XML/
The new version is still interface-compatible with the first two
public betas, but it adds the ability to query for content models and
enumerated attribute types (both returned as normalised strings, with
whitespace removed and parameter entities resolved).
With the new query routines, Ælfred is now capable of producing a
normalised version of an XML document's DTD; in fact, the distribution
now includes a new demonstration class, DtdDemo.java, that does
exactly that. ]
*****************
DESIGN PRINCIPLES
*****************
1. Ælfred must be as small as possible, so that it doesn't add too
much to your applet's download time.
STATUS: Ælfred is currently about 24K in total, and we're still
looking for ways to shrink it further.
2. Ælfred must use as few class files as possible, to minimize the number
of HTTP connections necessary for applets.
STATUS: Ælfred consists of only two class files, the main parser
class (XmlParser.class) and a small interface for your own program
to implement (XmlProcessor.class). All other classes in the
distribution are just demonstrations.
3. Ælfred must be compatible with most or all Java implementations
and platforms.
STATUS: Ælfred uses only JDK 1.0.2 features, and we have tested it
successfully with the following Java implementations: JDK 1.1.1
(Linux), jview (Windows NT), Netscape 4 (Linux and Windows NT),
Internet Explorer 3 (Windows NT), and Internet Explorer 4 (Windows
NT).
4. Ælfred must use as little memory as possible, so that it does not take
away resources from the rest of your program.
STATUS: On a P75 Linux system, using JDK 1.1.1, running Ælfred
(with a 4MB XML document) takes only 2MB more memory than running
a simple "Hello world" Java application. Because Ælfred does not
build an in-memory parse tree, you can run it on very large input
files using little or no extra memory.
5. Ælfred must run as fast as possible, so that it does not slow down
the rest of your program.
STATUS: On a P75 Linux system, using JDK 1.1.1 (without a JIT
compiler), Ælfred parses XML test files at about 50K/second. On a
P166 NT workstation, using jview, Ælfred parses XML test files at
about 1MB/second.
6. Ælfred must produce correct output for well-formed and valid
documents, but need not reject every document that is not valid or
not well-formed.
STATUS: Ælfred is DTD-aware, and handles all current XML features,
including CDATA and INCLUDE/IGNORE marked sections, internal and
external entities, proper whitespace treatment in element content,
and default attribute values. It will sometimes accept input that
is technically incorrect, however, without reporting an error (see
README), since full error reporting would make the parser much
larger.
7. Ælfred must provide full internationalisation from the first release.
STATUS: Ælfred supports Unicode to the fullest extent possible in
Java. It correctly handles XML documents encoded using UTF-8,
UTF-16, ISO-10646-UCS-2, ISO-10646-UCS-4 (as far as surrogates
allow), and ISO-8859-1 (ISO Latin 1/Windows). With these
character sets, Ælfred can handle all of the world's major (and
most of its minor) languages.
***********************
ABOUT THE NAME "Ælfred"
***********************
Ælfred the Great (AElfred in ASCII) was king of Wessex, and at least
nominally of all England, at the time of his death in 899AD. Ælfred
introduced a wide-spread literacy program in the hope that his people
would learn to read English, at least, if Latin was too difficult for
them. This Ælfred hopes to bring another sort of literacy to Java,
using XML, at least, if full SGML is too difficult.
The initial "Æ" (AE ligature) is also a reminder that XML is not
limited to ASCII.
Enjoy!
David
---
David Megginson ak117@freenet.carleton.ca
Microstar Software Ltd. dmeggins@microstar.com
http://home.sprynet.com/sprynet/dmeggins/