Nascent XML::Parser in 100% pure Perl


From     perl-xml-admin@lyris.activestate.com Mon Aug  3 12:49:50 1998
Date:    Mon, 03 Aug 1998 10:39:56 -0700
From:    Mark Kvale <kvale@phy.ucsf.EDU>
Subject: Nascent XML::Parser in 100% pure Perl
To:      Perl-XML Mailing List <perl-xml@lyris.activestate.com>

> : 2. 'twould seem to me that if it is possible to write HTML parsers using
> : PERL, pure PERL, then XML could also be done. This may not mean fully
> : validated SGML-DTD stuff but enough to sit on tags and stuff and do
> : usefull work. It becomes a bit much if dll/pll based stuff fails on
> : differing ports or releases.

> Sure.  The more, the merrier.  At the module level my predilections are
> definitely more darwinian than theocratic.  May the best module win (in
> its chosen ecological niche).

> Larry

Ah, good. I agree that an all perl approach is useful, and it might be 
nice to have both versions available. I wrote an XML parser last December
that is analogous in structure and function to HTML::Parser. It is a 
basic recursive descent parser which mostly follows the EBNF code in the
1.0 spec. I haven't built any translation modules yet, but this should 
be easy to implement by inserting hooks at various stages of the parsing. 
I've tested it on simple XML documents, but haven't done extensive tesing. 
The lexing is done with 5.004 regexps, which means that I did try to handle 
all the 8-bit characters in the XML spec, but it will fail miserably 
on multibyte character sets.

I am not currently developing it, but it might make a good starting point
for a pure Perl XML package, especially when utf-perl comes into production. 
People are welocme to use it, under the same terms as Perl itself. 
The parser is available at my FTP site:

ftp://phy.ucsf.edu/pub/kvale/XML.tar.gz

				-Mark

[Note: See the dedicated database entry for more on XML and Perl: XML and Perl. -rcc]