Getting DTD info. Peter Newcomb's parseDTD


See the description of the code from Peter Newcomb, of TechnoTeacher: parseDtd. It parses an SGML declaration set in the absence of a document. It is based on the SP SGML parser, version 1.2.1, written by James Clark. Some correspondence on the SP list is appended below.




From owner-sp-prog@cygnus.uwa.edu.au Wed Jan 14 16:46:19 1998
Date: Wed, 14 Jan 1998 16:24:43 -0600
From: Peter Newcomb <peter@techno.com>
Subject: Re: Getting DTD info
In-reply-to: <Pine.HPP.3.96.980114154941.7849A-100000@brm.bireme.br>
 (message from Alberto Pedroso on Wed, 14 Jan 1998 15:55:36 -0300 (SAT))
Sender: owner-sp-prog@cygnus.uwa.edu.au
To: alberto@bireme.br, sp-prog@cygnus.uwa.edu.au


[Alberto Pedroso <alberto@bireme.br> on Wed, 14 Jan 1998 15:55:36 -0300 (SAT)]
> Hi, remember my question a few days ago about getting info from the DTD
> that nobody answered? Well, I think I discovered a way of achieving this
> using SP's native interface. I did the following routine to parse the DTD,
> but when I call the function parser.init(params) I get an unhandled
> exception error. What other parameters do I need to pass? What could be
> happening there? Anyone has any idea?

Most glaringly, you're missing an entity manager.

I recently put together a small SP-based package that parses
declaration sets irrespective of particular documents, returning the
result as an SP Dtd object.  I didn't think it was appropriate when I
read your first post, since I thought you were trying to access DTD
information _while_ parsing a document.  Now I think it may be exactly
what you're looking for.

It'll be put it up on the ISOGEN and TechnoTeacher web sites soon, but
for now, you (and anybody else that wants to) can pick it up at:

  ftp://ftp.techno.com/TechnoTeacher/parseDtd/

-peter

--
Peter Newcomb                          at TTI: +1 972 231 4098
TechnoTeacher, Inc.                 at ISOGEN: +1 214 953 0004 x141
http://www.techno.com/                  email: peter@techno.com
--
To unsubscribe from this list, send mail to majordomo@cygnus.uwa.edu.au
with the phrase 'unsubscribe sp-prog' in the message body.  If
you have any difficulty with this, please send mail to
owner-sp-prog@cygnus.uwa.edu.au and ask for help.


=========================================================================


Date: Fri, 16 Jan 1998 11:53:45 +0000
From: Alfie Kirkpatrick <akirkpatrick@ims-global.com>
Subject: RE: Getting DTD info
Sender: owner-sp-prog@cygnus.uwa.edu.au
To: alberto@bireme.br, sp-prog@cygnus.uwa.edu.au, peter@techno.com
Reply-to: sp-prog@cygnus.uwa.edu.au


First of all, thanks to Peter for making parseDtd available.
Personally, I've been waiting for a pointer into this area for
some time.

For people using MSVC, I thought I'd share the process I
went through to compile the program as a subproject to SP
(it's clear Peter is in a UNIX environment).

1. Extract parseDtd into a subdirectory of Jade10.
2. Open the SP project and add a new Win32 console
application as a subproject in the parseDtd directory.
3. Add all the appropriate files to the subproject.
4. Make a dependency link from parseDtd to lib.
5. Copy ALL the settings from spam to the parseDtd
project, including SP_NAMESPACE and all the more
subtle options (I had some unresolved externals before
I did this).
6. Build.

Hope this saves someone some time...

Alfie.

 ----------
From:  peter@techno.com
Sent:  14 January 1998 22:36
To:  alberto@bireme.br; sp-prog@cygnus.uwa.edu.au
Subject:  Re: Getting DTD info

 --------------------------------------------------------------------------  
 --
I recently put together a small SP-based package that parses
declaration sets irrespective of particular documents, returning the
result as an SP Dtd object.  I didn't think it was appropriate when I
read your first post, since I thought you were trying to access DTD
information _while_ parsing a document.  Now I think it may be exactly
what you're looking for.

It'll be put it up on the ISOGEN and TechnoTeacher web sites soon, but
for now, you (and anybody else that wants to) can pick it up at:

  ftp://ftp.techno.com/TechnoTeacher/parseDtd/

 -peter

 --
Peter Newcomb                          at TTI: +1 972 231 4098
TechnoTeacher, Inc.                 at ISOGEN: +1 214 953 0004 x141
http://www.techno.com/                  email: peter@techno.com

--
To unsubscribe from this list, send mail to majordomo@cygnus.uwa.edu.au
with the phrase 'unsubscribe sp-prog' in the message body.  If
you have any difficulty with this, please send mail to
owner-sp-prog@cygnus.uwa.edu.au and ask for help.


=========================================================================


Date: Fri, 16 Jan 1998 10:12:21 -0600
Message-Id: <199801161612.KAA10755@exocomp.techno.com>
From: Peter Newcomb <peter@techno.com>
To: sp-prog@cygnus.uwa.edu.au, akirkpatrick@ims-global.com
In-reply-to: <TFSJMUTH@ims-global.com>> (message from Alfie Kirkpatrick on
	Fri, 16 Jan 1998 11:53:45 +0000)
Subject: Re: Getting DTD info


[Alfie Kirkpatrick <akirkpatrick@ims-global.com> on Fri, 16 Jan 1998 11:53:45 +0000]
> For people using MSVC, I thought I'd share the process I
> went through to compile the program as a subproject to SP
> (it's clear Peter is in a UNIX environment).

True, I usually develop on Linux or Solaris, with GCC.  However, I was
careful to mimic James' style of portability so that it would compile
cleanly under Windows/MSVC, including the SP_MULTI_BYTE support
(Basically, all I had to do was steal a bunch of code from CmdLineApp,
EntityApp, and ParserApp.)  Also, although GCC does not support
namespaces (@#&#*!!), I included the appropriate namespace
declarations, #ifdef'd just as James always does.  I even tested it
under MSVC, since I originally developed DtdParser as a module for
someone else's MSVC+MFC project.

> 1. Extract parseDtd into a subdirectory of Jade10.
> 2. Open the SP project and add a new Win32 console
> application as a subproject in the parseDtd directory.
> 3. Add all the appropriate files to the subproject.
> 4. Make a dependency link from parseDtd to lib.
> 5. Copy ALL the settings from spam to the parseDtd
> project, including SP_NAMESPACE and all the more
> subtle options (I had some unresolved externals before
> I did this).
> 6. Build.

Thank you for providing these directions, Alfie; do you mind if I
include them in the README?  I should have come up with some myself,
but I'm not very comfortable with MSVC, and so wasn't sure exactly
what to say (i.e., what would be obvious to MSVC users, and what would
not.)

I should also have said something about how to use the UNIX-style
makefile fragment (Makefile.sub):

1. Extract parseDtd as a subdirectory of the top-level directory of
   SP 1.2.1 (or Jade10).

2. In the top-level directory, type "make XPROGDIRS=parseDtd", or
   "make PROGDIRS=parseDtd" to skip compiling nsgmls, sgmlnorm, spam,
   and spent.

BTW, is there no way to create and use a makefile fragment in and/or
for MSVC?  I didn't think there was, and thus didn't include one to
begin with, but if anyone knows of a way, please tell me!

-peter

--
Peter Newcomb                          at TTI: +1 972 231 4098
TechnoTeacher, Inc.                 at ISOGEN: +1 214 953 0004 x141
http://www.techno.com/                  email: peter@techno.com
--
To unsubscribe from this list, send mail to majordomo@cygnus.uwa.edu.au
with the phrase 'unsubscribe sp-prog' in the message body.  If
you have any difficulty with this, please send mail to
owner-sp-prog@cygnus.uwa.edu.au and ask for help.




=========================================================================

Date: Fri, 06 Feb 1998 11:20:08 -0600
From: Peter Newcomb <peter@techno.com>
Subject: Re: Getting DTD info
In-reply-to: <00011BFD.1271@derwent.co.uk> (mbayly@derwent.co.uk)
Sender: owner-sp-prog@cygnus.uwa.edu.au
To: sp-prog@cygnus.uwa.edu.au


[mbayly <mbayly@derwent.co.uk> on Fri, 6 Feb 1998 09:50:03 +0000]
> One question, I was experiencing memory leaks and it appears they
> are due to the line in dtdparse.cxx 
> 
> dtd->removeElementType(params.doctypeName);
> 
> Sorry, I'm still pretty green when it comes to the Native API.  I was
> wondering if you could briefly explain what that line does and why
> its there.

[Martin, I hope you don't mind me posting my answer to your question
(which you sent as private email) to the sp-prog list, as I believe it
to be of general interest to those using my DtdParser module, and I
have no other way of contacting those who are using it.]

This call is made in order to remove the bogus element type created by
SP as a result of my setting the document type name (i.e.,
params.doctypeName) to the empty string (which I do explicitly, though
it's probably not necessary to do so as it should default to being
empty following construction).

The memory leak is my fault, as I failed to realize that
Dtd::removeElementType() only removes, and does not delete, the named
element type.  In retrospect, it was silly of me to miss this, since
the return type of the function is ElementType *.

I suggest changing the offending line to:

    delete dtd->removeElementType(params.doctypeName);

In other words, just delete the returned ElementType object.

I've updated the package at:

    ftp://ftp.techno.com/TechnoTeacher/parseDtd/parseDtd.zip

Thanks to Martin for tracking down this bug.

-peter

--
Peter Newcomb                          at TTI: +1 972 231 4098
TechnoTeacher, Inc.                 at ISOGEN: +1 214 953 0004 x141
http://www.techno.com/                  email: peter@techno.com
--
To unsubscribe from this list, send mail to majordomo@cygnus.uwa.edu.au
with the phrase 'unsubscribe sp-prog' in the message body.  If
you have any difficulty with this, please send mail to
owner-sp-prog@cygnus.uwa.edu.au and ask for help.

=========================================================================