XML and DTD Parsing - EntityResolver


From: <david@megginson.com>
Date: Mon, 19 Oct 1998 11:37:45 -0400 (EDT)
To: "XML Developers' List" <xml-dev@ic.ac.uk>
Subject: XML and DTD parsing

Sung Nguyen writes:

 
 > With IBM Alpha Parser,  It is working fine with
 > both DTD and XML in the same files.  But, when I have
 > DTD in a separated file stored in a buffer - also the
 > XML file is stored in a buffer too. The parser doen't
 > provide the API to do that job.

I haven't tested the SAX support in XML4J recently, but if it supports EntityResolvers correctly, you can simply make up a URI for the external DTD, and then supply your own Reader for the buffer when you see the URI.

For more information, see:

http://www.megginson.com/SAX/javadoc/org.xml.sax.EntityResolver.html.

It is possible to use entity resolvers in quite clever ways -- you can also pull information out of a database, request user input in a dialog box, etc. I wish that I could take credit for the idea, but I think that the original suggestion came from James Clark.

David Megginson                 david@megginson.com
           http://www.megginson.com/

Viz., 1998-10-20



Interface org.xml.sax.EntityResolver 

public interface EntityResolver 

Basic interface for resolving entities. 

If a SAX application needs to implement customized handling for external entities,
it must implement this interface and register an instance with the SAX parser using
the parser's setEntityResolver method.

The parser will then allow the application to intercept any external entities
(including the external DTD subset and external parameter entities, if any) before
including them.

Many SAX applications will not need to implement this interface, but it will be
especially useful for applications that build XML documents from databases or other
specialised input sources, or for applications that use URI types other than URLs.

The following resolver would provide the application with a special character
stream for the entity with the system identifier "http://www.myhost.com/today":

 import org.xml.sax.EntityResolver;
 import org.xml.sax.InputSource;
 public class MyResolver implements EntityResolver {
   public InputSource resolveEntity (String publicId, String systemId)
   {
     if (systemId.equals("http://www.myhost.com/today")) {
              // return a special input source
       MyReader reader = new MyReader();
       return new InputSource(reader);
     } else {
              // use the default behaviour
       return null;
     }
   }
 }
 

The application can also use this interface to redirect system identifiers to local
URIs or to look up replacements in a catalog (possibly by using the public identifier).

The HandlerBase class implements the default behaviour for this interface, which is
simply always to return null (to request that the parser use the default system
identifier).

Version: 
      1.0 
Author: 
      David Megginson (ak117@freenet.carleton.ca) 
See Also: 
      setEntityResolver, InputSource, HandlerBase



  resolveEntity(String, String) 
      Allow the application to resolve external entities. 

                       

   resolveEntity 

 public abstract InputSource resolveEntity(String publicId,
                                           String systemId) throws SAXException, IOException

      Allow the application to resolve external entities. 

      The Parser will call this method before opening any external entity except
      the top-level document entity (including the external DTD subset, external
      entities referenced within the DTD, and external entities referenced within
      the document element): the application may request that the parser resolve
      the entity itself, that it use an alternative URI, or that it use an entirely
      different input source.

      Application writers can use this method to redirect external system
      identifiers to secure and/or local URIs, to look up public identifiers in a
      catalogue, or to read an entity from a database or other input source
      (including, for example, a dialog box).

      If the system identifier is a URL, the SAX parser must resolve it fully before
      reporting it to the application.

      Parameters: 
            publicId - The public identifier of the external entity being referenced,
            or null if none was supplied. 
            systemId - The system identifier of the external entity being
            referenced. 
      Returns: 
            An InputSource object describing the new input source, or null to
            request that the parser open a regular URI connection to the system
            identifier. 
      Throws: SAXException 
            Any SAX exception, possibly wrapping another exception. 
      Throws: IOException 
            A Java-specific IO exception, possibly the result of creating a new
            InputStream or Reader for the InputSource. 
      See Also: 
            InputSource 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)