Issues of Document Description in HTML

[Mirrored from: http://www.oclc.org:5046/~emiller/publications/metadata/issues.html]

Issues of Document Description in HTML

Eric J. Miller
Office of Research, OCLC Online Computer Library Center, Inc., Dublin, Ohio

1.0 Introduction

The rapid development of the World Wide Web and the electronic dissemination of information offers opportunities and burdens. The opportunity is to provide unprecedented access to information with the flexibility and convenience of networked data offers. The burden, however, is to integrate these services with organizational frameworks that will allow these opportunities to become a reality.

HTML (HyperText Markup Language), a simple application of SGML-like markup, is a method for expressing document structure in the WWW [Berners-Lee]. Its simplicity has contributed to its popularity and made Web publishing more accessible, but that same simplicity makes it difficult to richly describe documents. This description is increasingly important for effective resource discovery and retrieval as more and more networked data becomes available.

Object characterization ranges from automated full-text indexing to rich AACR2 [AACR2] description, the encoding rules used by most library catalogers. Effective description, validation, maintenance and duplication are all issues that will contribute to the growth of the information landscape. This report discusses some of the issues of document description in HTML.

2.0 The HTML 2.0 META Element

Embedded description in HTML documents is currently facilitated through the use of the META element tag. This element was originally designed to provide a means to "discover that the data set exists and how it might be obtained or accessed" and to "document the content, quality, and features of a data set, indicating its fitness for use" [Berners-Lee].

HTML 2.0 defines the META element with the attributes HTTP-EQUIV, NAME and CONTENT. The corresponding structure of the META element is defined by the following DTD (Document Type Definition):

<!ELEMENT META - O EMPTY>
<!ATTLIST META
        HTTP-EQUIV  NAME    #IMPLIED
        NAME        NAME    #IMPLIED
        CONTENT     CDATA   #REQUIRED    >

The HTTP-EQUIV attribute binds the element to an HTTP header field. The NAME attribute contains the name of the element. The CONTENT element contains the associated data. These attributes allow the user a limited ability to describe a particular document.

3.0 Current Uses of the META Element

Currently, no widely supported guidelines exist that describe the use of the META element for the embedded description of HTML documents. The lack of these guidelines has created a wide range of localized use. One use is, as intended, to facilitate discovery and retrieval of document collections. Examples of localized implementations that utilize the META element for this purpose include the MOMspider project, the WN server, and ALIWEB to name but a few. The lack of a common descriptive element set, however, potentially compromises the effectiveness of interoperating between these systems.

A different use of the META element is found in Netscape's push-pull implementation. The META element, in this case, is used to define a refresh duration and possible redirection of an HTML document. FirstFloor additionally uses the META element for refreshing "bulletins" or messages that automatically notify users of new information on a Web site.

Additionally, Microsoft's Internet Assistant creates and embeds META values in HTML documents based on the software version and the user's configuration of Microsoft Word.

4.0 Current Document Description: The META Issue(s)

Given the variety of current implementations and the potential future uses, it is important to define a framework that will allow for both an effective object description and both localized and distributed extensions. It is difficult, however, to anticipate all the possible descriptive elements or formats that will facilitate discovery and retrieval of localized and global networked information. The limitations of the basic attribute-value pairing allowed with the META element makes this general, descriptive framework extremely difficult to achieve. The following examples illustrate some of the problems describing an object within the confines of the current META design.

<META NAME = "Title"	CONTENT = "On the Pulse of Morning">
<META NAME = "Author" CONTENT = "Maya Angelou">
<META NAME = "Publisher" CONTENT = 
"University of Virginia Library Electronic Text Center">
<META NAME = "OtherAgent" CONTENT = 
"University of Virginia Electronic Text Center">
<META NAME = "Date" CONTENT = "1993">
<META NAME = "Object" CONTENT = "Poem">
<META NAME = "Form" CONTENT = "1 ASCII file">
<META NAME = "Source" CONTENT = "Newspaper stories and oral performance 
of text at the presidential inauguration of Bill Clinton">
<META NAME = "Language" CONTENT = "English">

This example represents an embedded HTML description of Maya Angelou's transcribed speech at the 1993 presidential inauguration. The descriptive elements are based on the Dublin Core element set. The Dublin Core is a set of thirteen metadata elements that originated from discussions at OCLC/NCSA Metadata Workshop [Weibel]. Among the elements are: author, title, publisher, subject, unique identifier or electronic location, language, date, relation to similar objects, physical form, spatial coverage and temporal duration. The meaning of these elements can be understood by users with no training in formal cataloging or established record formats and can be used to create descriptions of Internet resources that are more detailed than automatically generated indexes.

The first problem that becomes apparent is actually classifying the appropriate element set used for the description. This example is based on the Dublin Core element set which focuses on the bibliographic characterization of the object. A variety of different descriptive schemes will emerge that focus on other types of metadata information including transactional, terms and conditions and ratings. Therefore it is necessary to distinguish between these types of classification schemes. The syntax and structure necessary to distinguish classification schemes would require agreement in the HTML standards community, however for illustrative purposes, the previous example could simply prepend the following identifier to signify the Dublin Core descriptive scheme.

 
<META NAME = "Citation" CONTENT = "Dublin Core">

For effective searching of large collections of objects, defining controlled vocabularies and classification schemes may become increasingly important. Controlled vocabularies when defining subjects, for example, become particularly useful when classifying knowledge domains. For example, the subject element could be qualified by a scheme, which specifies adherence to a known classification system such as the Library of Congress Subject Headings (LCSH), the Dewey Decimal System (DDC), or the Art and Architecture Thesaurus. The following examples illustrate this.

<META NAME = "Subject" SCHEME = "LCSH" CONTENT = "UNIX (computer system)">
<META NAME = "Subject" SCHEME = "DDC" CONTENT = "004.251">

Descriptive element schemes will evolve as networked information becomes more prevalent and the functional requirements of interoperability and access to distributed information become more of a reality. Classification schemes, classification types, and versioning information become increasingly important as a basis for effective description and to facilitate the transition among evolving descriptive schemes. Additional information about metadata element schemes, types, and versions is addressed in the OCLC/NCSA Metadata Workshop position paper [Weibel]. The following HTML DTD fragment representing the META element provides a redefinition of the structure and corresponding attribute information as a result of these metadata discussions.

<!ELEMENT META	- O 	EMPTY >
<!ATTLIST META
        HTTP-EQUIV  	NAME   	#IMPLIED
        NAME       	NAME   	#IMPLIED
	VERSION		NAME	#IMPLIED
        TYPE      	NAME   	#IMPLIED
        SCHEME      	NAME   	#IMPLIED
        CONTENT     	CDATA  	#REQUIRED    >

For additional examples and discussions about the Dublin Core element set see appendix 1.

5.0 Possible Directions of Document Description

When the META element is coupled with the VERSION, SCHEME and TYPE attributes, potential for effective bibliographic description is increased. However it still provides only a limited framework for describing HTML documents. For detailed cataloging, a more flexible framework is needed.

Descriptive information may be found either internal or external to the actual HTML document. A combination of both internal and external description is additionally possible based proposed HTML extensions that are currently being reviewed by the working group. The following sections discuss some of the various framework issues associated with both internally and externally referenced descriptive information.

5.1 External Description: The LINK Element

This framework could be developed by externally referencing metadata information associated with the object. An example of this type of referencing, based on the HyperText LINKs proposal [Maloney] is shown in the following example:

 
<LINK REL=META HREF="http://foo.bar/paper.marc">

A potential drawback to the external referencing of descriptive information is one of managing document modifications and versions. The ability to have documents independent of their descriptive information allows for modifications to be made to the document that are not reflected in the descriptive. Additional frameworks may be required to handle versioning of documents.

Another issue associated with externally referencing descriptive information is the additional protocol access required to identify the bibliographic characteristics of an object. Additionally, the potential problem of document migration exists. The Uniform Resource Locator, or URL, can change due to hardware reconfiguration, file system reorganization, or changes in organizational structure. The unpredictable mobility of Internet resources is inconvenient. For librarians, it is a serious problem which compromises their service to patrons and imposes an unacceptably large burden on catalog maintenance. Continuing work in the Uniform Resource Name (URN) communities, and the recent introduction of the PURL [PURL] services are designed to solve these problems. However, much work still is needed.

A potential benefit of external referencing is that the descriptive encoding standard is independent of the HTML document, thus current markup schemes including MARC [MARC], TEI [TEI], and FGDC [FGDC], could be readily used. Additionally, formatting and rendering issues of the bibliographic information are no longer a concern.

5.2 Internal and External Description

External referencing additionally provides an attractive framework for a modular, inherent framework for cataloging. This type of cataloging has been discussed by several authors including Heaney [Heaney] and Tillett [Tillett]. For example, a collection of similar objects might share a particular cataloging "class". This "class" could then be referenced by all of the objects in a collection with only the intrinsic cataloging information, local to the object, documented. Thus each local object would inherit the bibliographic information identified in the shared bibliographic record.

The following is an hypothetical example of bibliographic inheritance based on records created by the University of Virginia Library's Electronic Text Center. (For a description of that project, see Gaynor [Gaynor]).

Object 1 with localized bibliographic information:

<CITATION SCHEME="TEI-HEADER">
<HEADER TYPE=AACR2>
<FILEDESC>
<TITLSTMT>
<TITLE>On the pulse of morning [a machine-readable transcription]</TITLE>
<AUTHOR>Angelou, Maya</AUTHOR>
<RESP><NAME>Unknown</NAME><ROLE>creation of 
machine-readable edition</ROLE></RESP>
<RESP><ROLE>Conversion to TEI-conformant markup</ROLE><NAME>University
of Virginia Library Electronic Text Center</NAME></RESP>
</TITLSTMT>

<PUBSTMT>
<IDNO TYPE="ETC">Modern English, AngPuls</idno>
<DATE>1993</DATE>
<LINK REL=META HREF="http://foo.bar/etext/pubstmt">
</PUBSTMT>
...
</HEADER>
</CITATION>

Object 2 with localized bibliographic information:

<CITATION SCHEME="TEI-HEADER">
<HEADER TYPE=aacr2>
<FILEDESC>
<TITLSTMT>
<TITLE>The history of Lady Julia Mandeville [a machine-readable 
transcription]</TITLE>
<AUTHOR>Brooke, Frances, 1724?-1789</AUTHOR>
<RESP><NAME>Barbara Smith</NAME><ROLE>creation of 
machine-readable edition</ROLE></RESP>
<RESP><ROLE>Conversion to TEI-conformant markup</ROLE>
<NAME>University of Virginia Library Electronic Text Center</NAME></RESP>
</TITLSTMT>

<PUBSTMT>
<IDNO TYPE="ETC">Modern English, BroLady</IDNO>
<DATE>1993</DATE>
<LINK REL=META HREF="http://foo.bar/etext/pubstmt">
</PUBSTMT>
...
</HEADER>
</CITATION>

Inherited bibliographic information:

<PUBSTMT>
<RESP><NAME>University of Virginia Library</NAME>
<ROLE>publisher</ROLE></RESP>
<ADDRESS>Charlottesville, Va.</ADDRESS>
<AVAIL><P>Available for anonymous ftp at etext.lib.virginia.edu</P>
<P>Copies of this file are also available to University of
Virginia faculty, staff, and students; please contact the Electronic
Text Center</P>
<P>Available commercially from: Project Gutenberg</P>
</AVAIL>
</PUBSTMT>

A graphical representation of this relationship between local metadata information and its shared metadata record class is shown in Figure 1.

Figure 1:Relationship between local metadata information and its shared metadata record class. The metadata associated with any HTML object includes both the locally defined bibliographic information and the shared bibliographic information.

5.3 Internal Description: Place-holders and Content-Negotiation

Storing descriptive information internal to the document solves the problem of resource migration that is associated with external linking. Additionally this type of storage provides a much simpler framework for handling document versions and updates. The requirements of storing descriptive information internal to a HTML document however requires a "place-holder" framework for this descriptive information. Issues regarding implementation, acceptance and backward compatibility of this internal descriptive framework is discussed in the following sections.

5.3.1 Place-holders

One possibility for providing flexibility is to to define a "place-holder" for this descriptive component. An additional element (e.g., CITATION) in the HEAD of the document an area for arbitrarily rich embedded description of a document may be defined the following example. Maya Angelou's transcribed speech is described using a DTD that resulted from the OCLC/NCSA Metadata Workshop. The DTD from this workshop is found in appendix 2.

<!DOCTYPE HTML PUBLIC "-//IETF//NEW DTD HTML//EN">
<HTML>
<HEAD>
<TITLE> Transcription of Maya Angelou's "On the Pulse of Morning" </TITLE>
<CITATION SCHEME = "Dublin Core" VERSION = "0.1">
<TITLE> On the Pulse of Morning</>
<AUTHOR SCHEME = "AACR2"> Angelou, Maya </>
<PUBLISHER>University of Virginia Library Electronic Text Center</>
<OTHERAGENT TYPE="TRANSCRIBER"> University of Virginia Electronic 
Text Center</>
<DATE SCHEME = > 1993 </>
<OBJECT> Poem </>
<FORM> 1 ASCII file </>
<SOURCE> Newspaper stories and oral performance of text at the 
presidential inauguration of Bill Clinton </>
<LANGUAGE> English </>
<RELATION TYPE = "CHILD"> 
http://foo.bar/1993/presidential-address/collection.html </RELATION>
<RELATION TYPE = "ALTERNATE-FORM"> 
http://foo.bar/1993/presidential-address/audio/06983921.au </RELATION>
</CITATION>

An obvious problem associated with this modification is that it requires changes to both the HTML DTD as well as agreement in principle of the "non-rendering" of the CITATION content in WWW clients.

With respect to the non-rendering of data, consider the hypothetic DATA element:

<DATA NAME="Keywords" CONTENT="Information Retrieval">

This statement might encode that the phrase "Information Retrieval" is associated with the DATA element "Keyword". Note that the data is included as an attribute of the second value defined for the DATA element. Another possible HTML convention is to enclose the data in tags:

<DATA NAME="Keywords">Information Retrieval</DATA>

If the browser encountered the second DATA statement, the phrase "Information Retrieval" would display, but the display would be suppressed if an HTML document contained the first DATA statement. The situation for the CITATION element would be similar. While the bibliographic information in CITATION is potentially useful for discovery and retrieval, the rendering or display in browsers is somewhat problematic.

5.3.2 Content-Negotiation

Another possibility of incorporating this descriptive component using the concepts of SGML's "marked sections" and HTTP's "content negotiation" capabilities. Marked sections in SGML show which parts of a document are not processed or should only be processed under certain conditions [Herwijen]. For example, parts of documents destined for different groups of readers can be kept together in a single document. A parser can be instructed to process the document for different groups and deliver the only the appropriate parts.

HTTP's content negotiation is a way for browsers and servers to "negotiate" for content [Behlendorf]. The following is an simple example of this type of negotiation:

Client: I'd like a copy of Maya Angelou's poem "On the Pulse of 
Morning" and any bibliographic information you may have in 
MARC or TEI formats.

Server: I do have this poem, however the only bibliographic 
information I have on this is encoded in the Dublin Core format.

in which the client could either automatically or interactively choose

Client:OK, give me both the poem and the bibliographic information
in Dublin Core.

Client:I can't handle that format of bibliographic information... 
Only give me the poem.

Content negotiation allows the browser to tell the server what types of information it prefers or can easily handle, and the server automatically provides it, in one step. SGML marked sections and DTD declarations could encapsulate this bibliographic information and provide a way to include or exclude this information. The following is an example of this encapsulation of the CITATION in SGML:

<!DOCTYPE HTML PUBLIC "HTTP://PURL.ORG/NET/DTD/HTML-DC.DTD">
<HTML>
<HEAD>
<TITLE>Transcription of Maya Angelou's "On the Pulse of Morning"</TITLE>

<![ &if-citation; [

<CITATION>
<TITLE>On the Pulse of Morning</TITLE>
<AUTHOR SCHEME = "AACR2">Angelou, Maya</AUTHOR>
<PUBLISHER>
University of Virginia Library Electronic Text Center</PUBLISHER>
<OTHERAGENT TYPE="TRANSCRIBER">
University of Virginia Electronic Text Center</OTHERAGENT>
...

]>

The corresponding DTD for this citation located at "HTTP://PURL.ORG/NET/DTD/HTML-DC.DTD" would then specify:

<!ENTITY % if-citation "INCLUDE">

<!ENTITY % if-citation "IGNORE">

to include or omit the citation information.

Content-negotiation has been in HTTP since the early days of the web. It requires that both the server and client support this protocol. Unfortunately, only a few HTTP servers, such as the Apache and W3O server currently support it. Few WWW clients use it correctly and fully. [Behlendorf].

6.0 Conclusion

Access to distributed networked information requires effective mechanisms for resource discovery and retrieval. With respect to the rapid development of the World Wide Web and the electronic dissemination of information, effective descriptive techniques become increasingly important. The META element, provide the basis for document description, but does not provide the richness necessary for complex descriptive schemes. Providing descriptive capabilities of metadata schemes as well as adding attribute information for more accurate elemental description is one step towards a more accurate document description and more effective discovery and retrieval. A much more flexible environment is necessary to fully allow for richer document description. Future research in external declarations to bibliographic information and the growing interest in HTTP content negotiation look promising for this descriptive environment.

References

[AACR2]
Anglo-American Cataloguing Rules. Second Edition, 1988 Revision

[Behlendorf]
B. Behlendorf, "What Is Content Negotiation? Towards an Extensible Framework for an Ecology of Data Types", <http://www.organic.com/Staff/brian/cn/>

[Berners-Lee]
T. Berners-Lee., D. Connolly "Hypertext Markup Language (HTML)", 1995, <URL:http://www.w3.org/hypertext/WWW/MarkUp/html-spec/html-spec_toc.html>

[FGDC]
Federal Geographic Data Committee. 1994. Content standards for digital geospatial metadata (June 8). Federal Geographic Data Committee. Washington, D.C.

[Gaynor]
Gaynor, Edward. 1994. "Cataloging Electronic Texts: The University of Virginia Library Experience." Library Resources and Technical Services 38(4): 403-413 (October 1994).

[Heaney]
M. Heaney, "Object-Oriented Cataloging", Information Technology, September 1995

[Herwijnen]
Herwijnen, Eric van, Practical SGML , Kluwer Academic Publishers, 1994

[Maloney]
M. Maloney, L. Quin, "Hypertext links in HTML, draft-ietf-html-relrev-00.txt", <ftp://ietf.cnri.reston.va.us/internet-drafts/draft-ietf-html-relrev-00.txt>

[MARC]
Network Development and MARC Standards, Office, ed. 1994. USMARC Format for Bibliographic data. 1994. Washington, DC: Cataloging Distribution Service, Library of Congress.

[Purl]
PURL: Persistant Uniform Resource Locators, <http://purl.oclc.org>

[TEI]
Sperberg-McQueen, C. M., and Leu Burnard, ed. 1994. Guidelines for Electronic Text Encoding and Interchange. Chicago and Oxford: Text Encoding Initiative.

[Tillett]
B. Tillett, "Cataloging Rules and Conceptual Models", OCLC Distinguished Seminar Series, January 9, 1996, <http://purl.oclc.org/net/papers/tillett>

[Weibel]
S. Weibel, J. Godby, E. Miller, and R. Daniel "Elements of Network Object Description, OCLC/NCSA Metadata Workshop: The Essential Workshop Position Paper Workshop", March 1995, <URL:http://www.oclc.org:5046/conferences/metadata/metadata.html>

Appendix 1.0: A Sample Dublin Core record encoded HTML 2.0

This appendix contains a simple Dublin Core record that conforms to the syntax of the META element in the current draft of the HTML 2.0 specification. This example describes an Internet Request for Comment (RFC) found on a Web page containing similar RFCs.

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
<HEAD>
<TITLE>Sample Dublin Core URC for a RFC</TITLE>
<META NAME = "CITATION" CONTENT = "Dublin Core">
<META NAME = "SUBJECT" CONTENT = "IETF RFC">
<META NAME = "SUBJECT" CONTENT = "URI">
<META NAME = "SUBJECT" CONTENT = "Uniform Resource Identifiers">
<META NAME = "TITLE" CONTENT = "A Unifying Syntax for the 
Expression of Names and Addresses of Objects on the Network as 
used in the World-Wide Web">
<META NAME = "TITLE" CONTENT = "Universal Resource Identifiers in WWW">
<META NAME = "AUTHOR" CONTENT = "T.Berners-Lee">
<META NAME = "PUBLISHER" CONTENT = "CERN">
<META NAME = "DATE" CONTENT = "1994">
<META NAME = "OBJECTTYPE" CONTENT = "monograph">
<META NAME = "FORM" CONTENT = "text/plain">
<META NAME = "IDENTIFIER" 
CONTENT = "gopher://gopher.es.net:70/0R0-57601-/pub/rfcs/rfc1630.txt">
<META NAME = "RELATION" 
CONTENT = "http://ds.internic.net/ds/dspg1intdoc.html">
<META NAME = "RELATION"  
CONTENT = "http://ds.internic.net/rfc/rfc1738.txt">
</HEAD>
</HTML>

A problem with this type of example is that the Dublin Core elements are ambiguous due to the lack of element modifiers. A more useful record, for example, might identify the first RELATION element as a "parent" and the second as a "sibling" of the object being described. Additionally, it would identify the type of access methods for these objects, in this case in the form of URLs. The TITLE elements might be clarified: the first is a main title and the second is a subtitle. Since the HTML META element currently defined for HTML 2.0 severely constrains the richness of the description that can be created, members of the HTML Working Group are discussing several proposals.

In the following, SCHEMEs and TYPEs are used to modify the META element for a richer descriptive framework corresponding to the Dublin Core element set.

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
<HEAD>
<TITLE>Sample Dublin Core URC for a RFC</TITLE>
<META NAME = "CITATION" SCHEME="Dublin Core" CONTENT = "0.1">
<META NAME = "SUBJECT" CONTENT = "IETF RFC">
<META NAME = "SUBJECT" CONTENT = "URI">
<META NAME = "SUBJECT" CONTENT = "Uniform Resource Identifiers">
<META NAME = "TITLE" TYPE="MAIN" CONTENT = "A Unifying Syntax for 
the Expression of Names and Addresses of Objects on the Network as 
used in the World-Wide Web">
<META NAME = "TITLE" TYPE="ALT" CONTENT = "Universal Resource 
Identifiers in WWW">
<META NAME = "AUTHOR" SCHEME="AACR2" CONTENT = "Berners-Lee, Tim">
<META NAME = "PUBLISHER" CONTENT = "CERN">
<META NAME = "OTHERAGENT" TYPE="CATALOGER" SCHEME="AACR2" 
CONTENT = "Miller, Eric J.">
<META NAME = "DATE" CONTENT = "1994">
<META NAME = "OBJECTTYPE" CONTENT = "monograph">
<META NAME = "FORM" SCHEME="IMT" CONTENT = "text/plain">
<META NAME = "IDENTIFIER" SCHEME="URL" 
CONTENT = "gopher://gopher.es.net:70/0R0-57601-/pub/rfcs/rfc1630.txt">
<META NAME = "RELATION" TYPE = "CHILD" SCHEME="URL"  
CONTENT = "http://ds.internic.net/ds/dspg1intdoc.html">
<META NAME = "RELATION" TYPE = "SIBLING" SCHEME="URL" 
CONTENT = "http://ds.internic.net/rfc/rfc1738.txt">
</HEAD>
</HTML>

Appendix 2.0: A Proposed Document Type Definition for the Dublin Metadata Core

The following sample DTD is included to make the proposals in this report more precise and to promote discussion among those wishing to make improvements. Since the SCHEME qualifier, as defined by the participants in the Dublin Metadata Workshop, is potentially ambiguous, the TYPE qualifier has also been defined. SCHEME is used when reference is made to an external authority for notation or vocabulary. TYPE is used to describe a subclass of a generic element. For example, the TITLE element has subclasses such as Subtitle, Main Title, and Alternate Title. The proposed distinction between SCHEME and TYPE is a suggestion, and illustrates a problem that will arise when the current version of the Dublin Core is implemented and describes real data. Much more discussion is required before the problem of element modification can be adequately resolved.

The DTD in this appendix specifies one possible syntax for an SGML version of the Dublin Core. It is for illustrative purposes only.


<!--
This is the ISO8879:1986 document type definition for the default
Dublin Core element set.  This DTD is subject to discussion and change
by the members of the library community, IETF's URI/URC working group,
or by anyone who cares to discuss this information.

This DTD results from the OCLC/NCSA Metadata Workshop,
Dublin, Ohio, March 3, 1995.  Additional information regarding this
workshop, attendees and a working group position paper are available
at <URL:http://www.oclc.org:5046/conferences/metadata/metadata.html>
-->

<!-- ============ Parameter Entities ===============

<!-- Parameter entities for Elements
    These define the content models for the elements in the default
    set. The convention for those names is n.xxx, where xxx is the
    generic identifier of the element. -->

<!ENTITY   %  n.Instance "(Author | Title | Subject | Identifier |
                    URN | Location | URL | Form | Publisher | Date |
                    ObjectType | OtherAgent | Relation | Source |
                    Language | Coverage)">
<!ENTITY   % n.CITATION "(%n.Instance; | Instance)">

<!-- CITATION is the element that contains all the others. Instance is an
  element that can group together any of the other elements, but not
  itself. -->

<!ENTITY   %  n.Subject     "(#PCDATA)" >
<!ENTITY   %  n.Title       "(#PCDATA)" >
<!ENTITY   %  n.Author      "(#PCDATA)" >
<!ENTITY   %  n.OtherAgent  "(#PCDATA)" >
<!ENTITY   %  n.Publisher   "(#PCDATA)" >
<!ENTITY   %  n.Date        "(#PCDATA)" >
<!ENTITY   %  n.Relation    "(#PCDATA)" >
<!ENTITY   %  n.ObjectType  "(#PCDATA)" >
<!ENTITY   %  n.Form        "(#PCDATA)" >
<!ENTITY   %  n.Identifier  "(#PCDATA)" >
<!ENTITY   %  n.Source      "(#PCDATA)" >
<!ENTITY   %  n.Language    "(#PCDATA)" >
<!ENTITY   %  n.Coverage    "(Spatial | Temporal | #PCDATA)*" >
<!ENTITY   %  n.Spatial     "(#PCDATA)" >
<!ENTITY   %  n.Temporal    "(#PCDATA)" >

<!-- Parameter entities for Attributes
    Almost all of the elements can have a "scheme" attribute that
    can be used to more precisely  indicate their semantics. The
    Many of the elements can have a "type" attribute as well in order
    to specify additional structure.
 -->

<!ENTITY   % Author.Type
             "Name | Email | OtherType" >
<!ENTITY   % Author.Scheme
             "AACR2 | DUNS | OtherScheme" >

<!ENTITY   % Title.Type
             "Main | SubTitle | PartTitle | Alternate |
              Abbrev | OtherType" >

<!ENTITY   % Title.Scheme
             "AACR2 | OtherScheme" >

<!ENTITY   % Subject.Scheme
             "LCSH | MeSH | Sears | Abstract | OtherScheme" >

<!ENTITY   % Identifier.Scheme
             "URN | URL | LCCN | ISBN | ISSN | SICI | MessageID |
              FPI | UPC | OtherScheme" >

<!ENTITY   % Form.Scheme
             "IMT | X.400 | OtherScheme">

<!ENTITY   % Publisher.Scheme
             "AACR2 | DUNS | OtherScheme" >

<!ENTITY   % Date.Scheme
             "RFC822 | YYYY | YYYY-MM-DD | OtherScheme" >

<!ENTITY   % OtherAgent.Type
             "Editor | Sponsor | Principal | Compiler | Funder |
              Composer | Cataloger | Illustrator | Translator |
              OtherType"  >
<!ENTITY   % OtherAgent.Scheme
             "AACR2 | OtherScheme" >

<!ENTITY   % Relationship.Scheme
             "URN | URL | LCCN | ISBN | ISSN | SICI | MessageID |
              FPI | OtherScheme" >
<!ENTITY   % Relationship.Type
             "Supersedes | Continues | Continued.From |
              Contained.In | Superseded.By | Cites | Extracted.From |
              Is.Part.Of | Contains | IsIndexOf | IsIndexedBy |
              GlossaryOf | Predecessor | Successor | IsDerivativeOf |
              Child | Parent | Sibling | OtherType " >

<!-- Element list: Subject to change as this thing gets refined. Some
     elements for meta-metadata (version info on the URC itself)
     are the most likely candidates for addition. -->

<!ELEMENT   CITATION       - -  (%n.CITATION;)*

<!ELEMENT   Author      - -     (%n.Author;)
 -- Name of the persons and organizations primarily responsible for
    the intellectual content of the resouce. Encode one name per
    element. For personal names use Last, First (or whatever
    the cultural norm is for sorted lists of names). -- >
<!ATTLIST Author     Type     (%Author.Type;)         #IMPLIED
                        Scheme   (%Author.Scheme;)       #IMPLIED >

<!ELEMENT   Title       - -     (%n.Title;)
 -- The name of the object, if it has one. -- >
<!ATTLIST Title      Type     (%Title.Type;)          #IMPLIED
                        Scheme   (%Title.Scheme;)        #IMPLIED >

<!ELEMENT   Subject     - -     (%n.Subject;)
 -- The field of knowledge to which the resource belongs. The default
    content of the subject element is simple keywords. The scheme
    attribute can be used to indicate the use of a controlled
    vocabulary. -- >
<!ATTLIST Subject    Scheme   (%Subject.Scheme;)      #IMPLIED >

<!ELEMENT   Identifier  - -     (%n.Identifier;)
 -- String or number used to uniquely identify this resource.
    Typically the URN element will be used in favor of the URN
    attribute on this element. Other identification schemes may
    also be used. -- >
<!ATTLIST Identifier Scheme   (%Identifier.Scheme;)   #IMPLIED >

<!ELEMENT   Form        - -     (%n.Form;)
 -- The particular data representation of the resource. Typically
    this will be an Internet Media Type (formerly known as MIME
    content type). In such a case the SCHEME attribute should be used
    to identify it. -- >

<!ATTLIST Form       Scheme   (%Form.Scheme;)         #IMPLIED >

<!ELEMENT   Publisher   - -     (%n.Publisher;)
 -- The agent or agency responsible for making the resource
    available. The value of this element should follow the
    guidelines for the AUTHOR element. -- >
<!ATTLIST Publisher  Scheme   (%Publisher.Scheme;)    #IMPLIED >

<!ELEMENT   Date        - -     (%n.Date;)
 -- The date of publication. The scheme element can be used to
    indicate the particular format of the date string. -- >
<!ATTLIST Date       Scheme   (%Date.Scheme;)         #IMPLIED >

<!ELEMENT   ObjectType  - -     (%n.ObjectType;)
 -- The abstract category of the resource, such as article, image,
    dictionary, etc. -- >

<!ELEMENT   OtherAgent  - -     (%n.OtherAgent;)
 -- Other person(s) and/or organization(s) who have made a
    significant contribution to the resource. The value of this
    element should follow the guidelines for the AUTHOR element.
    The Author and Publisher elements are shorthand for using the
    Author and Publisher attributes of this element. -- >
<!ATTLIST OtherAgent Type     (%OtherAgent.Type;)     #IMPLIED
                        Scheme   (%OtherAgent.Scheme;)   #IMPLIED >

<!ELEMENT   Relation    - -     (%n.Relation;)
 -- Relationship of this resource to another resource. This
    element should specify what the relationship is, as well as
    the target of the relationship. The TYPE attribute is used for
    this purpose, the SCHEME attribute indicates how the
    destination is encoded.  -- >
<!ATTLIST Relation   Type     (%Relationship.Type;)   #IMPLIED
                        Scheme   (%Relationship.Scheme;) #IMPLIED >

<!ELEMENT   Source      - -     (%n.Source;)
 -- Objects, either electronic or printed, from which this
    resource was derived. This is a special case of the RELATION
    element. -- >

<!ELEMENT   Language    - -     (%n.Language;)
 -- The natural language(s) of the resource. When more than
    one Language element is specified, it indicates that more than
    one language is used to a significant degree in the work. No
    inference should be made about the relative proportions of the
    language content based on the order of appearence of the
    Language element. -- >

<!ELEMENT   Coverage    - -     (%n.Coverage;)
 -- The spatial extent and/or temporal duration characteristic of
    the resource, e.g. "19'th Century France". -- >

<!ELEMENT   Spatial     - -     (%n.Spatial;)
 -- For more precise indication of the spatial extent characteristic
    of the resource. -- >

<!ELEMENT   Temporal    - -     (%n.Temporal;)
 -- For more precise indication of the temporal duration
    characteristic of the resource. -- >