[Cache from http://www.tei-c.org.uk/Vault/ED/edw55.htm; please use the canonical source if possible.]

Form for Draft Chapters of the TEI Guidelines


C. M. Sperberg-McQueen

Lou Burnard

TEI ED W55

5 June 1996

Status: This document was prepared by the editors of the Text Encoding Initiative; it will be submitted to the TEI Technical Review Committee at its meeting of 27 June 1996 in Bergen, Norway, for discussion and action. In its current state, it is publicly accessible and may be quoted from and publicized, but it does not represent official policy of the TEI until approved by the Technical Review Committee. Readers wishing to comment on the draft should send comments to the authors or to the head of the Technical Review Committee.

Table of Contents


This document describes the format to be used in preparation of draft sections for incorporation into the TEI Guidelines. Such drafts must use the SGML tagging described here; drafters are encouraged to follow the stylistic and editorial conventions listed here as well. The recommendations of this document may also be applied in other documents prepared by TEI work groups.

This document is itself in the recommended format.[1]

1 The ODD and TEI Lite Formats

All official working papers of TEI work groups should be prepared using SGML. Work group members who do not know how to use SGML should plan to learn, if they wish to contribute directly to the drafting process.

Drafts of new or revised sections of the Guidelines must be prepared and submitted using the current version of the ODD (One Document Does it all) DTD, which is discussed below.

For other documents, including working papers, minutes, etc., work groups are strongly recommended to use TEI Lite, or any other appropriate TEI-conformant DTD, if only because this will simplify the task of publishing them for public comment and of presenting them to the Technical Review Committee.

[Alternate wording:

Other documents, including working papers, minutes, etc., may be prepared using the ODD DTD, the TEI Lite DTD, or any other TEI-conformant DTD. This simplifies the maintenance of the TEI document archive and ensures consistency in TEI materials made available for public comment and action by the Technical Review Committe.

End alternate wording.]

TEI Lite is a view of the main TEI DTD, with minor extensions. It is defined in document TEI U5, which is available in both TEI-conformant and HTML-encoded forms. A single-file non-modifiable version of the DTD may be retrieved from the TEI's file servers. TEI Lite contains a small number of elements specifically intended for technical documentation: these are described in section 17 of TEI U5. Though adequate for most general purpose documentation, TEI Lite does not contain everything needed for material to be included within the Guidelines.

Version 2 of the ODD DTD is a view of the TEI main DTD, very similar to TEI Lite, but with additional extensions specific to the production of the Guidelines. These include

These extensions are derived from the following sources, which should be consulted for more detailed descriptions than are now provided by the present document:[2]

The DTDs for ODD and TEI Lite both have formal public identifiers; they may thus be used as document types in their own right, as well as following the normal TEI conventions for extensions. A TEI Lite document might begin thus, using the single-file version of the document:

 
<!DOCTYPE TEI.2 PUBLIC '-//TEI//DTD TEI Lite 1.0//EN' >
or thus, using the normal TEI methods for extending the DTD:
 
<!DOCTYPE TEI.2 PUBLIC "-//TEI P3//DTD Main Document Type 1995-09//EN" [
<!ENTITY % TEI.prose    'INCLUDE' >
<!ENTITY % TEI.linking  'INCLUDE' >
<!ENTITY % TEI.analysis 'INCLUDE' >
<!ENTITY % TEI.figures  'INCLUDE' >
<!ENTITY % TEI.extensions.ent SYSTEM 'teilite.ent' >
<!ENTITY % TEI.extensions.dtd SYSTEM 'teilitex.dtd' >
]>

An ODD document can similarly begin with a formal public identifier for ODD itself:

 
<!DOCTYPE TEI.2 PUBLIC '-//TEI//DTD ODD System ver. 2//EN' >
or thus, using the normal TEI methods for extending the DTD:
 
<!DOCTYPE TEI.2 PUBLIC "-//TEI P3//DTD Main Document Type 1995-09//EN" [
<!ENTITY % TEI.prose    'INCLUDE' >
<!ENTITY % TEI.linking  'INCLUDE' >
<!ENTITY % TEI.analysis 'INCLUDE' >
<!ENTITY % TEI.figures  'INCLUDE' >
<!ENTITY % TEI.extensions.ent SYSTEM 'oddx.ent' >
<!ENTITY % TEI.extensions.dtd SYSTEM 'oddx.dtd' >
]>
The files oddx.ent and oddx.dtd are defined fully by the DTD fragments embedded in this document.

2 Components of Drafts

Drafts for sections of the Guidelines are not complete if they do not have:

The purpose of the ODD system is to simplify the task of preparing all of these in a consistent and complete manner. This is done partly by means of following a fairly rigid set of style guidelines and conventions, and also by the use of special purpose programs which can, for example, generate automatically parts of the reference documentation, enforce consistency of description, etc.

2.1 Canonical Form for Sections of the Guidelines

The TEI prescribes no required format for the overall organization of a chapter; the only requirement is that the organization be clear and easy to follow. Some work groups have preferred to work bottom up, starting at the lowest level (smallest) elements, and working up to group them in larger and larger units. Others have preferred to work top down. One or two tried to work out from the middle, but those chapters were heavily revised before publication.

When a chapter defines a base or additional tag set, the introductory material at the beginning of the chapter should include a brief example demonstrating how the tag set described is selected.

When a chapter of the Guidelines defines a particular tag set, that tag set is itself formally defined using DTD fragments. It is generally helpful to include a large DTD fragment with references to all the other DTD fragments in the chapter, to show the reader how things fit together.[3] This may be near the front of the chapter, as a sort of overview, or at the end, as a sort of wrapup. It is seldom a good idea for it to be in the middle.

Within a chapter documenting a particular tag set, some sections will be just normal prose paragraphs discussing issues or problems addressed by the tag set or its application. The majority will however introduce some of the elements of the tag set and their attributes, and the DTD fragment which defines them. A canonical form exists for such sections, which drafters should deviate from only in exceptional circumstances. Sections in canonical form contain:

This `canonical form' should be followed wherever practicable, in the interests of maintaining consistency with the rest of the Guidelines, and thus reducing confusion (if only of the editors).

2.2 Things to Watch For

Each section of the canonical form should be checked for the following common vices.[4]

Check each introductory paragraph to ensure that it is clear and short. A sentence or two often suffices --- don't let it swell to more than half a page in any case.

Check each element list for:

In hard-copy output or soft-copy display, lists of attribute values should be appropriately introduced: closed sets with "Legal values are:", open sets with "Sample values include:" and semi-closed sets with "Suggested values include:" --- in case of doubt, check the declaration at the end of the section to see whether an attribute has a closed, open, or semi-closed set of values.

Check each commentary section to see that:

Check each example for:

Check each cross reference for:

Check each DTD fragment for:

3 House Style

Section 1.1.2 of P3 lists the `notational conventions' followed in that document: most of these relate to the "canonical form" discussed above, but they also address a number of stylistic issues which should be followed in new drafts.

We list here, in no particular order, a number of other conventions which should be followed in preparing TEI drafts.

3.1 Element Usage

In marking various special types of term or phrase, be consistent:

elements
For element names (generic identifiers) in continuous prose, in general use the <gi> element -- if there is no reference to the abstract element type, but just an example of a tag marking an element instance, then use <tag> instead. The presence of attribute value specifications is prima facie evidence that the phrase in question is a <tag> not a <gi>, as is presence of a slash before the name to simulate an end-tag. For tags in examples, use the normal literal string enclosed within an <eg> element and a CDATA marked section.
attributes
In continuous prose, tag attribute names using the <att> element. Attribute-value specifications of the form name - value indicatore (i.e. equals sign) - value should be tagged using <avs>. Values should be tagged using <val>. In examples none of these elements should be necessary.
entity names
In continuous prose, tag names of entities using <ent>.
SGML examples
Each distinct example must be enclosed within a CDATA marked section inside an <eg> element. Spacing and lineation should follow the practice of the published Guidelines.
markup declarations
These should be treated in the same way as SGML examples. They may however be run on when appearing within running text.
tables of contents
tag any sample tables of contents as simple lists, not numbered (ordered) or unordered (bulleted) lists. Give section numbers explicitly. In most documents, tables of contents should not be given explicitly; the <divGen> element should be used instead, with a type value of toc; this indicates where the table of contents produced by the formatting engine should go.
lists of terms
most lists of terms and headed lists should be tagged as glossary lists using the <list type=gloss> element.
Consistent use of the <gi>, <att>, and similar elements will allow better automatic generation of indices.

3.2 Terminology

Drafters should attempt to be consistent in using SGML and TEI terminology. Some have suggested that the TEI should develop and maintain a database of such specialized terminology; until that happens, here is a short list of some points which arise frequently:

3.3 Naming Conventions

Names of elements should be natural-language words in English. Names from other languages are appropriate if the name used is a technical term standard among Anglophone scholars, e.g. enjambement, ablaut, sandhi. Elements should normally be given names which are full natural language words or phrases rather than abbreviations, unless the abbreviation is very common or it is expected that the element will appear frequently when used -- more than once per page. Consistent rules should be used in forming abbreviations.

Other things being equal, names formed from single words should normally be preferred to names formed from phrases. Nouns and adjectives should be preferred to verbs or adverbs.

Consistency with existing TEI naming methods should be followed wherever possible, especially for compound element names. For example, use <listBlort> for an element which is a list of blorts (rather than <blortList>); use <blortStruct> for a structured blort (by analogy with <biblStruct>), etc.

Consistency with existing names is also important when naming attributes: for example, an attribute with the name target will normally be expected to have a declared value of IDREF and to behave like the target attribute on <ptr>, <note>, etc.

Names of elements should always be given in lower case, except where they consist of more than one word, in which case the second (and subsequent) words should be capitalized: p or list but partOfSpeech.

The recommendations summarized here are given in more detail in document TEI ML W26.

3.4 Positioning

In general: use "tight" positioning. Do not introduce additional whitespace after any SGML tag, other than a newline. Tags which delimit structural units (<p>, etc.) may be on lines of their own. For example:

 
<p>This paragraph is correctly <hi>tagged</hi>.
</p>
<p>  This paragraph is incorrectly
<hi>tagged   </hi>
     </p>
<!-- and is missing its closing punctuation
     to boot -->
This rule makes it significantly easier to produce acceptable typeset output in TeX, Waterloo Script, or other programs.

Every text-division element (<div>, <div1>, etc.) should start a new line. If its n attribute is used to contain a section number, numbering must be consistently and correctly applied. When, as is usually the case, it is followed by a <head> element giving a title for the section, this must follow directly on with no intervening spaces, or be at the beginning of the next line. Do not include section numbering within the <head> element. For example;

 
  <!-- this is correct -->
<div1 n='23'><head>Concerning Identifiers</head>  

<!-- and this is correct -->  
<div1 n='23'>
<head>Concerning Identifiers</head>  

<!-- but this is incorrect -->
<div1 n='23'>
   <head>Section 23:  Concerning Identifiers</head>  

3.5 Section IDs

It is convenient, when making and checking cross references, if sections are given ID values which exhibit the same structure as the sections they identify. In preparing TEI P3, the editors used the following system, which we recommend to work groups:

For example, here are some of the identifiers assigned to the various sections of chapter 6 of TEI P3:

As can be seen, each second-level identifier adds two characters to the identifier stem; the third-level identifiers add one, two, or three characters more. In general, the characters are chosen to suggest the name of the section; occasionally, the subsections are simply numbered through. Either technique can lose its semi-mnemonic character when sections are given new titles, rearranged, or merged with other sections. In the authors' experience, sections are rearranged more often than they are radically renamed; mnemonic letters thus remain mnemonic longer than numbers.

It is not essential to follow this scheme; any method of assigning IDs can be used as long as the result is reasonably easy to follow.

3.6 Spelling and Punctuation

After much debate, the TEI editors have managed to agree on the `mid-Atlantic rule' for normalizing spelling. This rule states that, wherever possible, spellings which are distinctively British or distinctively American should not be used. This means that some words such as favor, labor, color or favour, labour, colour cannot, in effect, be used at all. In practice an unmarked synonym can usually be found.

In cases of doubt, follow Webster's Third International to see if a spelling seems distinctively British to American eyes; follow the Oxford English Dictionary (or Hart's Rules) to see if a spelling seems distinctively American to British eyes. Some specific issues which have made one or other editor see red are listed below in no particular order:

4 Phrase-Level Elements Provided by the ODD DTD

This and the following sections list in summary form the elements provided by the ODD DTD which are additional to those defined by the TEI P3 main DTD.[8]

Within any prose material in an ODD document, the following elements may be used:

<gi>
contains the generic identifier of some element type
<tag>
contains an SGML start- or end-tag.
<att>
contains the name of some SGML attribute.
<val>
contains a value to be given for some attribute.
<avs>
contains a full attribute-value specification (attribute name, equal-sign, attribute value), the components of which may but need not be tagged.
<ent>
contains an entity name.
<file>
contains a file name.
<fpi>
contains a formal public identifier.
<code>
contains a fragment of source code in some formal language.
<ident>
contains an identifier in some formal language (e.g. a variable name); also used for `syntactic variables' in syntax diagrams and the like.
<kw>
contains a keyword or reserved word in some formal language.
<lit>
contains a literal string in some formal language (with or without enclosing quotation marks).
<comment>
contains a comment in some formal language (with or without the enclosing delimiters).
<delim>
contains a delimiter in some formal language.
The elements <gi>, <att>, <ent>, and <file> are all syntactic sugar for <ident> with appropriate type values. The elements <tag> and <avs> are syntactic sugar for <code>.

In general, all occurrences of these elements within ODD documents should be tagged, though the individual components of a <code>, <tag>, or <avs> element need not be.

In running prose, these elements are used to pick out names of elements, attributes, files, etc., in much the same way that phrase-level elements of the TEI core are used to pick out emphatic phrases, quotations, or proper names. For example, two paragraphs from section 6.3.3 of TEI P3 are tagged this way in the SGML form of TEI P3:

 
<p>Quotation may be rendered by changes in type 
face, by special punctuation marks.... If
these characteristics are of interest, an 
appropriate value for the <att>rend</att> attribute 
should be given, to record how the <gi>q</gi> 
or <gi>quote</gi> element is
rendered.  For discussion of suggested 
values for this attribute, see
below.</p>
<!-- ... -->
<p>Alternatively, the encoder may suppress all 
quotation marks, possibly recording their form 
using the <att>rend</att> attribute.  Where this 
is done, the following list of entity names 
(taken from the public entity 
sets <ident type=entset>ISOpub</ident> and <ident
type=entset>ISOnum</ident>) may be found useful 
to describe quotation-mark styles common in 
European and American typesetting:
... </p>

The names ISOpub and ISOnum are here tagged as <ident> (identifier) elements, since the actual entity name used for the entity set is allowed, in principle, to vary. Since in practice the entity name almost never does vary, however, they could plausibly be tagged as <ent> instead.

A paragraph in section 6.6 of TEI P3 illustrates how attribute values and attribute-value-specifications should be tagged.[9]

 
Here <avs>type=bibliog</avs> signals for 
the processing appropriate to a bibliographic 
reference, while <avs>targType='bibl
bibl.struct bibl.full'</avs> restricts the 
legal targets to bibliographic
elements, and the value <val>Chom59</val> on the
<att>target</att> attribute indicates which
bibliographic element actually is being referred to.  
For further discussion of bibliographic references, 
see section <ptr target=CObixr>.

The elements <file>, <code>, <ident>, <kw>, <lit>, <comment>, and <delim> are not specific to SGML, but can be used to discuss SGML as well as any other system or environment (e.g. a programming language) which has identifiers, keywords, literals, comments, and delimiters. As illustrated above, they may also be needed occasionally for SGML constructs. The <kw> element, in particular, is required for tagging SGML keywords or reserved names like PCDATA or ATTLIST.

These elements are defined formally as follows.

First, we declare all of these elements. Most can only contain literal characters, but some can contain phrase-level elements, or at least other types of tokens.

< 1 Define phrase-level elements > =

 
<!ELEMENT gi            - O  (#PCDATA)                          >
<!ATTLIST gi                 %a.global;                         >
<!ELEMENT tag           - O  (#PCDATA | %m.tokenTypes)*         >
<!ATTLIST tag                %a.global;
          type               (stag | etag)       'stag'         >
<!ELEMENT att           - O  (#PCDATA)                          >
<!ATTLIST att                %a.global;                         >
<!ELEMENT val           - O  (#PCDATA)                          >
<!ATTLIST val                %a.global;                         >
<!ELEMENT avs           - O  (#PCDATA | %m.tokenTypes)*         >
<!ATTLIST avs                %a.global;                         >
<!ELEMENT ent           - O  (#PCDATA)                          >
<!ATTLIST ent                %a.global;
          type               (ge | pe)           'ge'           >
<!ELEMENT file          - O  (#PCDATA)                          >
<!ATTLIST file               %a.global;                         >
<!ELEMENT fpi           - O  (#PCDATA)                          >
<!ATTLIST fpi                %a.global;                         >
<!ELEMENT code          - O  (%paraContent;)                    >
<!ATTLIST code               %a.global;                         >
<!ELEMENT ident         - O  (#PCDATA)                          >
<!ATTLIST ident              %a.global;                         >
<!ELEMENT kw            - O  (#PCDATA)                          >
<!ATTLIST kw                 %a.global;                         >
<!ELEMENT lit           - O  (#PCDATA)                          >
<!ATTLIST lit                %a.global;                         >
<!ELEMENT comment       - O  (#PCDATA | %m.tokenTypes)*         >
<!ATTLIST comment            %a.global;                         >
<!ELEMENT delim         - O  (#PCDATA)                          >
<!ATTLIST delim              %a.global;                         >

The tokenTypes class is defined to include all of these elements. The sgmlKeywords class of TEI P3 is extended to include the new SGML-related elements (i.e. all but <gi>, <att>, <tag>, and <val>, which are already members of the class), and the miscTokens class is defined to include the other elements not already part of the TEI P3 phrase class.

< 2 Define class of special token types > =

 
<!ENTITY % x.miscTokens ''                                      >
<!ENTITY % m.miscTokens '%x.miscTokens  
file | code | ident | kw | lit | comment | delim'               >

<!ENTITY % x.tokenTypes ''                                      >
<!ENTITY % m.tokenTypes '%x.tokenTypes  
att | avs | code | comment | delim | ent | file | fpi | gi 
| ident | kw | lit | tag | val'                                 >

Extensions to existing classes will included in the DTD files at a different location.

< 3 Modify SGML Keywords class > =

 
<!ENTITY % x.sgmlKeywords 'avs | ent | fpi |'                   >

5 Inter-level Elements in the ODD DTD

Within or between paragraphs or other components, ODD documents may contain the elements described in the following sections: examples, DTD fragments, and lists of elements or other constructs.

5.1 Examples

Examples may appear within or between paragraphs or other components in any ODD document:

<eg>
contains an example of SGML tagging, other markup, user interaction, or other phenomenon which should be reproduced literally when the document is rendered.

The <eg> element is described in TEI P3 as part of the TSD; in ODD documents, however, it can appear not only in the reference documentation within the <exemplum> element, but between or within paragraphs or other text components.

Since <eg> elements normally contain examples of SGML tagging, the SGML software used to process the tag set documentation may erroneously treat the tags in the example as real markup, rather than as simple text, if special steps are not taken. The preferred method in TEI documents is to enclose the entire example in a CDATA marked section, and to enclose that marked section within the <eg> element.

The paragraph containing the first example in the preceding section, for example, looks like this in the SGML form of this document:

 
<p>In running prose, these elements are used
to pick out names of elements, ...
For example, two paragraphs from section 6.3.3 of TEI P3 are
tagged this way in the SGML form of TEI P3:
<eg><![ CDATA [
<p>Quotation may be rendered by changes in type 
face, by special punctuation marks.... If
these characteristics are of interest, an 
appropriate value for the <att>rend</att> attribute 
should be given, to record how the <gi>q</gi> 
or <gi>quote</gi> element is
rendered.  For discussion of suggested 
values for this attribute, see
below.</p>
<!-- ... -->
<p>Alternatively, the encoder may suppress all 
quotation marks, possibly recording their form 
using the <att>rend</att> attribute.  Where this 
is done, the following list of entity names 
(taken from the public entity 
sets <ident type=entset>ISOpub</ident> and <ident
type=entset>ISOnum</ident>) may be found useful 
to describe quotation-mark styles common in 
European and American typesetting:
... </p>
]]></eg>
</p>
It is normally a good idea to keep lines in examples short (45-50 columns wide or so), so that they work even on relatively narrow windows.

No particular style of indentation is prescribed; one acceptable method indents elements to show their level of indentation, two blank spaces for each level. Elements should be given on a single line if they are short enough to fit. Otherwise the end-tag should be aligned vertically with the start-tag, thus:

 
<blort>
  <farble>
    <granfalloon> ... </granfalloon>
    <granfalloon> ... </granfalloon>
    <granfalloon> ... </granfalloon>
  </farble>
  <feeble> ... </feeble>
  <!-- etc. -->
</blort>

The <eg> element is defined thus:

< 4 Define example > =

 
<!ELEMENT eg            - -  (#PCDATA)                          >
<!ATTLIST eg                 %a.global;
          TEIform            CDATA               'eg'           >

5.2 DTD Fragments

The definition of a TEI tag set must not only describe the elements and their attributes in prose, but also provide formal declarations for them in SGML, both in the running prose at the end of sections and in the relevant entries in the reference documentation. Within the prose of a section, these declarations are contained in <scrap> elements.[10]

<scrap>
contains a fragment of a DTD or program source code.

The following elements are also defined in the DTD, and may be encountered when reading ODD documents, but need not be used in drafts of the Guidelines:

<scrapInfo>
contains a <scrap> element accompanied by elements giving cross references to other scraps and index entries for elements defined or used in the scrap.
<recap>
contains the text of a scrap already encountered, with its references to other scraps partially resolved.

The <scrap>, <scrapInfo>, and <recap> elements are described in the Sweb documentation and should be used as described there. The Sweb documentation describes them in the general context of literate programming, as containers for program source code. In the ODD system, the main function of <scrap> is typically to contain not fragments of a program but fragments of a DTD, possibly interspersed with references to other DTD fragments.

Like the <eg> element, the <scrap> element will frequently need a CDATA marked section surrounding its content, or part of it. For example, the DTD fragment in section 5.2 of TEI P3 would look something like this, tagged as a <scrap> element:

 
 
<scrap id=dHD2 name='The file description'>
<![ CDATA [ 
<!ELEMENT fileDesc      - -  (titleStmt, editionStmt?, extent?, 
                             publicationStmt, seriesStmt?, 
                             notesStmt?, sourceDesc+ )          > 
<!ATTLIST fileDesc           %a.global;                         >
]]> 
<ptr target=dHD21> <!> 
<ptr target=dHD22> <!> 
<ptr target=dHD23> <!> 
<ptr target=dHD24> <!> 
<ptr target=dHD25> <!> 
<ptr target=dHD26> <!> 
</scrap> 
When the element contains references to other DTD fragments, however, as in this example, the use of marked sections can become unwieldy, and it may be preferred to insert empty comments (an exclamation point between two angle brackets) at strategic locations in the data, thus:[11]
 
 
<scrap id=dHD2 name='The file description'> 
<<!>!ELEMENT fileDesc      - -  (titleStmt, editionStmt?, extent?, 
                              publicationStmt, seriesStmt?,
                              notesStmt?, sourceDesc+ )         > 
<<!>!ATTLIST fileDesc           %a.global;                         > 
<ptr target=dHD21> <!> 
<ptr target=dHD22> <!> 
<ptr target=dHD23> <!> 
<ptr target=dHD24> <!> 
<ptr target=dHD25> <!> 
<ptr target=dHD26> <!> 
</scrap> 

The <scrapInfo> element is generated by Sweb and ODD processors; it contains a <scrap> element together with related cross-reference and indexing information. Since <scrapInfo> elements are generated automatically, they need never be written into the document by hand; they will be encountered only in documents which have been processed by the ODD system and are now being edited further by hand.

The preceding example might look like this, tagged with a <scrapInfo> element:[12]

 
<scrapInfo>
<head>5.2 The file description</head>
<scrap id=dHD2>
<<!>!ELEMENT fileDesc      - -  (titleStmt, editionStmt?, extent?,
                             publicationStmt, seriesStmt?,
                             notesStmt?, sourceDesc+ )          >
<<!>!ATTLIST fileDesc           %a.global;                         >
<ref target=dHD21>The title statement 5.2.1</ref>
<ref target=dHD22>The edition statement 5.2.2</ref>
<ref target=dHD23>The extent statement 5.2.3</ref>
<ref target=dHD24>The publication statement 5.2.4</ref>
<ref target=dHD25>The series statement 5.2.5</ref>
<ref target=dHD26>The notes statement 5.2.6</ref>
</scrap>
<scrapRefs>This fragment is used in 
<ref target=dHD11>Section 5.1.1 The TEI Header</ref>
</scrapRefs>
<indexDefs>
  <index index=elements level1='fileDesc' rend='defined'>
</indexDefs>
<indexRefs>
  <index index=elements level1='titleStmt'   rend='used'>
  <index index=elements level1='editionStmt' rend='used'>
  <index index=elements level1='extent'      rend='used'>
  <index index=elements level1='publicationStmt' rend='used'>
  <index index=elements level1='seriesStmt'  rend='used'>
  <index index=elements level1='notesStmt'   rend='used'>
  <index index=elements level1='sourceDesc'  rend='used'>
</indexRefs>
</scrapInfo>

After the ODD processors generate the indexing information, it can be edited by hand if necessary.

The <scrap> element is defined thus:

< 5 Define scrap > =

 
<!ELEMENT scrap         - -  (%model.scrap)                     >
<!ATTLIST scrap              %a.global
          name               CDATA               #IMPLIED
          file               CDATA               #IMPLIED
          version            IDREFS              #IMPLIED
          index              CDATA               'auto'         >
<!-- recap and scrapInfo are defined elsewhere -->

The declaration for <scrap> refers to the parameter entity model.scrap, which is defined thus:

< 6 Define content model for DTD fragments > =

 
<!ENTITY  % model.scrap  '(#PCDATA | ptr | ref)*'               >

Declarations for the <scrapInfo> and <recap> elements are given in the appendix.

5.3 Lists of Elements, Attributes, and Values

The canonical form of sections requires that elements to be discussed in a section be listed at the beginning of the section. This should be done with the following elements:

<listElements>
contains a series of <gi> - <desc> - <listAtts> triplets, or <gi> - <desc> pairs. Where attributes should be described, the <listAtts> element should be included; where no attributes need be described, it should be omitted.
<listAtts>
contains a series of <att> - <desc> - <listVals> triplets, or <att> - <desc> pairs. Where particular values of the attribute should be described, the <listVals> element should be included; otherwise it should be omitted. Attributes include:
atts
contains a set of IDREFs indicating the attributes to be described; the IDs should be those of the <attDef> elements in the reference documentation; this attribute is ignored if the <listAtts> element is not empty.
<listVals>
contains a series of <val> elements or <val> - <desc> pairs. Attributes include:
vals
contains a set of IDREFs indicating the values to be described; the IDs should be those of the <val> elements in the reference documentation; this attribute is ignored if the <listVals> element is not empty.
type
indicates whether the set of values is closed (no others are legal), open (these are just samples), or semi (the list is not exhaustive, but the values given should be used when they apply, and TEI-aware software should work appropriately with them).
These are all syntactic sugar for the generic <listDescs> with appropriate values for its itemType attribute:
<listDescs>
contains a series of <label> - <desc> - <listDescs> triplets, <label> - <desc> pairs, or simple <label> elements. Where the list entry should include a list of attributes, fields, arguments, children, values, or other associated (usually subordinate) items, the <listDescs> element should be included; otherwise, it should be omitted. Attributes include:
itemType
identifies the type of items being described (or to be described) by the nested list. Legal values are:
elements
attributes
values
members
classes
range
arguments
other
items
contains a set of IDREFs indicating the items to be described; the IDs should be those of the <desc> elements describing those items, or of the <val> or <ident> elements giving the values or identifiers to be listed; this attribute is ignored if the <listDescs> element is not empty.

In the preparation of ODD documentation for tag sets, these elements will normally contain full descriptions of the elements, attributes, or values being described. In the course of ODD processing, the descriptions will be copied into the reference documentation (on which see further below), and the descriptions in the running prose will be marked as redundant copies by means of the sameAs attribute.[13]

In documents which have already been processed at least once by an ODD processor, the description may occur both within a <listElements> element and within the <tagDoc> element for the element type. One of these is the `master' version and one is a copy and will be ignored in future processing. When editing an ODD document, be sure not to make your manual changes to the copy; make the changes to the master version.

As an example of these elements, consider the following merger of the element lists which begin this section:

 
 
<listElements>
  <gi>listElements</gi>
  <desc>contains a series of
    <gi>gi</gi> - <gi>desc</gi> - <gi>listAtts</gi> 
    triplets.   Where attributes should be described, 
    the <gi>listAtts</gi> element should
    be included; where no attributes need be described, 
    it should be omitted.
  </desc>

  <gi>listAtts</gi>
  <desc>contains a series of <gi>att</gi> - 
    <gi>desc</gi> - <gi>listVals</gi> triplets.  Where
    particular values of the attribute should be described, 
    the <gi>listVals</gi> element should be included; 
    otherwise it should be omitted.
  </desc>
  <listAtts>
    <att>atts</att>
    <desc>contains a set of <kw>IDREF</kw>s indicating 
      the attributes to be described; the <kw>ID</kw>s 
      should be those of the <gi>attDef</gi> elements 
      in the reference documentation; this
      attribute is ignored if the <gi>listAtts</gi> 
      element is not empty.
    </desc>
  </listAtts>

  <gi>listVals</gi>
  <desc>contains a series of
    <gi>val</gi> elements or <gi>val</gi> - <gi>desc</gi> 
    pairs.
  </desc>
  <listAtts>
    <att>vals</att>
    <desc>contains a set of <kw>IDREF</kw>s indicating 
      the values to be described; the <kw>ID</kw>s should 
      be those of the <gi>val</gi> elements in the reference 
      documentation; this attribute is ignored if the 
      <gi>listVals</gi> element is not empty.
    </desc>
  </listAtts>

  <gi>listDescs</gi>
  <desc>contains a series of
    <gi>label</gi> - <gi>desc</gi> - <gi>listDescs</gi> 
    triplets.   Where the list entry should include a 
    list of attributes, fields, arguments, children, values, 
    or other associated (usually subordinate) items, 
    the <gi>listDescs</gi> element should
    be included; otherwise, it should be omitted.
  </desc>
  <listAtts>
    <att>itemType</att>
    <desc>identifies the type of items being described 
      (or to be described) by the nested list.
    </desc>
    <listVals>
      <val>elements</val>
      <val>attributes</val>
      <val>values</val>
      <val>members</val>
      <val>classes</val>
      <val>range</val>
      <val>arguments</val>
    </listVals>

    <att>items</att>
    <desc>contains a set of <kw>IDREF</kw>s indicating 
      the items to be described; the <kw>ID</kw>s should 
      be those of the <gi>desc</gi> elements describing those 
      items, or of the <gi>val</gi> or <gi>ident</gi> elements 
      giving the values or identifiers to be listed; this
      attribute is ignored if the <gi>listDescs</gi> element 
      is not empty.
    </desc>
  </listAtts>
</listElements>

For constructs other than elements, attributes, or predefined attribute values, the general-purpose <listDescs> element can be used. The element-list just given would look something like this if abridged somewhat and tagged using the <listDescs> element:

 
 
<listDescs itemType=elements>
  <label><gi>listElements</gi></label>
  <desc>contains a series of
        <gi>gi</gi> - <gi>desc</gi> - <gi>listAtts</gi> 
        triplets.   Where attributes should be described, 
        the <gi>listAtts</gi> element should
        be included; where no attributes need be described, 
        it should be omitted.
  </desc>
  <!-- ... -->

  <label><gi>listDescs</gi></label>
  <desc>contains a series of
    <gi>label</gi> - <gi>desc</gi> - <gi>listDescs</gi> 
    triplets.   Where the list entry should include a 
    list of attributes, fields, arguments, children, values, 
    or other associated (usually subordinate) items, 
    the <gi>listDescs</gi> element should
    be included; otherwise, it should be omitted.
  </desc>
  <listDescs itemType=attributes>
    <label><att>itemType</att></label>
    <desc>identifies the type of items being described 
      (or to be described) by the nested list.
    </desc>
    <listDescs itemType=values>
      <label><val>elements</val></label>
      <label><val>attributes</val></label>
      <label><val>values</val></label>
      <label><val>members</val></label>
      <label><val>classes</val></label>
      <label><val>range</val></label>
      <label><val>arguments</val></label>
    </listDescs>

    <label><att>items</att></label>
    <desc>contains a set of <kw>IDREF</kw>s indicating 
      the items to be described; the <kw>ID</kw>s should 
      be those of the <gi>desc</gi> elements describing those 
      items, or of the <gi>val</gi> or <gi>ident</gi> elements 
      giving the values or identifiers to be listed; this
      attribute is ignored if the <gi>listDescs</gi> element 
      is not empty.
    </desc>
  </listAtts>
</listDescs>

If reference documentation has been created by hand or generated automatically, the descriptions may be left empty, using the copyOf attribute to point at the master version of the description in the reference material. If we assume that the descriptions in the reference material have the IDs indicated, the sample element list would look something like this:

 
 
<listElements>
  <gi>listElements</gi>
  <desc copyOf=listElements.desc></desc>

  <gi>listAtts</gi>
  <desc copyOf=listAtts.desc></desc>
  <listAtts>
    <att>atts</att>
    <desc copyOf=listAtts.atts.desc></desc>
  </listAtts>

  <!-- ... -->

  <gi>listDescs</gi>
  <desc copyOf=listDescs.desc></desc>
  <listAtts>
    <att>itemType</att>
    <desc copyOf='listDescs.itemType.desc'></desc>
    <listVals copyOf='listDescs.itemtype.vals'></listVals>

    <att>items</att>
    <desc copyOf='listDescs.items.desc'></desc>
  </listAtts>
</listElements>

Using the atts or vals attribute, an even simpler tagging is possible:

 
 
<listElements>
  <gi>listElements</gi>
  <desc copyOf=listElements.desc></desc>

  <gi>listAtts</gi>
  <desc copyOf=listAtts.desc></desc>
  <listAtts atts='listAtts.atts'></listAtts>

  <!-- ... -->

  <gi>listDescs</gi>
  <desc copyOf=listDescs.desc></desc>
  <listAtts atts='listDescs.itemType listDescs.items'></listAtts>
</listElements>

[Further syntactic sugar may be developed for this tagging in future versions of the ODD DTD.]

Descriptions for elements, attributes, and values should take consistent grammatical forms:

The formal declaration of these elements is as follows:

< 7 Define listElements and its children > =

 
<!ELEMENT listElements  - -  (gi, desc, listAtts?)*             >
<!ATTLIST listElements       %a.global;                         >
<!ELEMENT listAtts      - -  (att, desc, listVals?)*            >
<!ATTLIST listAtts           %a.global;    
          atts               IDREFS              #IMPLIED       >
<!ELEMENT listVals      - -  (val, 
                             ((desc, (val, desc)*) 
                             | val+)?)?                         >
<!ATTLIST listVals           %a.global;    
          type               (closed 
                             | open 
                             | semi)             open
          vals               IDREFS              #IMPLIED       >
<!ELEMENT listDescs     - -  (label, desc?, listDescs?)*        >
<!ATTLIST listDescs          %a.global;                         
          itemType           (elements 
                             | attributes
                             | values
                             | members
                             | classes
                             | range
                             | arguments
                             | other)            #IMPLIED       
          other              CDATA               #IMPLIED
          items              IDREFS              #IMPLIED       >

6 Elements for Reference Documentation

Each ODD document which defines elements for inclusion in the TEI encoding scheme must include reference documentation for the elements, entities, and element classes defined, using the following elements:

<tagDoc>
contains reference documentation for an SGML element type, including its generic identifier the natural-language expansion of the generic identifier, one or more descriptions, examples, documentation for its attributes, and so on.
<classDoc>
contains reference documentation for an element class.
<entDoc>
contains reference documentation for an SGML entity, including its name, type, and entity text.

These elements are semantically identical to those defined in the TEI's auxiliary document type for tag set documentation, though they may differ slightly in their syntax. When reference documentation is needed for something other than an element, an element class, or an entity, the following element may be used:

<refDoc>
contains reference documentation for any construct or item in an ODD document, including its formal name, the natural-language expansion of the formal name, one or more descriptions, examples, documentation for component parts, attributes, arguments, etc., and so on.

The component parts of these elements are as documented in TEI P3 and the reference documentation for Sweb. Some further details are given in the following sections.

All of these elements are members of the element class docCrystals:

< 8 Define doc crystals class > =

 
<!ENTITY % m.docCrystals 'tagDoc | classDoc | entDoc | refDoc'  >

6.1 Element Documentation

<tagDoc>
contains reference documentation for an SGML element type, including its generic identifier the natural-language expansion of the generic identifier, one or more descriptions, examples, documentation for its attributes, and so on.

The component parts of the <tagDoc> element are as described in TEI P3; for examples, see chapter 27.

The formal declarations for <tagDoc> and its children in the ODD DTD are these:

< 9 Define tagDoc and its children > =

 
<!ELEMENT tagDoc        - -  (gi, rs?, desc, 
                             attList?, exemplum*, 
                             remarks?, part?, 
                             classes?, files?, 
                             dataDesc?, parents?, 
                             children?, elemDecl, 
                             attlDecl?, (%m.crossRef)*, 
                             equiv*)                            >
<!ATTLIST tagDoc             %a.global;
          usage              (req | mwa | rec | rwa | opt) 
                                                 opt
          TEIform            CDATA               'tagDoc'       >
 
<!ELEMENT desc          - O  (%paraContent)                     >
<!ATTLIST desc               %a.global;
          TEIform            CDATA               'desc'         >
 
<!ELEMENT attList       - O  (attDef*)                          >
<!ATTLIST attList            %a.global;
          TEIform            CDATA               'attList'      >
 
<!ELEMENT exemplum      - -  (p*, eg, p*)                       >
<!ATTLIST exemplum           %a.global;
          TEIform            CDATA               'exemplum'     >
 
<!-- eg is defined elsewhere in the ODD documentation         -->
 
<!ELEMENT remarks       - O  (%component.seq)                   >
<!ATTLIST remarks            %a.global;
          TEIform            CDATA               'remarks'      >
 
<!ELEMENT part          - O  (#PCDATA)                          >
<!ATTLIST part               %a.global;
          type               CDATA               #IMPLIED
          name               CDATA               #IMPLIED
          TEIform            CDATA               'part'         >
 
<!ELEMENT classes       - O  (#PCDATA)                          >
<!ATTLIST classes            %a.global;
          names              CDATA               #REQUIRED
          TEIform            CDATA               'classes'      >
 
<!ELEMENT files         - O  EMPTY                              >
<!ATTLIST files              %a.global;
          names              CDATA               #IMPLIED
          TEIform            CDATA               'files'        >
 
<!ELEMENT dataDesc      - O  (%phrase.seq)                      >
<!ATTLIST dataDesc           %a.global;
          TEIform            CDATA               'dataDesc'     >
 
<!ELEMENT parents       - O  (#PCDATA)                          >
<!ATTLIST parents            %a.global;
          TEIform            CDATA               'parents'      >
 
<!ELEMENT children      - O  (#PCDATA)                          >
<!ATTLIST children           %a.global;
          TEIform            CDATA               'children'     >
 
<!ELEMENT elemDecl      - O  (#PCDATA)                          >
<!ATTLIST elemDecl           %a.global;
          TEIform            CDATA               'elemDecl'     >
 
<!ELEMENT attlDecl      - -  (#PCDATA)                          >
<!ATTLIST attlDecl           %a.global;
          TEIform            CDATA               'attlDecl'     >
 
<!ELEMENT equiv         - O  (%specialPara)                     >
<!ATTLIST equiv              %a.global;
          scheme             CDATA               #REQUIRED
          TEIform            CDATA               'equiv'        >

This is almost exactly as in TEI P3, but at the end of the <tagDoc> we allow not only <ptr> elements to point to the relevant documentation, but any type of pointer including <ref>, <xptr>, and <xref>. The necessary class is called crossRef and it is defined thus:

< 10 Define cross reference class > =

 
<!ENTITY % x.crossRef '' >
<!ENTITY % m.crossRef '%x.crossRef ptr | ref | xptr | xref' >

6.2 Attribute Documentation

The reference documentation for an attribute resembles that for the element as a whole, but tends to be shorter.

<attDef>
contains reference documentation for one attribute of a particular SGML element type, including its attribute name, the natural-language expansion of the name, one or more descriptions, examples, documentation for its values, and so on.

The <attDef> element is used the same way in ODD as in the TSD; consult TEI P3 for details.

< 11 Define attDef and children > =

 
<!ELEMENT attDef        - O  (attName, rs?, desc, 
                             (dataType, (valList | valDesc)?), 
                             default, 
                             eg?, remarks?, 
                             equiv*)                            >
<!ATTLIST attDef             %a.global;
          usage              (req | mwa | rec | rwa | opt) 
                                                 opt
          TEIform            CDATA               'attDef'       >
 
<!ELEMENT attName       - O  (#PCDATA)                          >
<!ATTLIST attName            %a.global;
          TEIform            CDATA               'attName'      >
 
<!ELEMENT dataType      - O  (#PCDATA)                          >
<!ATTLIST dataType           %a.global;
          TEIform            CDATA               'dataType'     >
 
<!ELEMENT valList       - -  ((val, desc)*)                     >
<!ATTLIST valList            %a.global;
          type               (closed | semi | open) 
                                                 open
          TEIform            CDATA               'valList'      >
 
<!ELEMENT valDesc       - O  (%phrase.seq)                      >
<!ATTLIST valDesc            %a.global;
          TEIform            CDATA               'valDesc'      >
 
<!ELEMENT default       - O  (#PCDATA)                          >
<!ATTLIST default            %a.global;
          TEIform            CDATA               'default'      >
 

6.3 Element-class Documentation

The TEI makes extensive use of element classes to reduce the complexity of the Guidelines; these classes have their own reference documentation:

<classDoc>
contains reference documentation for an element class.

The <classDoc> element is used the same way in ODD as in the TSD; consult TEI P3 for details.

Element classes are documented with <classDoc>:

< 12 Define classDoc > =

 
<!ELEMENT classDoc      - O  (class, rs?, desc, 
                             attList?, remarks?, 
                             part?, classes?, 
                             files?, (%m.crossRef)*, 
                             equiv*)                            >
<!ATTLIST classDoc           %a.global;
          type               (model | atts | both) 
                                                 #IMPLIED
          TEIform            CDATA               'classDoc'     >
 
<!ELEMENT class         - O  (#PCDATA)                          >
<!ATTLIST class              %a.global;
          TEIform            CDATA               'class'        >

6.4 Entity Documentation

If a tag set defines special entities (parameter entities or general entities), it is essential to document them:

<entDoc>
contains reference documentation for an SGML entity, including its name, type, and entity text.

The <entDoc> element is used the same way in ODD as in the TSD; consult TEI P3 for details.

< 13 Define entDoc > =

 
<!ELEMENT entDoc        - -  (entName, rs?, desc, 
                             remarks?, string, 
                             (%m.crossRef)*, equiv*)            >
<!ATTLIST entDoc             %a.global;
          type               (pe | ge)           #REQUIRED
          TEIform            CDATA               'entDoc'       >
 
<!ELEMENT entName       - O  (#PCDATA)                          >
<!ATTLIST entName            %a.global;
          TEIform            CDATA               'entName'      >
 
<!ELEMENT string        - -  (#PCDATA)                          >
<!ATTLIST string             %a.global;
          TEIform            CDATA               'string'       >
 

6.5 Documentation of Other Constructs

Other constructs may be documented using the general-purpose <refDoc> element (not part of the TSD):

<refDoc>
contains reference documentation for any construct or item in an ODD document, including its formal name, the natural-language expansion of the formal name, one or more descriptions, examples, documentation for component parts, attributes, arguments, etc., and so on.

This element is more general than the <tagDoc>, <classDoc>, and <entDoc> elements, all of which can be reduced to it; the ODD DTD retains the specialized elements as syntactic sugar. The <refDoc> element itself may be used to document SGML constructs other than elements, attributes, classes, and entities. The TEI Lite and ODD DTDs, for example, both provide NOTATION declarations for various commonly used data formats. In an ODD document, these could be documented thus:

 
 <refDoc type=notation>
    <ident type=notation>cgm</>
    <rs  >Computer Graphics Metafile</>
    <desc>A graphics format defined by the International
      Organization for Standardization; particularly well suited
      to vector graphics, though a bit-map format is also
      available.</desc>
    <exemplum><p>A figure in CGM format can be embedded in a
      TEI Lite document using the <gi>fig</gi> element; the
      <att>entity</att> attribute gives the entity name of
      the figure, and the entity declaration identifies the
      notation:</p>
      <eg><![ CDATA [
        <!DOCTYPE tei.2 system 'tei2.dtd' [
        <!ENTITY dwarf SYSTEM 'dwarf.cgm' NDATA CGM>
        ]>
        ...
        <figure entity=dwarf>
          <figDesc>The figure shows the dwarf Dopey,
            waving a magic wand.</figDesc>
        </figure>
      ]]></eg>
    </exemplum>
    <remarks><p>The main drawback to CGM is that it's an
      international standard; its use therefore exposes the
      user to ridicule from chance acquaintances who prefer
      proprietary solutions.</remarks>
    <files>teilite.dtd</files>
    <formal><![ CDATA [
       <!NOTATION cgm PUBLIC
       'ISO 8632:1987//NOTATION Computer Graphics Metafile//EN' >
    ]]>
    </formal>
    <ptr type='doc' target=chap3>
    <equiv scheme=TEI>seg type=blort</>
  </refDoc>

The <refDoc> element could similarly be used to document the TEI writing system declaration viewed as a specialized SGML NOTATION:

 
 <refDoc type=notation>
    <ident type=notation>WSD</>
    <rs  >Writing System Declaration</>
    <desc>An auxiliary document type defined by the TEI for
      documenting coded character sets, transliterations, and
      entity sets used to represent written language in
      alphabetic or non-alphabetic writing.  Can also be
      used, with slight stretch, to document transcription of
      audio material.</desc>
    <exemplum><p>Multiple writing system declarations can be
      associated with any TEI document; they are attached using
      using the <att>wsd</att> attribute of the <gi>language</gi> 
      element in the TEI header; an entity declaration must be
      provided, to associate the external document containing
      the WSD with the entity name used as the value of the
      <att>wsd</att> attribute; the entity declaration should
      identify the external document as being in <ident>WSD</ident>:
      notation:</p>
      <eg><![ CDATA [
        <!DOCTYPE tei.2 system 'tei2.dtd' [
        <!ENTITY english SYSTEM 'en.wsd' NDATA WSD>
        ]>
        ...
        <language wsd=english id=en>Modern English</language>
        ...
      ]]></eg>
    </exemplum>
    <remarks><p>The Writing System Declaration provides more
      information about the writing system and its representation
      than is normally found in standards for entity sets,
      transliteration schemes, or coded character sets, and
      has been designed to allow the encoder to distinguish
      different forms of the same character (whether they are
      distinct glyphs or not) or not, independent of decisions
      made in the relevant coded character sets, fonts, or
      standard entity sets.</remarks>
    <files>teilite.dtd</files>
    <formal><![ CDATA [
       <!NOTATION wsd  PUBLIC
       '-//TEI P3-1994//NOTATION Writing System Declaration//EN' >
    ]]>
    </formal>
    <ptr type='p3' target=chap3>
  </refDoc>

The <refDoc> element and its children are declared thus:

< 14 Define generic reference documentation > =

 
<!ELEMENT refDoc        - -  (ident, rs?, desc, (%m.subs)?,
                             exemplum*, remarks?, module?,
                             files?, ptrList*, informal?, 
                             formal?, (%m.crossRef)*, equiv*)   >
<!ATTLIST refDoc             %a.global                          >
<!ELEMENT module        - -  (#PCDATA)                          >
<!ATTLIST module             %a.global                          >
<!ELEMENT ptrList       - -  (#PCDATA | ptr | ref 
                             | xptr | xref)*                    >
<!ATTLIST ptrList            %a.global                          >
<!ELEMENT informal      - O  (%paraContent)                     >
<!ATTLIST informal           %a.global                          >
<!ELEMENT formal        - O  (%paraContent)                     >
<!ATTLIST formal             %a.global                          >

<!ELEMENT attributes    - -  (refDoc*)                          >
<!ATTLIST attributes         %a.global                          >
<!ELEMENT arguments     - -  (refDoc*)                          >
<!ATTLIST arguments          %a.global                          >
<!ELEMENT members       - -  (refDoc*)                          >
<!ATTLIST members            %a.global                          >
<!ELEMENT range         - -  (code, desc)*                      >
<!ATTLIST range              %a.global                          >
<!ELEMENT subs          - -  (refDoc*)                          >
<!ATTLIST subs               %a.global                          
          type               CDATA               #REQUIRED      >

The class of subs includes elements for documenting various types of subordinate or related objects or constructs: attributes (of an element), arguments (of a function), members (of a class or of a structure), range (of values), or subs (anything else). It is defined thus:

< 15 Define subs class > =

 
<!ENTITY % m.subs 'attributes | arguments 
| members | range | subs' >

7 Overview of the DTD

This section describes the overall structure of the ODD DTD and defines the `driver' or `top-level' DTD fragment which embeds all the other DTD fragments.

The ODD DTD is designed to be used as a user-extension of the TEI P3 main DTD; it thus consists of two files: a TEI.extensions.ent file to modify the TEI element classes by adding the ODD elements to them, and a TEI.extensions.dtd file to declare the elements themselves. The two files are called oddx.ent and oddx.dtd respectively. The filename odd.dtd will be used for single-file non-extensible versions of the ODD DTD.

7.1 Top-Level View

The overall structure of the file oddx.dtd is:

< 16 >(oddx.dtd) =

 
<!-- Text Encoding Initiative:                         -->
<!-- ODD (One Document Does it all)                    -->
<!-- Document TEI ED W55, 1996.                        -->
<!-- Copyright (c) 1996 ACH, ACL, ALLC.                -->
<!-- Permission to copy in any form is granted,        -->
<!-- provided this notice is included in all copies.   -->
 
<!-- These materials subject to revision.              -->
<!-- Current versions are available from the           -->
<!-- Text Encoding Initiative.                         -->

<!-- Revisions: -->
<!-- 1996-06-04 : CMSMcQ : complete DTD fragments in ED W55 -->

< Define phrase-level elements 1 > 
< Define example 4 > 
< Define scrap 5 > 
< Define recapitulation and scrap info 27 > 
< Define listElements and its children 7 > 
< Define tagDoc and its children 9 > 
< Define attDef and children 11 > 
< Define classDoc 12 > 
< Define entDoc 13 > 
< Define generic reference documentation 14 > 

The overall structure of the file oddx.ent is:

< 17 >(oddx.ent) =

 
<!-- Text Encoding Initiative:                         -->
<!-- ODD (One Document Does it all)                    -->
<!-- Document TEI ED W55, 1996.                        -->
<!-- Copyright (c) 1996 ACH, ACL, ALLC.                -->
<!-- Permission to copy in any form is granted,        -->
<!-- provided this notice is included in all copies.   -->
 
<!-- These materials subject to revision.              -->
<!-- Current versions are available from the           -->
<!-- Text Encoding Initiative.                         -->

<!-- Revisions: -->
<!-- 1996-06-04 : CMSMcQ : complete DTD fragments in ED W55 -->
< Define class of special token types 2 > 
< Define content model for DTD fragments 6 > 
< Define pointer and index models 28 > 
< Define doc crystals class 8 > 
< Define cross reference class 10 > 
< Define subs class 15 > 
< Modify TEI element classes 18 >  
< Select TEI elements 19 >  

7.2 Adjustments to TEI Element Classes

The new elements need to be fitted into the standard TEI element classes. That is done in this DTD fragment:

< 18 Modify TEI element classes > =

 
<!-- Changes to element classes (for new elements)            -->
< Modify SGML Keywords class 3 > 
<!ENTITY % x.data      '%m.miscTokens |'                        >
<!ENTITY % x.lists     'listElements | listAtts | listVals |
listDescs |'                                                    >
<!ENTITY % x.inter     'eg | scrap | scrapInfo | recap |'       >
<!ENTITY % x.chunk     '%m.docCrystals |'                       >
<!ENTITY % x.common    'eg | scrap | scrapInfo | recap |'       >

<!-- Changes to element classes (to fix oversights in TEI P3) -->
<!ENTITY % x.front     'divGen |'                               >

<!ENTITY % a.linking '
          corresp            IDREFS              #IMPLIED
          sameAs             IDREF               #IMPLIED
          copyOf             IDREF               #IMPLIED
          next               IDREF               #IMPLIED
          prev               IDREF               #IMPLIED
          exclude            IDREFS              #IMPLIED
          select             IDREFS              #IMPLIED'      >
 

7.3 Element Selection

As any view of the TEI DTD may, the ODD DTD selects some elements from the TEI main DTD and suppresses others. For convenience, we record both those we suppress and those we include; this makes it easier to see what has been done, and to reconsider things later. [14]

< 19 Select TEI elements > =

 
< Select tags from TEI driver file 20 >

<!-- ******************************************************** -->
<!-- I.  Core tag sets.                                       -->
<!-- ******************************************************** -->

<!-- Chapter 5:  TEI Header ********************************* -->
< Select tags from TEI header 21 >
<!-- Chapter 6:  Elements Available in All TEI Documents **** -->
< Select tags from TEI core tag set 22 >
<!-- Chapter 7:  Default Text Structure ********************* -->
< Select tags from default text structure 23 >

<!-- ******************************************************** -->
<!-- II.  Base tag sets.                                      -->
<!-- II.A.  DTD files                                         -->
<!-- ******************************************************** -->

<!-- Chapter 8:  Prose * (included) ************************* -->
<!-- File:  TEIPROS2.DTD (no tags) ************************** -->
<!-- Chapter 9:  Verse * (excluded) ************************* -->
<!-- Chapter 10:  Drama * (excluded) ************************ -->
<!-- Chapter 11:  Transcriptions of Speech * (excluded) ***** -->
<!-- Chapter 12:  Print Dictionaries * (excluded) *********** -->
<!-- Chapter 13:  Terminological Data * (excluded) ********** -->
<!-- * Mixed Bases * (excluded) ***************************** -->

<!-- ******************************************************** -->
<!-- III.  Additional tag sets.                               -->
<!-- ******************************************************** -->

<!-- Chapter 14:  Linking, Segmentation, and Alignment ****** -->
< Select tags from tag set for linking and alignment 24 >
<!-- Chapter 15:  Simple Analytic Mechanisms **************** -->
< Select tags from tag set for simple analysis 25 >
<!-- Chapter 16:  Feature Structures * (excluded) *********** -->
<!-- Chapter 17:  Certainty and Responsibility * (excluded) * -->
<!-- Chapter 18:  Transcription of Primary Sources * (excl) * -->
<!-- Chapter 19:  Critical Apparatus * (excluded) *********** -->
<!-- Chapter 20:  Names and Dates * (excluded) ************** -->
<!-- Chapter 21:  Graphs, Networks, and Trees * (excluded) ** -->
<!-- Chapter 22:  Tables, Formulae, and Graphics ************ -->
< Select tags from tag set for tables and figures 26 >
<!-- Chapter 23:  Language Corpora * (excluded) ************* -->

In the main TEI driver file, we select only the <tei.2> element, suppressing <teiCorpus.2>:

< 20 Select tags from TEI driver file > =

 
<!-- FILE:  TEI2.DTD -->
<!ENTITY % TEI.2        'INCLUDE' >
<!ENTITY % teiCorpus.2  'IGNORE' >

In the header,

< 21 Select tags from TEI header > =

 
<!-- File:  TEIHDR2.DTD -->
<!ENTITY % teiHeader    'INCLUDE' >
<!ENTITY % fileDesc     'INCLUDE' >
<!ENTITY % titleStmt    'INCLUDE' >
<!ENTITY % sponsor      'INCLUDE' -- ? -- >
<!ENTITY % funder       'INCLUDE' -- ? -- >
<!ENTITY % principal    'INCLUDE' -- ? -- >
<!ENTITY % editionStmt  'INCLUDE' -- ? -- >
<!ENTITY % edition      'INCLUDE' -- ? -- >
<!ENTITY % extent       'INCLUDE' -- ? -- >
<!ENTITY % publicationStmt 'INCLUDE' >
<!ENTITY % distributor  'INCLUDE' >
<!ENTITY % authority    'INCLUDE' >
<!ENTITY % idno         'INCLUDE' >
<!ENTITY % availability 'INCLUDE' -- ? -- >
<!ENTITY % seriesStmt   'INCLUDE' >
<!ENTITY % notesStmt    'INCLUDE' >
<!ENTITY % sourceDesc   'INCLUDE' >
<!ENTITY % scriptStmt                  'IGNORE' >
<!ENTITY % recordingStmt               'IGNORE' >
<!ENTITY % recording                   'IGNORE' >
<!ENTITY % equipment                   'IGNORE' >
<!ENTITY % broadcast                   'IGNORE' >
<!ENTITY % encodingDesc  'INCLUDE' >
<!ENTITY % projectDesc   'INCLUDE' >
<!ENTITY % samplingDecl  'INCLUDE' >
<!ENTITY % editorialDecl 'INCLUDE' >
<!ENTITY % correction                  'IGNORE' -- ? -- >
<!ENTITY % normalization               'IGNORE' -- ? -- >
<!ENTITY % quotation                   'IGNORE' -- ? -- >
<!ENTITY % hyphenation                 'IGNORE' -- ? -- >
<!ENTITY % segmentation                'IGNORE' -- ? -- >
<!ENTITY % stdVals                     'IGNORE' -- ? -- >
<!ENTITY % interpretation              'IGNORE' -- ? -- >
<!ENTITY % tagsDecl      'INCLUDE' >
<!ENTITY % tagUsage      'INCLUDE' >
<!ENTITY % rendition     'INCLUDE' >
<!ENTITY % refsDecl      'INCLUDE' >
<!ENTITY % step                        'IGNORE' -- ? -- >
<!ENTITY % state                       'IGNORE' >
<!ENTITY % classDecl     'INCLUDE' >
<!ENTITY % taxonomy      'INCLUDE' >
<!ENTITY % category      'INCLUDE' >
<!ENTITY % catDesc       'INCLUDE' >
<!ENTITY % fsdDecl                     'IGNORE' >
<!ENTITY % metDecl                     'IGNORE' >
<!ENTITY % symbol                      'IGNORE' >
<!ENTITY % variantEncoding             'IGNORE' >
<!ENTITY % profileDesc  'INCLUDE' >
<!ENTITY % creation     'INCLUDE' >
<!ENTITY % langUsage    'INCLUDE' >
<!ENTITY % language     'INCLUDE' >
<!ENTITY % textClass    'INCLUDE' >
<!ENTITY % keywords     'INCLUDE' >
<!ENTITY % classCode    'INCLUDE' >
<!ENTITY % catRef       'INCLUDE' >
<!ENTITY % revisionDesc 'INCLUDE' >
<!ENTITY % change       'INCLUDE' >

In the TEI core,

< 22 Select tags from TEI core tag set > =

 
<!-- File:  TEICORE2.DTD -->
<!ENTITY % p            'INCLUDE' >
<!ENTITY % foreign      'INCLUDE' >
<!ENTITY % emph         'INCLUDE' >
<!ENTITY % hi           'INCLUDE' >
<!ENTITY % distinct                    'IGNORE' -- ? LB -- >
<!ENTITY % q            'INCLUDE' >
<!ENTITY % quote        'INCLUDE'  -- changed from Lite -- >
<!ENTITY % cit          'INCLUDE' >
<!ENTITY % soCalled     'INCLUDE' >
<!ENTITY % term         'INCLUDE' >
<!ENTITY % mentioned    'INCLUDE' >
<!ENTITY % gloss        'INCLUDE' >
<!ENTITY % name         'INCLUDE' >
<!ENTITY % rs           'INCLUDE' >
<!ENTITY % num          'INCLUDE' >
<!ENTITY % measure                     'IGNORE' >
<!ENTITY % date         'INCLUDE' >
<!ENTITY % dateRange                   'IGNORE' >
<!ENTITY % time         'INCLUDE' >
<!ENTITY % timeRange                   'IGNORE' >
<!ENTITY % abbr         'INCLUDE' >
<!ENTITY % expan                       'IGNORE' -- ? -- >
<!ENTITY % sic          'INCLUDE' >
<!ENTITY % corr         'INCLUDE' >
<!ENTITY % reg          'INCLUDE' -- ? -- >
<!ENTITY % orig         'INCLUDE' -- ? -- >
<!ENTITY % gap          'INCLUDE' >
<!ENTITY % add          'INCLUDE' -- ? -- >
<!ENTITY % del          'INCLUDE' -- ? -- >
<!ENTITY % unclear      'INCLUDE' >
<!ENTITY % address      'INCLUDE' >
<!ENTITY % addrLine     'INCLUDE' >
<!ENTITY % street                      'IGNORE' >
<!ENTITY % postCode                    'IGNORE' >
<!ENTITY % postBox                     'IGNORE' >
<!ENTITY % ptr          'INCLUDE' >
<!ENTITY % ref          'INCLUDE' >
<!ENTITY % list         'INCLUDE' >
<!ENTITY % item         'INCLUDE' >
<!ENTITY % label        'INCLUDE' >
<!ENTITY % head         'INCLUDE' >
<!ENTITY % headLabel                   'IGNORE' >
<!ENTITY % headItem                    'IGNORE' >
<!ENTITY % note         'INCLUDE' >
<!ENTITY % index        'INCLUDE' >
<!ENTITY % divGen       'INCLUDE' >
<!ENTITY % milestone    'INCLUDE' >
<!ENTITY % pb           'INCLUDE' >
<!ENTITY % lb           'INCLUDE' >
<!ENTITY % cb                          'IGNORE' >
<!ENTITY % bibl         'INCLUDE'  -- needs redefinition -- >
<!ENTITY % biblStruct                  'IGNORE' >
<!ENTITY % biblFull     'INCLUDE' >
<!ENTITY % listBibl     'INCLUDE' >
<!ENTITY % analytic                    'IGNORE' >
<!ENTITY % monogr                      'IGNORE' >
<!ENTITY % series                      'IGNORE' >
<!ENTITY % author       'INCLUDE' >
<!ENTITY % editor       'INCLUDE' >
<!ENTITY % respStmt     'INCLUDE' >
<!ENTITY % resp         'INCLUDE' >
<!ENTITY % title        'INCLUDE' >
<!ENTITY % meeting                     'IGNORE' -- ? -- >
<!ENTITY % imprint      'INCLUDE' >
<!ENTITY % publisher    'INCLUDE' >
<!ENTITY % biblScope    'INCLUDE' >
<!ENTITY % pubPlace     'INCLUDE' >
<!ENTITY % l            'INCLUDE' >
<!ENTITY % lg           'INCLUDE' >
<!ENTITY % sp           'INCLUDE' >
<!ENTITY % speaker      'INCLUDE' >
<!ENTITY % stage        'INCLUDE' >

< 23 Select tags from default text structure > =

 
<!-- File:  TEISTR2.DTD -->
<!ENTITY % text         'INCLUDE' >
<!ENTITY % body         'INCLUDE' >
<!ENTITY % group        'INCLUDE' >
<!ENTITY % div          'INCLUDE' >
<!ENTITY % div0         'INCLUDE' >
<!ENTITY % div1         'INCLUDE' >
<!ENTITY % div2         'INCLUDE' >
<!ENTITY % div3         'INCLUDE' >
<!ENTITY % div4         'INCLUDE' >
<!ENTITY % div5         'INCLUDE' >
<!ENTITY % div6         'INCLUDE' >
<!ENTITY % div7         'INCLUDE' >
<!ENTITY % trailer      'INCLUDE' >
<!ENTITY % byline       'INCLUDE' >
<!ENTITY % dateline     'INCLUDE' >
<!ENTITY % argument     'INCLUDE' >
<!ENTITY % epigraph     'INCLUDE' >
<!ENTITY % opener       'INCLUDE' >
<!ENTITY % closer       'INCLUDE' >
<!ENTITY % salute       'INCLUDE' >
<!ENTITY % signed       'INCLUDE' >

<!-- File:  TEIFRON2.DTD -->
<!ENTITY % front        'INCLUDE' >
<!ENTITY % titlePage    'INCLUDE' >
<!ENTITY % docTitle     'INCLUDE' >
<!ENTITY % titlePart    'INCLUDE' >
<!ENTITY % docAuthor    'INCLUDE' >
<!ENTITY % imprimatur                  'IGNORE' -- ? -- >
<!ENTITY % docEdition   'INCLUDE' >
<!ENTITY % docImprint   'INCLUDE' >
<!ENTITY % docDate      'INCLUDE' >

<!-- File:  TEIBACK2.DTD -->
<!ENTITY % back         'INCLUDE' >

< 24 Select tags from tag set for linking and alignment > =

 
<!-- File:  TEILINK2.ENT -->
<!-- File:  TEILINK2.DTD -->
<!ENTITY % link                        'IGNORE' -- ? -- >
<!ENTITY % linkGrp                     'IGNORE' -- ? -- >
<!ENTITY % xref         'INCLUDE' >
<!ENTITY % xptr         'INCLUDE' >
<!ENTITY % seg          'INCLUDE' >
<!ENTITY % anchor       'INCLUDE' >
<!ENTITY % when                        'IGNORE' >
<!ENTITY % timeline                    'IGNORE' >
<!ENTITY % join                        'IGNORE' >
<!ENTITY % joinGrp                     'IGNORE' >
<!ENTITY % alt                         'IGNORE' >
<!ENTITY % altGrp                      'IGNORE' >

< 25 Select tags from tag set for simple analysis > =

 
<!-- File:  TEIANA2.ENT -->
<!-- File:  TEIANA2.DTD -->
<!ENTITY % span                        'IGNORE' -- ? -- >
<!ENTITY % spanGrp                     'IGNORE' >
<!ENTITY % interp       'INCLUDE' >
<!ENTITY % interpGrp    'INCLUDE' >
<!ENTITY % s            'INCLUDE' >
<!ENTITY % cl                          'IGNORE' >
<!ENTITY % phr                         'IGNORE' >
<!ENTITY % w                           'IGNORE' >
<!ENTITY % m                           'IGNORE' >
<!ENTITY % c                           'IGNORE' >

< 26 Select tags from tag set for tables and figures > =

 
<!-- File:  TEIFIG2.ENT -->
<!ENTITY % formulaNotations 'CDATA'                             >
<!ENTITY % formulaContent 'CDATA'                               >

<!-- File:  TEIFIG2.DTD -->
<!ENTITY % table        'INCLUDE' >
<!ENTITY % row          'INCLUDE' >
<!ENTITY % cell         'INCLUDE' >
<!ENTITY % formula      'INCLUDE' >
<!ENTITY % figure       'INCLUDE' >
<!ENTITY % figDesc      'INCLUDE' >


Recapitulations and Scrap Info

The ODD DTD inherits several elements from Sweb which need not be described at length here, as (a) they are already described in the Sweb documentation and (b) they will seldom or never be necessary in initial drafts of TEI tag set documentation.

The formal definition of these elements is as follows:

< 27 Define recapitulation and scrap info > =

 
<!ELEMENT recap         - -  (%model.scrap)                     >
<!ATTLIST recap              %a.global;
          scrap              IDREF               #REQUIRED
          version            IDREFS              #IMPLIED       >
<!ELEMENT scrapInfo     - -  (head?, scrap, scrapDefs?,
                             scrapEquivs?, scrapRefs?,
                             indexDefs?, indexRefs?)            >
<!ATTLIST scrapInfo          %a.global                          >

<!ELEMENT scrapDefs     - -  (%model.scrapDefs)                 >
<!ATTLIST scrapDefs          %a.global                          >
<!ELEMENT scrapEquivs   - -  (%model.scrapEquivs)               >
<!ATTLIST scrapEquivs        %a.global                          >
<!ELEMENT scrapRefs     - -  (%model.scrapRefs)                 >
<!ATTLIST scrapRefs          %a.global                          >

<!ENTITY % x.chunk 'scrapInfo |' >

The various ancillary grouping elements have their content models defined using parameter entities, to make them easier to adjust in case of need:

< 28 Define pointer and index models > =

 
<!ENTITY % model.scrapPtrs '(#PCDATA | ptr | ref | xptr |
xref)*'                                                         >
<!ENTITY % model.scrapDefs   '%model.scrapPtrs'                 >
<!ENTITY % model.scrapEquivs '%model.scrapPtrs'                 >
<!ENTITY % model.scrapRefs   '%model.scrapPtrs'                 >

Contact Information

Comments on this document may be directed to:

Notes

[1] Or will be, as soon as we can finish the bootstrapping process. There is an ODD-conformant version of this document, which was generated by retagging the Sweb version. The Sweb version is being kept around for the moment, because we have an Sweb-to-HTML translator but not yet an ODD-to-HTML translator.
[return to text]

[2] The reference to other documentation is a temporary measure; this document will ultimately be expanded to provide full self-contained documentation for all elements mentioned here. -Eds.
[return to text]

[3] It's not only helpful, it is required by the way the ODD system works.
[return to text]

[4] The TEI Editors have both been guilty of all of these on some occasion or other. That's how we know they are common. It's usually the other guy who does it. That's how we know they are vices.
[return to text]

[5] This restriction may be lifted in future revisions of the ODD tag set.
[return to text]

[6] One acceptable indentation style is shown below in section Examples.
[return to text]

[7] As time passes, we expect to give up gradually on the length limit for identifiers -- as the examples elsewhere in this document already illustrate.
[return to text]

[8] This section should, ideally, be recast as a model application of the canonical form.
[return to text]

[9] N.B. This example has been edited lightly for purposes of illustration. In TEI P3, it is not tagged as shown, because the P3X DTD does not include the <avs> element. Examples like this one helped demonstrate to us that we needed it.
[return to text]

[10] In the P3X form of TEI P3, DTD fragments are contained in <eg> elements, but this is possible only because the P3X form of P3 is not used to generate the DTD: the ODD form must be able to generate the DTD.
[return to text]

[11] At the time this document was written, the Sweb parser accepted the form with empty comments, but not CDATA marked sections. This is expected to change by the time any work group actually makes use of this version of the ODD tag set. Other shortcomings of the current Sweb implementation will be removed as time affords opportunity.
[return to text]

[12] The style used for numbering and cross referencing scraps is subject to change; this is just an example of one possibility.
[return to text]

[13] Other variations are possible in theory; not all may be implemented by the TEI's ODD processing software. In principle, the user may request that the ODD processor create any of the following states:

When ODD processors encounter both a full copy and a `master' version of a description, they will process the master version and ignore the copy. The human reader can tell which is which because the master version will have a select attribute pointing at itself, while a copy will have either a sameAs or a copyOf attribute pointing at the master version. In the latter case, the <desc> element will also be empty.
[return to text]

[14] Sooner, rather than later, perhaps: the following elements seem distinctly expendable for ODD purposes:

The following elements, meanwhile, might perhaps usefully be added:
[return to text]