[Archive copy mirrored from: http://etd.vt.edu/etd/etd-ml/dtdetds.htm]

Document Type Definition for Electronic Theses and Dissertations

Neill A. Kipp

Virginia Polytechnic Institute and State University


Edward A. Fox
John L. Eaton
Gail McMillan
Emilio J. Arce

March 9, 1997 --- Version 1.0

Blacksburg, Virginia

KEYWORDS: Digital Library, Information Retrieval, Electronic Publishing, Multimedia

Copyright 1996, 1997, Electronic Thesis and Dissertation Project

(go to table of contents)

(ABSTRACT)

The Virginia Tech Graduate School requires a specific form for the submission of Electronic Theses and Dissertations (ETDs) to maintain the consistency of these complex documents. The formal statement of these guidelines serves graduate students submitting ETDs, the faculty with whom they work, and scholars who study the submitted ETDs. We defined a Document Type Definition (DTD) in the Standard Generalized Markup Language (SGML) for the representation of ETDs, a logical choice for encoding complex electronic documents. To build the DTD, we analyzed constructs in existing theses and dissertations and studied the rules for their submission. Here we present definitions, annotations, and rationale for each document construct, and we explain the connection of the document constructs into an integrated DTD. The result is the description of the grammar of a new document language, the Electronic Thesis and Dissertaton Markup Language, or ETD-ML.

Southeastern Universities Research Association (SURA) and the Department of Education's Fund for the Improvement of Post-secondary Education (FIPSE) both granted money that funded this project.


Acknowledgments

The author gives special thanks to the ETD software team: Laura Weiss, Emilio J. Arce, Scott A. Guyer, and Paul Mather.

Several constructs are new in this version. Others have been simplified from previous versions, thanks to comments from interested parties and the results of formal field testing [Kipp, 1996]. Changes include: float, natural break, and row-major tables.


1. Metadata

In this document, we discuss the various parts of the document type definition (DTD) for Electronic Theses and Dissertations (ETD). We begin with the document element and work through the hierarchy of major document structures found in the front, body, and back matter. Finally we notate the use of external notations, mulitmedia, and hypermedia constructs.

The reader should seek other sources for training in document definition, markup languages, and particularly the Standard Generalized Markup Language (SGML) [Goldfarb, 1990] [ISO 8879, 1986].

1.1. DTD Title

Here we present the title information. This is a Document Type Definition for Electronic Theses and Dissertations (ETD), originally developed and tested at Virginia Tech. The document editor is Neill A. Kipp, with great assistance by the ETD software team [See acknowledgments].

<!-- 
Document Type Definition
Electronic Theses and Dissertations
Virginia Polytechnic Institute and State University

Neill A. Kipp, Document Editor
nkipp@vt.edu

Version 1.0
March 9, 1997

http://etd.vt.edu/etd/
etd@vt.edu
-->

1.2. Public Identifier

Following is the SGML public identifier. The public identifier is used to refer to the DTD when the system location is not known.

<!-- 
<!ENTITY % etd PUBLIC 
  "-//VT//DTD Electronic Theses and Dissertations 1.0//EN">
-->

2. Document Element, Electronic Thesis or Dissertation

The document element for an ETD is element type electronic thesis or dissertation (etd). The ETD element must contain front matter, body matter, and back matter.

At any point within the ETD, a footnote or float may appear. Neither will be formatted when it is found--rather it will be formatted when activated by

<!ELEMENT etd  - O  ( front, body, back ) +( footnote | float ) >

3. Front Matter

Front matter contains the title information in a thesis or dissertation.

<!-- Front matter -->

3.1. Front Matter

The element type front matter (front) indicates that the construct contains items constituting the front matter of the ETD. These include title, author, degree, major, etc.

<!ELEMENT front - O
  ( title, author, submission, school, degree, major, approvals, date, city, state, 
  keywords, copyright, abstract, 
  grant?, dedication?, acknowledgments? ) >

3.2. Title

The element type title (title) indicates that the contained text is the title of the ETD. In traditional formatting, the title information appears in two places in the formatted ETD, once on the title page and again on the abstract page.

<!ELEMENT title - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

3.3. Author

The element type author (author) indicates that the contained text is the name of the author of the ETD. The author must be identified with given name, surname and suffix (Jr., III, etc.) for correct library cataloging. The catalog will be alphabetized on the surname, with ties broken by alphabetization of the given name. Any middle name or initial should be placed after the first name as content of the given element.

<!ELEMENT author - O ( ( given, surname, suffix? ) | organization )  >

<!ELEMENT given - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

<!ELEMENT surname - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

<!ELEMENT suffix - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )*  >

<!ELEMENT organization - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

3.4. Submission Type

The element type submission type (submission) indicates that the contained text is the type of submission, be it ``Dissertation,'' ``Thesis,'' or ``Special Report.''

<!ELEMENT submission - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

3.5. School

When it appears in the front matter, the element type school (school) indicates that the contained text is the name of the school to which the author submitted the ETD.

This element type may also appear in a bibliographic citation. In that instance, the contained text is the name of the school to which the citation author submitted the cited material (be it a thesis, dissertation, project report, or otherwise).

<!ELEMENT school - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

3.6. Degree

The element type degree (degree) indicates that the contained text is the name of the degree the author seeks. Examples are Doctor of Philosophy, Master of Science, and Master of Arts.

<!ELEMENT degree - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

3.7. Major

The element type major (major) indicates that the contained text is the name of the author's academic department. Examples include Computer Science and English.

<!ELEMENT major - O 
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

3.8. Committee Approvals

The element type committee approvals (approvals) indicates that the contained names are the names of the committee that approved this ETD.

<!ELEMENT approvals - O ( name* ) >

3.9. Name

The element type name (name) indicates that the contained text is the name of someone or something. In the context of a committee approval type element, it is the name of the committee member.

<!ELEMENT name - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

3.10. Date

The element type date (date) indicates that the contained text is the date of the ETD or of the bibliographic citation.

<!ELEMENT date - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

3.11. City

The element type city (city) indicates that the contained text is the city in which the defense of the thesis or dissertation occurred.

<!ELEMENT city - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

3.12. State

The element type state (state) indicates that the contained text is the city in which the defense of the thesis or dissertation occurred.

<!ELEMENT state - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

3.13. Keyword List

The element type keyword list (keywords) indicates that the contained keyword elements are the keywords for this document.

<!ELEMENT keywords - O ( keyword+ ) >

3.14. Keyword or Phrase

The element type keyword or phrase (keyword) indicates that the contained text is a query-oriented keyword or phrase, perhaps on a predefined list of keywords of phrases offered in the major of the author, or by the library or university.

<!ELEMENT keyword - O 
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

3.15. Copyright Notice

The element type copyright notice (copyright) indicates that the contained text is the copyright notice for the document. This copyright notice has legal significance; it pertains to both the SGML source and the formatted version.

<!ELEMENT copyright - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

3.16. Abstract

The element type abstract (abstract) indicates that the contained paragraphs make up the abstract for the document.

<!ELEMENT abstract - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle | p | citation | table | mm )* >

<!ATTLIST abstract
  id ID #IMPLIED
>

3.17. Grant Information

The element type grant information (grant) holds the attribution statement required by some granting institutions.

<!ELEMENT grant - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle | p | citation | table | mm )* >

<!ATTLIST grant 
  id ID #IMPLIED 
>

3.18. Document Dedication

The element type document dedication (dedication) indicates that the contained text is the dedication for this document.

<!ELEMENT dedication - O ( head?, ( p | citation | table | mm)* ) >

<!ATTLIST dedication
  id ID #IMPLIED
>

3.19. Author's Acknowledgments

The element type author's acknowledgments (acknowledgments) indicates that the contained paragraphs make up the acknowledgments section of the ETD. Element type head may be used to define an alternate title to this section of the ETD.

<!ELEMENT acknowledgments - O ( head?, ( p | citation | table | mm)* ) >

<!ATTLIST acknowledgments
  id ID #IMPLIED
>

4. Body Matter

Body matter follows the front matter in a thesis or dissertation.

<!-- Body matter -->

4.1. Document Body

The element type document body (body) indicates that the contained chapters make up the body of the ETD.

<!ELEMENT body - O ( chapter+ ) >

4.2. Chapter

The element type chapter (chapter) indicates that the contained head, paragraphs, and sections constitute the chapter. The entity chapterInclusions is discussed in the appendix [Inclusions].

<!-- 1. -->
<!ENTITY % chapterInclusions "" >
<!ELEMENT chapter - O 
  ( head?,  ( p | citation | table | mm)*, section* ) %chapterInclusions; >

<!ATTLIST chapter
  id ID #IMPLIED
>

4.3. Section

The element type section (section) indicates that the contained head, paragraphs, and subsections constitute the section.

<!-- 1.2 -->
<!ELEMENT section - O 
  ( head?, ( p | citation | table | mm)*, subsection* ) >

<!ATTLIST section
  id ID #IMPLIED
>

4.4. Subsection

The element type subsection (subsection) indicates that the contained head, paragraphs, and blocks constitute the subsection.

<!-- 1.2.3 -->
<!ELEMENT subsection - O 
  ( head?, ( p | citation | table | mm)*, block* ) >

<!ATTLIST subsection
  id ID #IMPLIED
>

4.5. Block

The element type block (block) indicates that the contained head, paragraphs, and subblocks constitute the block.

<!-- 1.2.3.4 -->
<!ELEMENT block - O 
  ( head?, ( p | citation | table | mm)*, subblock* ) >

<!ATTLIST block
  id ID #IMPLIED
>

4.6. Subblock

The element type subblock (subblock) indicates that the contained head and paragraphs constitute the subblock. Nesting past five levels is not offered due to readability considerations.

<!-- 1.2.3.4.5 -->
<!ELEMENT subblock - O 
  ( head?, ( p | citation | table | mm)* ) >

<!ATTLIST subblock
  id ID #IMPLIED
>

4.7. Paragraph

The element type paragraph (p) indicates that the collection of contained items constitute a paragraph. Paragraphs may contain any of the items listed in any order. The parameter entity paragraphInclusions can provide extensions to the paragraph definition, and is discussed in the appendix [Inclusions].

<!ENTITY % paragraphInclusions "" >
<!ELEMENT p - O ( head | #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle | 
   ol | ul | dl | pre | verse | blockquote | attrib | math )* %paragraphInclusions; >

4.8. Head

The element type head (head) indicates that the contained text is the head of the element type that contains the head. The following constructs each may have a head: chapters, sections, subsections, blocks, subblocks, and paragraphs.

<!ELEMENT head - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

5. Body Text

In this chapter, we discuss the element and entity declarations for body text. Body text may occur at most leaves in an ETD.

We begin discussion of the document by defining what is allowed in text at the paragraph level. Here, one may include textual data (represented by SGML reserved name #PCDATA), emphasis (em), data marked as strong or bold (strong), natural breaks in the flow of the language (br), words noted as foreign (foreign), and words noted as terms (term). Also, we define that certain words may be the anchors of links (target). For the simplest of mathematical notation, we have the ability to superscript and subscript (sup and sub). And finally, for more complex mathematical notation, we have inline math (im).

5.1. Emphasized Text

The element type emphasized text (em) is a construct that denotes that the contained text should be emphasized.

<!ELEMENT em - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

5.2. Strong Text

The element type strong text (strong) is a construct that denotes that the contained text should be strong or bold , and flowed within the paragraph.

<!ELEMENT strong - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

5.3. Natural Break

The element type natural break (br) indicates that the text flow should break at this point, and begin flowing again after the break. Because the element is empty, the end tag (</br>) must be omitted. Breaking is most useful in long titles and for special emphasis within paragraphs.

<!ELEMENT br - O EMPTY >

5.4. Typed Text

The element type typed text (tt) is a construct that denotes that the contained text should be typed text, and appear as such within the formatted ETD. Although they will likely appear in the same font in the formatted ETD, typed text is distinguished from preformatted text. Typed text flows within a paragraph, whereas preformatted text appears between newlines. It is useful for documentation of computer-oriented applications, such as this document.

Character combinations less-than (<) and ampersand (&) must be escaped in the content of this element to avoid being recognized as SGML markup. This DTD provides the necessary entities &lt; and &amp; to make the appropriate escapes (see [Special Characters]).

<!ELEMENT tt - - ( #PCDATA )* >

5.5. Inline Quoted Text

The element type inline quoted text (q) is a construct that denotes that the contained text should be flowed inline and be prefaced by a left-quote (66) and followed by a right-quote (99).

<!ELEMENT q - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

5.6. Term

The element type term (term) denotes that the contained text is a term whose definition is in the surrounding text. Usually, terms will be placed in boldface in the formatted ETD.

<!ELEMENT term - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

5.7. Foreign Word or Phrase

The element type foreign word or phrase (foreign) denotes that the contained text is a foreign word or phrase. Usually, foreign words will be placed in italics in the formatted ETD. Using the id attribute, the dissertation or thesis author may also link a translation to this foreign word or phrase.

<!ELEMENT foreign - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

5.8. Inline Mathematics

The element type inline mathamatics (im) denotes that the contained data is formatted as inline mathematics in the notation provided with the TeX formatting system. The data contained within the inline mathematics tags is processed as if it appeared between dollar signs ($) in a TeX source file.

<!ELEMENT im - - ( #PCDATA )* >

<!ATTLIST im
  id ID #IMPLIED
  notation NOTATION (TeX) TeX
>

5.9. Superscripted Text

The element type superscripted text (sup) denotes that the contained text should appear as a superscript (e.g., a mathematical exponent [2n] or other qualifier).

<!ELEMENT sup - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

5.10. Subscripted Text

The element type subscripted text (sub) denotes that the contained text should appear as a subscript (e.g., a chemical quantity [He2] or other qualifier).

<!ELEMENT sub - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

5.11. Special Characters

In all markup languages, certain characters are reserved to perform special functions. Often authors will desire that the character deny its functional role and perform as character data (SGML CDATA). These entity declarations facilitate escaping the function of a character that is ordinarily a function character.

5.11.1. Ampersand

The character ampersand (&) is a special character to SGML and must be entered as a general entity reference, as in &amp;.

<!ENTITY amp CDATA "&" >

5.11.2. Less Than

The character pattern less than (<) is special to SGML. In regular text it must be entered as a general entity reference, as in &lt;.

<!ENTITY lt CDATA "<" >

5.11.3. Greater Than

The character greater than (>) is a special character to SGML. Although this character can be entered directly into SGML text, we provide this general entity declaration as in &gt;, to balance the ``less than'' character.

<!ENTITY gt CDATA ">" >

5.11.4. Quotation Mark

The character pattern qutation mark (") is special to SGML. To appear as data in an attribute value delimited by quotation marks, it must be entered as a general entity reference, as in &quot;.

<!ENTITY quot CDATA '"' >

5.12. Preformatted Text

The element type preformatted text (pre) indicates that the contained text is to be formatted exactly as it appears in the SGML source.

The attribute output file (outfile) indicates to which file the preformatted text should be appended in a source code extraction process in a literate program.

Remember that characters less than and ampersand must be escaped in preformatted text (see [special characters]).

<!ELEMENT pre - - ( #PCDATA )*  >

<!ATTLIST pre
  id ID #IMPLIED
  outfile CDATA #IMPLIED
>

5.13. Quoted Material

Often, the ETD will require direct quotaion of prose, poetry, or named dialog.

5.13.1. Block Quotation

The element type block quotation (blockquote) indicates that the contained text will be formatted as quoted block of text, possibly by margin indentation on one or both sides and changing to single spacing.

Block quotes may contain more than one paragraph.

<!ELEMENT blockquote - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle | p | citation | table | mm )* >

5.13.2. Attribution

Different than a citation, the element type attribution (attrib) indicates that the contained text will be formatted as an attribution of a block quote, verse, paragraph, or other contextual construct, possibly by inserting a newline and offsetting the contained text in some way.

<!-- 
For example,

<blockquote>Rosencrantz and Guildenstern are dead</>
<attrib>William Shakespeare, <worktitle>Hamlet, Prince of Denmark</>, 
Act V, Scene II.</>
-->
becomes
Rosencrantz and Guildenstern are dead
William Shakespeare, Hamlet, Prince of Denmark , Act V, Scene II.
<!ELEMENT attrib - O 
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

5.13.3. Verse

The element type verse (verse) indicates that the contained head and stanzas will be formatted as literary verse.

<!ELEMENT verse - O ( head?, stanza+ ) >

<!ATTLIST verse 
  id ID #IMPLIED
>

5.13.4. Stanza

The element type stanza (stanza) indicates that the contained names and lines will be formatted as a stanza of verse. The name indicates a speaker's name.

<!--
For example,
<verse>
<stanza>
<speaker>Petruchio
  <line>But here she comes; now, Petruchio, speak.
  <stage>Enter Katherina
  <line>Good morrow, Kate- for that's your name, I hear
<speaker>Katherina
  <line>Well have you heard, but something hard of hearing:
  <line>They call me Katherine that do talk of me. 

<attrib>William Shakespeare, <worktitle>Taming of the Shrew</>, Act II, Scene I
-->
becomes:
Petruchio
But here she comes; now, Petruchio, speak.
[Enter Katherina ]
Good morrow, Kate- for that's your name, I hear
Katherina
Well have you heard, but something hard of hearing:
They call me Katherine that do talk of me.
William Shakespeare, Taming of the Shrew , Act II, Scene I
A stanza may contain one or more names, lines, or stage directions.
<!ELEMENT stanza - O ( speaker | line | stage )+ >

5.13.5. Speaker

The element type speaker (speaker) indicates that the contained text will be formatted to indicate the speaker of the lines that follow.

<!ELEMENT speaker - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

5.13.6. Stanza Line

The element type stanza line (line) indicates that the contained text will be formatted as a line of a stanza of verse.

The attribute indented (indented) indicates that the line should be indented. The default line is not indented (noindent).

<!ELEMENT line - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

<!ATTLIST line 
  id ID #IMPLIED
  indented (indent|noindent) noindent 
  -- for line numbering --
  number NUMBER #IMPLIED
>

5.13.7. Stage Direction

The element type stage direction (stage) indicates that the contained text will be formatted as a stage direction.

<!ELEMENT stage - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

<!ATTLIST stage 
  id ID #IMPLIED
>

5.14. Mathematics

The element type mathematics (math) indicates that the contained data is in mathematic notation and will be formatted by the TeX formatter, as if the contained data appeared between \begin{eqnarray*} and \end{eqnarray*}.

Note that the paper-based convention of referring to equations by number is replaced by referring to the equation by a hyperlink (link).

<!ELEMENT math - - ( #PCDATA )* >

<!ATTLIST math
  id ID #IMPLIED
  notation NOTATION (TeX) TeX
>

<!NOTATION TeX SYSTEM >

5.15. Lists

5.15.1. Ordered List

The element type ordered list (ol) indicates that the contained line items be formatted as an ordered list. Nested lists will be indented relative to the outer one.

The attribute type numbering (numbering) indicates that the list will be numbered as follows:

The absence of this attirbute indicates that the list should be numbered using the subsequent convention, and proceeding around circularly (i.e., arabic follows uppercase alphabetic).
<!ELEMENT ol - - ( li )+ >

<!ATTLIST ol
  id ID #IMPLIED
  -- numbering: indicates the enumerator for this level of list
     (default: use in the order listed and circle back
     to arabic six-level nestings and beyond) --
  numbering ( arabic | lalpha | lroman | uroman | ualpha ) #IMPLIED
>

5.15.2. Unordered List

The element type unordered list (ul) indicates that the contained line items be formatted as an unordered list. Nested lists will be formatted with bullets, dashes, or other appropriate indicators of the level of nesting.

<!ELEMENT ul - - ( li )+ >

<!ATTLIST ul
  id ID #IMPLIED
>

5.15.3. List item

The element type list item (li) indicates that the contained paragraphs constitute an item of a list.

<!ELEMENT li - O ( head | ol | ul | dl | 
  #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle | p | citation | table | mm )* >

5.15.4. Description List

The element type description list (dl) indicates that the contained items are a list of terms and their definitions. The description term (dt) and description data (dd) will be formatted appropriately, to indicate the given semantic.

<!ELEMENT dl - - ( dt, dd )+ >

<!ELEMENT dt - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

<!ELEMENT dd - O ( head | ol | ul | dl | 
  #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle | p | citation | table | mm )* >

6. Hypertext

An ETD may contain hypertext references to other sections within this document, constructs within this document, and to particular words or phrases (e.g., authors may provide their own index). Additionally, the author may refer to external documents using the de facto protocols available on the World Wide Web.

6.1. Anchor of Web Link

The element type anchor of Web link (a) is a construct borrowed from HTML that refers to a Web page or other remote object.

The attribute hypertext reference (href) holds the address of the reference. For example:

<!-- 
Example usage:
    <a href="http://etd.vt.edu/etd/">ETD Home Page</a>
    <a href="mailto:etd@vt.edu">etd@vt.edu</a>
-->
<!ELEMENT a - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

<!ATTLIST a 
  href CDATA #REQUIRED
>

6.2. Hyperlink

The element type hyperlink (link) indicates that hyperlink traversal is possible to any destination: a footnote, citation, block, chapter, a definition, a table, a floating multimedia figure, etc.

The attribute HyTime (HyTime) tells the processing software that this element descends from the HyTime ``clink'' architectural form. The user need not specify the HyTime attribute explicitly [ISO 10744, 1992].

The attribute goesto (goesto) tells the SGML ID of the target, block, chapter, etc. Dangling references are not permitted: an element with the corresponding identifier must exist within the ETD.

The content of the link element gives an authored preview of where the link ``goesto,'' such as the author name and year for a bibliographic reference, or ``See ...'' or ``See also ...'' Omitting the content may indicate that a footnote is the destination.

<!-- goesto refers to: 
  target, footnote, citation, float, chapter, section, etc. -->
  
<!ELEMENT link - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

<!ATTLIST link
  id ID #IMPLIED
  HyTime NAME clink
  -- goesto: refers to another identifier in this document --
  goesto IDREF #REQUIRED
>

6.3. Target Element

The element type target element (target) denotes that the contained text can be the target of a hyperlink. Only targets that are linked require a change in formatting.

<!ELEMENT target - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

<!ATTLIST target 
  id ID #IMPLIED
>

6.4. Footnote

The element type footnote (footnote) is a construct that holds the text of a footnote. Footnotes may appear anywhere within a thesis or dissertation, and will not appear as part of the regular formatting. They will only appear if their identifier is used as the linkend of a hyperlink reference and the user requests navigation.

<!ELEMENT footnote - -
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle | p | citation | table | mm )* >

<!ATTLIST footnote
  id ID #REQUIRED
>

7. Floating Material

This chapter includes all material that appears in the list of multimedia objects, such as tables, figures, graphics, etc. Objects within the floating material may ``float'', that is, they may not be formatted necessarily in the order in which they appear in the ETD-ML source.

7.1. Floating Object

The element type floating object (float) declares that the contained elements should be formatted separately from the document and that references should be made (like the traditional figure or table reference) from within the document matter. References to a floating object are made with the link element, whose ``goesto'' attribute names the identifier in the floating object element.

<!ELEMENT float - - ( ( p | citation | table | mm )*, caption? ) >

<!ATTLIST float 
  id ID #REQUIRED
>

7.2. Caption

The element type caption (caption) declares that the contained data is the caption of the floating object.

<!ELEMENT caption - O 
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

<!ATTLIST caption
  id ID #IMPLIED
>

7.3. Table

The element type table (table) holds the data that constitutes a column-major table.

Table information can be coerced into this minimal table implementation without loss. Both traditional row-major and column-major tables are provided. Most tools support row-major output, while the column-major implementation may be a more natural one for large tables' entry using a text editor [Travis and Waldt, 1995].

<!ELEMENT table - O 
  ( ( colhead, column+)* | ( rowhead, row+)* )
>

<!ATTLIST table 
  id ID #IMPLIED
>

7.3.1. Column and Row Head

Element types column head (colhead) and row head (rowhead) hold the heading entries for the column or row.

<!ELEMENT ( colhead | rowhead ) - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

7.3.2. Column and Row

Element type column (column) and row hold lists of cells or heads for the column or row.

<!ELEMENT column - O ( c* | rowhead* ) >
<!ELEMENT row - O ( c* | colhead* ) >

7.3.3. Table cell

Element type table cell (c) holds the data for the table cell.

<!ELEMENT c - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

7.4. Multimedia Object

The element type multimedia object (mm) carries the information required to point to an external multimedia object, as well as identify that object to the reader of the document.

<!-- Multimedia object.  The optional content has information about
media format and file size. -->

<!ELEMENT mm - O
  ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br | 
   worktitle | articletitle )* >

<!ATTLIST mm
  entity ENTITY #REQUIRED
>

8. Back Matter

Back matter follows the body matter in a thesis or dissertation.

<!-- Back matter -->

8.1. Back Matter

The element type back matter (back) indicates that the contained references, appendices, and vita will be formatted like chapters in the thesis or dissertation.

<!ELEMENT back - O ( bibliography, appendix*, vita ) >

8.2. Bibliographic References

The element type bibliographic references (references) indicates that the contained head and citation list will be formatted as the reference section for a thesis or dissertation. If the head is omitted, the text ``References'' will head the section.

<!ELEMENT bibliography - O ( head?, ( p | citation | table | mm)* ) >

8.3. Citation

The element type citation (citation) indicates that the contained element types will be formatted as a single citation. Some formatters may reorder the entries according to a predetermined bibliographic style, others will use the order given.

The attribute work type (worktype) tells in which medium the work appears, be it a book, journal, proceedings, video, etc.

<!ENTITY % citationInclusions "" >
<!ELEMENT citation - O 
  ( address | articletitle | bible | court | edition | editor | email |
  handle | note | number | pages | pubdate | publisher | url | version |
  volume | workauthor | worktitle )*
  %citationInclusions;>

<!ATTLIST citation
  id ID #IMPLIED

  -- worktype: tells what sort of work is being cited, 
     i.e., book, journal, proceedings, video, WWW page, etc. --
  worktype CDATA #REQUIRED

  -- published: tells if the work was actually published --
  published ( published | unpublished ) published
>

8.4. Citation Items

The element types listed below tell the following information. Element type:

These element types facilitate the direct translation of the bibliographic entries into BiBTeX bibliographic notation [Lamp94] and other tools.

<!ELEMENT 
  ( address | articletitle | bible | court | edition | editor | email |
  handle | note | number | pages | pubdate | publisher | url | version |
  volume | workauthor | worktitle )
  - O ( #PCDATA | em | strong | tt | q | term | foreign | 
   im | link | target | a | sup | sub | br )* >

<!ATTLIST
  ( address | articletitle | bible | court | edition | editor | email |
  handle | note | number | pages | pubdate | publisher | url | version |
  volume | workauthor | worktitle )
  id ID #IMPLIED
>

8.5. Appendix

The element type appendix (appendix) is a document section parallel in the hierarchy to chapter, references, and vita. Each appendix gives support to the ETD, but may not be critical to its main point. The entity appendixInclusions is disscussed in the appendix [Inclusions].

<!ENTITY % appendixInclusions "" >
<!ELEMENT appendix - O ( head?, ( p | citation | table | mm)*, section* ) %appendixInclusions; >

<!ATTLIST appendix 
  id ID #IMPLIED
>

8.6. Vita

The element type vita (vita) is a document section parallel to chapter, bibliography, and appendix that holds the description of the author's life's work. This section may include a work experience, education experience, other publications, marriage information, family members, and places the author has lived.

<!ELEMENT vita - O ( head?, ( p | citation | table | mm)* ) >

9. Building an ETD

The beginning of the ETD document must contain a reference to the DTD to which it conforms, including pointers to all external files used in its processing.

9.1. Entity and Notation Declarations

In SGML, references to external files occur in ENTITY declarations in the document type declaration subset, and references to entities must include a reference to the notation type. This strategy forces the consolidation of external references to the top of the document, which facilitates interchange of SGML documents.

We provide a complete example of a document type declaration and subset below. It includes NOTATION and ENTITY declarations for external multimedia objects. Note that the notation declaration appears only once per notation type usage.

<!-- 
  This is an example document type declaration with subset.  
  It begins an SGML ETD.

<!DOCTYPE etd SYSTEM "etd.dtd" [
  <!NOTATION jpeg SYSTEM >
  <!ENTITY theatre SYSTEM "theatre.jpg" NDATA jpeg >
  <!ENTITY stratford SYSTEM "stratford.jpg" NDATA jpeg >
]>
<etd> 
<front>
<title>Use of Metaphor in Shakespeare's Plays ...
--> 


Bibliography

KIPP96
Kipp, Neill A., ``SGML DTD Usability Study.'' Proceedings of SGML '96, Arlington, VA: Graphic Communication Association, November, 1996.

LAMPORT94
Lamport, Leslie, LaTeX: A Document Preparation System User's Guide and Reference Manual. Reading, Massachusetts: Addison-Wesley, Second edition, 1994.

GOLDFARB90
Goldfarb, Charles F., The SGML Handbook. New York: Oxford University Press, 1990.

TRAVIS95
Travis, Brian E., and Dale C. Waldt, The SGML Implementation Guide: A Blueprint for SGML Migration. Berlin: Springer-Verlag, 1995.

ISO8879
International Standards Organization, ISO/IEC 8879 - Standard Generalized Markup Language: SGML. 1986.

ISO10744
International Standards Organization, ISO/IEC 10744 - Hypermedia/Time-based Structuring Language: HyTime. 1992.

Appendix A. Inclusions and ETD-ML Extensibility

<!-- Inclusions -->

ETD-ML is designed to be extensible without such extensions becoming burdensome for the author. The least invasive way to add elements to a document type definition is through the SGML inclusion feature. Elements within the inclusion set may appear at any level of nesting below the point at which they occur.

In SGML, element declarations will not be replaced if they are repeated in the DTD subset. However, any parameter entity declarations made there will be read first and subsequent declarations will be ignored. In this way, users can add elements to the DTD at specific, predeclared points.

If the user replaces the following declarations like in the examples given, then extra element types will be available at each level of the document.

A.1. Chapter Inclusions

In the DTD subset, authors may replace the entity

  <!ENTITY % chapterInclusions "" >
with, for example,
  <!ENTITY % chapterInclusions "+( mypara | myquote )" >
and declare element types mypara and myquote. For each non-ETD-ML element type, authors should declare an attribute (usually NAME #FIXED) called etd and that the value of that attribute is the element type from this DTD that most closely resembles the one the author defines. For example, if the element type mypara most closely resembles an ETD-ML paragraph, then the attribute list for mypara should contain the etd NAME #FIXED "p" attribute definition.

A.2. Appendix Inclusions

Like the chapter inclusions, the following may be replaced with inclusions at the appendix level.

  <!ENTITY % appendixInclusions "" >

A.3. Citation Inclusions

The following allows authors to have inclusions at the citation level.

  <!ENTITY % citationInclusions "" >

A.4. Paragraph Inclusions

Finally, authors may include elements that flow within paragraphs.

  <!ENTITY % paragraphInclusions "" >

Vita



Table of Contents

Title pageDocument Type Definition for Electronic Theses and Dissertations
1 Metadata
1.1 DTD Title
1.2 Public Identifier
2 Document Element, Electronic Thesis or Dissertation
3 Front Matter
3.1 Front Matter
3.2 Title
3.3 Author
3.4 Submission Type
3.5 School
3.6 Degree
3.7 Major
3.8 Committee Approvals
3.9 Name
3.10 Date
3.11 City
3.12 State
3.13 Keyword List
3.14 Keyword or Phrase
3.15 Copyright Notice
3.16 Abstract
3.17 Grant Information
3.18 Document Dedication
3.19 Author's Acknowledgments
4 Body Matter
4.1 Document Body
4.2 Chapter
4.3 Section
4.4 Subsection
4.5 Block
4.6 Subblock
4.7 Paragraph
4.8 Head
5 Body Text
5.1 Emphasized Text
5.2 Strong Text
5.3 Natural Break
5.4 Typed Text
5.5 Inline Quoted Text
5.6 Term
5.7 Foreign Word or Phrase
5.8 Inline Mathematics
5.9 Superscripted Text
5.10 Subscripted Text
5.11 Special Characters
5.11.1 Ampersand
5.11.2 Less Than
5.11.3 Greater Than
5.11.4 Quotation Mark
5.12 Preformatted Text
5.13 Quoted Material
5.13.1 Block Quotation
5.13.2 Attribution
5.13.3 Verse
5.13.4 Stanza
5.13.5 Speaker
5.13.6 Stanza Line
5.13.7 Stage Direction
5.14 Mathematics
5.15 Lists
5.15.1 Ordered List
5.15.2 Unordered List
5.15.3 List item
5.15.4 Description List
6 Hypertext
6.1 Anchor of Web Link
6.2 Hyperlink
6.3 Target Element
6.4 Footnote
7 Floating Material
7.1 Floating Object
7.2 Caption
7.3 Table
7.3.1 Column and Row Head
7.3.2 Column and Row
7.3.3 Table cell
7.4 Multimedia Object
8 Back Matter
8.1 Back Matter
8.2 Bibliographic References
8.3 Citation
8.4 Citation Items
8.5 Appendix
8.6 Vita
9 Building an ETD
9.1 Entity and Notation Declarations
Bibliography
A Inclusions and ETD-ML Extensibility
A.1 Chapter Inclusions
A.2 Appendix Inclusions
A.3 Citation Inclusions
A.4 Paragraph Inclusions
Vita

ETD-to-HTML formatting system:
etd2html (prototype) Version 3.0.1 beta, March 4, 1997
This document formatted on Sun Mar 9 20:34:31 EST 1997.
Comments and questions? etd@etd.vt.edu.