XML Version of the TEI DTD
Date: Fri, 18 Jun 1999 15:45:16 CDT From: C M Sperberg-McQueen <cmsmcq@acm.org> Subject: Re: XML and TEI architectural forms
Re: [Gregory Murphy on Thu, 10 Jun 1999 12:17:35 CDT]
>The auditing process that will be required to port the TEI DTD to XML >will require lots of careful human scrutiny. Many choices will have to >made among competing alternatives. I suspect that many of the >transformations will not be readily expressed as simple transformations. >For example, since XML does not allow global inclusions or exclusions, >how are the effected content models to be modified? What is to be done >with those attributes of type NUTOKEN(S) or NAME(S)? All mapped to a new >type, or mapped to a type that makes sense based on the semantics of >their element?
[C M Sperberg-McQueen]
This seems like a good time to announce the availability of several items which may be of interest to those readers of this list who care about XML:
Document ED W69, "Construction of an XML Version of the TEI DTD", which Lou Burnard and I have recently completed. It discusses, sometimes in tedious detail, what we have made of all the choices Greg describes in the paragraph just quoted, and what changes need to be made to the TEI DTD in order to realize those choices. Quick answers to the specific questions Greg poses: drop exclusions, propagate inclusions downward into the content model of every possible descendant, and redefine the attributes as NMTOKEN(S). This document is available on the Web at http://www.uic.edu/orgs/tei/ed/edw69.html (and .sgml).
The files
teixml.ent
andteixml.dtd
, which are generated by document ED W69 and are suitable for use as TEI extensions files, with either the DTDs of September 1994 or the corrected DTD which was included on the CD-ROMs distributed at the ACH/ALLC'99 conference at Charlottesville and which Lou and I are even now preparing to post on the servers. These files are available on the Web at http://www.uic.edu/orgs/tei/ed/teixml.ent and http://www.uic.edu/orgs/tei/ed/teixml.dtd.The revised Pizza Chef, a Web interface that (a) makes it easier to understand how to choose TEI base and additional tag sets to define a view of the TEI DTD, and (b) can generate a single-file version of the view you specify. If you give the URLs of your extensions files, it can fetch them and incorporate them into the resulting DTD. (N.B. there are a few limitations, noted on the page itself.)
The Pizza Chef now has a set of buttons allowing you to specify that you want your one-file DTD to be in XML. If you also specify a pair of extensions files, it appends the teixml extension files to the ones you provide, so your declarations will normally take precedence. You are responsible for ensuring that your extensions files don't use ampersands, inclusions, or exclusions; you do not need to worry about tag-omissibility information (the
- -
and- O
appearing after the generic identifier, in an ELEMENT declaration). Syd Bauman points out that you must also adjust any SDATA entity declarations you use.I've finally moved the Pizza Chef from its old location in the 'test' directory, to the new address http://www.uic.edu/orgs/tei/pizza.html. The Oxford pizza chef has not yet caught up, but hey, most people prefer Chicago pizza in any case.
A version of TEI Lite in XML, in a file called
teixlite.dtd
, which was generated by using the Pizza Chef. It is available at http://www.uic.edu/orgs/tei/lite/teixlite.dtd. We have tested it extensively, for as long as two minutes, and we believe it's legal and defines the same set of elements as TEI Lite. The entities were omitted, for the moment, because I haven't taken the time to find an XML version of the standard ISO entity sets. I also took out the call to usrmods, for the reasons I outlined on this list the other day.
Please look at these, if you are interested, and report any bugs to the editors, or to this list.
-C. M. Sperberg-McQueen
Editor, ACH/ACL/ALLC Text Encoding Initiative cmsmcq@acm.org (Note that the address U35395@UICVM.uic.edu now just forwards mail to cmsmcq@acm.org and will eventually go away. Beat the rush; go ahead and change your address book now!)
Prepared by Robin Cover for the The SGML/XML Web Page archive. See 'TEI-L LOG9906' available from LISTSERV@LISTSERV.UIC.EDU.