[Archive copy mirrored from the URL: http://www.qucis.queensu.ca/achallc97/papers/p036.html; see this canonical version of the document.]
Keywords: SGML, editing, database
The Model Editions Partnership is developing a series of sample editions for delivering historical documents on the World-Wide Web. The Partnership includes the editors of seven on-going documentary editions as well as leaders from the Text Encoding Initiative and the Center for Electronic Text in the Humanities. The project began in July 1995 with major funding from the National Historical Publications and Records Commission (U.S.) and the University of South Carolina. The first six phases will be completed in June 1998. The goals for that three-year period are:
The Partnership is predicated on the view that SGML markup can be used to create the scholarly frameworks required and that SGML markup offers a practical method for preparing and delivering documentary editions. A close study of the scholarly issues involved in preparing documentary editions led to the publication of "A Prospectus for Electronic Historical Editions" in May 1996 (http://mep.cla.sc.edu/prostoc.htm). The Prospectus set forth a series of design principles; a typology of the kind of editions which might be expected to develop; and a discussion of importance of markup. (A report on the Prospectus was given at the ALLC/ACH conference in Bergen.)
The Bergen report also noted the Partnership's development of two Document Type Definitions based on a subset of the TEI Guidelines. The data-capture DTD
The archival DTD retains the MEP header and the redefinition of <docGroup> and the other elements required for the structure of historical editions. However, divisional units like <opener> are restored as is the more familiar highlight element with its attributes. This transformation is done mechanically. The premise here is that future migration of the data will be more easily accomplished if the archival markup of the data conforms more closely to current TEI markup.
As this abstract is being written, the staff is marking up the text of the sample editions and plans are being made for delivering the sample editions on the Web in June 1997. One of the major facets of the presentation at ACHALLC97 will be a progress report on encoding and delivering the samples, which will take place between November 1996 and the end of May 1997. Although it is impossible to predict with certainty the details of our progress at this time, certain aspects of the development of the sample editions in that period are predictable.
Despite the intensive review of documents which provided the basis for the development of the data-capture DTD, experience shows that markup systems evolve over time. Some changes may be made for the convenience of data entry; others, because new textual features are identified; and still others, because textual features previously ignored assume importance in the eyes of the scholar. The presentation will review our experience in these areas with explicit examples of where we chose to expand the DTD as well as other cases where we chose not to expand the DTD. Beyond the markup questions which arise from the texts are those which will undoubtedly arise as we move the data into the delivery environments.
We expect to deliver the sample editions using both SoftQuad's Panorama and Electronic Book Techology's DynaWeb. At this point, we have created small mockups with Panorama, but we have no experience with DynaWeb. (We have been accepted in EBT's Higher Education Grant program and our full application is currently pending.) Markup issues associated with implementing the samples in each software package are important and the ACHALLC97 presentation will summarize our findings in this area. To put it another way, we will look at the following questions. Which markup is easier to implement in each of the delivery systems? Are there adjustments in either DTD which would make it easier? What are the trade-offs if DTDs are modified to facilitate delivery? Modification of the DTDs can, of course, be avoided by transforming the existing data to meet the required formats for the delivery software. This can be accomplished with macros or filters and is a common practice used in print publishing. At the same time, we have sometimes found that the markup we were using could have been modified with no loss of information and with significant gains in processing--thus reducing costs.
The idea of establishing a national database for American documentary editions has been implicit from the beginning of the Model Editions Partnership. A database of American materials which cuts across the humanities and sciences offers many advantages which cannot be accomplished easily by piece-meal publication. We have often noted that scholarly editions are printed on paper designed to last 300 years. If we are to achieve similar longevity in our electronic editions, we must have the kind of infrastructure that supports itself and the migration of the editions. A subscription database can meet those requirements. Further, the gathering of editions in a digital archives will facilitate cross-edition searching and cross-edition indexing. And we can expect such a depository to provide the metadata required by librarians as well as version control. The ACHALLC97 presentation will expand on these themes and describe some of the projects which are being planned as stepping stones toward the establishment of an American Documentary Heritage database.