[Mirrored from: http://www.ceth.rutgers.edu/info/news32/ACHALLC.html]


There was evidence of continuing application of SGML in the humanities at this year's conference, ACH/ALLC Conference held at Santa Barbara July 11-15, both in the theoretical and practical realms.

A number of ongoing projects which use the TEI presented reports. David Chesnutt gave an overview of the Model Editions Partnership, which brings together individual projects that are encoding large collections of historical documents. Out of their collaboration will come a set of guidelines for using the TEI, aimed at the larger community of historical editors.

Michael Sperberg-McQueen elaborated on the technical issues. In its first phase, the partnership will produce a set of models. Each model will demonstrate a different approach to the construction and presentation of scholarly apparatus and to document delivery, both as stand-alone products and as on-line network resources. In addition, the models will reflect a variety of strategies for managing the encoding process.

The preview which Sperberg-McQueen gave of these strategies emphasizes a flexible, bi-directional relationship with the TEI. On the one hand, each model will specify a subset of SGML tags defined by the TEI. On the other, some models may need to extend the TEI tag set, especially for oft-encountered minor documents like personal or business letters which the TEI treats only in passing. In addition to the actual editions, the Model Editions Partnership will bring to historians a set of protocols for historical editing in the electronic age.

The History of Women's Writing in the British Isles Project, centered at the University of Alberta, presented a poster session on their progress to date. This project is moving into uncharted territory. Humanities projects that involve electronic texts usually begin with the conversion of pre-existing, print documents, which means that the process of document markup is intertwined with the process of document conversion. The History of Women's Writing in the British Isles will be the first full study of its kind, and will be created from the beginning in electronic form.

Data which at a later stage will be drawn upon in the production of a print chronology of women's' writing and the individual author biographies is being entered first in a database. In the electronic versions of the History, SGML will be called upon to capture the wealth of detail and relational links present in the database, as a basis for hypertextual linking.

Three papers in a session on document encoding explored some theoretical problems involving SGML and the TEI in particular. In a paper entitled "You Can't Always Get What You Want," Michael Neuman of Georgetown University wrestled with the dilemma of encoding content in a content-free manner. His ideas began with the premise that there is a difference between structure (say, chapters and sections) and logical content (e.g. names and dates).

This premise ignores a host of workaday examples where the two overlap, of course. A stage direction is manifestly distinct - both structurally and, it follows, typographically - from a speech, but the information which it provides is clearly not without relevance to how, and even what, the speaker says. What Mr. Neuman suggested is that markup procedures should be designed to reflect the increase in subjectivity of interpretation when one moves from "purely" structural markup, to key features of content, to interpretative material like apparatus and commentary.

Apparatus and commentary were the topic of a second paper by Gregory Murphy, of CETH. Murphy gave an overview of the difficulties he has encountered working with the tag set for critical apparatus in the TEI. He broke them down into three groups: those arising from the limitations of a hierarchical encoding language like SGML, those arising from the TEI's use of SGML, and those caused by current commercial software's often restrictive interpretation of SGML.

The basic difficulty one encounters when trying to design an apparatus is that the base readings do not always nest properly. Ink stains, alas, rarely confine themselves to a single line of manuscript. A reading may overlap a structural boundary. Multiple readings may overlap each other. The TEI recommends the use of anchors to mark the start and end of such spans of text, but because anchors are empty tags, they do not between them define an element that any current SGML software will recognize.

Murphy concluded with two suggestions for work-arounds. The first was to use conversion filters that pair up anchors and then mark up their implied "content" with groups of segment tags. The second was to use HyTime syntax to describe the spans between anchors.

The final paper of the session was by Gary Simons of the Summer Institute of Linguistics. Simons has developed what promises to be one of the first TEI "meta-applications," in this case a program which maps the features represented by TEI feature structure tags to object attributes in an object-oriented database system. Originally designed for linguistic markup, the feature structure tag set can be used to inscribe almost any kind of information. In addition to their syntax, which an SGML system can understand, these tags carry detailed information in their content.

The rules governing this information are laid out in a Feature System Declaration (FSD). The application which Mr. Simons described interprets the rules in a FSD. Using them, it extracts field and attribute information from any TEI document which has feature structure tagging, and loads that information into CELLAR, an object-oriented database developed at the Summer Institute for storing information about languages.

Read Next Article

Return to Table of Contents