SGML: Society for Textual Scholarship, Conference Report

SGML: Society for Textual Scholarship, Conference Report

From @UTARLVM1.UTA.EDU:owner-tei-l@UICVM.BITNET Mon Apr 17 20:19:37 1995
Return-Path: <@UTARLVM1.UTA.EDU:owner-tei-l@UICVM.BITNET>
Received: from UTARLVM1.UTA.EDU by (4.1/25-eef)
	id AA01241; Mon, 17 Apr 95 20:19:33 CDT
Message-Id: <>
   with BSMTP id 1207; Mon, 17 Apr 95 18:21:52 CDT
Received: from KSUVM.KSU.EDU (NJE origin MAILER@KSUVM) by UTARLVM1.UTA.EDU (LMail V1.2a/1.8a) with BSMTP id 0396; Mon, 17 Apr 1995 18:21:51 -0500
 (LMail V1.2a/1.8a) with BSMTP id 0727; Mon, 17 Apr 1995 18:21:18 -0500
Date:         Mon, 17 Apr 1995 18:18:22 CDT
Reply-To: "C. M. Sperberg-McQueen" <U35395%UICVM.bitnet@UTARLVM1.UTA.EDU>
Sender: Text Encoding Initiative public discussion list
From: "C. M. Sperberg-McQueen" <U35395%UICVM.bitnet@UTARLVM1.UTA.EDU>
Organization: ACH/ACL/ALLC Text Encoding Initiative
Subject:      trip report - Society for Textual Scholarship
To: Multiple recipients of list TEI-L <TEI-L%UICVM.bitnet@UTARLVM1.UTA.EDU>
Status: R

                              Trip Report:
                    Society for Textual Scholarship
                     New York City, 6-8 April 1995

   The Society for Textual Scholarship held its conference in New York
this past week; though I have been interested in the program of the con-
ference before, this was the first time I was actually able to attend.
Herewith some notes on the portion of the conference I was able to see.

   The meeting was held at the Graduate School and University Center of
the City University of New York, a building plastered with posters
describing and protesting the punishing budget cuts about to be levied
in the CUNY budget by a vindictive state legislature and governor -- do
New York's politicians really appreciate so little the jewels of their
own state university system?  [Sigh.  The only consolation, if consola-
tion it is, is the observation that nothing is new under the sun.  Dan-
iel Coit Gilman, his biographers say, was lured away from the presidency
of the University of California to assume the presidency of the fledg-
ling Johns Hopkins University with the single observation that, as a
private university, Hopkins would have no occasion for dealings with any
state legislature.  I have never met anyone who has ever worked at a
state university in the U.S. who had any trouble understanding his deci-
sion.]  The posters reminded me of the wall postings used to call demon-
strations when I was a student, so I was mildly surprised to find they
had been posted not by left-wing student groups but by the office of the
president of the university.

   After the usual greetings by dignitaries, the meeting proper began
with a session entitled "Text and Technology."  David Seaman began with
an account of the University of Virginia's Electronic Text Center, in
which he contrasted computer-centered and discipline-centered approaches
to humanities computing (the Old Humanities Computing and the New).  At
Virginia, the Electronic Text Center found, when it started, that it was
necessary to build an effective user community, as well as building
resources for them.  Traditional humanists do take quite well to elec-
tronic texts when they are properly presented, but the text, not the
computer, must be at the center of activity.   Software independence,
and hence standards, are critical to this approach.

   Marilyn Lavan and Kirk Alexander of Princeton then described the Pie-
ro Project, an project which grew out of Lavan's work on the frescoes of
Piero della Francesca and which has grown into an effort to create a
general framework for the integration of visual data with its context --
both its physical context (visible in three-dimensional representations
of the positions of various frescoes in the church) and its intellectual
and art-historical context (made accessible through links to a relation-
al database of art-historical terms and concepts).  A stunning videotape
demonstration of the software showed how computer-assisted design soft-
ware had been used to allow the user to move at will (or, as well as
your hand-eye-mouse coordination will let you) through the three-
dimensional space occupied by the frescoes.  The Piero project, and
Lavan's earlier work on frescoes, have always struck me as extremely
interesting attempts to come to grips with material for which no easy
pre-existing tools exist, and the current demonstration reconfirmed me
in my admiration.

   The final speaker of the first session was Jerome McGann, who gave an
overview of the mission of the Institute for Advanced Technology in the
Humanities at the University of Virginia, and the various projects sup-
ported there.  He was, on the whole, remarkably little flustered over
the fact that none of his intended visual aids could be used at all:
the Institute, tired of the disasters which lie in wait for demos in
distant places by way of telephone connections, SLIP, and Telnet, has
recently purchased a laptop RS 6000 machine for such demos.  The new
laptop is self-contained and thus immune, more or less, to the problems
of live demos via telecommunications links.  Unfortunately, the machine
needed an adapter cable to work with the CUNY display system, and no
such cable was available.  The problem was, McGann said, perhaps an
allegory of technology.  He may have meant merely that technology obeys
Murphy's Law, but I think he was referring more specifically to the
missing ADAPTER cable:  the allegorical message, I infer, is "Adapt or

   After lunch, the participants divided into nine groups for parallel
sessions.  One continued the theme of the morning, under the title "
Text and Technology II". In it, I tried to answer the question "What is
the TEI and Why Should Editors Care?"  The burden of my argument will be
familiar to readers of this list:  above all, the TEI is an attempt to
create an essential part of the infrastructure necessary for the digital
libraries we would like to have in the future, a part more essential, in
the long run, even than the hardware and software which will implement
the digital library.  Editions should be built around software-
independent standards like SGML and the TEI because the alternative,
building them around specific pieces of software (HyperCard, Word
Cruncher, Mosaic) dooms them to lives as short as those of the software
on which they depend.  Entire software generations have risen up and
declined and faded from memory in less time than is taken by even moder-
ately fast editions of major authors or public figures.  Building elec-
tronic editions around particular pieces of software is like printing
paper editions on paper guaranteed to self-destruct within ten years.
(Come to think of it, my East German edition of Faust did self-destruct
after about ten years, owing to the extreme acidity of the paper.
Another allegorical lesson there, perhaps.)

   Mary Jo Kline, who currently serves as a consultant to the Library of
Congress's National Digital Library Project (originally the American
Memory Project) spoke next, on the perils of actually encoding text in
electronic form.  From the experiences of American Memory, she drew a
number of lessons, both positive and negative, for future projects.
Naturally, it was the mistakes she described which stick most vividly in
my mind, though I am happy to say I believe they give a wholly inaccu-
rate picture of the overall quality of the American Memory project.
When the texts were chosen for American Memory, qualified experts were
of course consulted; when the DTD was developed, however, the develop-
ment group included no one who had actually edited nineteenth century
broadsheets, diaries, or any of the other kinds of material in the cor-
pus for publication.  The group's initial efforts did not recognize that
a DTD serves, in effect, as a statement of transcription principles; so
detailed transcription guidelines were not worked out in advance.  The
initial data capture was not SGML, but only "almost-SGML".  Money was
saved, at first, not only by using inexpensive contractors to do the
keyboarding, but also by skimping on quality control and proofreading.
(The capital mistake here was that the project BELIEVED THE VENDORS.  As
Lenin is supposed to have said:  trust is good; inspection is better.)
The definition of CAPTION was designed to conform to standard practice
in cataloguing visual materials -- with the result that map legends and
other text within the image itself were not transcribed, though useful
for retrieval.

   Many of the perils of electronic publication, Kline noted, are bor-
ingly identical to those of print publication.  You must plan ahead; you
must think about your audience; you must make working indices and
retrieval tools for yourself and your readers.  Pilot projects and dem-
onstration projects exist, Kline pointed out, to make mistakes and pro-
vide the dire stories which will enable those who follow in their foot-
steps to avoid those mistakes.  And despite the fact that the American
Memory (excuse me, the National Digital Library) project staff have man-
aged to correct those early mistakes, the project did succeed in accumu-
lating a number of cautionary tales which should be taken to heart by
those who would venture on the production of electronic resources.

   Susan Hockey then outlined a number of problems arising in providing
adequate access to electronic resources, with special attention to the
need for more intelligent and powerful software to work with electronic
texts.  Such software should ideally have, at the least, the same kind
of sophistication and flexibility as existing programs like Tact, OCP,
and Tustep in dealing with problems of variant character sets and alpha-
bets, the definition of a "word", and the specification of reference
systems.  It is astonishing how much standard commercial software is
incapable of even the most basic modification in such areas.  Equally
important, though, is the need to push beyond the long-established tech-
nologies of the string search for text retrieval, and to link our elec-
tronic texts with lexical resources which can give our software a deeper
understanding of the texts we are working with, and enable the user to
deal more conveniently with variant spellings, homography, and the like.
She praised in particular the work being done at the Institute for Com-
putational Linguistics in Pisa toward the development of lexical data-

   The respondent in this session was Allen Renear, now Director of the
Scholarly Technologies Group at Brown University.  He regretted, he
said, that as respondent he had found very little to dissent from in the
papers -- not only because the role of the respondent seemed to demand
dissent, but because the TEI and SGML are ABOUT disagreement, or about
agreeing on some things in order to disagree more effectively about oth-
ers.  As Wittgenstein might have said, but (alas) did not say, in an
early notebook:

         We can disagree about many things; but can we disagree
      about EVERYTHING? [Wir koennen ueber vielerlei ver-
      schiedener Meinung sein; koennen wir es aber ueber
      ALLES sein?]

         [Change in hand and ink color here.]

         Or would that be like positing the existence of an air-
      line so small, it has no nonstop flights at all?  [Oder
      hiesse das, uns eine Fluggesellschaft vorzustellen, die
      so klein sei, dass sie gar keine Non-stops habe?]

Or (as he might have said, but did not say, in the TRACTATUS):  "If dis-
agreement is possible, then agreement is necessary.  [Wenn die Nicht-
zustimmung MOEGLICH sein soll, dann muss die Zustimmung NOTWENDIG sein.]"
The practical reasons for using SGML are both overwhelming; so are the
theoretical reasons, though it is worth pointing out that in SGML, no
textual features are exhibited; they are only identified.

   Eventually, this line of discussion elicited a plaintive query from
Peter Batke of Princeton, who said he found it somewhat disheartening
that so many people were spending time and intellectual effort on prob-
lems like the theory of text encoding, which were, he said, surely nec-
essary as part of our general electronic hygiene, but surely not as wor-
thy of our intellectual efforts as more general problems of criticism
and the like?  It reminded him, he said, of the Monty Python sketch
about the chartered accountant who wants to be a lion tamer; when we
talk so emphatically about the interpretive implications of text markup,
are we not simply trying to lend to it an intellectual challenge and
excitement which, as a fundamentally positivist activity, it simply
doesn't have?  Is anyone actually doing real criticism with computers
these days, or has everyone decided to worry about infrastructure?
Allen Renear observed that STS, of all conferences, was the last place
he had expected to hear problems of editing and text representation
described as intellectually routine, and allied himself unrepentantly
with the chartered accountants and textual editors.  For my part, I
merely observed that before I started working in the TEI I had had a
number of ideas for projects in the application of computers to literary
and linguistic study, but that all of them had required some notation
for representing fairly complex textual data in electronic form.  I
incautiously inquired of informed parties whether such a notation exist-
ed, and before I knew it found myself shepherding the TEI Guidelines
toward publication.  Of course, text markup is not an end in itself.
But none of the interesting work I can think of wanting to do with com-
puters can be done without an intellectually sound markup scheme.  Hans
Walter Gabler observed at this point that Batke's image of the lion tam-
er was curiously focused on the tamer, and not on the lion who resists
taming.  If the literary critic is a lion tamer, the lion is, presum-
ably, some text or author like Homer. But interpretations of Homer have
fallen in European culture like leaves in Vallombrosa, as numerous and
as forgotten. Homer, meanwhile, remains, and he remains in no small part
owing to the efforts of "chartered accountants" like Aristarchus of
Alexandria.  The audience (which was, after all, attending an STS con-
ference) clearly felt, as I did, that Gabler had had the best of that

   After the afternoon break, I attended a session on Chaucer, with two
non-electronic papers (on a forthcoming facsimile edition of the Elles-
mere Chaucer manuscript, and on the publication history of William
Wordsworth's translations/modernizations of Chaucer, the latter involv-
ing an intricate but fascinating detour into nineteenth century literary
plagiarism and skulduggery), and one electronic paper, in which the
indefatigable Peter Robinson gave a galloping overview of four electron-
ic editions he is now involved with:  the World Shakespeare Bibliogra-
phy, a critical edition of the first and fourth editions of Johnson's
DICTIONARY, an edition of the collected works of Voltaire, and his par-
ticular favorite among these brainchildren, the massive 30-CD electronic
edition of the CANTERBURY TALES.  It was a stirring demonstration of
adapting electronic technology to the presentation of complex textual
material, to which I cannot hope to do justice here even if I were by
some miracle able to convey Peter's Australian accent in writing.  I did
manage, however, to jot down the seven requirements Robinson poses for a
really good electronic edition:

*   it should be at least as attractive to read as print (this is the
    one everyone is still failing)
*   the reader should be able to find what is wanted at least as quickly
    and easily as with print
*   all cross references should be live links
*   full search facilities should be provided (Peter did not, unfortu-
    nately, go into detail about what constitutes a "full" search facil-
*   the reader should be able to retain a sense of the dimensions of the
    whole, and have some idea of the beginning, middle, and end of the
    work; it should not be a boundary-less cloud of materials
*   the edition must be capable of representing everything a scholar
    might want to include (this is more a requirement for the underlying
    technology than for the edition, of course)
*   for now at least, the edition should be consultable on Macintosh,
    Windows, and X-windows; more generally, it should run on the majori-
    ty of widely used hardware/software platforms

Such editions would not be possible at all without SGML and the TEI;
SGML by itself, however, is rather like bread dough:  in need of a good
hot oven (good hot presentation software?) to give it some texture.

   The next morning, other business prevented me from attending the ses-
sion on Shakespeare criticism, but after lunch I did make it to a ses-
sion entitled unprepossessingly "The Computer and Editing", which was
mostly about hypertext and scholarly editions.  In the first paper,
Theodore James Sherman presented the computer, and in particular hyper-
text, as the answer to the problem of presenting authorial variants.  He
gave a clear exposition of the basic concepts of hypertext and their
modern history (beginning from Vannevar Bush), and so provided useful
background for the two other papers.

   Hoyt Duggan talked about his ambitious electronic archive of PIERS
PLOWMAN, which will include (eventually) transcripts and scanned images
of all extant medieval witnesses to that important Middle English poem,
as well as critically reconstructed archetypes of the three main ver-
sions of the text.  Though he apologized for their graininess, his
slides made outstandingly clear how useful it is to have color images,
even of pages written in monochromatic ink, especially if the page is at
all difficult to read.  Like the Chaucer project described the previous
day by Peter Robinson, the Piers Plowman archive, if it can ever be fin-
ished, will represent a tremendous monument of scholarship, as well as a
tremendous tool for further scholarship.  The first installment, with
transcriptions and images of one manuscript, is to appear this year (I
think) as the first volume of the new Society for Early English and
Norse Electronic Texts.

   Dugan argued, as he has done elsewhere, that the advent of new tech-
nology makes, in the final analysis, very little difference for the
array of skills which must be mastered by the editor of Middle English
texts.  The would-be editor should master Old English and Middle Eng-
lish, study related and contemporary languages, and acquire working
knowledge of paleography, codicology, and other ancillary fields.  The
technology necessary, he says, can be mastered in a few weeks.  The main
effect of technology is to make nugatory the old debate between propo-
nents of interventionist and of conservative editions, which was found-
ed, in its acerbity, on the patent impossibility of combining interven-
tionist and conservative texts effectively in the same edition.  It is
no longer impossible to do both; it is no longer necessary to choose one
over the other.

   The third speaker, Jean-Louis Lebrave of the CNRS Institut des Textes
et Manuscrits Moderns, presented a mockup of a hypertextual edition
designed to exhibit the genesis of a text for which authorial manu-
scripts are preserved, using a late short story by Flaubert as an exam-
ple.  Like the editions described by Robinson, Sherman, and Duggan, it
includes scanned images of manuscript material, detailed transcriptions
of that material with indications of cancellations and insertions, and
ancillary matter (here:  Flaubert's notebooks, his correspondence, the
reference works he consulted while writing the story, editorial annota-
tions, and so on).  The most unusual and unexpected feature of the edi-
tion was the opportunity it offers to view a hypothetical reconstruction
of the process of composition, as attested by the cancellations and
insertions in the manuscript.  The transcript, or the image of the manu-
script, is shown constituting itself word by word on the screen, with
pauses for cancellations of words already written or the insertion of
text above the line or in the margin.  Such an animation expresses in
particularly clear and vivid form a view of the compositional process
which would take pages of description, or an exceptionally dense appara-
tus criticus, to convey in print. It has obvious uses both in teaching
and in research.

   After Lebrave's talk, a vigorous discussion ensued, in which some
(e.g.  Peter Robinson) were struck by the great differences in user
interface among the editions shown, while others (e.g. myself) were
struck by the deep similarities among the editions.  The user interfaces
are indeed very different, and those differences will greatly affect the
impression readers gain of each edition.  But I do not think there is
more difference between Peter Robinson's Chaucer and Lebrave's Flaubert
than there is likely to be between the first CD-ROM of the Chaucer edi-
tion and the concluding disc in the projected series of thirty, when it
appears some years hence.

   After the break, I went to listen to the talks in a session on Yeats,
Pound, and Joyce, but when I arrived I found the crowd spilling out of
the lecture room and occupying the easy chairs in the lounge outside to
listen through an open door.  I don't know if all STS conferences have
sessions which fill their rooms to overflowing, but if the sessions I
heard at this one were any indication, they all deserve to.  The chairs
outside were, I suspect, more comfortable than those inside, but the
acoustics were not quite so good, and so in the end I gave up and went
up to the cafeteria for coffee before taking the New Jersey Transit bus
out to the Newark Airport. On the way, I wrote portions of a much better
trip report than this, but my laptop froze at a particularly vicious
bump on the freeway, and I lost the work, so the reader will have to
make do with this one.

   Another allegory?

                                                C. M. Sperberg-McQueen
                                                         17 April 1995