The Electronic Piers Plowman Archive and SEENET

Four years ago Thorlac Turville-Petre and I found ourselves talking about the simultaneous usefulness and unreliability of the electronic editions of the Middle English texts available to us. My own attempt to determine the metrical rules underlying the composition of Middle English alliterative verse and our common use of electronic materials in editing The Wars of Alexander for the Early English Text Society had convinced both of us that electronic technology offered extraordinary opportunities to students of medieval literature, but we both had found that, in most respects, those electronic texts available to us were less reliable than most printed texts. Eventually, our conversations led to our forming the Society for Early English and Norse Electronic Texts (SEENET). We cannot claim any originality in the idea, since Furnivall had anticipated us by well over a century in forming the Early English Text Society (EETS). Indeed, our first thought was to approach the EETS governing board with the idea of cooperating in publishing electronic versions of their texts. Initially, we thought to call the new society the Early English Electronic Text Society, but the EETS thought that rather too close to its own title to be quite suitable, and in 1991, its board saw little point in electronic editions. We approached a number of university presses, and in 1992 only the University of Michigan and Johns Hopkins University presses saw any significant future for electronic books, at least in medieval texts. Since then, the situation has changed dramatically, and if electronic books have become not quite as "digne as diche-water," they are no longer a radical novelty, certainly not among medievalists.

At the time we began to contemplate SEENET, a number of textual projects were already in existence, perhaps chief among them the Oxford Text Archive (OTA), a major collection of electronic versions of books already in print. I have contributed three or four texts to the OTA, and with each text, I included a caveat lector to the effect that it had been inadequately proofread and that it represented only a digital transcript of parts of an extant printed edition. In my experience, the major difference between my contributions and those of others in that series was the warning to users about the probable defects of my texts. To say that much is to make no criticism of the immensely useful Oxford Text Archive. It was acknowledged that texts in that collection were usually prepared in a "quick-and-dirty" fashion to make them swiftly available and that frequently the base printed edition was chosen not because it represented an authoritative text but because it was out of copyright. Such texts will not, for the most part, gain the CEA stamp of approval, but they have enabled kinds of literary scholarship formerly unavailable to us. That said, such a series, perhaps most like the printed editions of Everyman's Library, is not steadily reflective of the highest standards of editorial thinking or performance. We can and ought to do better in the new medium.

We formed SEENET to make it easier for editors to do well, better, and best in producing electronic texts, intending through it to procure, produce, and disseminate scholarly electronic editions of Old Norse, Old English and Middle English texts. We want our texts to combine exploitation of the full capacities of computer technology with preservation of the highest standards of traditional scholarly editing. We want not only to publish reliable machine-readable texts but we want them accompanied with highly competent introductory materials, glossaries, annotations, and apparatus. We want our texts to bear all of the virtues of traditional print editions and at the same time to begin to create the new kinds of text enabled by computer technology.

It has become commonplace in predictions of the effects of electronic technologies on the study of literature to compare modern editors with Gutenberg and Caxton, to speak of fundamental and revolutionary developments to come in our conception of text. I have written such prose myself, and I don't dismiss such claims; however, I want here to summarize a recent piece from The European English Messenger that suggests another possible view of the matter. The editor, hoping to stimulate submissions on electronic texts, detailed his experience with a real electronic text, an edition of Virginia Woolf's novel The Waves. He found a number of problems, all soluble, but all fairly tedious to deal with. Some reflected conventions of early computer editing; for instance, the entire text was in upper case letters. Still worse, the text itself was not reliable and required substantial revision. Worst of all, once he had corrected the text, he found it useless: "What," he somewhat plaintively asked, "are we to search for?" Moreover, he found that his condition was general amongst literary critics:

A first set of letters to literary critics and media experts at other universities, asking what on earth one is to do with all this electronically stored textery, garnered some expressions of sympathy, but not more. Some of them suggested it was a good question, they might get in touch with me again. The rest is silence.... May I appeal to readers who have discovered what to do with or to a text on the computer to advise us what pleasures those of us miss who are puzzled by all these oysters that refuse to yield the pearls they had promised?

Surely the emperour of CD does have some clothes on?1

A second story concerns, at least for medieval texts, the radical importance of editors having mastered the traditional philological disciplines as well as the new technology. Three or four years ago at Kalamazoo, after I had first described our plans for editing Piers Plowman, a young man expressed considerable interest in the Piers Plowman Electronic Archive and asked "What must I do to edit a Middle English electronic text?" I responded that he should first master Old and Middle English, reading broadly in both major and minor texts in a variety of dialects; that he should study languages, at a minimum Latin, Old French, and Old Norse; that he should devote time to formal study of historical linguistics, especially the history of English; that he should study paleography and codicology; that he should learn as much as he could about scribal practice and the usus scribendi of the author(s) he wished to edit. "But what about the computer programs?" he asked. "Oh, that," I responded, assuring him that he could learn what he needed to know about the computer programs in a month or so. For as crucial as the machine is to computer editing, the indispensable disciplines to master first are the traditional ones, that body of knowledge built up laboriously over the past two or three centuries--in short, the Old Philology.2

I am, like most folk assembled here, convinced that the new technology will revolutionize the old disciplines, even the hide- bound Old Philology. Imagine where we would be today had the Reverend Walter W. Skeat possessed a fast micro-computer! However, as confident as I am that we live at an exciting time of textual discovery, I am about equally certain that the new "electronically stored textery" is not likely to result in validating last (or even this) year's flavor in literary theory. The fundamental principles and processes of textual editing have not changed in essence as the result of the newly introduced technology, at least not yet. I am tempted, against better judgment, to predict they will not change.

However, it is easy, in any case, to see that at least one centuries-long debate will, in an age of electronic editing, no longer divide theorists. The technology already available makes irrelevant the question of whether the editor's task should consist of constructing conservative documentary editions or interventionist critical texts. We can now see that the old argument was economically and technologically constructed, based on limitations that no longer need trouble us. With electronic texts, we may, indeed, we ought to, have both kinds of text. Editors of the next decades should steadily present "best text" diplomatic transcriptions of some, most, or all the important manuscript witnesses to a text. But the editorial project need not stop there, for editors may, and I think should, present in the same edition archetypes and hypearchetypes and as many theoretical constructions of a critical text as their good sense requires and their ambition and energy permit.

There ought to be many mansions in the house of electronic texts, many viable models for text editing. The most basic kind of edition is the bare textual archive. The Tudor Poetry Textbase at the University of Otago in New Zealand offers one such model. When it is completed, it will incorporate some quarter million lines of Tudor poetry, all in TEI-conformant SGML markup, but with a minimal amount of apparatus and annotation. It will serve primarily as a textbase, an accurate rendering in electronic form of a complete historical corpus of poetry in machine-manipulable form. In this instance, adequacy of markup and the accuracy of its representation of the original texts will constitute its fundamental virtues. In this respect, such a project represents the ethos of the Oxford Text Archives and the Center for Electronic Technology in the Humanities (CETH), but perfected with scholarly care in editing the texts afresh from the primary data. Such text bases, reliably edited, but with a minimum of annotation and apparatus, are likely to have permanent value.

That value comes not least from the fact that a reliable electronic text provides a foundation upon which textual scholarship of the future may build. When Thorlac Turville-Petre and I began work on The Wars of Alexander, W. W. Skeat's 1886 EETS edition was immensely useful in helping us define the editorial problems, but it was still necessary to re-transcribe both manuscripts. We might well have simply proofread Skeat's texts against both manuscripts, a practice honored over time, but whatever the decision made at that point, there would always come a time when the entire document had to be recopied. In the case of the Wars, that happened twice, first when the working transcription had to be re-typed for my dissertation and again at the end of the process when the entire text was entered into the publisher's computers to be typeset. In both cases, fresh errors were introduced, and weeks of proofreading ensued. In the case of our forthcoming edition of Corpus Christi College, Oxford, MS 201 (the notorious manuscript F of the B version of Piers Plowman) each proofreading has served to correct errors in a base text; each set of keyboardings bringing the text closer to perfection. What is got right once can be kept right and used as a base either for new thinking about the text or for markup for new and different kinds of critical enquiry.3 Probably nothing will shorten the tedious process of proofreading to assure maintenance of the quality of the base texts, but complete re- transcriptions need never again be made.4

We have heard Michael Pidd's characterization of the Electronic Canterbury Tales Project and seen portions of Kevin Kiernan's image-oriented Electronic Beowulf Project. The Piers Plowman Electronic Archive, which I am constructing with a team consisting of Robert Adams, Eric Eliason, Ralph Hanna, Thorlac Turville-Petre, and Mícéal Vaughan, offers yet another model for the electronic edition. In the first stage of constructing the Archive, we are creating documentary editions with full SGML markup of all the fifty plus manuscripts and early printed editions. Each documentary edition will include paleographic and codicological descriptions of each manuscript. We are recording erasures, subpunctions, and other forms of deletion; marking suspensions and abbreviations, marginal and interlinear additions and corrections to the texts, changes in scribal hand or ink or script, along with any other features of the material text we recognize as likely to be useful to students of the poem. Since at this level of the Archive our goal is to provide the lections of each of the manuscripts in machine-readable form, we shall in effect be acting as conservative editors, presenting 54 "best texts" of Piers Plowman. At this stage, we will make no emendations, regularizations, or modernizations of these base manuscript files.5

When libraries will permit us, we will include in the Archive digitized color facsimiles of each manuscript, providing hypertextual linkages that will permit a user to place our transcriptions in windows beside the facsimiles. We expect such transcriptions and digitized facsimiles will be of considerable value to historical linguists, editors, and paleographers, as well as to students of scribal habits and methods. Though the combined facsimiles and transcriptions can never entirely replace direct study of the manuscripts, they permit textual study at sites remote from the originals with a better basis than any form of photography formerly available. Indeed, for many purposes, a color digital facsimile is superior to the original itself because it is manipulable for magnification as well as for color or gray-scale analysis. For those who lack immediate access to the original manuscripts, color digital facsimiles will answer to most literary and philological purposes, providing immediate and inexpensive access to their readings.

Electronic collation of a very complex documentary tradition is now possible--thanks to Peter Robinson's splendidly useful COLLATE program--and editors wishing to produce critical texts can machine collate the transcriptions to produce the corpus of variants from which they will construct the archetypes. In the case of Piers Plowman, consistency in this large task will be facilitated by access to computer-generated concordances of each of the manuscripts as well as to edited texts of all three versions of the poem.

A recent and exciting development in computer applications in textual editing has been Peter Robinson's work with Robert J. O'Hara, using PAUP cladistic analysis to establish genetic relations among manuscripts.6 Cladistic analysis has been developed over the past thirty years by evolutionary biologists to reconstruct the descent of related species. Both evolutionary biologists and textual critics seek to explain the existence of a varied population as the product of branching descents over time from a common ancestor. For textual critics, the population to be accounted for consists of manu- scripts, and the descent is by scribal copying. For evolutionary biologists, the population is life itself, and the descent is by reproduction. The theoretical identity of the central concerns of the two otherwise unrelated disciplines continues to be explored. Most recently, Robinson and O'Hara have used PAUP successfully to analyze a large group of Old Norse manuscripts. Sufficient external evidence exists to establish the genetic relations among those manuscripts, and that unusual situation permitted Robinson and O'Hara to test the program, which indeed reproduced in all essential respects the stemma attested by the external evidence.7 At the medieval conference in Kalamazoo in May, 1992, Robinson ran a small sample of 100 lines from six manuscripts of Piers Plowman and achieved extremely interesting results, suggesting that it may be possible with this program to overcome problems created by contamination or coincidental variation. We are not convinced that the program will work with our texts, largely because of massive contamination and coincidental variation, but we plan to give it a full test on the Piers Plowman manuscripts.

When we have reconstructed the A, B, and C archetypes, we will in each case construct a critical text, for though some recent theorists have concluded that the editorial project is completed with the production of accurate documentary and facsimile editions of the witnesses, we consider it only to have been well begun. No single witness to Piers Plowman accurately reflects Langland's poems. Some manuscripts are in general better witnesses to the authorial text than others, but no single witness carries more a priori authority than any other. Therefore, if we are to read a text more nearly authorial than is represented in any manuscript witness, editors must engage in the perhaps quixotic but essential project of creating an authorial version from this corpus of variant readings.

Some of you must already have wondered whether this archival game is worth the candle. We might ask ourselves how many scholars, including those most insistent upon privileging scribal versions of the poem, will read fifty-four editions of very similar poems? Why should so many marginally different "poems" find readers? As it stands, few specialists in late Middle English will have read all three published versions with the same attention they have given B. This surplus of textual material, even with meticulous hypertextual linking, is at once too much and too little.

Too much, certainly, if one thinks of the text only as an aesthetic object.8 And much too little, if one thinks of the text primarily as an aesthetic object. I have recently argued that in spite of the recent and unlamented "death of the author," many naifs still want to read

the poems Langland wrote, as he wrote them and without scribal lapses or accretions. Medievalists, of course, have never had occasion to celebrate the kind of author whose demise has been announced.9 We know, for example, virtually nothing about the historical poet who composed the three versions of Piers Plowman. The autobiography of Long Will, the dreamer-protagonist, is constructed as much by thematic considerations as by knowable events in the poet's life and may well be completely fictional. Whether "Langland" means anything to us other than a name to attach to the three poems, we recognize in those poems the work of a poet of extraordinary power. We want to read his words. Without wishing for one moment to deny the intrinsic interest of scribal practice to bibliographers, students of reception, linguists, metrists, or cultural historians, we recognize that scribal accretions are of secondary interest to most readers. In any case, the import and direction of scribal changes are recognizable only in relation to a concept of the authorial text. That is, we cannot study the reception history of texts like Piers Plowman as it is reflected in the manuscript witnesses until editorial work distinguishes what each scribe or editor did from what was inherited from an exemplar. Without doing collations, without establishing archetypal or critical texts, without an attempt to construct a stemma, textual variation is difference without significance. Though the fifty-four documentary editions constitute necessary groundwork for critical editing, each witness with its own set of differences will not be fully meaningful until its relationship to the other texts in the tradition can be established.10 That requires critical editing.

You will have realized as I have delivered this paper that, in spite of our very passionate commitment to the exploitation of electronic technology, the essential goals of the Piers Plowman Electronic Archive would not differ greatly from those that motivated George Kane and E. Talbot Donaldson a generation ago nor Skeat at the close of the last century. What I hope is equally clear is that I regard the recent theoretical claims of the irrelevance of authors and of authorial intention to be rank folly. Though we are unlikely ever to establish unequivocally the text that represents Langland's final intentions--it seems moderately clear that this inveterate tinkerer with his texts had no final intention--it is equally clear that it is Langland's creative work that validates the entire editorial project, that in spite of the historical and codicological interest we may take in scribal versions of the poem, it is our attempt to recover Langland's versions of Piers Plowman that makes our work worthwhile.

However, in an electronic archive, the reconstituted authorial text is privileged only at the level where such privileging is appropriate. At other levels of the archive, we will represent the scribal versions of the text--their ordinatio of the page, their passus divisions and rubrications, their verse paragraph markers, their hierarchy of scripts, their use of color and line in emphasizing and presenting text, their choice of dialect terms, spellings, and so forth. The electronic text at this level ought to be as faithful to scribal intentions as--on a different level of abstraction--it is to the author's. In the electronic archive, the user/reader will be able to move easily between levels, but the editors must first do the traditional editorial work and supply the linkages.11

In summary, I claim that most of the characteristic features of printed editions should appear in the electronic text, differing primarily by being hypertextually linked. Unlike printed editions, the electronic text will permit readers to manipulate, search, compare, or concord the individual manuscripts, the reconstructed archetypes, and the critical texts. Readers seeking linguistic information on a form can move easily from the glossary to the phonological discussion or between textual and historical notes and the text itself. When we have failed some readers, as inevitably we must, they may insert their own annotations in their copies of the text. We intend to set up a procedure whereby users of the Archive who wish to add their annotations to it may submit suggestions to the Editorial Board, who will from time to time update the Archive. Moreover, it is worth repeating that once the base work of transcription and markup has been done, editors of the future may construct from it better editions than we have. The completed Archive will, in short, provide an electronic base for kinds of literary, linguistic, and textual scholarship presently unimagined, if not quite unimaginable.

However, all of these wonderfully revolutionary things cannot be achieved if the literary portion of the academy continues its mindless rejection of empirical investigation and its distaste for the methods and controls of the harder sciences, if the Old Philology is displaced from our graduate programs just at the point at which the tools become available that will help us do it better than ever before. There is reason to be encouraged. Fashions, after all, by their very nature are changeable. The anti-philological theory of much recent critical discourse has begun to fret as well as strut as it leaves the stage. Editors in this new age of electronic technology need to be careful of embracing the fashionably revolutionary for fear of missing what is truly revolutionary in the technology--its capacity to help us collect, organize, analyze the empirical evidence to answer more precisely and comprehensively the perennial questions. Our attempts to answer those questions will inevitably generate new questions. But we should not expect that tomorrow's revolutionary discoveries will correspond too closely to today's fashionable pieties.

1 The European English Messenger 3 (1994): 86-87. Back to document

2 Stephen G. Nichols, ed. The New Philology, Speculum 65 (1990): 1-108; Keith Busby, ed, Towards a Synthesis? Essays on the New Philology (Amsterdam: Rodopi, 1993); and William D. Paden, ed. The Future of the Middle Ages: Medieval Literature in the 1990s (Gainesville: U Florida P, 1994.)Back to document

3 This claim may represent no more than a pious hope, for as Willard McCarty has recently reminded us in "Handmade, Computer-Assisted, and Electronic Concordances of Chaucer," in Computer-Based Chaucer Studies, ed. Ian Lancashire, CCH Working Papers 3 (Toronto: U of Toronto Centre for Computing in the Humanities, 1993), pp. 49-66, transience may be the essence of electronic texts--or, as a wag in the first computer lab I worked in drolly put it, "To err is human, but if you really want to screw up, you need a computer." Electronic texts offer both the base for textual permanence and for a kind of mouvance that would shame the most libertine of medieval scribes. Back to document

4 In my experience as an editor, simple accuracy is perhaps the hardest thing to achieve. Electronic technology scarcely affects the labor of transcribing and proofreading. Transcription at a keyboard, like writing on animal skins with a quill, still takes place character by character. Optical character recognition software does not presently exist to shorten the task. Proofreading still requires serial re-readings to compensate for eye-skip, arrhythmia, dittography, homoeoteleuton, or for the manifold other failures of concentra- tion that have marred scribal efforts since literacy began. Our original plan provided for five separate readings of each tran- scription against the photocopies and a final proofreading against the original manuscript, but our practical experience with F suggests that in the case of a difficult or complex manuscript more readings will be necessary. Back to document

5 SGML markup permits editors of documentary texts to record discrepancies between scribal intention and performance while remaining strictly literal in transcription. Lapses of the pen can be easily entered in dual form, one with the text as the scribe wrote it and another with his probable intention. For example, the F scribe occasionally writes "a tese," almost certainly reflecting his pronunciation of the phase, for "at ese." With a tag reading <reg orig="a tese">at ese</reg>, we provide a toggle switch that permits display (or search) for either the manuscript form or the intended phrase. Neither version is privileged save by the interest of the user. Nor is either suppressed editorially. Both readings, what the scribe actually wrote and what the editor thinks he must have intended to write, are represented in the edited text.Back to document

6 D. L. Swofford, PAUP: Phylogenetic Analysis Using Parsimony, Version 3.0. Computer program distributed by the Illinois Natural History Survey, Champaign, IL, 1991. Back to document

7 P. M. W. Robinson and R. J. O'Hara. "Cladistic Analysis of an Old Norse Manuscript Tradition," Research in Humanities Computing 4, ed. Nancy Ide and Susan Hockey (Oxford: Oxford U P, forthcoming). For cladistic theory, see also Elliot Sober, Reconstructing the Past (Cambridge, MA: MIT P, 1988) and N. I. Platnick and H. D. Cameron, "Cladistic Methods in Textual, Linguistic and Phylogenetic Analysis," Systematic Zoology 26 (1977): 380- 385. See as well Peter Robinson's "An Approach to the Manuscripts of The Wife of Bath's Prologue'," in Computer-Based Chaucer Studies, ed. Ian Lancashire, CCH Working Papers 3 (Toronto: U of Toronto Centre for Computing in the Humanities, 1993), pp. 17-48.Back to document

8 As I have argued in "Some Un- Revolutionary Aspects of Computer Editing," in Richard J. Finneran's The Literary Text in the Digital Age (Ann Arbor: U Michigan P, forthcoming):

Such a view of text is, of course, more than a little parochial, since literary texts serve a variety of other functions in modern attempts to recreate and under- stand our past. Less parochially, fifty-four electronic transcriptions and facsimiles--perhaps none of them ever serving as a traditional reading text--will offer scholars not only new ways to study the text and the textual tradition of the poem but also possibilities for gaining fresh insights into other aspects of late medieval literary culture. Students of text reception may readily access formerly inaccessible marginal and interlinear annotations or significant scribal changes to the text itself. The Archive will enable study of the changes both in language and literary focus wrought by the revising sixteenth-century scribe who created Toshiyuki Takamiya, MS 23, or by Robert Crowley's protestant rebaptism of the poem in his three 1551 printed editions of the B text, each converting Langland's Middle English into something more appropriate for its Tudor audience. Since Piers Plowman was copied in virtually every late Middle English dialect, historical linguists will be able to study patterns of regional variation in lexicon, phonology, and orthography. The facsimiles will be useful to students who once lacked access to large collections of primary manuscript materials. Moreover, students of form and style and meter can add their own markup for other, as yet unimagined, kinds of study. It matters little that no one is ever likely to want to read all fifty-four documents. Many will want to use them.
Back to document

9 For thoughtful comment on this formalist gambit, see Seán Burke, The Death and Return of the Author: Criticism and Subjectivity in Barthes, Foucault and Derrida (Edinburgh: Edinburgh University Press, 1992). Back to document

10 Robert Adams decries "the recent penchant for ... treating scribal errors as instances of medieval literary criticism" ("Editing Piers Plowman B," p. 33). Cf. Barry Windeatt, "The Scribes as Chaucer's Early Critics," SAC 1 (1979): 119-141, and Derek Pearsall, "Editing Medieval Texts: Some Developments and Some Problems," in Textual Criticism and Literary Interpretation, ed. Jerome J. McGann (Chicago: University of Chicago Press, 1985), p. 103, with George Kane, "The Text," in A Companion to Piers Plowman, ed. John A. Alford (Berkeley and Los Angeles: U California P, 1988), p. 194.Back to Document

11 Bernard Cerquiglini, Eloge de la variante; Histoire critique de la philologie (Paris: Seuil, 1989), imagines a new kind of electronic text not unlike that proposed here, though he appears to be content with the archival element of compiling vast collections of linked data but without considering it necessary to move beyond accumulation of data. See Mary B. Speer, "Editing Old French Texts in the Eighties: Theory and Practice," RPh 45 (1991): 22, for sensible criticism of Cerquiglini's position. Back to document

Hoyt N. Duggan