[Archive copy mirrored from: http://sunsite.berkeley.edu/MLA/guidelines.html, August 12, 1997]
MODERN LANGUAGE ASSOCIATION OF AMERICA
COMMITTEE ON SCHOLARLY EDITIONS
Guidelines for Electronic Scholarly Editions
These guidelines are intended to help scholarly editors, publishers,
CSE consultants, and CSE reviewers in carrying out their respective
functions; they reflect the principles articulated in the MLA
brochure Aims and Services of the Committee on Scholarly Editions,
parts of which are quoted below. [NOTE: Copies of the CSE brochure
may be ordered from the Committee on Scholarly Editions, Modern
Language Association of America, 10 Astor Pl., New York, NY 10003-6981.]
The guidelines for electronic scholarly editions are closely based
on the guidelines for printed editions. Their goal is to enhance
the usability and reliability of scholarly editions by making
full use of the capabilities of the computer. At this stage, the
guidelines are phrased in terms of desiderata rather than requirements,
since hardware and software capabilities are changing so rapidly;
and some desirable features are not yet technically or economically
feasible. The CSE does not prescribe a particular method of editing;
the committee's position is that different approaches are appropriate
in different situations. The CSE emphasizes that editors who are
thoroughly acquainted with editorial options applicable to their
materials and with the relevant documentary texts and who are
sensitive to the circumstances attending the composition and production
of all forms of the text are in a position to choose editorial
procedures appropriate to their materials, carry out those procedures
accurately and consistently, and explain exactly what they have
done and why.
Standards for the Approved Edition and Approved Text Emblems (based
on the 1991 CSE Statement of Aims and Services )
The editorial standards that form the criteria for the award of
the CSE Approved Edition emblem can be stated here in only the
most general terms, as the range of editorial work that comes
within the committee's purview makes it impossible to set forth
a detailed, step-by-step editorial procedure. Whatever
specific editorial theory and procedures may be used, the editor's
basic task is to establish a reliable text. In an electronic edition,
the provision of basic transcriptions and tools that allow alternative
views of the text and that permit others to build upon existing
editorial work is almost as important. Many, indeed most, scholarly
editions include a general introduction--either historical
or interpretive--as well as explanatory annotations
to various words, passages, events, and historical figures. Although
neither is essential to the editor's primary responsibility of
establishing a text, both can add to the value, that is, the usefulness,
of the edition. Whatever additional materials are included, however,
the CSE considers the following essential for a scholarly edition:
- A textual essay, which sets forth the history of the text
and its physical forms, describes or reports the authoritative
or significant texts, explains how the text of the edition has
been constructed or represented, gives the rationale for all decisions
affecting its construction or representation, and discusses the
verbal composition of the text as well as its punctuation, capitalization,
and spelling. .When it becomes technically feasible to do so,
textual examples used in the essay might take the form of hot
linkages to the edition itself rather than copies of the relevant
passages. While it might not be possible to carry this practice
through consistently for all texts cited (sources, analogues,
translations, secondary bibliography, etc.), in principle it is
highly desirable in order to avoid insofar as possible misquotations
of the texts cited.
- An appropriate apparatus and/or notes, or the functional equivalent
thereof, which (1) record alterations and emendations of the basic
text(s) (e.g., a full-text transcription of the basic text(s)
keyed to the edited text will make plain the alterations and emendations
in the latter), (2) discuss problematical readings (if not treated
in the textual essay), (3) report variant substantive readings
from all versions of the text that might carry authority (thus
full-text transcriptions of all versions of the text that might
carry authority obviate the need to report variant readings),
and (4) indicate how the new edition treats ambiguously divided
compounds (if any) in the basic text and/or distinguishes hard
and soft hyphens in order to show which line-end hyphens
in the new edition should be retained in quoting from the text.
These four kinds of information need not be presented in any specific
arrangement, and not all obtain in every situation, but the CSE
requires that, when applicable, they should appear either in each
file bearing the Approved Edition emblem or be otherwise available
at the time of publication.
If the apparatus is replaced by full-text transcriptions,
mechanisms are needed to display passages selected in parallel
and to create collated lists of textual variants in various categories
(e.g., substantive, accidental).
- A proofreading plan that provides for meticulous proofreading
at every stage of production so that the accuracy of the text,
textual essay, and textual apparatus is not compromised. Automated
proof-reading programs (spell checkers), word lists, and computerized
collation or file comparing programs can be used to alleviate
the burden, but they cannot substitute for manual proof-reading
nor should they ever be allowed to make unverified changes in
The guidelines below suggest some considerations that the CSE
regards as fundamental to the preparation and publication of useful,
reliable scholarly editions. They cover the kinds of inquiries
that an editor, reviewer, publisher, or informed critic needs
to make in order to form a judgment about the accuracy and completeness
of a scholarly edition, and they can therefore serve as a working
checklist of matters that may demand attention in producing scholarly
Just as no list of general guidelines can anticipate all of the
special problems in a particular edition, so also many of the
points mentioned below will not be applicable to every edition--e.g.,
Section IV.C Collations would not be relevant to a diplomatic
edition of a single text. The guidelines are intended only to
provide a broad framework for identifying issues and for dealing
with them reasonably.
For an electronic scholarly edition, perhaps the single most crucial
decision is the choice of encoding standard. Internationally accepted
and publicly defined norms, as set forth below, are preferable
to proprietary systems. If the norms are chosen correctly, the
edition can be migrated easily to new hardware and software platforms,
thus preserving the work that has gone into it.
- Of paramount concern is the necessity of standardizing the
character set, encoding norms, and documentation of the source
documents and the electronic edition itself as much as possible.
These elements should be as machine- and software independent
as possible and of such sufficiently wide-spread use that they
can reasonably be expected to be ported into future systems without
too much difficulty; since a well-prepared electronic edition
will in all likelihood outlast the hardware and software environment
in which it was produced. Editors must distinguish between the
intellectual requirements of the edition and the requirements
of its preparation, distribution, and use.
- Character set. For maximum portability the recommended character
sets are ANSI standard X3.4-1986 (lower ASCII), with 128 characters,
ISO 646 (82 characters), or UNICODE. In certain disciplines other
coding schemes of long standing exist and may be used (e.g., beta
coding for classical Greek). In some cases unique codes using these
character sets may need to be devised in order to represent special
characters. The character set should be explicitly declared as
part of the edition itself (e.g., as Text Encoding Initiative
[TEI] Writing System Declarations or as SGML entities).
- Encoding norms. It is preferable to use the implementation
of Standard Generalized Markup Language (SGML) specifically devised
for coding electronic texts, the Text Encoding Initiative (TEI).
The choice of an alternate standard should be fully justified
- The text itself should be essentially self-describing, which
means that the computer file which embodies it should contain
a header with essential meta-data. The Guidelines for Electronic
Text Encoding and Interchange (TEI P3), edited by C.M. Sperberg-McQueen
and Lou Burnard (1994) offer detailed descriptions of the sorts
of information that should be provided for the source document
as well as the electronic text itself (see chap. 5 of the TEI
guidelines). Meta-data should include:
- A description of the file itself and the sources used in its
preparation (although the description given here need not be as
detailed as that found in the introductory essay) (File Description).
- The encoding system used (Encoding Description).
- The level of encoding should respond to the purpose of the
edition. However, at a minimum any edition should encode elements
which by any reasonable standard are of general importance and
objectively determinable (e.g., the text structure itself: chapters,
acts, scenes). Any encoding scheme should be extensible in order
to allow others to add encoding later.
- Contextual information concerning the subject matter of the
text as well as the basic information about the editor(s) (Profile
- Information concerning the changes made to the file in the
course of its preparation (Revision Description).
- Coupled with this is the necessity of a mechanism to authenticate
the contents of the file (e.g., a hashing algorithm using a time-stamping
mechanism to generate a unique id number). Because of the ease
with which electronic texts can be changed, users must be able
to satisfy themselves that the file in fact is what it purports
- Similarly, formats for other media included in the edition
(sound, image, video) should conform to non-proprietary standards.
- While the format and content of electronic editions can, appropriately,
vary as much as those of print editions, it seems clear that the
possibility of digitized facsimiles of the original source materials,
especially the copy text, would enhance the usability and reliability
of virtually any electronic edition. Notionally, one can conceive
of the utility of a hypermedia archive, comprising digitized facsimiles
of all textual witnesses, encoded electronic transcriptions of
each witness linked to it, and a critical text linked to those
transcriptions, along with annotations, sources, analogues, etc.
In practice, the cost of preparing such archives for long texts
with many witnesses is likely to be prohibitive. Appropriate non-textual
materials (e.g., recordings of poetry read by the author or performances
of dramatic works) can only enhance the scholarly value of the
- Annotation of digitized facsimiles as well as linking of image
to transcription at the line or word level would greatly facilitate
scholarly use of such materials.
- Similarly, alignment of parallel texts (witnesses to a single
text, translations) at least to the line level would also facilitate
scholarly use. Line breaks in base transcriptions should be retained
so that they may be shown (if desired) when the text is displayed
in different-sized windows.
- Archival format: The preservation form of the text should be
non-proprietary and as machine- and software- independent as possible.
- The master digital archive should be maintained on a server,
preferably network-accessible and ideally in the custody of an
institution that can guarantee preservation of the archive and
migration to suitable hardware and software platforms as technology
changes (e.g., a university library or electronic text archive).
- A read-only version of the preservation form of the text should
also be maintained (e.g., on a CD-ROM disk, digital linear tape,
or other long-term storage medium).
- Delivery software involves both presentational and analytical
software. Given the current existence of three widespread software
platforms (MS-DOS/Windows, Macintosh, UNIX) and distribution on
diskette, CD-ROM, or the Internet, it seems likely that most electronic
editions will not be universally available to all users in their
most sophisticated form.
- Presentational and analytical software should ideally be widely
available (commercial, shareware, or public domain) for a variety
of platforms and should have a reasonable life expectancy. Although
electronic editions need not be published commercially, they should
be made available in standardized formats, e.g., CD-ROM disks
in ISO 9660 format, preferably not limited to a given computer
platform. CD-ROM disks have the great advantage of fixing the
form of the text at a given time, much like a traditional paper
edition; but they do not allow for additions and corrections except
through the release of a second edition.
- Network access from a central location, or text archive, is
highly desirable, both to minimize the proliferation of variant
texts and to facilitate revisions. Network access may obviate
the necessity of providing platform-specific versions, since World
Wide Web browsing tools (Mosaic, Netscape) exist already for each
platform. The current HTML markup language is not adequate for
serious scholarly purposes, although HTML versions of SGML-marked-up
text may be suitable delivery mechanisms.
- Such momentary limitations can be overcome, however, by preserving
the text in a more sophisticated archival form (i.e., SGML) and
then mapping it into other formats for presentation.
- Hypertext capabilities. The software chosen should allow for
the use of hypertext, preferably with the capability to allow
the user to add personal links as well as to annotate the text
- The editorial principles should include a rationale of the
kinds of hypertext links used as well as of the categories of
information that they are used to connect. The links themselves
should be annotated to indicate their scholarly purpose and to
facilitate searching by category (e.g., source).
- Analytic software similarly should be widely available and
not limited to a single platform.
- Analytic software might include:
- Retrieval software (e.g., TACT). Retrieval software frequently
uses an indexed data base. Such a data base should include every
individual word form as well as (preferably) access to lemmatized
forms. The latter is particularly necessary for old spelling editions.
Texts should also be available in a non-indexed form as well.
- Collation software (e.g., CASE, COLLATE, UNITE). If the editor
has constructed a critical text on the basis of full text transcriptions,
collation software allows the user to verify the editor's critical
practice as well as vary the editorial assumptions (e.g., by selecting
another version as a base text) and criteria (e.g., preservation
of accidentals). Moreover, collation software allows the user
to prepare a subedition of an individual family in a complex textual
tradition, thereby facilitating reception studies.
- Insofar as possible, software should be used instead of manual
techniques. Thus, instead of encoding, for example, morphological
information at the word level, or lemmatizing texts manually,
parsers, lemmatizers, or machine-readable dictionaries external
to the text could be employed. Much software of this sort is not
yet widely available and, when it is, may not necessarily fulfill
an edition's requirements for accuracy. It is likely that the
development of sophisticated software tools will be the single
most important factor in facilitating the creation of sophisticated
electronic scholarly editions. Any such software should have the
capability of specifying and storing rules for any actions it
carries out and following them without exception.
- CONCEPTION AND PLAN OF EDITION. The content of an electronic
edition differs little from that of a print edition. It should
be appropriate, complete, and coherently conceived. The criteria
for what is to be included in an electronic critical edition will
generally be more expansive than those for a comparable printed
edition, because of the computer's inherent ability to organize
and manipulate large amounts of data. In addition to materials
that form part of the edition itself, an electronic edition can
also make use of existing electronic materials by linking to them.
The considerations set forth above with regard to encoding schemes,
formats, digitized facsimiles, etc., apply equally to all of the
materials listed below. The contents should:
- include logically selected, manageable textual content--e.g.,
an edition of a single work, a group of works generically or chronologically
- include, when appropriate, authorial documents in addition
to basic text(s), such as adaptations, working notes, contracts,
tables of contents, prefaces, abstracts;
- present appropriate second-party textual materials--e.g.,
letters from respondents may be desirable in an edition of letters;
- include the editorial materials required by the kind of edition
envisaged- -e.g.,  prefaces and acknowledgments;
 lists of sigla, symbols, and abbreviations;  textual essay;
 textual apparatus (or the functional equivalent, e.g., hypertext
links) and/or notes;  historical/interpretive essay(s); 
illustrations--charts, diagrams, maps;  historical/explanatory
notes;  appendices;  bibliography;  glossary;  index(es);
- be logically arranged and easy to use;
- include appropriate analytical and text retrieval tools, either
as part of the edition itself or as part of the access package
for which the edition is designed (e.g., network browsing tools
- EDITORIAL METHODS AND PROCEDURES
- A thorough census of all relevant materials should be conducted.
- Although editors may use reproductions (e.g., photocopies,
microfilms, or digitized facsimiles) for preliminary editing,
they should at some point verify the accuracy of their work against
the original artifacts.
- Machine-readable transcriptions should be made according to
an established rationale and policy, covering, e.g., such matters
as expansion of abbreviations, use of special characters, and
indication of medium. Except for exceptionally clear machine-printed
modern texts, photocopied or original, scanners have not as yet
proved accurate enough to replace manual transcription.
- One very reliable method of manual transcription for printed
materials is to input the same text twice, by two different people,
who do not necessarily have to know the language involved, then
use a collation or file compare program to find the differences.
- Transcriptions should be double-checked and perfected
by persons other than the transcriber using appropriate manual
and computerized proof-reading procedures.
- All significant or potentially significant forms of the text(s)
should be collated or included as machine-readable transcriptions
of the witnesses.
- Accuracy of the collations should be verified by comparison
of results obtained by different people using appropriate collation
or file comparison software to supplement manual proofing. In
the latter case, it may be assumed that the collations obtained
will reflect faithfully the underlying transcriptions.
- Editorial policy for defining and recording variants should
be clearly stated, preferably in the form of parameters established
in the collation software. All items defined as variants should
be recorded whether or not they are to be included in the completed
edition. Such variants will be recorded automatically if complete
transcriptions of the textual witnesses have been made and if
the collation software has been programmed to list them.
- The collation software used should be capable of filtering
out variants according to established categories (e.g., spelling,
capitalization, punctuation) and of separating or grouping the
resulting apparatus by those categories.
- Sources of references and quotations in author's text(s) should
be identified, and any textual problems raised should be addressed.
- Care should be taken that the text is accurately quoted in
the textual essay, textual notes, historical essay, and explanatory
notes, preferably by hot linking to the quoted passage rather
than by copying it, when it becomes technically feasible to do
so; so that any change in the text is reflected in the essay.
- Proofing at every stage to safeguard accuracy is of the highest
- PARTS OF THE EDITION
- The decision to use a single or multiple base- or copy-text,
parallel texts, sequential versions, or a combination of these,
should be appropriate to the goal of the edition. Sophisticated
encoding and linkage will allow the greatest flexibility to both
editor and user in deciding and altering the presentation format.
- The form of presentation of the texts--whether in
clear text, diplomatic transcription, facsimile, or in some other
format--should be consistent with announced principles.
Detailed encoding combined with appropriate filtering mechanisms
can allow the same base text to be presented in a variety of different
ways; e.g., as an old spelling or a modernized edition.
- Inclusive text should use a clear and efficient system to symbolize
or reproduce cancellations, interlineations, omissions, insertions,
- Textual Essay
- The essay should provide a clear, convincing, and thorough
statement of the edition's theoretical principles and practical
methodology, covering such matters as:
- theory of copy-text adopted;
- description of alternative candidates, if any, for basic text
(whether single, parallel, or sequential texts are presented)
and justification of selection; instructions on how to use software
to select alternative base texts.
- justification of form of presentation, whether clear text,
diplomatic transcription, or other form, and instructions on how
to convert the presentation of the text from one form to another;
- clear explanation of emendation policy, covering all changes
made in the basic text(s) or documents, whether or not such changes
appear in the emendations list;
- rationale for including and excluding various classes of textual
variants in the apparatus, or instructions on how to use the collation
software to change the paradigms which select variants;
- explanation of treatment of ambiguously broken line-end
compounds or possible compounds in source text(s);
- clear instructions for using the textual apparatus, or the
accompanying collation programs;
- description of the character set and encoding scheme used.
- instructions for use of the text retrieval software
- The discussion of materials the edition is based on should
include the following, where appropriate:
- a survey of all forms of the text(s) relevant to the edition,
including an account of the provenance of such forms and/or artifacts;
- a record of locations of relevant manuscripts and unique printed
- identification of the specific copies used for collations,
preparation of printer's copy, etc.;
- bibliographical or codicological description of the relevant
artifacts (printed copies, manuscripts, typescripts, tear-sheets,
etc.). When possible this should be accompanied by complete digitized
facsimiles of such artifacts.
- The account of the evolution of the text(s) should include:
- the history of composition and revision, whether by the author,
scribes, editors, compositors, etc.;
- the history of publication of printed texts;
- for scribal texts, a profile of the copying habits, orthography,
and dialect of manuscript scribes.
- Critical/textual apparatus (The term "apparatus"
is used here in its broadest sense. The CSE does not require a
standard format for the apparatus.)
- Design and Purpose of Apparatus
- The apparatus or collation software used in conjunction with
the textual essay should enable thorough study of the composition
and transmission of the text within the limits envisaged by the
- The apparatus or collation software should distinguish, where
possible, between what the author has done to the text and what
was done by scribes, printers, compositors, advisors, and editors
(including the present one).
- The record of textual variants should be logical, complete,
and uncluttered; it should:
(1) conform to the principles announced in the textual essay;
(2) include variants from all authoritative or significant texts;
(3) make possible, when used in conjunction with the edited text(s),
the recovery of all significant forms of the text, if such is
consistent with the goals of the edition, preferably by display
of the complete form of the transcription of the originals.
- Each part of the apparatus should be self-contained; cross-referencing
of information between lists should be clear and simple to follow,
a process that can be facilitated by appropriate use of hypertext
links. The implementation of hypertext links should make clear
the distinction between textual and non-textual material. Encoding
of apparatus where there is not a complete transcription of all
relevant witnesses should follow the TEI or other appropriate
- Parts of the Apparatus
- Record of emendations: editorial emendations--words, spelling,
punctuation, and capitalization--of the basic text(s) should be
reported or adequately described in a manner consistent with the
stated policy of emendation; if emendations are not individually
reported, the policy must be justified and the classes of unreported
emendations adequately described.
- Record of alterations: the author's alterations of the text
should be recorded.
- Records of variants should follow the edition's stated principles
of inclusion and exclusion and should make clear the history and/or
permutations of the text. Collation software should allow the
user to modify those principles to suit his or her own needs.
- Textual notes should identify the textual problems and adequately
explain how the editors have dealt with them.
- Records of Word, Stanza, and Section Breaks
(1) All ambiguous line-end hyphenation of compounds or possibly
compound words in printed texts used as basic texts should be
recorded; a second list should indicate the way such compounds
ambiguously broken in the new edition should be quoted. This process
will be facilitated by the use of hard and soft (conditional)
(2) Stanza, section, and verse paragraphs ambiguously broken at
the ends of printed pages should be recorded.
- Extra-Textual Materials
- Historical or critical essays and analyses, explanatory notes,
glosses, etc., should, if present:
- be clearly separated from the textual essay and complement
rather than duplicate information in the textual essay;
- dovetail smoothly with the textual essay;
- conform to a reasoned policy for length, placement, and content;
- be complete.
- Glossaries and proper-name tables
- The rationale for determining entries should be clear and appropriate
both to the text and to the audience envisaged.
- The format should be clear and uncluttered.
- Cross-references should be provided for entries having alternate
- To the extent possible such tables should be electronically
generated on the basis of encoding.
- PREPARATION FOR PUBLICATION
- All necessary permissions to publish the material must be obtained
from the owners and copyright holders.
- The editor and the publisher should agree on the encoding scheme
and software to be used and the publisher should at an early stage
see a sample.
- The editor and the publisher should understand one another's
special requirements for publishing electronic scholarly editions,
- the particular design requirements of the formatted edition
and, if applicable, the format of the series as a whole;
- special aspects of the production schedule, including:
- the amount of time to be allowed for multiple proofreadings
and for necessary final collations.
- Final responsibility for maintaining the accuracy of the text
during production must be clearly assigned.
- Adequate resources should be allotted, and a comprehensive
plan for proofreading should be developed, taking into account:
- how proof will be read--by whom, how many times, and against
- which stages of proof will be read by the editor(s).
- Final collations or checks should be carried out to ensure
that no unauthorized changes have been made in the final electronic
files in proof. Spell checkers and word lists are useful for spotting
anomalies but all changes must be verified.
- Use of Electronic Files
- Since the editor's electronic files will be used for all or
part of the formatted edition, the editor and publisher should
- the choice of software and platform, bearing in mind problems
such as the linking of notes with text, nonstandard characters,
etc. (ideally, an edition should be available on as many platforms
- the extent to which the encoding scheme chosen will allow or
facilitate subsequent publication in other formats, e.g., print;
- who is responsible for inserting final changes or corrections
in the file-- the editor, the publisher, or third-party technical
- If electronic files are to be translated to a system that will
drive the typesetting machinery, the resulting proofs should be
checked as they normally would.
- Arrangements should be made for retaining and archiving the
- Consideration should be given to publication of the edition
in a variety of formats, including print.
- Indexing: in addition to full text retrieval software, consideration
should be given to the encoding of items to be indexed (e.g.,
proper names); and appropriate software for retrieval of indexed
items should be included.
- Reformatting: To facilitate reformatting, editors and publishers
- making archived electronic files available for reformatting;
- encoding the apparatus and editorial in such a way that they
can easily be omitted, if desired, from reformatted versions.
- licensing libraries to extract data in order to integrate it
into locally-based electronic text collections.
PLEASE SEND COMMENTS TO:
The Electronic Scholarly Editions Listserv:
For the Committee on Scholarly Editions
Charles B. Faulhaber
The Bancroft Library
University of California
Berkeley, CA 94720-6000