English Poetry is a machine-readable full-text database encompassing the works of 1,350 poets from the Anglo-Saxon period to the end of the nineteenth century, available commercially from Chadwyck-Healey.
It is the largest and most accessible full-text database yet published in the humanities. The great size and chronological span of English Poetry and the consistent coding of all the texts make it a valuable resource for research, teaching and reference.
The availability of the major part of the English poetic canon in a single database makes possible a vast range of new ways of researching poetry. It is impossible to predict all the potential applications but they will certainly include those listed below.
As never before researchers can follow thematic influence and compare the poetic imagery of poets in all periods.
English Poetry provides the raw material for stylistic and linguistic analysis of every kind. The ability to select texts for study, using sophisticated search techniques, and easily to print or save them, enormously facilitates the work of researchers. Time previously spent in the identification and collection of material can now be used more fruitfully in its analysis.
English Poetry is a powerful aid in the creation of concordances and any kind of lexicographic research. The database can be used as a guide to changing patterns of word use and vocabulary and the earliest use in poetry of words and phrases.
The great majority of texts in the database are out of copyright and teachers may use them to create their own anthologies for course work, quickly and inexpensively. Commentaries and questions can be added easily to any poem or line.
English Poetry provides teachers with an almost limitless source of material for hands-on course work. Students can be assigned an enormous variety of projects in literary and stylistic analysis.
English Poetry contains primarily the works of those writers listed as poets by The New Cambridge Bibliography of English Literature, Cambridge University Press, 1969-1972 (NCBEL). It includes those writers whose main entry in NCBEL appears under another genre but who are cross-referenced to Poetry. It also includes the few writers of poetry not cross-referenced by NCBEL, for example Emily Brontë and Aphra Behn. In addition, following NCBEL, the database contains the works in English of Welsh, Scottish and Irish poets. Poets who were active before 1900 are included but poets principally active in the twentieth century are excluded.
NCBEL is an authoritative reference work present in almost every library that supports English studies. It describes itself as 'a blueprint for research' and provides a sound list of poets for English Poetry. It is recognised, however, that developments over the past twenty years have focused attention on poets not included in NCBEL and on categories of poets, such as women poets, that are underrepresented.
English Poetry's ability to cope with such omissions at a later stage through the seamless merging of new texts into the database is one of its great virtues.
The electronic publishing medium and the use of rigorous and comprehensive coding make possible what could never be achieved in print. In response to the suggestions of scholars Chadwyck-Healey will supplement the core database with the texts of other poets.
Individual scholars can also create their own personal databases by adding other machine-readable texts to texts downloaded from English Poetry.The Works
English Poetry aims to include as full a collection of the published works of each poet as possible.
Works for the stage which are excluded will be the subject of a separate electronic publishing project in the future.
The Editorial Board recommends exceptions to these selection criteria when works which strictly do not meet them are nevertheless considered too important to be excluded.The Editions
The Editorial Board has the task of selecting the editions to be included in English Poetry. No one single approach to the whole corpus is possible but the general policy is to select the more reliable early and, where appropriate, collected editions. In some cases later editions have been used; and in other cases modern critical editions, subject to the agreement of copyright holders, have been drawn on for texts not otherwise readily available in print. In the case of every poem the edition selected is stated and full bibliographic information is given.
In addition to the works of individual poets certain 'landmark' anthologies have been selected by the Editorial Board and will be included in their entirety e.g. Percy's Reliques of Ancient English Poetry, 1765.The Texts
The purpose of English Poetry is to provide poetry texts in machine-readable form. The entire text of each poem is therefore included. Any accompanying text written by the poet and forming an integral part of the poem, such as dedications, prologues, epigraphs, footnotes, sidenotes and endnotes, is also generally included.
The aim of the project is to record the text and structure of poems and not to provide a facsimile of a particular printed version. A descriptive markup scheme is used in which textual elements are identified by their function rather than their appearance. For example the titles of poems are marked explicitly as titles rather than as being in a larger typeface or italicised. Changes of typeface or pointsize peculiar to any one edition are not explicitly recorded in the encoded texts. However, for users who wish to refer back to the printed sources of poems, page breaks, page numbers and other essential information specific to editions is included in the database.The Coding of the Texts
An important feature of English Poetry is the use of Standard Generalised Markup Language (SGML) for the coding of the texts. This internationally recognised coding language, specified in ISO 8879, greatly enhances the value of the database to researchers.
Though SGML provides a standard set of rules for how to encode texts, it provides no guidance on which particular textual features should be encoded, since the requirements of different user communities will obviously differ in this respect. Well known existing standards based on SGML include the 'Electronic Manuscript' project of the American Association of Publishers and the CALS standard of the American Department of Defense, and the work of establishing other standards is continuing. English Poetry is helping to shape that work.
The SGML encoding scheme to be used by English Poetry is closely modelled on that being developed by the Text Encoding Initiative (TEI), the international research project sponsored by the Association for Computers and the Humanities, the Association for Literary and Linguistic Computing and the Association for Computational Linguistics, and jointly funded by the US National Endowment for the Humanities, the Commission of the European Communities (DG XIII) and the Andrew W. Mellon Foundation.
The two TEI editors are also members of the editorial board for English Poetry.
Examples of elements distinguished by the English Poetry encoding scheme include structural units such as volume, part, book, etc down to the level of individual poems or groups of poems, stanzas and lines. Titles, headings, refrains, prologues, notes, etc are dearly distinguished from texts, and prose from verse. Page divisions, use of typographic emphasis and indentation are also all clearly marked. For verse dramas scene, act, speaker, stage instructions and cast list are coded.
Full documentation for the encoding scheme of English Poetry, including the source code of a 'document type definition' (DTD) describing it, is given in The English Poetry Full-Text Database Coding Handbook which is supplied free of charge to all purchasers of the database.The Capture of the Characters
Great care has been taken to ensure that the text of English Poetry can be used on any computer system using any national or international character set.
The following technical specifications apply:
For the storage format the character set used is ISO 646, which is equivalent to a strict form of ASCII. This is extended by the use of SGML entity references taken from the standard sets proposed in ISO 8879 for the characters missing from it, such as Old English thorn and yogh. The small amount of Greek in the database (mainly epigraphs) is also captured.
These internal character encodings are however largely invisible to the user of English Poetry, as they are automatically converted to an appropriate display format by the CD-ROM retrieval software.
Texts extracted from the database for re-use with another piece of software, and texts supplied on magnetic tape, are encoded in a strictly TEI-conformant form, in which all characters outside a specified subset of ISO 646 are replaced by entity references.Copyright
Texts may be printed out to create hardcopy editions for teaching or research, providing that the rights of original copyright holders are not infringed. Such copyrights are clearly noticed in the database.
Small extracts of texts may be downloaded to create databases for teaching, research and personal use, and may be incorporated in an article or essay for publication in a journal or collected work.
The publication of larger extracts of texts in printed form requires the written permission of Chadwyck-Healey.
Texts may not be used for publication in electronic formats whether on their own or with other texts or in modified form without written permission from Chadwyck-Healey. Such permission will not be unreasonably withheld.Cataloguing
Each individual text includes a TEI- conformant 'file header' description so that its origin and identity can always be determined even when it is separated from the database. This header contains bibliographic information about the text.
MARC records for the editions from which texts have been taken are being made available separately by the publisher at no additional charge. English Poetry is designed to be fully integrated into library collections so that the machine-readable texts can be represented in catalogues alongside their companion print and microform editions.