This online copy of ETEXTCTR Review #2 contains references to works of interest to librarians, some of which reference SGML. Most of the relevant articles are cited in my full SGML bibliography, but additional abstracts are available here, usually linked from my main entries. -rcc
From owner-etextctr@lists.Princeton.EDU Fri May 19 17:21:35 1995 Return-Path: <owner-etextctr@lists.Princeton.EDU> Received: from lists.Princeton.EDU by utafll.uta.edu (4.1/25-eef) id AA07199; Fri, 19 May 95 17:21:30 CDT Received: by lists.Princeton.EDU id <23243.s2-1>; Fri, 19 May 1995 16:22:09 -0400 Received: from ponyexpress.princeton.edu ([128.112.129.131]) by lists.Princeton.EDU with SMTP id <23223.s2-1>; Fri, 19 May 1995 16:21:31 -0400 Received: from phoenix.Princeton.EDU by ponyexpress.princeton.edu (8.6.12/1.7/newPE) id QAA02318; Fri, 19 May 1995 16:20:18 -0400 Received: by phoenix.Princeton.EDU (4.1/Phoenix_Cluster_Client) id AA04751; Fri, 19 May 95 16:20:02 EDT Message-Id: <CMM.0.88.800914798.etextctr@phoenix.Princeton.EDU> Date: Fri, 19 May 1995 16:19:58 EDT Reply-To: etextctr@lists.Princeton.EDU Sender: owner-etextctr@lists.Princeton.EDU From: ETEXTCTR Discussion List <etextctr@phoenix.Princeton.EDU> To: Electronic Text Centers List <etextctr@lists.Princeton.EDU> Subject: ETEXTCTR Review #2 X-To: etextctr@lists X-Listprocessor-Version: 7.1 -- ListProcessor by CREN Status: R Sender: Mary Mallery, Moderator, ETEXTCTR Discussion List <mallery@gandalf.rutgers.edu> Subject: ETEXTCTR Review #2 ETEXTCTR Review #2, May, 1995 ETEXTCTR Review provides abstracts of current articles from journals of interest to those working with electronic texts in a research setting. Volunteer contributors for this issue are: Jerry Caswell (JVC), Iowa State University Libraries; Aurora Ioanid (AI), The Center for Electronic Texts in the Humanities; and Mary Mallery (MM), CETH. *************************************************************************
* Burrows, Toby. (1994). "Integrating electronic services into the
academic library: the Scholars' Centre at the University of Western
Australia." _Australian Academic and Research Libraries_ 25: 213-220.
The article examines the complexities of the process of setting up a center
for scholarly electronic resources and integrating it with the rest of the
library information services at the University of Western Australian
Library. The author emphasizes the three "major imperatives" that govern
the establishment of this center, known as the Scholars' Center:
"proliferation of resources in electronic forms," the "need to focus on
library services," and the necessity to contribute to the collective effort
to "maximise the quality of its [the university's] teaching and research."
The author addresses all the important aspects that are involved in the
process, starting with issues related to the physical facilities, the need
for specialized staff, collection development, type of access to various
databases and electronic texts, as well as a thorough analysis of the
users' needs to calculate the level of expertise needed for operating the
Center. --AI
* Day, Mark Tyler. (1994). "Humanizing Information Technology: Cultural
Evolution and the Institutionalization of Electronic Text
Processing." In Sutton, Brett, ed. _Literary Texts in an
Electronic Age: Scholarly Implications and Library Services_
Graduate School of Library and Information Science, University of
Illinois at Urbana-Champaign, pp. 67-92.
Day discusses issues of cultural evolution in today's information
society and the efforts made by the modern university library
to adapt to them. Indiana University's Library Electronic Text
Resource Service (LETRS) is an example of the adaptive process.
It is a cooperative effort of the library and computing center to
provide faculty and students with access to scholarly electronic
texts in the humanities and related computing software tools.
Despite organizational and economic constraints, this "humanist's
laboratory" represents a new collaborative system of the cultural
preservation of materials that embody traditional humanistic
values. --JVC
* Guenther, Rebecca. (1994) "The Challenges of Electronic Texts in the
Library: Bibliographic Control and Access." In Sutton, Brett, ed. _Literary
Texts in an Electronic Age: Scholarly Implications and Library Services_
Graduate School of Library and Information Science, University of Illinois
at Urbana-Champaign, pp. 149-172.
This article addresses special problems relating to bibliographic control
and description of electronic texts. The main issues discussed are:
identification of electronic texts, description, location and access. The
author extensively describes the earlier OCLC Internet Resources Project
Cataloging Experiment and its involvement in the study of the possibilities
of accommodating online information resources in USMARC formats. Librarians
are interested in placing data about electronic resources in the same type
of database they use for the other library materials, that is USMARC.
Consequently, the author mentions the different proposals that emerged from
this experiment, as well as from other projects initiated by organizations
like USMARC Advisory Group of the American Library Association. Among these
proposals a very important one was the addition of the 856 field, which, in
the case of the Internet resources, would supply the connection between the
bibliographic record and the text itself. Because electronic texts are
complex objects, AACR2 rules for computer files are scrutinized in an
attempt to identify better ways of describing the various forms in which an
electronic text can appear. In the end, the author discusses the
misconception that SGML would replace USMARC and defines their different
functionalities. --AI
* Harrison, A.D., Roos, F.A. & Thomas, R.E. (February, 1995).
"(Semi)automatic capturing of bibliographic information from journal
contents pages for inclusion in online library catalogues: the RIDDLE
Project." _Electronic Library_, vol. 13, no. 1: 15-19. --MM
A summary of the RIDDLE (Rapid Information Display and Dissemination in a
Library Environment) Project (available on the Web at
<http://www.cwi.nl:80/cwi/projects/riddle.html> and at
<http://web.inf.rl.ac.uk/proj/riddle.html>, an international endeavor
funded by the Commission of the European Communities' (CEC) Telematics
Research and Technological Development Programme. The project involved "a
feasibility study of the use of scanning technology to capture the contents
pages of scientific journals, extract the bibliographical information of
the article and load this data into an online library catalogue (OLC)."
There are tables of the SGML tags chosen for this work as well as a sample
of results of marking a particular journal, and consideration of how easily
SGML translates into the different catalogue interface packages in the
European countries. Also included are formulas for computing the cost
effectiveness of such a project. --MM
* Hockey, Susan. (1994). "Electronic Texts in the Humanities: A Coming of
Age." In Sutton, Brett, ed. _Literary Texts in an Electronic Age: Scholarly
Implications and Library Services_ Graduate School of Library and
Information Science, University of Illinois at Urbana-Champaign, pp. 21-34.
This article provides a brief historical account of the progress of
electronic texts in humanities research as well as a concise overview of
applications in literary computing, "including concordances, text
retrieval, stylistic studies, scholarly editing, and metrical analyses."
The author reviews developments in electronic texts today and the steps
forward in text preparation that the Text Encoding Initiative's
_Guidelines_ make possible. Finally, the author looks to the future of
texts marked up in TEI-conformant SGML and the development of better
analysis tools that take advantage of the expertise of natural language
understanding systems, as well as digital imaging technology.
* Johnson, Eric. (1994-1995). "Oxford Electronic Text Library Edition of
the Complete Works of Jane Austen," _Computers and the Humanities_, vol.
28, pp. 317-321.
This review of the OETL electronic edition of the Complete Works of Jane
Austen provides an introduction to a new kind of resource for libraries:
the full-text CD-ROM of an author's oeuvre. To demythologize this new
beast, Johnson shows its face, including examples of a page of SGML-tagged
text and the same page formatted to hide the tags. In addition, the author
shows how to use such a text, though Johnson notes that the analyses of the
texts produced for his review were generated by programs that he wrote
himself; however "since the texts are encoded with SGML, they should be
able to be used with software designed to process SGML -- such as
_Intellitag_ (from WordPerfect) or _DynaText_ (from Electronic Book
Technologies)." The article also includes sample output from a simple
query looking at the various characters' speech patterns. --MM
* Kiernan, Kevin. (February, 1995). "The Electronic Beowulf." _Computers
in Libraries_ vol. 15, no. 2: 14-15.
The manuscript of the Old English epic _Beowulf_ has long been the center
of dispute among textual scholars who would like to fill the lacunae left
by the flames of the damaging fire the manuscript survived in 1731. Now,
through digitization and the coordination of a team of experts from
libraries, computer science, math and English departments in Europe as well
as the United States, some answers are being found. Kiernan gives a quick
history and overview of the Beowulf Project at the University of Kentucky
and the British Library (now centered in the Richard Rawlinson Center for
Anglo-Saxon Studies and Research). One can view a Mosaic presentation of
the project at URL: <http://www.uky.edu/~kiernan/welcome.html>. --MM
* Lane, Anthony. (February 20 & 27, 1995). "Byte Verse: How to wing your
way through thirteen hundred years of English Poetry in an afternoon of
interfacing." _The New Yorker_, pp. 102-117.
Despite its title, this article provides more than an afternoon lark
through Chadwyck Healey's _English Poetry Database_ at the New York Public
Library. Lane is a keen observer. He speculates "on the sort of person
who would really _need_ 'English Poetry,'" and he's thinking past the joy
of follow-that-theme to its implications: "Once you have a printout of
your sleep-meets-death findings, the onus is then on you, as never before,
to wonder what on earth they might mean; the computer hasn't a clue."
Lane's analysis of the database includes a short history from idea to
actual transcription (in the Philippines). Also, he notes what poetry is
available on the disk, as well as what's not (this is his list, others
might add to it): no Shakespeare plays, no hymns published after 1800, no
American poetry, and no verse from this century. Finally, you might read
this article to experience the view from inside the heads of the novice
user of tools for electronic text access. --MM
* Lowry, Anita. (1994). "Electronic Texts and Multimedia in
Academic Libraries: A View from the Front Line." In Sutton,
Brett, ed. _Literary Texts in an Electronic Age: Scholarly
Implications and Library Services_ Graduate School of Library and
Information Science, University of Illinois at Urbana-Champaign,
pp. 57-66.
Acting on the premise that both graduate and undergraduate students
could benefit from exposure to electronic texts and hypermedia
databases that link primary source materials, the University of Iowa
Library created the Information Arcade, which consists of an electronic
classroom for instructor directed learning and a laboratory. The
laboratory includes an information stations cluster for viewing
preexisting information resources and a multimedia cluster for
the creation and manipulation of electronic material. Experience with
various classes suggests that all participants, including undergraduates,
are strongly motivated to do research and to create materials.
Problems include the multiple and proprietary platforms of some
databases and the demands on staffing that the creation of source
materials requires. --JVC
* Mathiesen, Thomas J. (1994). "Transmitting Text and Graphics in Online
Databases: The _Thesaurus Musicarum Latinarum_ Model." _Computing in
Musicology_, 9: 33-48. (MM)
This article provides a full description of the _Thesaurus Musicarum
Latinarum_ (TML), "an evolving database that will eventually contain the
entire corpus of Latin music theory written during the Middle Ages and the
early Renaissance." It is a unique model for electronic text transmission
because the database includes musical notation as well as ASCII text, so
that the choice of graphics formats and a transmission protocol with the
least amount of corruption was paramount. Mathiesen documents the
decisions for data capture and verification with OCR software as well as
the delivery and structure of the database through a gopher server
<gopher://iubvm.ucs.indiana.edu/11/tml>, a listserv, TML-L (subscribe
through listserv@iubvm.ucs.indiana.edu), and an ftp site, TML-FTP
(available at ftp 129.79.1.10, password is "themulat"). The Appendices
contain the "Principles of Orthography" for the database as well as the
"Table of Codes for Noteshapes and Rests." --MM
* McMahon, Kenneth. (March, 1995). "BUBL BITS: Investigating the
Computers in Teaching Initiative (CTI) WWW Services." _Computers
in Libraries_ vol. 15, no. 3: 53-54.
There are twenty subject-oriented CTI centers in the UK, each of
which supports the use of computers in teaching at the higher education
level. The electronic resources of each center are available on WWW
servers and accessible through a common interface (the BUBL WWW Subject
Tree at URL: BUBL), which is located
at the University of Bath. Resources include reports, bibliographies, full
text journals, and courseware. --JVC
* Olson, Nancy B. Cataloging Internet Resources : a Manual and Practical
Guide. OCLC Computer Library Center, Inc, c1995. Available via anonymous
ftp at URL:
ftp://ftp.rsch.oclc.org/pub/internet_cataloging_project/Manual.txt
Nancy Olson's manual for cataloging Internet resources represents the
result of a collective effort directed toward the identification of AACR2
and USMARC capabilities to describe the specificity of Internet resources.
Originally, it was initiated within a "... nationwide, coordinated effort
among libraries and institutions of higher education to create, implement,
test, and evaluate a searchable database of USMARC format bibliographic
records, complete with electronic location and access information, for
Internet-accessible materials." These guidelines have been developed in
support to the OCLC project participants undertaking the difficult task of
cataloging a new type of bibliographic resource located on the Internet in
the form of electronic texts. --AI
* Price-Wilkin, John. (1994). "The Feasibility of Wide-Area
Textual Analysis Systems in Libraries: A Practical Analysis." In
Sutton, Brett, ed. _Literary Texts in an Electronic Age:
Scholarly Implications and Library Services_ Graduate School of
Library and Information Science, University of Illinois at
Urbana-Champaign, pp. 113-136.
After recounting early efforts at Chicago, Dartmouth, Michigan
and Virginia, Price-Wilkin identifies and discusses the
fundamental characteristics of wide-area textual analysis
systems: very precise searching at high speeds, the ability to
show keywords in context and to qualify searches by structural
characteristics, statistical generating capability, and
grammatical analysis. To be useful in a wide area environment,
texts must be reusable, observe standards for encoding (i.e.,
SGML, TEI), and be accessible from multiple user platforms.
While an increasing number of texts are available from both
academic and commercial sources, some are compromised by the
choice of edition used, limited markup, poor transcription, or
the lack of flexibility in licensing. Open Text Systems' PAT is
evaluated as a server platform that meets many of the needs of a
wide-area textual analysis system. Examples of its use are given
at the University of Michigan and the University of Virginia. An
appendix discusses document structure and the need for protocols
and a standard query language that are aware of structure. --JVC
* Price-Wilkin, John. (1994). "Using the World Wide Web to Deliver
Complex Electronic Documents: Implications for Libraries." _The
Public-Access Computer Systems Review_ 5, no. 3: 5-21. (Or
e-mail the command GET PRICEWIL PRV5N3 F=MAIL to
LISTSERV@UHUPVM1.UH.EDU.)
At the University of Virginia the products of several scholarly
projects in literature and history were converted into HTML so
that they would be readily available over the World Wide Web.
Unfortunately HTML's inability to reflect the structure of
complex documents compromised the efforts. Price-Wilkin found a
better solution by developing a common gateway interface (CGI)
>From the Web to an SGML based server. This provided a simpler
user interface for complex information retrieval, took advantage
of a sophisticated retrieval engine (Open Text's PAT), and
enabled the elements and relationships of complex SGML encoded
documents to be represented without fragmenting them and without
abandoning the standards that were used in their creation. --JVC
* Schwartz, Lillian. "The Art Historian's Computer: Riddles Posed
by Ancient Works Fall to Historical Analyses and Electronic
Explorations." _Scientific American_ vol 272, no. 4: 106-111.
By using computer graphics to scale and juxtapose images Schwartz
has been able to shed light on the sources of famous portraits
such as the Mona Lisa and Shakespeare. She has also used
computer graphics to show how certain paintings relate to the
environment for which they were created. --JVC
* Seaman, David. (1995). "Campus Publishing in Standardized Electronic
Formats -- HTML and TEI." In Okerson, Anne, ed. _Filling the Pipeline and
Paying the Piper: Proceedings of the Fourth Symposium_. ARL Publications.
Also available at URL
<http://www.lib.virginia.edu/etext/articles/arl/dms-arl94.html>.
David Seaman, the Director of the University of Virginia Library's
Electronic Text Center, describes his Center's project in converting
documents marked in TEI-conformant SGML into documents with hypertext
(html) markup for distribution over the World Wide Web via Pat. The author
notes the difference between documents that are suitable for html markup
(e.g., short guides and brochures) as opposed to documents that would
require more granular demarcation of their structures (e.g., finding aids,
full texts, set of journal titles, encyclopedia, etc.). In addition,
Seaman documents how the Virginia project team embedded the TEI header into
their image files to maintain the record of the image's origins. The html
version of this article contains many links to useful sites for creating
html files, perl scripts for SGML-to-html conversion, and html documents
and image files that pertain to the article. --MM
* Sperberg-McQueen, C. M. (1994). "The Text Encoding Initiative:
Electronic Text Markup for Research." In Sutton, Brett, ed.
_Literary Texts in an Electronic Age: Scholarly Implications and
Library Services_ Graduate School of Library and Information
Science, University of Illinois at Urbana-Champaign, pp. 35-56.
The work of the Text Encoding Initiative (TEI) grew out of the
need to address the fundamental problems of representing and
sharing electronic texts: how to represent document structure,
how to link interpretive and auxiliary information, the lack of a
standard and extensible system of markup. With support from
professional associations and research centers TEI developed a
system of SGML markup that culminated in the third edition (P3)
of 1994. P3 embodies a hierarchical document grammar, which
focuses on document structure rather than layout, defines a
concrete set of tags which may be mixed or extended as needed,
and, in requiring conformance to international standards, is
platform independent and non-proprietary in nature. A sample
text demonstrates its application. --JVC
************************************************************************** If you would like to contribute to ETEXTCTR Review or recommend an article for review, write to Mary Mallery, Moderator of ETEXTCTR, at e-mail: <mallery@gandalf.rutgers.edu>.