Date: Thu, 20 Mar 1997 08:03:09 +0000 (GMT) Subject: 10.0796 new online: American texts; PMC; JCMC Humanist Discussion Group, Vol. 10, No. 796. [1] From: John Price-Wilkin <jpwilkin@umich.edu> Subject: Making of America Project at University of Michigan From: John Price-Wilkin <jpwilkin@umich.edu> ------------------------------------------------------------------
The University of Michigan Digital Library is pleased to announce the availability of an extraordinary new electronic collection of American writing. A part of the Making of America project, these materials are a powerful demonstration of several pieces of digital library technology developed by the University of Michigan. Currently included in the UM online collection are some 200,000 pages of American publications from 1850 to 1900; by mid-year, the collection will extend to include approximately 650,000 pages, including several journals. The University of Michigan MOA collection is available at: http://www.umdl.umich.edu/moa/.
The Making of America project is a collaborative effort between Cornell University and the University of Michigan. Funded primarily by the Andrew W. Mellon Foundation, the focus of the project is American social history from the antebellum period through reconstruction. Cornell and Michigan are working to develop a distributed architecture to provide access to the two collections through a single interface at each institution. Materials currently available from Cornell may be found at http://moa.cit.cornell.edu/ Work is underway to facilitate cross-collection searching for the two efforts.
Digital Library Resources for the Humanities
The implementation at Michigan demonstrates a number of unique approaches
to building systems for access to scholarly resources. Capitalizing on
Cornell University's extensive experience in preservation-quality imaging,
pages were scanned as 600dpi TIFF images through a conversion bureau,
using specifications jointly written by Cornell and Michigan. In a
subsequent process designed by Digital Library Production staff at the
University of Michigan, a subset of the scanned pages were treated with
locally developed routines for automatic OCR. A relatively low-level of
SGML, using the TEI Guidelines, was applied to the OCR. This encoding is
used to hold bibliographic information, text, article-level information in
journals, and page references. It also serves as an extensible framework
as titles are identified for more thorough proofing and richer encoding.
Images are stored as high resolution, preservation-quality 600dpi TIFF
images, and are rendered to various levels of GIF in real time.
SGML-based Access Systems
We hope that users of the system will appreciate some of the functionality
developed through UM's nearly eight years of experience with deploying
SGML-based access and delivery systems. Attractive, easily navigated
displays of results showing the number of occurrences per page are
combined with displays of the page image, circumventing many of the
problems encountered when relying on OCR alone. As we have opportunities
to "clean up" and more richly encode OCR'd texts, the system will begin to
show dynam ically-rendered HTML with links to the page images. The
mechanisms used for the MOA system will be provided to participants in the
UM's SGML Server Program (see http://www.hti.umich.edu/misc/ssp/).
Next Steps
Development and design of the system continues. The current
implementation will be exhaustively vetted with focus groups of local
users, especially experts in the fields covered. We would also encourage
others to send comments and suggestions to moa-info@umich.edu. Also, as
time and resources permit, texts will be extracted from the system,
carefully proofed and corrected, and encoded at a much higher level of
SGML. These enriched resources will allow us to continue to improve
functionality in a numbe r of different directions. For more information
about the Making of American project in general, and the Michigan
implementation in particular, please see:
http://www.umdl.umich.edu/moa/about.html