SGML: Making of America Project at University of Michigan

Making of America Project at University of Michigan

Date: 	 Thu, 20 Mar 1997 08:03:09 +0000 (GMT)
Subject: 10.0796 new online: American texts; PMC; JCMC
Humanist Discussion Group, Vol. 10, No. 796.
[1]   From:    John Price-Wilkin <>
      Subject: Making of America Project at University of Michigan

 From: John Price-Wilkin <>


Making of America at the University of Michigan

The University of Michigan Digital Library is pleased to announce the availability of an extraordinary new electronic collection of American writing. A part of the Making of America project, these materials are a powerful demonstration of several pieces of digital library technology developed by the University of Michigan. Currently included in the UM online collection are some 200,000 pages of American publications from 1850 to 1900; by mid-year, the collection will extend to include approximately 650,000 pages, including several journals. The University of Michigan MOA collection is available at:

The Making of America project is a collaborative effort between Cornell University and the University of Michigan. Funded primarily by the Andrew W. Mellon Foundation, the focus of the project is American social history from the antebellum period through reconstruction. Cornell and Michigan are working to develop a distributed architecture to provide access to the two collections through a single interface at each institution. Materials currently available from Cornell may be found at Work is underway to facilitate cross-collection searching for the two efforts.

Digital Library Resources for the Humanities
The implementation at Michigan demonstrates a number of unique approaches to building systems for access to scholarly resources. Capitalizing on Cornell University's extensive experience in preservation-quality imaging, pages were scanned as 600dpi TIFF images through a conversion bureau, using specifications jointly written by Cornell and Michigan. In a subsequent process designed by Digital Library Production staff at the University of Michigan, a subset of the scanned pages were treated with locally developed routines for automatic OCR. A relatively low-level of SGML, using the TEI Guidelines, was applied to the OCR. This encoding is used to hold bibliographic information, text, article-level information in journals, and page references. It also serves as an extensible framework as titles are identified for more thorough proofing and richer encoding. Images are stored as high resolution, preservation-quality 600dpi TIFF images, and are rendered to various levels of GIF in real time.

SGML-based Access Systems
We hope that users of the system will appreciate some of the functionality developed through UM's nearly eight years of experience with deploying SGML-based access and delivery systems. Attractive, easily navigated displays of results showing the number of occurrences per page are combined with displays of the page image, circumventing many of the problems encountered when relying on OCR alone. As we have opportunities to "clean up" and more richly encode OCR'd texts, the system will begin to show dynam ically-rendered HTML with links to the page images. The mechanisms used for the MOA system will be provided to participants in the UM's SGML Server Program (see

Next Steps
Development and design of the system continues. The current implementation will be exhaustively vetted with focus groups of local users, especially experts in the fields covered. We would also encourage others to send comments and suggestions to Also, as time and resources permit, texts will be extracted from the system, carefully proofed and corrected, and encoded at a much higher level of SGML. These enriched resources will allow us to continue to improve functionality in a numbe r of different directions. For more information about the Making of American project in general, and the Michigan implementation in particular, please see: