The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
Advanced Search
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

Cover Stories
Articles & Papers
Press Releases

XML Query

XML Applications
General Apps
Government Apps
Academic Apps

Technology and Society
Tech Topics
Related Standards
Created: January 04, 2002.
News: Cover StoriesPrevious News ItemNext News Item

Harvard University Library Feasibility Study Recommends XML DTD/Schema for E-Journal Archives.

With financial support from the Mellon Foundation, the Harvard University Library's Office for Information Systems E-Journal Archiving Project commissioned a feasibility study to investigate the development of a common markup formalism that can be used to "reasonably represent the intellectual content (text, tables, formulas, still images, and links) of archived journal articles." The study was carried out by Inera Corporation, using input from ten publishers who were asked to provide their existing DTDs, documentation and sample SGML documents for analysis. The Inera investigative team reviewed the materials to determine if such a structure can be developed and to assess the challenges that would be faced in SGML transformation; they also examined the challenges faced by organizations that have worked with DTDs from multiple publishers. A 65-page report documenting the E-Journal Archival DTD Feasibility Study has now been published. The report recommends the creation of an XML DTD or Schema which "can be developed, allowing successful conversion of significant intellectual content from publisher SGML and XML files into a common format for archival purposes." The authors of the recommendation elected to defer the choice between XML DTD and W3C XML Schema for formal notation. The Harvard project team now "hopes to finalize the conceptual agreement with its publishing partners, to document technical development, operations, and staffing of the archive, and to refine the business model that will sustain the archive over time."

Bibliographic information: E-Journal Archival DTD Feasibility Study. Commissioned by the Harvard University Library, Office for Information Systems, E-Journal Archiving Project. Prepared by Inera Incorporated. December 5, 2001. 65 pages. Edited by Bruce Rosenblum (Inera). With contributions from Bob Hollowell (American Institute of Physics), Kristine Schnebly (BioOne), David Sommer and Richard O'Beirne (Blackwell Science), Karen Hunter and Jos Migchielsen (Elsevier Science), John Sack, Maureen Phayer, and Diana Robinson (Highwire Press), Stephen Cohen and Ken Rawson (IEEE), Y Kathy Kwan and Ed Sequeira (Pubmed Central), Howard Ratner and Heather (Rankin Nature Publishing Group), Evan Owens and John Muenning (University of Chicago Press), Margaret Wallace (John Wiley & Sons).

Summary from D-Lib Magazine: "In the Fall of 2001, under the auspices of a Mellon Grant to explore ejournal archiving, Harvard University Library contracted with Inera, Inc. to review a variety of DTDs from selected publishers. The study focused on two key questions: Can a common DTD be designed and developed into which publishers' proprietary SGML files can be transformed to meet the requirements of an archiving institution? If such a structure can be developed, what are the issues that will be encountered when transforming publishers' SGML files into the archive structure for deposit into the archive? The requirement of the archival article DTD was defined as ability to represent the intellectual content of journal articles."

Project Methodology: "Harvard and Inera selected ten DTDs for review... The goal was inclusion of a sufficient number of DTDs to allow most significant issues to be identified during the course of the study. All publishers asked to participate in this study accepted. They include: American Institute of Physics (AIP), BioOne (BioOne), Blackwell Science (Blackwell), Elsevier Science (Elsevier), Highwire Press (HWP), Institute of Electrical and Electronics Engineers (IEEE), Nature Publishing Group (Nature), Pubmed Central (PMC), University of Chicago Press (UCP), John Wiley & Sons (Wiley). All participating publishers were asked to submit the current version of their DTD, DTD documentation, and twenty to twenty-five sample document instances from multiple journals and issues... All of the reviewed DTDs owe their legacy, directly or indirectly, to the ISO 12083 Serial DTD..."

Status report of 2001-12-05 from DLF (via Marilyn Geller, Harvard Project Manager): "Harvard has completed a first round of business meetings and technical meetings with our publisher-partners, Blackwell, John Wiley, and University of Chicago Press. We have also received a report from Inera, Inc. on the feasibility of developing a common archival article DTD... The significant conclusions drawn from this study are that it is possible to create a common archival article DTD that would represent the intersection and the union of several existing publisher DTDs and that thorough documentation and quality assurance tools would be essential to insure that conversion is successful. Because this study has so much potential for resolving ingest, storage and delivery issues, it is being made available to the entire scholarly communications community. We are optimistic that this will encourage discussion and progress in the technical aspects of e-journal preservation. In the coming months, we hope to finalize the conceptual agreement with our publishing partners, document technical development, operations, and staffing of the archive, and refine the business model that will sustain this archive over time..."

Principal references:

Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation


XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Bottom Globe Image

Document URI:  —  Legal stuff
Robin Cover, Editor: