The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: May 30, 2003
NLM XML DTDs for Journal Publishing, Archiving, and Interchange

Overview

"The National Center for Biotechnology Information (NCBI) of the National Library of Medicine (NLM) created the Journal Archiving and Interchange Document Type Definition (DTD) with the intent of providing a common format in which publishers and archives can exchange journal content. This DTD was created from the Journal Archiving and Interchange DTD Suite, which provides a set of XML modules that define elements and attributes for describing the textual and graphical content of journal articles as well as some non-article material such as letters, editorials, and book and product reviews)... NCBI/NLM created the Journal Publishing Document Type Definition (DTD) with the intent of providing a common format for the creation of journal content in XML. For journals that do not have an SGML/XML model selected, NCBI will encourage the use of this DTD to define the incoming data for PubMed Central, the U.S. National Library of Medicine's digital archive of life sciences journal literature." [homepage description 2003-05]

[May 30, 2003]   NLM Releases XML Tagset and DTDs for Journal Publishing, Archiving, and Interchange.    An announcement from the US National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM) describes the release of a Tagset and two XML DTDs designed to "simplify journal publishing and increase the accuracy of the archiving and exchange of scholarly journal articles. The Journal Publishing DTD and the Archiving and Interchange DTD have been created from the Archiving and Interchange Tagset, a set of XML elements and attributes that can be used to define many other types of documents, including textbooks and online documentation. The Tagset provides a set of XML modules that defines elements and attributes for describing the textual and graphical content of journal articles as well as some nonarticle material such as letters, editorials, and book reviews. The purpose of the Tagset is to preserve the intellectual content of journals independently of the form in which that content was originally created. The Tagset has been written as a set of XML DTD modules, each of which is a separate file. No module is a complete DTD by itself, but these modules can be combined to create any number of new DTDs." The NLM Tagset represents an open specification: the DTDs and the Tagset are in the public domain so that any organization wishing to create its own DTD from the Tagset may do so without permission from NLM. NLM is forming an XML Interchange Structure Advisory Board to assist in development and maintenance of the Tagset. An Archiving and Interchange Tagset Secretariat will collect feedback and will physically maintain the files and documentation.

Overview of NLM Journal Archiving and Interchange DTD Tag Library

The intent of this DTD Suite is to 'preserve the intellectual content of journals independent of the form in which that content was originally delivered'. The tags defined here will be used to describe journal articles that originate with many publishers and societies but whose content will be stored in repositories, such as the NLM PubMed Central repository. Therefore, the Suite has been optimized for conversion from a variety of journal source DTDs, with the intent of providing a single format in which publishers can deliver their content to a wide range of archives. There are so many journal DTDs currently in use by publishers, repositories, content-aggregators, scientific societies, and compositors that this Suite cannot possibly incorporate all the variation to be found in such diverse models. But a wide variety of structures can be accommodated, because the content models for the elements have been made very flexible, including a wide range of elements with nearly all structures optional.

The conversion focus also means that this is a larger, more inclusive DTD than might have been necessary if the intent had been, for example, to create only a journal-authoring DTD. Many elements have been created explicitly so that information tagged by publishers would not be discarded when they converted material from another DTD to an archival interchange or repository DTD created from this Suite. Because of the broad scope of the several proposed electronic archives, this Suite contains elements and attributes that may occur only in a very few journals. Attribute values that a particular DTD would restrict to a list of options, were declared as data character values so that all options could be accepted. Care has been taken to provide several mechanisms (frequently information classing attributes) to preserve the intellectual content of a document structure when that structure is converted from another DTD or schema to this one, even if there is no exact element equivalent of the structure.

Modular DTD Design: The Archiving and Interchange DTD Suite has been written as a set of XML DTD modules called DTD 'modules', each of which is a separate physical file. No module is an entire DTD by itself, but these modules can be combined into a number of different DTDs, for example, both an Archival and Interchange DTD and an Archival Repository DTD. Modules are primarily intended for maintenance; all the elements of the same 'type' (class) are stored together... There are many advantages to such a modular approach. The smaller units are written once, maintained in one place, and used in many different DTDs. This makes it much easier to keep lower-level structures consistent across document types, while allowing for any real differences that analysis identifies. A DTD for a new function (such as an authoring DTD) or a new publication type can be built quickly, because most of the necessary components will already be defined in the DTD Suite. Editorial and production personnel can bring the experience gained on one tagging project directly to the next with very little loss or retraining. Customized software (including authoring, typesetting, and electronic display tools) can be written once, shared among projects, and modified only for real distinctions... [from the Introduction]

Overview of NLM Journal Publishing Tag Library

The Journal Publishing DTD defines a document type for journal articles and some non-article journal material such as product and book reviews, editorials, and letters to the editor. The DTD was written to describe both the metadata for a journal article and the content of the article, but it can also describe just the article header metadata. This is a prescriptive DTD, optimized for the authoring and initial XML tagging of journal material. Although designed for biomedical journals, this DTD should be sufficiently general to describe not only STM journals but technical journals in any field.

The DTD was constructed using the modules of the Archiving and Interchange DTD Suite and has been modeled along the same philosophical lines as the Journal Archiving and Interchange DTD, which is a DTD for interchange and storage of journal material. However, because this is a publishing DTD optimized for the creation of new material, the DTD is far smaller (fewer elements, and fewer choices in many contexts) than was the full Journal Archiving and Interchange DTD. Where, in the interchange DTD, there may have been several ways to express the same information, only one way is provided for this publishing DTD. It was not the intention to limit the expressive power licensed by this DTD but rather to limit the meaningless choices that a full interchange DTD needs to make conversion from a wide variety of formats as easy as possible. The philosophy for the interchange DTD was to accept as many varied forms of many structures as possible. The philosophy of this DTD is to prefer a single structural form, or at least a single style of tagging. [from the Introduction]

Principal URLs

Articles, Papers, News

  • [June 2003] At OpenPublish 2003: "New Public Domain Journal Article Archiving and Interchange DTDs," presented by Jeff Beck (Technical Information Specialist, National Center for Biotechnology Information, United States). "The National Library of Medicine, with Mulberry Technologies and Inera, has released public domain DTDs and a modular DTD suite for journals. The presentation covers the intent/extent of the DTDs/suite, who might use them, why, and how this impacts STM publishing."


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI: http://xml.coverpages.org/nlmJournals.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org