[This local archive copy is from the official and canonical URL, http://www.mediacenter.org/NML-NITF.htm; please refer to the canonical source document if possible.]
Proposal to incorporate
the NML tag initiative into
the NITF DTD
Revised March 5 version
February 25, 1999
Glenn Cruickshank, Director, Tribune Solutions
The Salt Lake Tribune, Salt Lake City, UT USA
Index
Introduction | History | Methodology
Publication tagging | Workflow_issues | Element_identification | Archiving issues
Appendix: Tag descriptions
<assignment> <bibliography> <correction> <pubdata> <date.published> <editor-list> <editor> <edition> <first-character> <front-page> <issue> <lead> <size><summary> <package> <page> <start-page> <publication-title> <publication-number> <rest> <section> <no-run> <volume> <orgid> <zone> <docdata>
In January of 1999, the American Press Institute hosted a Media Center Grammar Conference in Dallas TX to discuss the need for a news markup language within the newspaper industry. This group, consisting of representatives from a number of US newspaper chains, universities and API, developed a list of 40 tags which they felt were needed to identify journalistic content in news stories.
Members of this first group, called the API Grammarians, enlarged on this tag set and developed a preliminary tag proposal and partial DTD. On February 16, 1999, in Atlanta, GA, they met with a group from the NAA Wire Service Committee and several other systems vendors and discussed whether to merge the NML effort into the NITF DTD.
The author, a member of the NAA committee, was asked to align the NML tags with the NITF standard, and propose changes to the NITF to include the intent of the NML tags that were not supported in the current release of NITF. This document is that merger proposal.
In 1992, under the auspices of the NAA and International Press Telecommunications Council, representatives from several news organizations began working on "an industry standard for the interchange of textual material between news agencies and their clients" which would replace the aging non-Y2K compliant ANPA 1312 wire transmission format. This work continued through the 1990's, and additional news organizations joined the process. While the wire services, AP, Reuters and UPI, lead the process domestically, U.S. participants included representatives from RTNDA, Special Librarians Association, The New York Times, Chicago Tribune, Miami Herald, Apalochicola (FL) Times, Dow Jones, Newark Star Ledger, Lewiston (ID) Tribune, Lexis-Nexis and others, including systems vendors.
This group also worked closely with international news organizations and newspapers, under the umbrella of IPTC, the international standards organization.
The original effort was to create a SGML-based language. In 1998, with the explosive growth of the world wide web, the base language was changed from SGML to a derivative, XML. The goal remained to create "a device-independent format for textual and tabular information within the global news industry" and "to mark up text once for a variety of uses, including traditional print publications, broadcast news, and electronic services such as Web sites and archival databases." This markup language was named News Industry Text Format, or NITF.
The US committee working on NITF is named the NAA Wire Committee, and as such, has focused on primarily wire service issues. The underlying NITF framework, however, was designed to allow additional tagging for downstream content creation and identification. According to the NAA's John Iobst , there has always been plans in the NITF development to create adjuncts for archiving, workflow, markup and multimedia.
The emergence of the API Grammarians work has spurred work on those adjuncts to NITF.
The Grammarians proposed an original tag set of about 40 tags. The author realized that tags set was insufficient for a number of workflow and re-purposing uses, so he analyzed a number of standards used by the online industry.
The author reviewed the VISF standard from Lexis-Nexis, the SIF format from MediaStream, Dialog-B, a Dow Jones data transmission standard, a modified ANPA1312 transmission format created by the Washington Post and America Online's Rainman format. The author also reviewed a number of production system markup tagging schemes from Atex, Harris, CText, ACT, Dewar, SII, DTI and several Word-based systems. The author included data from a news assignment management system and from several photo archive systems. Last, the list server for the Special Librarians Association News Division was notified of this work, and subscribers to that list, about 1,000, were solicited to provide any additional tags they felt were left out.
The author developed a preliminary tag list consisting of the Grammarians tags and tags uncovered during his other-standards review process. Next, he tried to describe those tags using existing NITF tags and markup. In the majority of cases, NML tags were already supported in NITF.
Where NITF did not have elemental support for certain NML tags, we have proposed structures, elements and attributes to support those NML tags within NITF.
Highlights
Not surprisingly, NITF did not overlap NML in a number instances, most notably in areas of publication, work flow and archiving-related tagging. The following solutions have been proposed:
Print publications use a number of methods to identify a specific publication product or cycle. These include the publication number, volume, issue, publication title, publishing date, edition, zones of distribution and page. A document may be published several times, so a mechanism needs to be created to record publishing information for each instance of publication. The solution outlined by the author uses a <pubdata> container within the <head> of the document to support multiple instances of publication tags. Each <pubdata> instance has a event attribute which is a number from 1 to n. For each event, users can assign specific volume, issue, edition, publication title, zone, etc., attributes.
For print publications, the page number issue can be quite complex if the publication uses "jumps", or continuations of the story from one page to another. In the case of multiple publications, a particular paragraph of text may appear on one page in one usage and another page in a second usage. The author proposes an optional <page> tag within blocks of text. Key to this schema is an attribute of the <page> tag is pubdata.event attribute, which links the particular block of text to a page identifier which in turn links to a specific publishing event listed in the header.
The third part of solving the page problem is to create a <start-page> element with the <pubdata> structure. For stories which start and end on the same page, the application needs only to register information in this tag and ignore writing any <page> tags within blocks.
A second publication issue is size. The NITF DTD does not contain a vehicle for indicating size of the document. This issue is important to editors and researchers alike. A size attribute has been proposed for the <docdata> group. Users can use a variety of size pairs to indicate word count, character count, page count, and/or length (depending on certain print parameters).
NITF has a mechanism for tracking the delivery of the document in the <del-list> container. A requested NML tag is editor, which relates to work flow. For the same reason it is important to track the flow of document from delivery agent, it is important in newsrooms to track the editing history of a document through multiple systems. The solution the author proposes is a <editor-list> container within the <docdata> structure. The <editor-list> container can hold an unlimited number of editor names. Each name has a time-stamp attribute that can be used to track the flow of the document.
It has long been the request of photo librarians to be able to search through the photo assignment information to be able to track additional background information about the photo that may not have been reflected in any caption. While this could be considered bibliographic material, it's really different. The contact information contained in news assignments has similar value over time. The author proposed a single <assignment> tag in the document header. This tag has a <a.meta> attribute structure allowing free form, site-specific storage of assignment information.
In a formal sense, though, there is need for a mechanism to store bibliographic information. The author proposes a <bibliography> tag within the <body.end> framework. This tag can contain <block> information, which would allow the full range of formatting and content identification tags.
In both the print markup process and archiving process, it is often important to identify blocks of text by type. Several common identifiers are "lead" and "summary" (also known as the nut graf, in the U.S.) Most online archives allow users the ability to search just the lead of a story, and often web publishing of menu or index pages uses the lead of the story. Summary paragraphs often have different markup, and are extracted from the story to build brief digests. In some cases, a document will have blocks of text which have been edited and released for publication, but are not printed in some media for space or other reasons. These blocks of texts can be identified as "no-run" blocks. Last, since we can identify some blocks as lead and summary, we have to have a way to identify blocks that aren't. MediaStream has a tag called "rest". That would be the default block tag.
A second textual element that is used during the markup process is the "first character" of a block of text. Many publications apply stylistic markup to this character. Others replace the character altogether with a graphic element. While style sheets handle many of these processes, there remains a requirement to tag this element. Consider it similar to the <em> - emphasis tag.
Corrections to document can occur through out the life of a document from author to a final resting-place in the archive or beyond. The <correction> tag in NITF v 2.0 only provides for basic text as a Corrections attribute. Corrections are richer than that and often contain important content that is of great value at a later date. Often it is the <person> in the correction who is most likely to litigate on the basis of some earlier error. Corrections should have all the potential attributes that they would have as part of the body of the original document. They should also carry a date attribute that could indicate an after-the-fact correction.
As editors assemble multiple documents into a news package, there is a need to link those documents together for later retrieval. While the <series> tag works for certain serial documents, it is not general enough to link together a wide variety of package components. The author proposes a <package> tag within the <docdata> component to allow for a package name and a thread number to indicate the relationship of the various components of the package with each other. The thread ID could be a serial number or a document ID number.
The following table illustrates how the NML tags can be expressed as NITF tags. The APX column refers to the appendix.
Implementation of NML into NITF | ||||||
NML field |
Apx |
NITF markup | ||||
Abstract |
<BLOCK.HEAD><ABSTRACT> | |||||
Accession Number |
<DOCDATA><DOC-ID REGSRC="private" id-string="accession"/> | |||||
Assignment |
1 |
<ASSIGNMENT><a.meta> | ||||
Bibliography |
2 |
<BODY.END><BIBLIOGRAPHY> | ||||
Byline/Author |
<BODY.HEAD><BYLINE.PERSON><BYLINE.BYTTL> | |||||
Column Name |
<HEAD><DOCDATA><FIXTURE FIX-ID="column name"/> | |||||
Company |
<ORG> | |||||
Contact (press release) |
<DISTRIBUTOR> | |||||
Copyright |
<BODY.CONTENT><COPYRIGHT.HOLDER><COPYRIGHT.YEAR> | |||||
Correction |
3 |
<HEAD><DOCDATA><CORRECTION><BLOCK><DATE.RELEASE> | ||||
Country |
<LOCATION><COUNTRY> | |||||
Credit |
<BODY.HEAD><BYLINE.BYTAG> or <DISTRIBUTOR> | |||||
Date - Advance |
<HEAD.DOCDATA><DATE.RELEASED> | |||||
Date Authored |
7 |
<HEAD><PUBDATA><EDITION value="edition"/> | ||||
Editor |
6,25 |
<EDITOR-LIST><EDITOR NAME="name" NORM="timestamp"/> | ||||
Editor's Notes (non-published) |
<HEAD><DOCDATA><NOTE.TYPE><ED-MSG="editors note"/> | |||||
Editor's Notes (published) |
<BODY><BODY.CONTENT><NOTE noteclass="editorsnote"> | |||||
First Character of Text |
8 |
<BODY><BODY.CONTENT><BLOCK><FIRST-CHAR> | ||||
Front Page |
4,9 |
<HEAD><PUBDATA><front-page value="yes"/> | ||||
Geographic |
<DOC-SCOPE scope="geographic area"/> | |||||
Headlines |
<BODY.HEAD><HEDLINE><HL1><HL2><H1>-<H8> | |||||
Illustration Caption |
<BLOCK.CONTENT><PHOTO><CAPTION> | |||||
Illustration Creator |
<BLOCK.CONTENT><PHOTO><PRODUCER> | |||||
Industry |
23 |
Lead |
11 |
<BODY><BODY.CONTENT><BLOCK blocktype="lead"> | ||
Length/Word Count |
12,25 |
<HEAD><DOCDATA><SIZE measure="words" val="size"> | ||||
Memo |
<BODY><BODYHEAD><NOTE.TYPE><NOTE.NOTECLASS=editorsnote> | |||||
Nut Graf / Summary |
13 |
<BODY><BODY.CONTENT><BLOCK blocktype="summary"> | ||||
Organization |
<ORG> | |||||
Package ID |
14,25 |
<HEAD><DOCDATA><PACKAGE NAME="package name" thread="thread id"> | ||||
Page |
15,16 |
<BLOCK><PAGE pubdata-element="1" value="2A"/> | ||||
Person |
<PERSON><NAME.CONTENT><NAME.GIVEN><NAME.FAMILY><FUNCTION> | |||||
Poster Heads/Decks |
<BODY.HEAD><HEDLINE><H1>- <H8> | |||||
Priority |
<HEAD><DOCDATA><URGENCY ed-urg=1-9> | |||||
Product |
22 |
<ORG><ORGID idsrc="PRODUCT" value="name"/> | ||||
Publication |
<BODY.HEAD><DISTRIBUTOR.ORGID.VALUE="publication"/> | |||||
Publication title, by event |
17 |
<HEAD><PUBDATA><pub-title="publication title"/> | ||||
Publication Number |
4,18 |
<HEAD><PUBDATA><publication-number="number"/> | ||||
Pull Quotes |
<BODY><BODY.CONTENT><BQ><heading,block,credit> | |||||
Region |
SIC code |
23 |
<ORG><ORGID idsrc="SIC" value="symbol"/> | |||
Slug |
<HEAD><DOC-ID ID-STRING="slug"/> | |||||
Source |
<BODY.HEAD><DISTRIBUTOR> | |||||
State |
<LOCATION><STATE> | |||||
Statistical Code |
23 |
<ORG><ORGID idsrc="STATISTIC" value="code"/> | ||||
Story Type |
<HEAD><TOBJECT.PROPERTY PROPERTYLIST="type"> | |||||
SubHeads |
<BODY.HEAD><HEDLINE><H1>- <H8> | |||||
Subject |
<HEAD><DOCDATA><TOBJECT.SUBJECTLIST> | |||||
Text |
<BODY.CONTENT> | |||||
Text that didn't run |
21 |
<BODY><BODY.CONTENT><BLOCK blocktype="no-run"/> | ||||
Thread ID |
14,25 |
<HEAD><DOCDATA><PACKAGE NAME="package name" thread="thread id"/> | ||||
Ticker Symbol |
23 |
<ORG><ORGID idsrc="NYSE" value="symbol"/> | ||||
time stamp |
<DOCDATA><DATE.ISSUE NORM=mm/dd/ccyy hh:mm:ss /> | |||||
Version |
<HEAD><DOCDATA><DU-KEY generation= part= version=/> | |||||
Volume |
4,22 |
<HEAD><PUBDATA><volume value="volume"/> | ||||
Where |
<BODY><HEAD><DATELINE><LOCATION> | |||||
Zones |
4,24 |
<HEAD><PUBDATA><ZONE value="zone">? |
This appendix contains detailed descriptions of additions and changes to NITF v2.0 DTD to incorporate the proposed NML tag set.
1. <assignment> - assignment information
Container to hold source and assignment information related to the creation of the document.
Content model:
The element <assignment> contains zero or more <a.meta> elements.
Attributes:
None
Tag Source
NITF
Usage Example
<assignment><a.meta name="assignmentname" content="City Council Meeting"/>
XML Element and Attribute Declarations:
<!ELEMENT assignment (a.meta*)>
Parent:
head
A tag to hold assignment meta information
Content model:
The <a.meta> element is defined as empty, meaning that it contains no content.
Attributes:
name, content
Tag Source
NITF
Usage Example
<a.meta name="assignmentname" content="City Council Meeting"/>
XML Element and Attribute Declarations:
<!ELEMENT a.meta EMPTY>
<!ATTRLIST a.meta
name NAME #IMPLIED
content CDATA #REQUIRED>
Parent:
assignment
2: <bibliography> - Bibliographic data
A method to include general bibliographical data that the author used in creating or researching a story.
Content Model:
The <bibliography> consists of one or more blocks of data.
Attributes:
None
Tag Source:
NITF
Usage Example:
<body.end><bibliography><block><h1>Anatomy of a Wire Story II/Data Transmission Guidelines</h1><p><org>Radio-Television News Directors Association<org></P></bibliography>
XML Element and Attribute Declarations:
<!ELEMENT bibliography (block+)>
Parent:
body.end
While wire service data contains correction or clarification data along with the original story, news content corrections may appear at a later time. Corrections also can contain a number of data types, most notable is "person" . Further, some corrections can have multiple blocks of text.
The Correction element, which only has a "info" attribute in NITF version 2.0 should be expanded. The block element contains the necessary components for data types.
Content Model:
The <Correction> element contains parsed character data, <block> data and optionally a <date.release> tag.
Attributes:
None
Tag Source:
NITF
Usage Example:
<correction><p>The name Foo in the headline was misspelled. It should have been Food.</p><date.release norm="19990221"/> </correction>
XML Element and Attribute Declarations:
<!ELEMENT correction (#PCDATA | block* | date.release?)* >
Parent:
docdata
4. <pubdata> - general publication data
The initial release of NITF makes little provision for usage-specific distribution information, such as print publication parameters like date of publication, page, issue, volume, etc. This requires creation of a PubData structure within the <HEAD> of the document, similar to the <DOCDATA> area.
Content Model:
The <pubdata> element consists of a series of elements which provide the distribution meta data. It contains one event attribute, which would be a sequence of numbers. Each event attribute serves to group a number of pubdata elements.
Attributes
event
Tag Source:
NITF
Usage Example
<pubdata event="1"><issue issue="March 1999"/><volume volume=5/><date.published norm=19990225/><start-page value="2A"/><section value="sports"/><zone val="zone"/></pubdata>
XML Element and Attribute Declarations:
<!ELEMENT pubdata (issue | volume | start-page | publication-title | publication-number | front-page | date.published | section | zone | edition )*>
<!ATTLIST pubdata
event NMTOKEN "0" >
Parent:
head
5. <date.published> date document was published
This element contains the date and time when the information within a document is published. The information should be normalized to UTC. Attribute use is ISO 8601 based (YYYYMMDDThhmmssZ).
Content Model:
The <date.published> element is defined as empty, meaning that it contains no content.
Attributes:
norm
Tag Source:
NITF
Usage Example:
<date.published norm="19990223"/>
XML Element and Attribute Declarations:
<!ELEMENT date.published EMPTY>
<!ATTLIST date.published
norm CDATA #IMPLIED>
Parent:
pubdata
6: <editor-list> - Editor list
Container to hold a list of editors who have been associated with the document.
Content Model:
The element <editor-list> contains zero or more <editor> elements.
Attributes:
None
Tag Source:
NITF
Usage Example
<editor-list>
<editor name="john" norm="19990223 10:33:00"/>
<editor name="betsy" norm="19990223 10:35:00"/>
</editor-list>
XML Element and Attribute Declarations:
<!ELEMENT editor-list (editor*)>
Parent
docdata
Tag to hold the name of an editor, and a time stamp when the editor worked on the document
Content Model:
The <editor> element is defined as empty, meaning that it contains no content.
Attributes:
name, norm
Usage Example:
<editor-list>
<editor name="john" norm="19990223 10:33:00"/>
<editor name="betsy" norm="1999022310:35:00"/>
</editor-list>
XML Element and Attribute Declarations:
<!ELEMENT editor EMPTY>
<!ATTLIST editor
name CDATA #IMPLIED
norm CDATA #IMPLIED>
Parent
editor-list
7. <edition> edition of publication
Identification of a publication title related to an instance of publication
Content Model:
The <edition> element is defined as empty, meaning that it contains no content.
Attributes
value
Tag Source
NITF
Usage example:
<edition value="late city final"/>
XML Element and Attribute Declarations:
<!ELEMENT edition EMPTY>
<!ATTLIST edition
value CDATA #IMPLIED>
Parent
pubdata
8. <first-character> First character
A mechanism to identify the first character of a block of text. Often used to allow application of a specific style or graphical character to a text area.
Content Model:
#PCDATA - simple text composed of parsed character data
Attributes:
id ID #IMPLIED
class NMTOKENS #IMPLIED
style CDATA #IMPLIED
lang NMTOKEN #IMPLIED
dir (ltr | rtl) #IMPLIED
as contained in %attrs:
Tag Source:
NITF
Usage example:
<block><p><first-character>I</first-character>n the beginning, there was the tag.</p>
XML Element and Attribute Declarations:
<!ELEMENT first-character (#PCDATA)>
<!ATTLIST first-character
id ID #IMPLIED
class NMTOKENS #IMPLIED
style CDATA #IMPLIED
lang NMTOKEN #IMPLIED
dir (ltr | rtl) #IMPLIED >
Parent:
a, body, caption, credit, dt, fig.data, fn, h1, h2, h3, h4, h5, h6, h7, h8, hl1, hl2, note, p, q, tagline, td, th
9. <front-page> - front page of a publication
Identification of a publication issue related to the document. The front-page attribute indicates whether the document appeared on the front page of the publication
Content Model:
The <front-page> element is defined as empty, meaning that it contains no content.
Attributes
value
Tag Source:
NITF
Usage example
<front-page value="yes"/>
XML Element and Attribute Declarations:
<!ELEMENT front-page EMPTY>
<!ATTLIST front-page
value (yes | no) "no">
Parent:
pubdata
10. <Issue> - issue of publication
Identification of a publication issue related to the document.
Content Model:
The <issue> element is defined as empty, meaning that it contains no content.
Attributes
value
Tag Source:
NITF
Usage example
<issue value="March 1999"/>
XML Element and Attribute Declarations:
<!ELEMENT issue EMPTY>
<!ATTLIST issue
value CDATA #IMPLIED>
Parent:
pubdata
11. Lead (an attribute of a block)
Blocks of text can have certain attributes which have industry-specific meaning, such as lead and summary. The additional of a blocktype attribute to the block tag allows for content identification of type of blocks. Leads may stretch for several blocks.
Content Model:
The <block blocktype="lead"> identifies the block as a lead block of text.
Attributes
blocktype, %attrs;
Tag Source:
NITF
Usage example:
<block blocktype="lead"><p>This is the lead of a story</p>
XML Element and Attribute Declarations:
<!ELEMENT block ((%block.head;)?, (%block.content;)*, (%block.end;)?)>
<!ATTLIST block
blocktype (lead | summary | no-run | rest ) CDATA "rest"
%attrs;>
Parent:
body, bq, dd, FIG.data, fn, LI, note, td, th
Mechanism to provide measurement data of the document.
Content model:
The <size> element is defined as empty, meaning that it contains no content, only attributes.
Attributes:
measure, value
Tag Source:
NITF
Usage example:
<SIZE size.measure="words" size.value=345/>
XML Element and Attribute Declaration
<!ELEMENT size EMPTY>
<!ATTLIST size
size.measure CDATA #REQUIRED
size.value NMTOKEN #REQUIRED >
Parent:
docdata
13. summary (an attribute of a block)
Blocks of text can have certain attributes which have industry-specific meaning, such as lead and summary. The additional of a blocktype attribute to the block tag allows for content identification of type of blocks. Summaries may stretch for several blocks. Summaries in the U.S. are also called "Nut Graf".
Content Model:
The <block blocktype="summary"> identifies the block as a summary block of text.
Attributes
blocktype, %attrs;
Tag Source:
NITF
Usage example:
<block blocktype="summary"><p>This is the summary of a story</p>
XML Element and Attribute Declarations:
<!ELEMENT block ((%block.head;)?, (%block.content;)*, (%block.end;)?)>
<!ATTLIST block
blocktype (lead | summary | no-run | rest ) CDATA "rest"
%attrs;>
Parent:
body, bq, dd, FIG.data, fn, LI, note, td, th
14. <package> - Package of documents
Mechanism for linking a group of documents together, but documents which would not be considered to be part of a series. A package would have a name, and a thread element which indicates a sequence within the package.
Content Model:
The <package> element is defined as empty, meaning that it has no content.
Attributes:
name, thread
Tag Source:
NITF
Usage Example:
<package name="hurricane coverage" thread="12">
XML Element and Attribute Declarations:
<!ELEMENT package EMPTY>
<!ATTLIST package
name CDATA #IMPLIED
thread CDATA #IMPLIED>
Parent:
docdata
15. <page> - the page of the publication the document appeared in
Mechanism to attach a page attribute to blocks of textual data. <page> links to the <pubdata.element="n"> attribute so that blocks of text can have a page attribute which links to a specific publication instance. A block could have multiple page attributes, relating to multiple publication instances.
Content model:
The <page> element is defined as empty, meaning that it contains no content, only attributes. It can occur multiple times within a block
Attributes:
pubdata.element, value
Tag Source:
NITF
Usage Example:
<head><pubdata element="1"><volume value="22"/><start-page value="2B"/></pubdata>
<block><page pubdata.element="1" value="3B"/><p>This is some more text
XML Element and Attribute Declarations:
<!ELEMENT page EMPTY>
<!ATTRLIST page
pubdata-element NMTOKEN #REQUIRED
value CDATA #REQUIRED>
Parent:
block
16. <start-page> the starting page of publication
Identification of the starting page that a document appeared on in print form.
Content Model
The <start-page> element is defined as empty, meaning that it contains no content, only attributes.
Attributes:
value
Tag Source:
NITF
Usage Example:
<Start-page value="2B"/>
XML Element and Attribute Declarations
<!ELEMENT start-page EMPTY>
<!ATTLIST start-page
value CDATA #REQUIRED>
Parent
pubdata
17: <publication-title> - the title of the publication
Identification of a publication title related to an instance of publication
Content Model:
The <publication-title> element is defined as empty, meaning that it contains no content.
Attributes
value
Tag Source:
NITF
Usage example
<publication-title value="The Salt Lake Tribune"/>
XML Element and Attribute Declarations:
<!ELEMENT publication-title EMPTY>
<!ATTLIST publication-title
value CDATA #IMPLIED>
Parent:
pubdata
18: <publication-number> - the number of the publication
Identification of a publication number related to an instance of publication
Content Model:
The <publication-number> element is defined as empty, meaning that it contains no content.
Attributes
value
Tag Source:
NITF
Usage example
<publication-number value="22"/>
XML Element and Attribute Declarations:
<!ELEMENT publication-number EMPTY>
<!ATTLIST publication-number
value CDATA #IMPLIED>
Parent:
pubdata
19. rest (an attribute of a block)
Blocks of text can have certain attributes which have industry-specific meaning, such as lead and summary. The additional of a blocktype attribute to the block tag allows for content identification of type of blocks. The rest of the story would be blocks which do not have a lead, summary or not-run attribute. "Rest" would be the default of the blocktype attribute.
Content Model:
The <block blocktype="rest"> identifies the block as a rest block of text.
Attributes
blocktype, %attrs;
Tag Source:
NITF
Usage example:
<block blocktype="rest"><p>This is just a paragraph of a story</p>
XML Element and Attribute Declarations:
<!ELEMENT block ((%block.head;)?, (%block.content;)*, (%block.end;)?)>
<!ATTLIST block
blocktype (lead | summary | no-run | rest ) CDATA "rest"
%attrs;>
Parent:
body, bq, dd, FIG.data, fn, LI, note, td, th
20. <section> - the section of the publication the document appeared in
Identification of a publication issue related to the document. Sections relate to a physical or logical grouping of stories within a news product.
Content Model:
The <section> element is defined as empty, meaning that it contains no content.
Attributes
value
Tag Source:
NITF
Usage example
<section value="sports"/>
XML Element and Attribute Declarations:
<!ELEMENT section EMPTY>
<!ATTLIST section
value CDATA #IMPLIED>
Parent:
pubdata
21. no-run (an attribute of a block)
Blocks of text can have certain attributes which have industry-specific meaning, such as lead and summary. The additional of a blocktype attribute to the block tag allows for content identification of type of blocks. Some blocks of text are not run in certain instances, but should remain within the document.
Content Model:
The <block blocktype="no-run"> identifies the block as a no-run block of text.
Attributes
blocktype, %attrs;
Tag Source:
NITF
Usage example:
<block blocktype="no-run"><p>This is extra information that you can run if you have room</p>
XML Element and Attribute Declarations:
<!ELEMENT block ((%block.head;)?, (%block.content;)*, (%block.end;)?)>
<!ATTLIST block
blocktype (lead | summary | no-run | rest ) CDATA "rest"
%attrs;>
Parent:
body, bq, dd, FIG.data, fn, LI, note, td, th
22. <volume> - publication volume
Identification of a publication issue related to the document.
Content Model:
The <volume> element is defined as empty, meaning that it contains no content.
Attributes
volume
Tag Source:
NITF
Usage example
<volume value="55"/>
XML Element and Attribute Declarations:
<!ELEMENT volume EMPTY>
<!ATTLIST volume
value CDATA #IMPLIED>
Parent:
pubdata
23. <orgid> - organization identifier
Usage notes:
The IDSRC attribute is used to identify certain standard organization types. These include SIC, STATISTIC, PRODUCT, ISSC, NAICS
24. <zone> - zone of distribution of publication
Identification of a publication issue related to the document. A zone would be a regional distribution of a publication.
Content Model:
The <zone> element is defined as empty, meaning that it contains no content.
Attributes
value
Tag Source:
NITF
Usage example
<issue value="city,surban"/>
XML Element and Attribute Declarations:
<!ELEMENT zone EMPTY>
<!ATTLIST zone
value CDATA #IMPLIED>
Parent:
pubdata
25. <docdata> - General document data
A number of additions to the <docdata> structure require additions to the ELEMENT declaration.
XML Element and Attribute Declarations:
<!ELEMENT docdata (envloc | doc-id | del-list | urgency | fixture | date.issue | date.release | date.expire | doc.scope | series | ed-msg | du-key | doc.copyright | doc.rights | key-list | correction | size | package )*>
The author has worked 25 years in the newspaper industry in a variety of roles. He has been a photojournalist, reporter, production systems editor, production systems manager, manager of information systems, systems designer and newspaper research director.
He has spent the last 10 years as Director of Tribune Solutions, a research and development department of The Salt Lake Tribune. Tribune Solutions produces several commercial newspaper software applications, including NewsView, a text archiving system, PhotoView, an image archiving system, and Connections32, a digital asset management system. He is the chief architect of all those products, and has designed over 150 conversion filters for nearly every production system in use by the industry today, as well as every on-line vendor. Those products are used by more than 100 publications in 8 countries.
The NewsView product line has been sold and marketed for the last 8 years under the umbrella of several Reed-Elsevier companies, including Lexis-Nexis and Reed Technology and Information Services Inc. The Salt Lake Tribune is a wholly owned subsidiary of Tele Communications Inc, soon to be a wholly owned subsidiary of AT&T.
The author is a frequent speaker at industry gatherings and has conducted seminars on archiving and news production issues at the University of Missouri School of Journalism and Rhodes University, Grahamstown, South Africa.