NewsML Elements and Attributes
XN-5
Jo Rabin
Revision 7
12th November 1999
draft
Accompanies 1999-11-10 DTD
Describes the functions of NewsML
Copyright © Reuters Limited 1999 All Rights Reserved
This document describes the elements and attributes of the NewsML DTD.
The requirements of NewsML are described in XN-2 and the intended functionality in XN-3. XN-8 describes various encoding choices that underpin the formulation of the DTD
Revision 7 - 12th November 1999
Changes to accompany 1999-11-10 DTD (remove ambiguous content model).
Revision 6 - 22nd October 1999
Removed material now in XN-3 and XN-8. Corrections and additions to synchronise with 1999-10-12 DTD.
Introduction. 1
Revision History..... 1
Contents... 1
Structural Elements...... 4
newsitem 4
newsitempart......... 5
Newslines............. 5
sourcedata............. 6
newsobject............. 6
data...... 7
text....... 7
p........... 7
link....... 8
records. 8
record... 8
field...... 8
Metadata Elements. 10
codes.. 10
code... 10
things.. 11
thing.... 11
altthings 11
editdetail........... 11
thinglocation......... 12
name... 12
dc....... 13
News Management
Elements. 15
handling 15
slug..... 15
product 15
service 15
routing 15
instructions........... 15
priority 15
urgency 15
status.. 16
permissions......... 16
cycle... 16
outcue. 16
action.. 16
Attribute values................. 17
Roles...... 17
Variants.. 17
xml:lang.. 18
things.class 18
codes.class.............. 18
Examples... 19
Example: Simple Story Encoding 19
Example - Multiple Part
Story Encoding 20
Example - Categorized
Story...... 21
Example -
Categorization with Corrections.............. 22
Example – A Kill......... 23
Example – A picture
with separations.............. 23
References. 24
Standards 24
NewsML References 24
<!ELEMENT newsitem (title+,
%newslines;,
((newsitempart+ | newsobject | text),
%newslines;)?,
metadata?,
handling*,
sourcedata?)>
A newsitem consists of a title, followed by any number of newslines in any order, optionally followed by some content, optionally followed by any number of tagline copyright or citation - in any order, followed by optional metadata, optional handling and finally optional sourcedata.
The ordering of these items is accidental and results from limitations of DTDs. Hence the presence of %newslines; twice - to allow publishers to for example have headlines preceding content and copyright notices training the content.
The content can be:
· One or more newsitemparts - for composite newsitems.
· One newsobject - this allows for newsitems that are e.g. a picture alone.
· One in-line text element - this allows for trivial textual story encodings.
Duplicated newslines are expected to contain different versions of the element for different languages. If two elements of this kind specify the same language the content of the later element takes precedence.
Attributes:
Attribute Name |
Presence |
Format |
Comment |
itemid |
Required |
Any |
Uniquely identifies this newsitem in the publishers domain |
date |
Required |
ISO Date |
A date associated with the story. It is not defined as to what kind of date this is, it can be the story creation date, the publication date etc. |
id |
Optional |
ID |
Identifies this element |
revision |
default 0 |
Integer |
The higher the number the later the revision. |
publisher |
Optional |
URL |
A means of disambiguating the id attribute and hence making it unique. Other data about the publisher, if needed, should be encoded as metadata. |
xml:lang |
Optional |
RFC 1766 |
Sets the default for the newsitem, indicates that story is intended especially people who wish to read this language. |
href |
Optional |
URL |
Information about where to get the story and hence where to get elements that have not been included. Always provides latest revision of story. |
parts |
Default 1 |
Integer |
How many parts there are in this newsitem. The actual number of parts present may be different as this figure identifies the total number of parts in the newsitem. |
<!ELEMENT newsitempart (%newslines;,
((newsitem | newsobject+ | newsitempart+),
%newslines;)?,
metadata? ,
sourcedata?)>
Note as above the same structure of Newslines is used and the same semantics are imputed to repeated elements.
A newsitempart consists of any number of Newslines in any order, optionally followed by some content, optionally followed by any number of tagline copyright or citation - in any order, optional metadata and finally optional sourcedata..
The content can be:
· One newsitem - this allows the construction of lists of stories.
· One or more newsobjects - this provides the mechanism by which a number of alternatives fulfilling the same role in the newsitem may be listed.
· One or more newsitemparts
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
role |
Required |
named role |
Some systematic means of identifying the role that a newsitempart can play in a story. There may be a number of schemes for this, in which case it will be necessary to have some kind of namespace mechanism to distinguish them. Some thoughts about roles detailed at the end of this document. |
order |
Optional |
Integer |
A precedence order to parts may be given by specifying a number in this attribute. Parts which do not specify this attribute have no precedence and their precedence order among themselves is inferred from their order of presentation in the XML encoding. They have lower precedence than any part specifying this attribute. Parts with this attribute specified have decreasing precedence the higher thevaslue. The highest precedence is 0. |
alternatives |
optional |
true or false |
This attribute affects the interpretation of parts embedded in parts. Objects embedded in parts are alwys alternatives to each other. When parts are embedded they are considered alternatives to each other irrespective of the value of their role attribute if this attribute is true. They are complements to each other if this attribute is false. |
All Newslines have PCDATA content and id and xml:lang attributes.
<!ELEMENT sourcedata #PCDATA>
This element allows the transport of any XML compatible data or element structures. It is provided to allow applications that wish to take advantage of the capabilities of newsml but require additional application semantics above those developed for news. Sourcedata is not intended to extend NewsML semantics in an ad hoc way – it is for the expression of other semantics (quotes or whatever).
With the arrival of namespaces it will not be necessary to have this element and it will be removed in a later version of the spec. You have been warned!
Attribute Name |
Presence |
Format |
Comment |
encoding |
optional |
mimetype |
What encoding has been used |
compression |
optional |
mimetype |
What compression has been applied |
<!ELEMENT newsobject (%Newslines;,
((data|text),
%Newslines;)?,
metadata? ,
sourcedata?)>
Note: once again same structure of Newslines.
A newsobject consists of any number of Newslines in any order, optionally followed by some content, optionally followed by any number of tagline copyright or citation - in any order, optional metadata and finally optional sourcedata.
The content can be one of:
· One data element - this allows the in-line inclusion of non-textual NewsML encoded material.
· One text element - this provides the means for including NewsML content encoding in-line.
Note that this is a prime area for turning into RDF.
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
mimetype |
Required |
mime specification |
what sort of object this is. If it is NewsML content encoding then this is "text/x‑newstext" |
mediatype |
Optional |
Enumerated |
Identify what type of object it is – e.g. GIF may be animated or an image |
variant |
Optional |
string |
the reason the object is present as an alternative, especially if this is not appraent from the other attributes |
xml:lang |
Optional |
RFC1766 |
language and variant if relevant |
href |
Optional |
URL |
where to get the content from if not included in-line as data or text |
height |
Optional |
integer |
vertical space occupied by object, if relevant |
width |
Optional |
integer |
horizontal space needed by object if relevant |
size |
Optional |
integer |
the size in bytes of the object if it is specified as a URL |
duration |
Optional |
integer |
the time it takes to experience the object if this is relevant |
colordepth |
Optional |
Integer |
How many colors |
characterset |
Optional |
String |
Which character encoding is used (not which alphabet) |
bandwidthtostream |
Optional |
Integer |
Minimum number of bits per second sustained throughput required to be able to stream this object |
<!ELEMENT data (#PCDATA)>
The data element contains in-line content that has been encoded to meet the requirements of XML in respect of valid characters for PCDATA. The format of the data packaged in this element is described in the containing newsobject element.
The data attributes describe what compressing has been applied followed by what encoding scheme was applied to the compressed result.
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
encoding |
Required |
Mimetype |
Text/plain denotes none. |
compression |
Optional |
Mimetype |
What compression has been used. |
<!ELEMENT text (#PCDATA|p|link|records)* >
The text element allows the in-line encoding of textual content. The text can contain an arbitrary mixture of characters and p, link and records elements.
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
<!ELEMENT p (#PCDATA|link)*>
The p element encapsulates text as a paragraph. Link elements can span text in paragraphs.
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
<!ELEMENT link (#PCDATA)>
The link element denotes the text included in it as a hyperlink. More on this with the development of xLink.
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
href |
Required |
URL |
Where the link leads to |
Records identifies a data structure consisting of data that can also be laid out without having to be interpreted by a computer. It is present to satisfy minimally the need for textual data to be organized in some way without defining layout elements like table. Records is more general than table because it does not require its rows to have the same columns. The application attribute allows a receiving program to determine what kind of program to use to interpret the data present (but several might be applicable for example if records contain data relating to closing prices then a graph application could be equally applicable as a straightforward tabular layout). The intention is that systems that do not understand the data are able to make some attempt at rendering it (as a table structure).
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
application |
Option |
string |
Some means of identifying what kind of data this is so it can be rendered appropriately by relevant applications. This may be a stylesheet reference … |
<!ELEMENT record (field+)>
A container for field elements to be grouped together
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
<!ELEMENT field (#PCDATA)>
A container for data
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
name |
Optional |
string |
A way of distinguishing fields from each other and identifying the data content. |
<!ELEMENT codes (code*)>
Contains code elements that indicate the applicability of the code they contain to the content of the entity to which this metadata is attached.
It is intended that only one instance of any codes element has the same class/role values.
Attribute Name |
Presence |
Format |
Comment |
id |
optional |
ID |
Identifies the element. |
class |
optional |
string |
Identifies the scheme that the codes in the contained elements come from, see section describing use of the class attribute. |
publisher |
optional |
URL |
a means of disambiguating the class attribute, if absent the value is inferred from the publisher attribute of newsitem |
role |
optional |
string |
Identifies the role that codes fulfill e.g. country codes can be used both for the role "location" and for the role "topic" |
<!ELEMENT code (name|editdetail)*>
Indicates that the code identified applies to the entity to which this metadata is attached. The code element can contain name elements – to describe the code – and editdetail elements to describe the history of the application of the code.
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
code |
Required |
string |
The code identifier |
|
|
|
|
confidence |
Optional |
string |
A measure of the confidence that the code applies. This means of indicating confidence requires further study |
present |
Default “true” |
True or false |
If true the code applies if false it does not. The false value is required so that the contained elements can be preserved (i.e. where a code had been appliued but now has been removed). |
<!ELEMENT things (thing*)>
Contains elements indicating the presence of references to certain named entities.
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
class |
required |
string |
Identifies the type or class of thing that is identified (e.g. person, place …), note the difference between a thing class and a code class. See section describing use of the class attribute. |
publisher |
optional |
URL |
the owner of the thingclass scheme name |
<!ELEMENT thing (name|thinglocation|editdetail)*>
The thing element denotes that the thing it identifies is present. The thing can be named by including a name element (one for each language). It can also contain thinglocation and editdetail elements.
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
code |
Optional |
String |
A code allocated to this thing |
codeclass |
Optional |
String |
The classification scheme that the code comes from (see section on class attribute) |
confidence |
optional |
string |
a measure of certainty as to the correctness of application of this thing |
present |
default true |
true or false |
false if applied then subsequently removed and the editdetail is important |
<!ELEMENT altthings (things*)>
This element is used to bracket together a number of alternative interpretations of the same piece of text, e.g. where there is ambiguity as to whether the text "Britannia" refers to a company (a British building society, and also a British airline) or a person (the mythical personification of the British nation).
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
<!ELEMENT editdetail EMPTY>
Some applications require an audit trail of changes to the codes and things applied to a story. The editdetail record is used to describe the changes.
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
attribution |
Optional |
String |
The name of the person or thing that made the change |
action |
Default “added” |
Added, confirmed or removed |
Whether the change was to assert or deny the applicability of the code or thing. |
date |
Optional |
ISO |
When the change took place |
agent |
Default “unknown” |
Enumeration |
What type of thing made the change: Human – a person Expansion – a machine that inferred this was required because some other code or thing in the same scheme is present. |
score |
optional |
string |
A measure whose values are of significance to the entity that applied the code. |
confidence |
optional |
string |
a measure of the confidence of application |
expansion |
optional |
string |
when agent is expansion the value of the other code or thing that caused the expansion |
The thinglocation element points to an instance of the thing referenced in the containing thing elment. This element only works for textual content at the moment, but should be generalized to reference other media.
Attribute Name |
Presence |
Format |
Comment |
id |
optional |
ID |
Identifies the element. |
itemid |
optional |
string |
the newsitem the thing was found in |
idref |
optional |
string |
the element the thing was found in |
offset |
optional |
integer |
Character position of the start of the reference in the text |
text |
optional |
string |
Actual text found at this location if different from canonical form found in containing thing element |
length |
optional |
integer |
How many character in the source relate to the reference. |
The name element is contained by a thing or code element and is used to name the thing or code.
Attribute Name |
Presence |
Format |
Comment |
id |
Optional |
ID |
Identifies the element. |
xml:lang |
optional |
ISO1766 |
Language the name is in |
<!ELEMENT dc EMPTY>
<!ATTLIST dc id ID #IMPLIED
element %dc.elements; #REQUIRED
value CDATA #REQUIRED
xml:lang NMTOKEN #IMPLIED>
The DC element allows the assignment of values to specific refinements of the Dublin core scheme. Example usage is:
<dc element=”dc.date.expires” value=”1999-07-22T00:00Z”>
(midnight 21/22 July 1999)
<dc element=”dc.world.ends” value=”1999-01-01T00:00-5:00”>
(millennium midnight in New York)
Syndication model:
The creator is the person or thing that created the object. The values of the creator elements do not change with the lifecycle of the newsitem. The values of publisher and source do change with the lifecycle. When the newsitem is published the publisher’s details are recorded using the publisher elements. If the newsitem is syndicated each subsequent publisher replaces the values they received with their own details. The original publishers details are transferred source elements. In the case of republication multiple times only the original publisher and the last publisher are represented (using the source and publisher elements respectively.
The attribute element can contain the following values (closer definition is needed especially of allowable values, charactersets, number of instances and so on. See Reuters Metadata Standard for how this can be laid out:
dc.title |
The title of the entity. This is a different semantic to the Title element (intended for display in a list of similar titles). This may not be needed. |
dc.creator.name |
The name of the person who created entity. |
dc.creator.title |
The person’s role in their organization (e.g. Foreign Correspondent) |
dc.creator.location |
The main location of the creator while creating the entity |
dc.creator.location.city |
The name of the city if applicable |
dc.creator.location.sublocation |
The name of the place in the city if applicable (e.g. Madison Square Garden) |
dc.creator.location.stateOrProvince |
The name of the sub division of the country if applicable e.g. New York; Andalucia; Cape Province |
dc.creator.location.country.code |
As ISO 3166 with IPTC extensions for at sea, in the air, in outer space and so on. |
dc.creator.location.country.name |
the name of the country according to some scheme and in some language |
dc.creator.phone |
International format how to call creator |
dc.creator.email |
Email address from internet |
dc.creator.program |
The program and version used to create the entity |
dc.date.created |
When originally created according to UTC. i.e. when the first part of the story was developed. Specify date hours and mins. |
dc.date.lastModified |
The date the latest part of the story was created. |
dc.date.converted |
The date the entity was converted to its present medium. |
dc.publisher |
The name of the publishing organization. |
dc.publisher.provider |
The relevant department of the publisher (bureau name/desk?) |
dc.publisher.contact.name |
Someone to speak to at the publisher. |
dc.publisher.contact.email |
How to contact them. |
dc.publisher.contact.phone |
|
dc.publisher.contact.title |
Who they are. |
dc.publisher.location |
Where they are. Need to figure whether this should really be a time zone? Otherwise is it a country or a city or what? |
dc.publisher.graphic |
The publisher’s logo. This references a URL? Or a part of the attached story? |
dc.coverage.start |
The earliest date referred to in the newsitem. |
dc.coverage.end |
The last date refereed to in the newsitem. |
dc.coverage.period |
If continuous period use start date and period. |
dc.relation.obsoletes |
The itemid of a news item that this item obsoletes (replaces) other than earlier versions of itself. |
dc.relation.includes |
The item id of a news item that this item includes as part of its content. |
dc.relation.references |
The item id of a news item that this item makes reference to. |
dc.date.published |
When the item was first published. |
dc.date.live |
When it is permissible to use the content (embargo) |
dc.date.statuschanges |
When some aspect of it is going to change – e.g. its permissions may change. (not expiry). |
dc.date.expires |
When it is no longer permissible to use the item. |
dc.source |
The original publisher, if the publisher named in the “publisher” element was not the original publisher. |
dc.source.contact |
Who to contact there |
dc.source.provider |
What their dept is |
dc.source.location |
Where they are / their time zone. |
dc.source.graphic |
Their logo. |
dc.source.date.published |
When they published it. |
dc.source.identifier |
What their reference for it was. |
dc.contributor.editor.name |
Somone who edited the snewsitem |
dc.contributor.captionwriter.name |
Someone who wrote the caption. |
dc.contributor.captionwriter.nanny.name |
The name of the person who looked after the caption writer’s children while the caption writer was writing the caption. |
The news management features are all grouped together under the handling element. These elements, with the exception of action, are largely equivalent to IIM features. As such there seems to be something of a lack of clarity about why the elements are needed and how they might be used.
<!ELEMENT handling (routing | urgency | priority | slug |
action | status | product | service |
instructions | permissions | cycle | outcue )*>
The content elements of handling can appear in any order and as many times as needed. All elements have an optional ID attribute and xml:lang attribute. If the xml:lang attribute is missing the value is assumed inherited from handling. All content elements of handling allow content of #PCDATA.
Summary document information and status. e.g. “international-kasmir-leadall”. Used for editorial purposes and may sometimes be used to link documents. Use of the slug for linking is deprecated. The NewsML item id is used to identify newsitems and hence provide links
Information from the provider to indicate which product this news is part of. cf. IIM 1:50
Information from the provide to indicate which service this news is part of. cf. IIM 1:30.
Provide specific information relating to the way the news is routed
Editorial information not contained anywhere else – equivalent to IIM special Instructions e.g. Not available in Asia for copyright reasons.
Routing prioirty
Editorial Urgency
Editorial status e.g. lead, MORE, “CORRECTION” and so on
To be used to indicate in some way what permissions attach to this news.
AM, PM or BC
The closing words of an audio part.
<!ELEMENT action EMPTY>
<!ATTLIST action id ID #IMPLIED
action (add| delete| replace) "add"
itemid CDATA #IMPLIED
newid IDREF #IMPLIED
oldid IDREF #IMPLIED
oldrev CDATA #IMPLIED
setrev CDATA #IMPLIED>
Specifies how this newsitem affects earlier newsitems - i.e. how the elements of this newsitem interact with the elements of other newsitems which may or may not be earlier versions of the same newsitem.
Actions are processed sequentially and may change the state of earlier newsitems receieved. When an element is added it is added as the last child of the nominated parent.
If no action element is present then
{ if no newsitem exists with this itemid then
{ this newsitem is taken to be a new newsitem.}
if a newsitem exits with this itemid then
{ replace all elements with same id in old newsitem
with corresponding ones in new and add elements in
new newsitem which do not have corresponding
elements in old one}
}
else //an action element is present
{ if the rev of the old newitem (identified by the itemid
attribute of the action, if present, or the itemid attribute
of the newsitem the action is in, if not, is equal to the
oldrev attribute then
{set the revision of the old newsitem to the setrev attribute
and carry
out the actions determined by the action attribute:
a ADD element identified
by newid to the element identified by oldid
b DELETE element identified by oldid, or if no id then delete item
c REPLACE element identified by oldid with element identified by newid
}
else
{ignore action element}
}
XPointer would be nice for this.
newsitemparts fulfill different roles in stories. it will be useful to have some standard nomenclature for this.
Here are some suggestions for some roles:
primary the principal thread of the newsitem (every newsitem must have one …?)
secondary supporting material
tertiary etc.
side bar a piece of explanatory text
box some newsobject that related to the main story, e.g. a picture, a pice of box text etc.
navigation bar a set of hyperlinks related to the newsitem
smil a definition of how the parts are to be played relative to each other if not simultaneously
logo
map
separation-color
newsobjects contain the variant attribute which explains the reason the object is present as an alternative for the part (especially if this is not evident from the other attributes, which implicitly provide choice in terms of size, duration, dimensions, language and format).
It might be useful to have some standard labels. Ideas are:
Very Fast Modem
Fast Modem
Slow Modem
Low Color
High Color
…
This is the type of stories written by Reuters journalists, and stored and delivered by RBB and RBB select. A number of features are available which are not used at present but which would be simple to include (e.g. markup of byline) in the editorial process.
<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="../newsitem.css"?>
<!DOCTYPE newsitem SYSTEM "../newsml.dtd">
<newsitem itemid="19990526000012" date="1999-05-28" publisher="moonlite.tibcofinance.com/newsml" xml:lang="en-us"
parts="1" revision="0">
<title>DTDs for Today's world</title>
<headline>DTDs for Today's world</headline>
<byline>by Ernest Dull</byline>
<dateline>Palo Alto, May 26 (Reuters)</dateline>
<text>
<p>Many people wonder why DTDs are necessary</p>
<p>Blah Blah Blah</p>
</text>
<copyright>(c) 1999 Dull, Ernest Tech Mags Inc.</copyright>
</newsitem>
This is inline coding of the content. Where the content is referenced rather than included this would appear as:
<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="../newsitem.css"?>
<!DOCTYPE newsitem SYSTEM "../newsml.dtd">
<newsitem itemid="19990526000012" date="1999-05-28" href="http://moonlite.tibcofinance.com/newsml/stories/9906DTD/simple story.xml">
<title>DTDs for Today's world</title>
</newsitem>
This is the type of thing Galaxy is trying to convey. Parts and their alternatives. Usually the parts are not present inline. Some textual parts may be inline. The application has an unusual form of url, it uses the custom protocol “slug” to denote retrieval of the multimedia content from the parallel delivery environment (this specially for the Galaxy project).
<?xml version="1.0"?>
<!DOCTYPE newsitem SYSTEM "http://moonlite.tibcofinance.com/newsml/dtds/9906dtd.xml">
<newsitem itemid="some unique id" date="1999-06-30 17:45" publisher="reuters.com/newmedia/galaxy/international">
<title>Politcal Groups Gather in India</title>
<copyright>
(c)1999 Reuters Limited. All rights reserved. Republication or redistribution of Reuters content, including by framing or similar means, is expressly prohibited without the prior written consent of Reuters.
</copyright>
<newsitempart role="main">
<newsobject variant = "audio" mimetype = "audio/x-pn-realaudio" href = "india-protest.ram"/>
<newsobject variant = "fast modem" mimetype = "video/x-ms-asx" href = "india-protest56.asx"/>
<newsobject variant = "slow modem" mimetype = "video/x-ms-asx" href = "india-protest28.asx"/>
</newsitempart>
<newsitempart role="thumbnail">
<newsobject mimetype="application/jpeg" height = "120" width = "160" href = "india-protest.jpg"/>
</newsitempart>
<caption>Politcal Groups Gather In India To Protest Pakistan's Armed Incursion Into Kashmir Region</caption>
<credit>Reuters Television News</credit>
<handling>
<slug>international-kashmir-leadall</slug>
</handling>
</newsitem>
<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="../newsitem.css"?>
<!DOCTYPE newsitem SYSTEM "../newsml.dtd">
<newsitem itemid="DX9307030296" date="1993-07-03">
<title>FRANCE:TRAVEL - LE PARC JURASSIQUE?</title>
<headline>TRAVEL - LE PARC JURASSIQUE?</headline>
<byline>By JASPER GERRARD.</byline>
<text>
<p>Already subjected to Euro Disney, France is poised for its second American theme park. Universal Studios - whose dinosaur movie Jurassic Park is packing them in - is considering sites for its latest park. Warm favourite is a park between Euro Disney and the centre of Paris.</p>
<p>"By the time the park is built we are confident there will be demand for it," says Christine Hanson, vice-president of corporate affairs for the studios, whose Florida base is next to Disney World.</p>
<p>Euro Disney is taking a positive line. "Universal hasn't spoken to us about it, but I think it will encourage tourists to stay for full-length holidays," ventured a spokesman.</p>
</text>
<copyright>(c) The Telegraph plc, London, 1993. </copyright>
<metadata>
<codes class="bip:industry"> <code code="I97412"/> <code code="I971"/> <code code="I3454"/> </codes>
<codes class="bip:country"> <code code="FRA"/> </codes>
<codes class="bip:topic"> <code code="C24"/> </codes>
</metadata>
</newsitem>
<?xml version="1.0" ?>
<!DOCTYPE newsitem SYSTEM "../newsml.dtd">
<?xml-stylesheet type="text/css" href="../newsitem.css"?>
<newsitem itemid="xxxxx" date="1999-07-02">
<title>Example use of code corrections</title>
<metadata>
<codes class="rbb:country:9905">
<!-- this code is present added by Cat 99 -->
<code code="AFGH" confidence = "87" >
<editdetail attribution="Magic Categorizer Version 0.9" agent="auto" action="added" date="1999-05-02" />
</code>
<!-- this code is absent, first it was added by Cat 99, then removed by an autocoder, note that confidence value is lost is this correct, or should the confidence value be preserved for posterity?-->
<code code="ABDI" present = "false">
<editdetail attribution="Magic Categorizer Version 0.9" agent="auto" action="added" date="1999-05-02" />
<editdetail attribution="champ coder, Reuters" action="removed" date="1999-05-03" agent="human"/>
</code>
<!-- this code is present because according to some mapping it corresponds to an externally supplied code -->
<code code="C12" confidence="44">
<editdetail attribution="N2000/RBB Mapper Version 99.999" agent="map" action="added" date="1999-05-02" />
</code>
<!-- this code is present because C12 is -->
<code code="C1">
<editdetail attribution="expando ruleset version 9" agent="expansion" expansion="C12" action="added" date="1999-05-02" />
</code>
</codes>
</metadata>
</newsitem>
<newsitem itemid="whatever" date="whatever">
<handling>
<action itemid="itemidtokill" action="delete">
</handling>
</newsitem>
<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="../newsitem.css"?>
<!DOCTYPE newsitem SYSTEM "../newsml.dtd">
<newsitem itemid="19990526000012" date="1999-05-28"
publisher="moonlite.tibcofinance.com/newsml" xml:lang="en-us"
parts="1" revision="0">
<title>DTDs for Today's world</title>
<newsitempart role="primary">
<caption>A nice picture with 3 separations</caption>
<newsitempart role="separation-green"><newsobject mimetype="image/jpeg" href="anurl"/></newsitempart>
<newsitempart role="separation-red"><newsobject mimetype="image/jpeg" href="anurl"/></newsitempart>
<newsitempart role="separation-blue"><newsobject mimetype="image/jpeg" href="anurl"/></newsitempart>
</newsitempart>
</newsitem>
Extensible Markup Language (XML), http://www.w3.org/TR/REC-xml
Dublin Core metadata for Resource Discovery, Wiebel, Kunze, Lagoze& Wolf, RFC 2413, September 1998, http://purl.org/metadata/dublin_core
Tags for the Identification of Languages, Alvestrand, RFC 1766, March 1999.
Date and Time Formats (based on ISO 8601), W3C Technical Note, http://www.w3.org/TR/NOTE-datetime
IPTC-NAA Information Interchange Reference Model Version 4, 1997
XN-2 NewsML Requirements
XN-3 NewsML Functions
XN-6 Cross Reference from IIM to NewsML
XN-8 NewsML Encoding Principles
NewsML 1999-10-12 DTD