NewsML Elements and Attributes

XN-5

Jo Rabin

Revision 7

12th November 1999

draft
Accompanies 1999-11-10 DTD

Describes the functions of NewsML

Copyright © Reuters Limited 1999 All Rights Reserved


Introduction

This document describes the elements and attributes of the NewsML DTD.

The requirements of NewsML are described in XN-2 and the intended functionality in XN-3. XN-8 describes various encoding choices that underpin the formulation of the DTD

Revision History

Revision 7 - 12th November 1999

Changes to accompany 1999-11-10 DTD (remove ambiguous content model).

Revision 6 - 22nd October 1999

Removed material now in XN-3 and XN-8. Corrections and additions to synchronise with 1999-10-12 DTD.

Contents

Introduction. 1

Revision History..... 1

Contents... 1

Structural Elements...... 4

newsitem 4

newsitempart......... 5

Newslines............. 5

sourcedata............. 6

newsobject............. 6

data...... 7

text....... 7

p........... 7

link....... 8

records. 8

record... 8

field...... 8

Metadata Elements. 10

codes.. 10

code... 10

things.. 11

thing.... 11

altthings 11

editdetail........... 11

thinglocation......... 12

name... 12

dc....... 13

News Management Elements. 15

handling 15

slug..... 15

product 15

service 15

routing 15

instructions........... 15

priority 15

urgency 15

status.. 16

permissions......... 16

cycle... 16

outcue. 16

action.. 16

Attribute values................. 17

Roles...... 17

Variants.. 17

xml:lang.. 18

things.class 18

codes.class.............. 18

Examples... 19

Example:                 Simple Story Encoding   19

Example - Multiple Part Story Encoding 20

Example - Categorized Story...... 21

Example - Categorization with Corrections.............. 22

Example – A Kill......... 23

Example – A picture with separations.............. 23

References. 24

Standards 24

NewsML References 24

 


Structural Elements

newsitem

<!ELEMENT newsitem       (title+,

       %newslines;,

       ((newsitempart+ | newsobject | text),

                     %newslines;)?,

       metadata?,

                     handling*,

sourcedata?)>

A newsitem consists of a title, followed by any number of newslines in any order, optionally followed by some content, optionally followed by any number of tagline copyright or citation - in any order, followed by optional metadata, optional handling and finally optional sourcedata.

The ordering of these items is accidental and results from limitations of DTDs. Hence the presence of %newslines; twice - to allow publishers to for example have headlines preceding content and copyright notices training the content.

The content can be:

·        One or more newsitemparts - for composite newsitems.

·        One newsobject - this allows for newsitems that are e.g. a picture alone.

·        One in-line text element - this allows for trivial textual story encodings.

Duplicated newslines are expected to contain different versions of the element for different languages. If two elements of this kind specify the same language the content of the later element takes precedence.

Attributes:

Attribute Name

Presence

Format

Comment

itemid

Required

Any

Uniquely identifies this newsitem in the publishers domain

date

Required

ISO Date

A date associated with the story. It is not defined as to what kind of date this is, it can be the story creation date, the publication date etc.

id

Optional

ID

Identifies this element

revision

default 0

Integer

The higher the number the later the revision.

publisher

Optional

URL

A means of disambiguating the id attribute and hence making it unique. Other data about the publisher, if needed, should be encoded as metadata.

xml:lang

Optional

RFC 1766

Sets the default for the newsitem, indicates that story is intended especially people who wish to read this language.

href

Optional

URL

Information about where to get the story and hence where to get elements that have not been included. Always provides latest revision of story.

parts

Default 1

Integer

How many parts there are in this newsitem. The actual number of parts present may be different as this figure identifies the total number of parts in the newsitem.

newsitempart

<!ELEMENT newsitempart       (%newslines;,

                     ((newsitem | newsobject+ | newsitempart+),

                     %newslines;)?,

                     metadata? ,

sourcedata?)>

Note as above the same structure of Newslines is used and the same semantics are imputed to repeated elements.

A newsitempart consists of any number of Newslines in any order, optionally followed by some content, optionally followed by any number of tagline copyright or citation - in any order, optional metadata and finally optional sourcedata..

The content can be:

·        One newsitem - this allows the construction of lists of stories.

·        One or more newsobjects - this provides the mechanism by which a number of alternatives fulfilling the same role in the newsitem may be listed.

·        One or more newsitemparts

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

role

Required

named role

Some systematic means of identifying the role that a newsitempart can play in a story. There may be a number of schemes for this, in which case it will be necessary to have some kind of namespace mechanism to distinguish them. Some thoughts about roles detailed at the end of this document.

order

Optional

Integer

A precedence order to parts may be given by specifying a number in this attribute. Parts which do not specify this attribute have no precedence and their precedence order among themselves is inferred from their order of presentation in the XML encoding. They have lower precedence than any part specifying this attribute. Parts with this attribute specified have decreasing precedence the higher thevaslue. The highest precedence is 0.

alternatives

optional

true or false

This attribute affects the interpretation of parts embedded in parts. Objects embedded in parts are alwys alternatives to each other. When parts are embedded they are considered alternatives to each other irrespective of the value of their role attribute if this attribute is true. They are complements to each other if this attribute is false.

 

Newslines

All Newslines have PCDATA content and id and xml:lang attributes.

sourcedata

<!ELEMENT sourcedata       #PCDATA>

 

This element allows the transport of any XML compatible data or element structures. It is provided to allow applications that wish to take advantage of the capabilities of newsml but require additional application semantics above those developed for news. Sourcedata is not intended to extend NewsML semantics in an ad hoc way – it is for the expression of other semantics (quotes or whatever).

With the arrival of namespaces it will not be necessary to have this element and it will be removed in a later version of the spec. You have been warned!

Attribute Name

Presence

Format

Comment

encoding

optional

mimetype

What encoding has been used

compression

optional

mimetype

What compression has been applied

 

newsobject

<!ELEMENT newsobject       (%Newslines;,

                     ((data|text),

                     %Newslines;)?,

                     metadata? ,

sourcedata?)>

Note: once again same structure of Newslines.

A newsobject consists of any number of Newslines in any order, optionally followed by some content, optionally followed by any number of tagline copyright or citation - in any order, optional metadata and finally optional sourcedata.

The content can be one of:

·        One data element - this allows the in-line inclusion of non-textual NewsML encoded material.

·        One text element - this provides the means for including NewsML content encoding in-line.

Note that this is a prime area for turning into RDF.

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

mimetype

Required

mime specification

what sort of object this is. If it is NewsML content encoding then this is "text/x‑newstext"

mediatype

Optional

Enumerated

Identify what type of object it is – e.g. GIF may be animated or an image

variant

Optional

string

the reason the object is present as an alternative, especially if this is not appraent from the other attributes

xml:lang

Optional

RFC1766

language and variant if relevant

href

Optional

URL

where to get the content from if not included in-line as data or text

height

Optional

integer

vertical space occupied by object, if relevant

width

Optional

integer

horizontal space needed by object if relevant

size

Optional

integer

the size in bytes of the object if it is specified as a URL

duration

Optional

integer

the time it takes to experience the object if this is relevant

colordepth

Optional

Integer

How many colors

characterset

Optional

String

Which character encoding is used (not which alphabet)

bandwidthtostream

Optional

Integer

Minimum number of bits per second sustained throughput required to be able to stream this object

 

data

<!ELEMENT data          (#PCDATA)>

 

The data element contains in-line content that has been encoded to meet the requirements of XML in respect of valid characters for PCDATA. The format of the data packaged in this element is described in the containing newsobject element.

The data attributes describe what compressing has been applied followed by what encoding scheme was applied to the compressed result.

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

encoding

Required

Mimetype

Text/plain denotes none.

compression

Optional

Mimetype

What compression has been used.

 

text

<!ELEMENT text          (#PCDATA|p|link|records)* >

 

The text element allows the in-line encoding of textual content. The text can contain an arbitrary mixture of characters and p, link and records elements.

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

 

p

<!ELEMENT p          (#PCDATA|link)*>

 

The p element encapsulates text as a paragraph. Link elements can span text in paragraphs.

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

 

link

<!ELEMENT link          (#PCDATA)>

 

The link element denotes the text included in it as a hyperlink. More on this with the development of xLink.

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

href

Required

URL

Where the link leads to

 

records

 

Records identifies a data structure consisting of data that can also be laid out without having to be interpreted by a computer. It is present to satisfy minimally the need for textual data to be organized in some way without defining layout elements like table. Records is more general than table because it does not require its rows to have the same columns. The application attribute allows a receiving program to determine what kind of program to use to interpret the data present (but several might be applicable for example if records contain data relating to closing prices then a graph application could be equally applicable as a straightforward tabular layout). The intention is that systems that do not understand the data are able to make some attempt at rendering it (as a table structure).

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

application

Option

string

Some means of identifying what kind of data this is so it can be rendered appropriately by relevant applications. This may be a stylesheet reference …

record

<!ELEMENT record       (field+)>

 

A container for field elements to be grouped together

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

 

field

<!ELEMENT field         (#PCDATA)>

 

A container for data

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

name

Optional

string

A way of distinguishing fields from each other and identifying the data content.

 

 


Metadata Elements

codes

<!ELEMENT codes         (code*)>

 

Contains code elements that indicate the applicability of the code they contain to the content of the entity to which this metadata is attached.

It is intended that only one instance of any codes element has the same class/role values.

Attribute Name

Presence

Format

Comment

id

optional

ID

Identifies the element.

class

optional

string

Identifies the scheme that the codes in the contained elements come from, see section describing use of the class attribute.

publisher

optional

URL

a means of disambiguating the class attribute, if absent the value is inferred from the publisher attribute of newsitem

role

optional

string

Identifies the role that codes fulfill e.g. country codes can be used both for the role "location" and for the role "topic"

 

code

<!ELEMENT code          (name|editdetail)*>

 

Indicates that the code identified applies to the entity to which this metadata is attached. The code element can contain name elements – to describe the code – and editdetail elements to describe the history of the application of the code.

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

code

Required

string

The code identifier

 

 

 

 

confidence

Optional

string

A measure of the confidence that the code applies. This means of indicating confidence requires further study

present

Default “true”

True or false

If true the code applies if false it does not. The false value is required so that the contained elements can be preserved (i.e. where a code had been appliued but now has been removed).

 

things

<!ELEMENT things       (thing*)>

 

Contains elements indicating the presence of references to certain named entities.

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

class

required

string

Identifies the type or class of thing that is identified (e.g. person, place …), note the difference between a thing class and a code class. See section describing use of the class attribute.

publisher

optional

URL

the owner of the thingclass scheme name

 

thing

<!ELEMENT thing         (name|thinglocation|editdetail)*>

 

The thing element denotes that the thing it identifies is present. The thing can be named by including a name element (one for each language). It can also contain thinglocation and editdetail elements.

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

code

Optional

String

A code allocated to this thing

codeclass

Optional

String

The classification scheme that the code comes from (see section on class attribute)

confidence

optional

string

a measure of certainty as to the correctness of application of this thing

present

default true

true or false

false if applied then subsequently removed and the editdetail is important

 

altthings

<!ELEMENT altthings            (things*)>

 

This element is used to bracket together a number of alternative interpretations of the same piece of text, e.g. where there is ambiguity as to whether the text "Britannia" refers to a company (a British building society, and also a British airline) or a person (the mythical personification of the British nation).

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

 

editdetail

<!ELEMENT editdetail       EMPTY>

 

Some applications require an audit trail of changes to the codes and things applied to a story. The editdetail record is used to describe the changes.

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

attribution

Optional

String

The name of the person or thing that made the change

action

Default “added”

Added, confirmed or removed

Whether the change was to assert or deny the applicability of the code or thing.

date

Optional

ISO

When the change took place

agent

Default “unknown”

Enumeration

What type of thing made the change:

Human – a person
auto – a machine that inferred the applicability from the text of the item in question
map – a machine that created this code as a result of mapping from some other scheme

Expansion – a machine that inferred this was required because some other code or thing in the same scheme is present.

score

optional

string

A measure whose values are of significance to the entity that applied the code.

confidence

optional

string

a measure of the confidence of application

expansion

optional

string

when agent is expansion the value of the other code or thing that caused the expansion

 

thinglocation

The thinglocation element points to an instance of the thing referenced in the containing thing elment. This element only works for textual content at the moment, but should be generalized to reference other media.

Attribute Name

Presence

Format

Comment

id

optional

ID

Identifies the element.

itemid

optional

string

the newsitem the thing was found in

idref

optional

string

the element the thing was found in

offset

optional

integer

Character position of the start of the reference in the text

text

optional

string

Actual text found at this location if different from canonical form found in containing thing element

length

optional

integer

How many character in the source relate to the reference.

 

name

The name element is contained by a thing or code element and is used to name the thing or code.

Attribute Name

Presence

Format

Comment

id

Optional

ID

Identifies the element.

xml:lang

optional

ISO1766

Language the name is in

 

dc

<!ELEMENT dc EMPTY>

<!ATTLIST dc       id       ID            #IMPLIED

              element       %dc.elements;       #REQUIRED

              value       CDATA         #REQUIRED

xml:lang       NMTOKEN       #IMPLIED>

 

The DC element allows the assignment of values to specific refinements of the Dublin core scheme. Example usage is:

<dc element=”dc.date.expires” value=”1999-07-22T00:00Z”>

(midnight 21/22 July 1999)

<dc element=”dc.world.ends” value=”1999-01-01T00:00-5:00”>

(millennium midnight in New York)

 

Syndication model:

The creator is the person or thing that created the object. The values of the creator elements do not change with the lifecycle of the newsitem. The values of publisher and source do change with the lifecycle. When the newsitem is published the publisher’s details are recorded using the publisher elements. If the newsitem is syndicated each subsequent publisher replaces the values they received with their own details. The original publishers details are transferred source elements. In the case of republication multiple times only the original publisher and the last publisher are represented (using the source and publisher elements respectively.

The attribute element can contain the following values (closer definition is needed especially of allowable values, charactersets, number of instances and so on. See Reuters Metadata Standard for how this can be laid out:

dc.title

The title of the entity. This is a different semantic to the Title element (intended for display in a list of similar titles). This may not be needed.

dc.creator.name

The name of the person who created entity.

dc.creator.title

The person’s role in their organization (e.g. Foreign Correspondent)

dc.creator.location

The main location of the creator while creating the entity

dc.creator.location.city

The name of the city if applicable

dc.creator.location.sublocation

The name of the place in the city if applicable (e.g. Madison Square Garden)

dc.creator.location.stateOrProvince

The name of the sub division of the country if applicable  e.g. New York; Andalucia; Cape Province

dc.creator.location.country.code

As ISO 3166 with IPTC extensions for at sea, in the air, in outer space and so on.

dc.creator.location.country.name

the name of the country according to some scheme and in some language

dc.creator.phone

International format how to call creator

dc.creator.email

Email address from internet

dc.creator.program

The program and version used to create the entity

dc.date.created

When originally created according to UTC. i.e. when the first part of the story was developed. Specify date hours and mins.

dc.date.lastModified

The date the latest part of the story was created.

dc.date.converted

The date the entity was converted to its present medium.

dc.publisher

The name of the publishing organization.

dc.publisher.provider

The relevant department of the publisher (bureau name/desk?)

dc.publisher.contact.name

Someone to speak to at the publisher.

dc.publisher.contact.email

How to contact them.

dc.publisher.contact.phone

 

dc.publisher.contact.title

Who they are.

dc.publisher.location

Where they are. Need to figure whether this should really be a time zone? Otherwise is it a country or a city or what?

dc.publisher.graphic

The publisher’s logo. This references a URL? Or a part of the attached story?

dc.coverage.start

The earliest date referred to in the newsitem.

dc.coverage.end

The last date refereed to in the newsitem.

dc.coverage.period

If continuous period use start date and period.

dc.relation.obsoletes

The itemid of a news item that this item obsoletes (replaces) other than earlier versions of itself.

dc.relation.includes

The item id of a news item that this item includes as part of its content.

dc.relation.references

The item id of a news item that this item makes reference to.

dc.date.published

When the item was first published.

dc.date.live

When it is permissible to use the content (embargo)

dc.date.statuschanges

When some aspect of it is going to change – e.g. its permissions may change. (not expiry).

dc.date.expires

When it is no longer permissible to use the item.

dc.source

The original publisher, if the publisher named in the “publisher” element was not the original publisher.

dc.source.contact

Who to contact there

dc.source.provider

What their dept is

dc.source.location

Where they are / their time zone.

dc.source.graphic

Their logo.

dc.source.date.published

When they published it.

dc.source.identifier

What their reference for it was.

dc.contributor.editor.name

Somone who edited the snewsitem

dc.contributor.captionwriter.name

Someone who wrote the caption.

dc.contributor.captionwriter.nanny.name

The name of the person who looked after the caption writer’s children while the caption writer was writing the caption.

 


News Management Elements

The news management features are all grouped together under the handling element. These elements, with the exception of action, are largely equivalent to IIM features. As such there seems to be something of a lack of clarity about why the elements are needed and how they might be used.

handling

<!ELEMENT handling       (routing | urgency | priority | slug |

action | status | product | service |

                     instructions | permissions | cycle | outcue )*>

 

The content elements of handling can appear in any order and as many times as needed. All elements have an optional ID attribute and xml:lang attribute. If the xml:lang attribute is missing the value is assumed inherited from handling. All content elements of handling allow content of #PCDATA.

slug

Summary document information and status. e.g. “international-kasmir-leadall”. Used for editorial purposes and may sometimes be used to link documents. Use of the slug for linking is deprecated. The NewsML item id is used to identify newsitems and hence provide links

product

Information from the provider to indicate which product this news is part of. cf. IIM 1:50

service

Information from the provide to indicate which service this news is part of. cf. IIM 1:30.

routing

Provide specific information relating to the way the news is routed

instructions

Editorial information not contained anywhere else – equivalent to IIM special Instructions e.g. Not available in Asia for copyright reasons.

priority

Routing prioirty

urgency

Editorial Urgency

status

Editorial status e.g. lead, MORE, “CORRECTION” and so on

permissions

To be used to indicate in some way what permissions attach to this news.

cycle

AM, PM or BC

outcue

The closing words of an audio part.

action

<!ELEMENT action EMPTY>

<!ATTLIST action id       ID       #IMPLIED

                     action       (add| delete| replace)       "add"

itemid  CDATA       #IMPLIED

                     newid       IDREF       #IMPLIED

                     oldid       IDREF       #IMPLIED

                     oldrev       CDATA       #IMPLIED

                     setrev       CDATA       #IMPLIED>

 

Specifies how this newsitem affects earlier newsitems - i.e. how the elements of this newsitem interact with the elements of other newsitems which may or may not be earlier versions of the same newsitem.

Actions are processed sequentially and may change the state of earlier newsitems receieved. When an element is added it is added as the last child of the nominated parent.

     If no action element is present then

     {  if no newsitem exists with this itemid then

             { this newsitem is taken to be a new newsitem.}

        if a newsitem exits with this itemid then

             { replace all elements with same id in old newsitem

               with corresponding ones in new and add elements in

               new newsitem which do not have corresponding

               elements in old one}

     }

     else   //an action element is present

     { if the rev of the old newitem (identified by the itemid

          attribute of the action, if present, or the itemid attribute

          of the newsitem the action is in, if not, is equal to the

          oldrev attribute then

             {set the revision of the old newsitem to the setrev attribute

              and carry out the actions determined by the action attribute:
              a ADD element identified by newid to the element identified by oldid

              b DELETE element identified by oldid, or if no id then delete item

              c REPLACE element identified by oldid with element identified by newid

             }

        else

           {ignore action element}

     }

 

XPointer would be nice for this.


Attribute values

Roles

newsitemparts fulfill different roles in stories. it will be useful to have some standard nomenclature for this.

Here are some suggestions for some roles:

primary the principal thread of the newsitem (every newsitem must have one …?)

secondary            supporting material

tertiary  etc.

side bar            a piece of explanatory text

box      some newsobject that related to the main story, e.g. a picture, a pice of box text etc.

navigation bar            a set of hyperlinks related to the newsitem

smil      a definition of how the parts are to be played relative to each other if not simultaneously

logo

map

separation-color

Variants

newsobjects contain the variant attribute which explains the reason the object is present as an alternative for the part (especially if this is not evident from the other attributes, which implicitly provide choice in terms of size, duration, dimensions, language and format).

It might be useful to have some standard labels. Ideas are:

Very Fast Modem

Fast Modem

Slow Modem

Low Color

High Color

xml:lang

things.class

codes.class


Examples

Example:                Simple Story Encoding

This is the type of stories written by Reuters journalists, and stored and delivered by RBB and RBB select. A number of features are available which are not used at present but which would be simple to include (e.g. markup of byline) in the editorial process.

<?xml version="1.0"?>

<?xml-stylesheet type="text/css" href="../newsitem.css"?>

<!DOCTYPE newsitem SYSTEM "../newsml.dtd">

<newsitem itemid="19990526000012" date="1999-05-28" publisher="moonlite.tibcofinance.com/newsml" xml:lang="en-us"

parts="1" revision="0">

<title>DTDs for Today's world</title>

<headline>DTDs for Today's world</headline>

<byline>by Ernest Dull</byline>

<dateline>Palo Alto, May 26 (Reuters)</dateline>

<text>

<p>Many people wonder why DTDs are necessary</p>

<p>Blah Blah Blah</p>

</text>

<copyright>(c) 1999 Dull, Ernest Tech Mags Inc.</copyright>

</newsitem>

 

This is inline coding of the content. Where the content is referenced rather than included this would appear as:

<?xml version="1.0"?>

<?xml-stylesheet type="text/css" href="../newsitem.css"?>

<!DOCTYPE newsitem SYSTEM "../newsml.dtd">

<newsitem itemid="19990526000012" date="1999-05-28" href="http://moonlite.tibcofinance.com/newsml/stories/9906DTD/simple story.xml">

<title>DTDs for Today's world</title>

</newsitem>

 


Example - Multiple Part Story Encoding

This is the type of thing Galaxy is trying to convey. Parts and their alternatives. Usually the parts are not present inline. Some textual parts may be inline. The application has an unusual form of url, it uses the custom protocol “slug” to denote retrieval of the multimedia content from the parallel delivery environment (this specially for the Galaxy project).

<?xml version="1.0"?>

 

<!DOCTYPE newsitem SYSTEM "http://moonlite.tibcofinance.com/newsml/dtds/9906dtd.xml">

 

<newsitem itemid="some unique id" date="1999-06-30 17:45" publisher="reuters.com/newmedia/galaxy/international">

<title>Politcal Groups Gather in India</title>

<copyright>

(c)1999 Reuters Limited. All rights reserved. Republication or redistribution of Reuters content, including by framing or similar means, is expressly prohibited without the prior written consent of Reuters.

</copyright>

 

<newsitempart role="main">

   <newsobject variant = "audio" mimetype = "audio/x-pn-realaudio" href = "india-protest.ram"/>

   <newsobject variant = "fast modem" mimetype = "video/x-ms-asx" href = "india-protest56.asx"/>

   <newsobject variant = "slow modem" mimetype = "video/x-ms-asx" href = "india-protest28.asx"/>

</newsitempart>                  

<newsitempart role="thumbnail">

   <newsobject mimetype="application/jpeg" height = "120" width = "160" href = "india-protest.jpg"/>

 

</newsitempart>

 

<caption>Politcal Groups Gather In India To Protest Pakistan's Armed Incursion Into Kashmir Region</caption>

<credit>Reuters Television News</credit>

<handling>

<slug>international-kashmir-leadall</slug>

</handling>

 

</newsitem>

 


Example - Categorized Story

<?xml version="1.0"?>

<?xml-stylesheet type="text/css" href="../newsitem.css"?>

<!DOCTYPE newsitem SYSTEM "../newsml.dtd">

<newsitem itemid="DX9307030296" date="1993-07-03">

<title>FRANCE:TRAVEL - LE PARC JURASSIQUE?</title>

<headline>TRAVEL - LE PARC JURASSIQUE?</headline>

<byline>By JASPER GERRARD.</byline>

<text>

<p>Already subjected to Euro Disney, France is poised for its second American theme park. Universal Studios - whose dinosaur movie Jurassic Park is packing them in - is considering sites for its latest park. Warm favourite is a park between Euro Disney and the centre of Paris.</p>

<p>"By the time the park is built we are confident there will be demand for it," says Christine Hanson, vice-president of corporate affairs for the studios, whose Florida base is next to Disney World.</p>

<p>Euro Disney is taking a positive line. "Universal hasn't spoken to us about it, but I think it will encourage tourists to stay for full-length holidays," ventured a spokesman.</p>

</text>

<copyright>(c) The Telegraph plc, London, 1993. </copyright>

<metadata>

<codes class="bip:industry"> <code code="I97412"/> <code code="I971"/> <code code="I3454"/> </codes>

<codes class="bip:country"> <code code="FRA"/> </codes>

<codes class="bip:topic"> <code code="C24"/> </codes>

</metadata>

</newsitem>

 


Example - Categorization with Corrections

<?xml version="1.0" ?>

<!DOCTYPE newsitem SYSTEM "../newsml.dtd">

<?xml-stylesheet type="text/css" href="../newsitem.css"?>

<newsitem itemid="xxxxx" date="1999-07-02">

<title>Example use of code corrections</title>

<metadata>

<codes class="rbb:country:9905">

   <!-- this code is present added by Cat 99 -->

   <code code="AFGH" confidence = "87" >

       <editdetail attribution="Magic Categorizer Version 0.9" agent="auto" action="added" date="1999-05-02" />

   </code>

 

   <!-- this code is absent, first it was added by Cat 99, then removed by an autocoder, note that confidence value is lost is this correct, or should the confidence value be preserved for posterity?-->

   <code code="ABDI" present = "false">

       <editdetail attribution="Magic Categorizer Version 0.9" agent="auto" action="added" date="1999-05-02" />

       <editdetail attribution="champ coder, Reuters" action="removed" date="1999-05-03" agent="human"/>

   </code>

 

   <!-- this code is present because according to some mapping it corresponds to an externally supplied code -->

   <code code="C12" confidence="44">

       <editdetail attribution="N2000/RBB Mapper Version 99.999" agent="map" action="added" date="1999-05-02" />

   </code>

 

   <!-- this code is present because C12 is -->

   <code code="C1">

       <editdetail  attribution="expando ruleset version 9" agent="expansion" expansion="C12" action="added" date="1999-05-02" />

   </code>

</codes>

</metadata>

</newsitem>

 

 


Example – A Kill

<newsitem itemid="whatever" date="whatever">

<handling>

   <action itemid="itemidtokill" action="delete">

</handling>

</newsitem>

 

Example – A picture with separations

<?xml version="1.0"?>

<?xml-stylesheet type="text/css" href="../newsitem.css"?>

<!DOCTYPE newsitem SYSTEM "../newsml.dtd">

<newsitem itemid="19990526000012" date="1999-05-28"

publisher="moonlite.tibcofinance.com/newsml" xml:lang="en-us"

parts="1" revision="0">

<title>DTDs for Today's world</title>

<newsitempart role="primary">

<caption>A nice picture with 3 separations</caption>

<newsitempart role="separation-green"><newsobject mimetype="image/jpeg" href="anurl"/></newsitempart>

<newsitempart role="separation-red"><newsobject mimetype="image/jpeg" href="anurl"/></newsitempart>

<newsitempart role="separation-blue"><newsobject mimetype="image/jpeg" href="anurl"/></newsitempart>

</newsitempart>

</newsitem>


References

Standards

Extensible Markup Language (XML), http://www.w3.org/TR/REC-xml

Dublin Core metadata for Resource Discovery, Wiebel, Kunze, Lagoze& Wolf, RFC 2413, September 1998, http://purl.org/metadata/dublin_core

Tags for the Identification of Languages, Alvestrand, RFC 1766, March 1999.

Date and Time Formats (based on ISO 8601), W3C Technical Note, http://www.w3.org/TR/NOTE-datetime

IPTC-NAA Information Interchange Reference Model Version 4, 1997

NewsML References

XN-2 NewsML Requirements

XN-3 NewsML Functions

XN-6 Cross Reference from IIM to NewsML

XN-8 NewsML Encoding Principles

NewsML 1999-10-12 DTD