[Apparently May 31, 2000 or later. From http://www.io.com/~laird/rss-in-ice.html; use this canonical URL/source if possible.]
RSS in ICE
This document is a follow-up to the discussion thread on the xml-dist-app
list between Laird Popkin and David Winer.
Comparison of RSS and ICE, and proposal on how to merge RSS' functionality
into ICE. Note that purely duplicating RSS' functionality spelled in terms
of ICE may not be the appropriate means of merging the standards, but this
should at the least serve as an interesting experiment in comparing the
protocols. In addition, the comparison to RSS reminds me of one of ICE's
original requirements, to allow a syndicator that simply delivers static
files to be a useful, valuable subset of ICE's functionality. This scenario
has not been at the top of the ICE AG's priority list, so this exercise
serves as a good reminder to the ICE AG, at the very least. The widespread
adoption of RSS by low-end syndicators to distribute promotional links
should serve as a clear indicator of the importance of this scenario in
the world of syndication.
Comparison
A brief comparison of the goals and functionality of RSS and ICE.
RSS
RSS is an evolution of CDF intended to allow web sites to easily broadcast
links to web pages in order to promote traffic to the web site. It was
originally intended to enable subscription to web "channels" for offline
browsing, but has proven valuable as a simple means of exchanging web content
between web sites. Thus, it provides for a relatively simple set of functionality
that is easily implementable:
-
delivery only of web pages
-
support of pull delivery of content
-
content delivered is linked to, not transported to the subscriber for display
-
simple scheduling
-
all content is identical for all subscribers
-
no access control
ICE
ICE is intended to support the full range of content sydication. It has
a number of commercial implementations, generally targeted at businesses
that wish to distribute or aggregate large volumes of content with numerous
business partners. Thus, it offers a range of options including:
-
delivery of arbitrary content types, including web pages
-
support of both push and pull delivery of content
-
access control and customization of content
-
Both "full update" and "incremental update" supported, with atomicity of
updates, recovery mechanisms for error conditions, etc.
-
content delivered in-line or by reference
-
confirmation of delivery
-
auditing of ICE operations
-
rich scheduling of ICE delivery
-
meta-data relating to intellectual property, usage rights, copyright, etc.
-
automated negotiation of delivery and other parameters
-
extensible
-
syndication discovery protocol
Note that while ICE offers these capabilities, it is not required that
they be used for any particular syndication. Beyond this, it appears valuable
to formally modify ICE and to define a subset of ICE that could be implemented
in order to provide comparable functionality to RSS, with comparable simplicity
of implementation. One can also imagine extending RSS to incorporate ICE's
functionality as a comparable phrasing of this exercise.
Recognizing that there is significant value in a single syndication
standard that encompasses the full range of online content syndication,
I propose a "straw man" proposal for expressing RSS functionality within
ICE, inclugin modifications to the ICE protocol to support identical behavior
as RSS. I do not expect that this proposal would be adopted "as is," but
would like to use it as a starting point for an exploration of unifying
the two standards.
Mapping RSS into ICE
I first explore mapping the RSS XML elements into ICE. I based the
RSS on the My
Netscape Network Help reference document, and the ICE on the 1.1 draft
currently in the ICE AG's egroups site.
RSS |
ICE |
Note |
<?xml version="1.0"?> |
<?xml version="1.0"?> |
standard XML label |
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN"
"http://my.netscape.com/publish/formats/rss-0.91.dtd"> |
<!DOCTYPE ice-payload
SYSTEM "ice11.dtd"> |
RSS mandates single, shared DTD. |
<rss version="0.91" encoding= "ISO_8859-1"> |
<ice payload ice.version="1.1" payload-id="1"> |
one top level element, required |
<channel> |
<ice-offer> |
one required |
<title>MozillaZine</title> |
product-name="MozillaZine" |
one required |
<description>Your source for Mozilla news, advocacy, interviews,
and more!</description> |
description="Your source for Mozilla news, advocacy, interviews, and
more!" |
one required |
<link>http://www.mozillazine.org</link> |
ice-catalog's url attribute |
one required |
<language>[ language code here ]</language> |
ice-catalog's xml:lang attribute, or
ice-package's xml:lang attribute |
one required |
<rating>[ PICS rating here ]</rating> |
add <rating> to ice-offer |
one optional |
<copyright>Copyright 1999, Mozillazine.</copyright> |
ice-offer's rights-holder="Copyright 1999, MozillaZine," |
one optional, ignored by NetCenter |
<pubDate>Thu, 08 Jul 1999 07:00:00 GMT</pubDate> |
ice-package attribute activation="1999-07-08T07:00:00" |
one optional, ignored by NetCenter |
<lastBuildDate>Thu, 08 Jul 1999 16:20:26 GMT</lastBuildDate><
/td> |
payload attribute timestamp="1999-7-08T16:20:26" |
one optional, ignored by NetCenter |
<docs>http://my.userland.com/stories/storyReader$11</docs>
td> |
<ice-business-terms xml:lang="en-us" type="licensing" url= "http://my.userland.com/stories/storyReader$11"></ice-business-
terms> |
one optional, ignored by NetCenter |
<managingEditor>sylv@thisdomain.com</managingEditor> |
<ice-contact description="Managing Editor" xml:lang="en-us" name="Sylv"
url="mailto://sylv@thisdomain.com"></ice-contact> |
one optional, ignored by NetCenter |
<webMaster>sylv@thisdomain.com</webMaster> |
<ice-contact description="Webmaster" xml:lang="en-us" name= "Sylv"
url="mailto://sylv@thisdomain.com"></ice-contact> |
one optional, ignored by NetCenter |
<image> |
add image-ref to ice-offer, based on ice-item-ref |
one optional |
<title>MozillaZine</title> |
<ice-offer product-name="MozillaZine"
or image-ref's attribute name="MozillaZine" |
one required if <image> |
<url>http://www.mozillazine.org/image/ mynetscape88.gif</url> |
image-ref's attribute url="http://www.mozillazine.org/image/ mynetscape88.gif" |
one required if <image> |
<link>http://www.mozillazine.org</link> |
add url to ice-offer |
one optional if <image> |
<width>[ numeric value here ]</width> |
image-ref's attribute width="[ numeric value here ]" |
one optional if <image> |
<height>[ numeric value here ]</height> |
image-ref's attribute height="[ numeric value here ]" |
one optional if <image> |
<description>Articles, discussions, builds, and more...</description> |
ice-offer's attribute description="Articles, discussions, builds, and
more..." |
one optional if <image> |
<item> |
<ice-item-ref> |
0-15 allowed |
<title>Java2 in Navigator 5?</title> |
name="Java2 in Navigator 5?" |
one required per <item> |
<link>http://www.mozillazine.org/ talkback.html?article=607</link> |
url="http://www.mozillazine.org/talkback.html?article=607" |
one required per <item> |
<description>Will Java2 be an integrated part of Navigator
5? Read more about it in this discussion...</description> |
<ice-item-ref><ice-access-control description="Will Java2 be
an integrated part of Navigator 5? Read more about it in this discussion..."
control-type="none" |
one optional per <item> |
<textinput> |
add <textinput> as defined by RSS to ice-offer. |
one optional, implemented by HTTP FORM GET |
<title>Send</title> |
ditto |
one required if <textinput> |
<description>Comments about MozillaZine?</ description> |
ditto |
one required if <textinput> |
<name>[ value of name= attribute of input tag here
]</name> |
ditto |
one required if <textinput> |
<link>http://www.mozillazine.org/cgi-bin/ sampleonly.cgi</link> |
ditto |
one required if <textinput> |
<skipHours> |
ice-offer's ice-delivery-policy's ice-delivery-rule |
up to 24, optional, ignored by NetCenter |
<hour>7</hour> |
min-update-interval="P25200S" |
optional, ignored by NetCenter |
<skipDays> |
above |
optional, ignored by NetCenter |
<days>Monday</days> |
weekday="1" |
up to 7, required if <skipDays> |
Example messages
The following is an example RSS message and a comparable ICE message. Note
that many RSS elements mapped above (based on the RSS specification) were
not used in the RSS messages that I have seen in production, and are thus
not reflected below. I believe that this makes the following messages fairly
realistic reflections of how typical RSS syndications can be encoded.
RSS Message
<?xml version="1.0"?>
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN"
"http://my.netscape.com/publish/formats/rss-0.91.dtd">
<rss version="0.91">
<channel>
<title>MozillaZine</title>
<link>http://www.mozillazine.org</link>
<description>Your source for Mozilla news, advocacy, interviews, builds, and more! </description>
<language>en-us</language>
<rating>(PICS-1.1 "http://www.rsac.org/ratingsv01.html"
l gen true comment "RSACi North America Server"
for "http://www.rsac.org" on "1996.04.16T08:15-0500"
r (n 0 s 0 v 0 l 0))</rating>
<image>
<title>MozillaZine</title>
<url>http://www.mozillazine.org/image/mynetscape88.gif</url>
<link>http://www.mozillazine.org</link>
<width>88</width>
<height>31</height>
<description>Articles, discussions, builds, and more...</description>
</image>
<item>
<title>Java2 in Navigator 5?</title>
<link>http://www.mozillazine.org/talkback.html?article=607</link>
<description>Will Java2 be an integrated part of Navigator 5? Read more about it in this discussion...</description>
</item>
<item>
<title>Communicator 4.61 Out</title>
<link>http://www.mozillazine.org/talkback.html?article=606</link>
<description>The latest version of Communicator is now available. It includes security enhancements and various bug fixes.
</description>
</item>
<item>
<title>Mozilla Dispenses with Old, Proprietary DOM</title>
<link>http://www.mozillazine.org/talkback.html?article=604</link>
</item>
<item>
<title>The Animation Contest is Now Closed!</title>
<link>http://www.mozillazine.org/talkback.html?article=603</link>
</item>
<textinput>
<title>Send</title>
<description>Comments about MozillaZine?</description>
<name>responseText</name>
<link>http://www.mozillazine.org/cgi-bin/sampleonly.cgi</link>
</textinput>
</channel>
</rss>
ICE Message
<?xml version="1.0"?>
<!DOCTYPE ice-payload SYSTEM "ice11.dtd">
<ice-payload ice.version="1.1" payload-id="1" timestamp="1999-
7-08T16:20:26">
<ice-header>
<ice-sender name="MozillaZine" role="syndicator" sender-id="[ GUID ]"
location="http://www.mozillazine.com/ice-get.xml"/>
<ice-user-agent>Mod::ICE-RSS/0.1</user-agent> <!-- not required, but could be useful for support -->
</ice-header>
<ice-response response-id="1">
<ice-code numeric="200" phrase="OK"></code>
<ice-package fullupdate="true" old-state="ICE-ANY" new-state="1999-7-08T16:20:26" package-id="1"
subscription-id="">
<ice-catalog name="MozillaZine's ICE/RSS server" url="http://www.mozillazine.org/" xml:lang="en-us">
<ice-image-link image-url="http://www.mozillazine.org/image/mynetscape88.gif" url="http://www.mozillazine.org"
height="88" width="31" description="Articles, discussions, builds, and more...">
</ice-image-link>
<ice-offer name="MozillaZine" description="Your source for Mozilla news, advocacy, interviews, builds, and more!">
<ice-delivery-policy>
<ice-delivery-rule mode="get" url="http://www.mozillazine.org/ice-rss.xml"/>
</ice-delivery-policy>
<rating>(PICS-1.1 "http://www.rsac.org/ratingsv01.html"
l gen true comment "RSACi North America Server"
for "http://www.rsac.org" on "1996.04.16T08:15-0500"
r (n 0 s 0 v 0 l 0))</rating>
<textinput> <!-- straight from RSS -->
<title>Send</title>
<description>Comments about MozillaZine?</description>
<name>responseText</name>
<link>http://www.mozillazine.org/cgi-bin/sampleonly.cgi</link>
</textinput>
</ice-offer>
</ice-catalog>
<ice-item-link name="Java2 in Navigator 5?" url="http://www.mozillazine.org/talkback.html?article=607"
description="Will Java2 be an integrated part of Navigator 5? Read more about it in this discussion...">
<ice-item-link name="Communicator 4.61 Out" url="http://www.mozillazine.org/talkback.html?article=606"
description="The latest version of Communicator is now available. It includes security enhancements and various bug fixes.">
</ice-item-link>
<ice-item-link name="Mozilla Dispenses with Old, Proprietary DOM" url="http://www.mozillazine.org/talkback.html?article=604">
</ice-item-link>
<ice-item-link name="The Animation Contest is Now Closed!" url="http://www.mozillazine.org/talkback.html?article=60>
</ice-item-link>
</ice-package>
</ice-response>
</ice-payload>
Observations
-
The ICE encoding tends towards fewer elements with attributes, whereas
the RSS encoding consists almost entirely of elements with no attributes.
Encoding with attributes rather than nested elements is substantially more
efficient, both in terms of code complexity and processing time, due to
the simpler data structures (DOM) or collection of relevant data into events
(SAX). May want to re-express the textinput structure as attributes rather
than nested elements for consistency with the rest of ICE.
-
I am assuming that the ice-payload will typically be delivered as a static
file that is generated periodically, either on a schedule or triggered
by content changes. Thus, the timestamps would be the timestamp of the
last time the file was written, the payload id would simple increment,
and so on. Of course, if the file is generated dynamically (i.e. the GET
is to a cgi or Servlet) the entire process could be data driven. The subscriber
shouldn't care, but we should make sure that the trivial, static file option
is straightforward to implement.
-
ICE uses a specific form of ISO8601 time, date and period specification.
Since ICE specifies that (for simplicity of implementation) all periods
are specified in seconds, rather than the broad range of units allowed
by ISO8601, the RSS "hours" scheduling is rather cryptic looking (1 hour
= a 3600 second period, or P3600S in ISO8601). We may want to consider
adding hours to the allowable units, allowing the more reasonable P1H for
one hour, etc. This is added code complexity, but probably more human readable.
Unlike typical ICE scenarios, involving content management systems, low-end
ICE-RSS syndicators may, in fact, hand-author this file for their site,
in which case simplicity of expression is valuable even at the cost of
somewhat more complex implementation on the receiving side. It's worth
discussing, at any rate.
-
Similarly, for full update packages, old-state is somewhat redundant, since
it must always have the value of "ICE-ANY". I would suggest that we may
want to change is attribute to #IMPLIED, defaulting to "ICE-ANY", in order
to make the expression of full updates more compact.
-
There are some relationships in RSS that aren't completely clear to me.
For example, is <image> intended to be an image that refers to the syndicator
as a whole, or to the particular channel being transmitted, or is it simple
another kind of item? I interpreted it as applying to the syndicator as
a whole, by encoding it in the ice-catalog (i.e. where other contact information
would go).
-
Encoded the ice-package in an ice-response. In principle the HTTP
GET of the URL can be treated as if it were an ice-request of ice-get-package
with no subscriber ID. As there is no protocol-level communication channel
from the subscriber to the syndicator, the ice-package generated in a GET
mode subscription cannot request confirmation from the subscriber.
-
RSS doesn't have any concept of a failure code, so to duplicate RSS the
ice-response's ice-code would always be 200/OK.
Changes to ICE
The key addition to the ICE protocol is an extremely limited content deliver
mode, "GET", in which the only communication between the syndicator and
subscriber consists of the subscriber issuing a GET of a provided URL,
and receiving the requested content. In addition, there are a small number
of DTD changes that would be required to fully express RSS' functionality
in ICE. This delivery mode utilizes an extremely minimal subset of ICE's
functionality, but the extreme ease of implementation, and the prospect
of unifying two divergent standards for content syndication to the benefit
of all concerned, make this an extremely valuable scenario to explore.
The changes to the ICE protocol are generally consistent with ICE's design
philosophy, and of general use to a variety of syndication scenarios.
This scenario involves making the following changes to ICE:
-
Add PICS rating to offer. PICS ratings should be added to ice-offer. I
simply adopted the RSS <rating> tag.
-
Add ice-item-link, a duplicate of ice-item-ref, with the added attribute
description, and the semantic that the asset is to be linked to rather
than retrieved and utilized locally. Linked assets are appropriate if the
syndicator wishes to take advantage of site-specific functionalty or to
control, track or benefit from the actual delivery of the content. Assets
encoded as ice-item-link must be served on a persistent server accessible
at the provided URL for use by the general public.
-
Add ice-image-link to the contents of ice-catalog and ice-package. For
ice-offer it would be for display to users in selecting between offers.
For ice-package it would be an alternative to ice-item-link that has additional
attributes, image-url, height, and width, relevant to image links.
-
Consider adding ice-image and ice-image-ref to ICE, as the height and width
attributes may be useful to subscribers for images that are encoded inline
or by reference.
-
Add url to ice-offer, to provide a link to a web page that describes the
offer more fully than can be embedded in the ice-offer. This is valuable
to any ICE implementation.
-
Add RSS' textinput mechanism to ice-offer. ICE does not have this concept.
Add this as an optional element contained within ice-offer. Again, this
is probably generally useful to ICE syndications in any delivery mode.<
/li>
-
Add a delivery mechanism, get, to the delivery modes, in addition to the
existing push and pull delivery modes. Get delivery would allow the subscriber
to do a simpe HTTP GET of a provided URL in order to retrieve updates.
As this means that the syndicator has no knowledge of the subscriber's
identity or state, GET delivery is only supported for full updates of content
that is not customized to particular subscribers.
-
The ICE protocol does not require the negotiation process. Therefore, a
subscription with get delivery could be initiated simply by having the
syndicator publicize the URL from which the ice-payload can be retrieved.
Of course, the standard ICE subscription discovery and negotiation mechanism
can be followed for get mode subscriptions.
-
In addition, for GET subscriptions, in order to simplify implementation,
the only allowable elements would be: ice-payload, ice-header, ice-sender,
ice-response, ice-package, ice-image-link and ice-item-link. This would
allow syndicators and subscribers that support only the get mode of delivery
to have an extremely simple implementation.
-
Since GET subscriptions are anonymous, subscription-id for such subscriptions
should be an empty string.
-
Finally, in order to model RSS's bundling of subscription information into
the file containing the content, I would propose adding the ice-catalog
structure, containing only the single ice-subscription relevant to the
ice-package, as an an optional component within the ice-package. While
syndicators could provide the full ICE level of detail for themselves (contact
information, etc.) the receiver can feel free to ignore all attributes
not mapped above.
Open Issues
There are a number of issues where it may make sense to adopt more of ICE's
functionality even into this "simplified" model of ICE modeled on RSS.
For example:
-
There may be value in supporting regular ICE inline content encoding (ice-item)
and reference (ice-item-ref, type="retrieve") for GET subscriptions. As
this would be new functionality for RSS, this option is not explored in
this document, but it may be appropriate for a fuller merging of the protocols.
-
Descriptions, names, etc., are encoded as plain text attributes in the
XML. We may want to consider encoding them using the ice-text structure,
which would allow alternate representations in varions languages and character
encodings. RSS doesn't have alternate encodings, so I stuck to the simpler
string representation for simplicity in modeling RSS.
-
There are a number of RSS features that are specified but not implemented
by Netcenter. If there are specified RSS features are generally not implemented
by RSS users, it may be preferable not to add those features to ICE.
-
RSS doesn't have the concept of failure status. If RSS "files" are dynamically
built, many of the ICE error codes having to do with operational availability,
permanent redirection, etc., are probably appropriate.
-
One can also easily imagine it being useful to send in-band notification
(ice-notify) to subscribers.