CP RSS Channel
About Our Sponsors
Articles & Papers
Technology and Society
|News: Cover Stories|
|IETF Atom Syndication Format Specification Declared Ready for Implementation.|
Update 2005-08-17: With the release of the Atom Format Internet Draft specification version -11, "The Atom Syndication Format" is approved by the IESG as an IETF Proposed Standard. "Some members of the working group remain unenthusiastic about some sections of the document, but the chairs strongly believe that there is rough (or better) consensus in support of the document as a whole... Scott Hollenbeck and the XML Directorate have reviewed the specification for the IESG; test implementations have confirmed basic protocol soundness..." See the IESG announcement.
[July 15, 2005] With the July 14, 2005 release of the The Atom Syndication Format Version -10 specification by the IETF, the Atom Format Internet Draft has been declared an appropriate basis for implementation of Atom 1.0. Atom is an XML-based Web content and metadata syndication format. Atom will live alongside RSS Version 2.0, and is expected by many to gradually replace RSS ("RDF Site Summary" or "Really Simple Syndication).
The The Atom Syndication Format has been produced by members of the IETF Atom Publishing Format and Protocol (atompub) Working Group under the direction of WG Co-Chairs Tim Bray and Paul Hoffman. The version -10 Internet Draft fixes a few things from the -09 draft sent to the IESG for final review. Eleven Internet drafts have been produced by the WG, beginning with Version -00 dated July 8, 2004.
The IETF's specification for the Atom 1.0 data format is described as "cooked and ready to serve." According to an announcement from Tim Bray, "The Atom 1.0 spec still has one registered objection from a member of the IESG, but the WG agreed that the objection was reasonable and we think the latest draft linked above fixes it; assuming he agrees, Atom very soon becomes an IETF standard. It will eventually get an RFC number, but that may take a while, first because the RFC Editor machinery works slowly, and secondly because we have a normative reference to Ned Freed's re-work of the MIME-type RFCs, which isn't quite finished yet.
Atom is "an XML-based document format that describes lists of related information known as feeds. Feeds are composed of a number of items, known as entries, each with an extensible set of attached metadata. For example, each entry has a title. The primary use case that Atom addresses is the syndication of Web content such as Weblogs and news headlines to Web sites as well as directly to user agents. However, nothing precludes it from being used for other purposes and types of content."
As presented in the Version -10 Internet Draft, the atom:feed markup element "is the document (i.e., top-level) element of an Atom Feed Document, acting as a container for metadata and data associated with the feed. Its element children consist of metadata elements followed by zero or more atom:entry child elements. The atom:entry element represents an individual entry, acting as a container for metadata and data associated with the entry. This element can appear as a child of the atom:feed element, or it can appear as the document element of a standalone Atom Entry Document." In addition to common attributes, an entry's defined elements include: atomAuthor, atomCategory, atomContent, atomContributor, atomId, atomLink, atomPublished, atomRights, atomSource, atomSummary, atomTitle, and atomUpdated.
A number of articles have been written comparing Atom to RSS. It's fair to wager that contentiousness which has marked the development of "standardized" syndication formats under the name RSS/Atom will not cease immediately with the final public publication of Atom 1.0 as an approved IETF RFC. According to a summary provided by Robert Sayre, a "conservative count shows eight document formats calling themselves RSS. In order of appearance, they are 0.90, 0.91, 1.0, 0.92, 0.93, 0.94, 2.0, and 1.1. The versions in widest deployment today [2005-07] are 0.91, 1.0, and, most popularly, 2.0." The most distinctive feature of Atom in terms of its development framework is that it has been produced by a recognized standards body under a set of strict rules; none of the most recent versions of RSS was. Fortunately (not for programmers, but for end users), many of the Web-based tools for content syndication are being coded to process both Atom and various flavors of RSS.
Besides the The Atom Syndication Format specification, two other Internet Drafts from the IETF atompub Working Group are making their way through the development process. The The Atom Publishing Protocol (APP) draft "presents a protocol for using XML (Extensible Markup Language) and HTTP (HyperText Transport Protocol) to edit content. This application-level protocol for publishing and editing Web resources belonging to periodically updated websites at its core is the HTTP transport of Atom-formatted representations. The Atom Publishing Protocol Model defines operations on collections of Web resources; all collections support the same basic interactions, as do the resources within the collections. The patterns of interaction are based on the common HTTP verbs: GET is used to retrieve a representation of a resource or perform a read-only query; POST is used to create a new, dynamically-named resource; PUT is used to update a known resource; DELETE is used to remove a resource."
The Atom Feed Autodiscovery draft "specifies a machine-readable method of linking to an Atom feed HTML or XHTML document using the <link> element. The purpose of Atom autodiscovery is for clients who know the URI of a web page to find the location of that page's associated Atom feed. For example, say an end user wishes to subscribe to the Atom feed of a site. Their Atom-aware aggregator client could prompt them to enter the home page of the site. The client could retrieve the HTML source of the home page, find the Atom autodiscovery element, and then retrieve the Atom feed or cache the URI of the Atom feed for later retrieval."
The Atom Syndication Format. Produced by members of the IETF Atom Publishing Format and Protocol (atompub) Working Group. Edited by Mark Nottingham [WWW] and Robert Sayre [WWW]. Preliminary draft contributions from Tim Bray, Mark Pilgrim, and Sam Ruby; Norman Walsh provided the Relax NG schema. IETF Network Working Group, Internet Draft. Reference: 'draft-ietf-atompub-format-10. July 11, 2005, expires January 12, 2006. 56 pages.
See also the Atom Publishing Protocol and Atom Feed Autodiscovery IDs from the IET Working Group, and the Feed History (individual) draft:
"The Atom Publishing Protocol." Edited by Joe Gregorio (BitWorking, Inc) and Robert Sayre (Boswijck Memex Consulting). IETF Network Working Group, Internet Draft. Reference: 'draft-ietf-atompub-protocol-04.txt'. May 09, 2005, expires November 10, 2005. Updates the previous draft of March 18, 2005. 36 pages. See also HTML and XML. Version -04 is reorganized, adding ladder diagrams and SOAP interactions. See the version history from The Internet Report Catalog.
"Atom Feed Autodiscovery." Edited by Mark Pilgrim (IBM) and Phil Ringnalda. IETF ATOMPUB Working Group. Reference: Internet Draft 'draft-ietf-atompub-autodiscovery-01.txt'. May 10, 2005, expires November 11, 2005. 14 pages. This document specifies a machine-readable method of linking to an Atom feed from a HyperText Markup Language (HTML) or Extensible HyperText Markup Language (XHTML) document, using the <link> element. See also the HTML and -01/-00 diff. See the version history from The Internet Report Catalog.
"Feed History: Enabling Stateful Syndication." By Mark Nottingham [WWW]. IETF Network Working Group, (individual) Internet Draft. Reference: 'draft-nottingham-atompub-feed-history-02'. July 14, 2005, expires January 15, 2006. See the IETF announcement.
Robert Sayre is Co-Editor of the Atom Format specification, together with Mark Nottingham. He authored an excellent article on Atom in "Atom: The Standard in Syndication," published in IEEE Internet Computing Volume 9, Number 2 (July/August 2005), pages 71-78. An online version is available via IEEE Distributed Systems Online.
Sayre's article provides extensive discussion on the "History of Syndication Interoperability," "The First RSS," and "RSS Versions" — as well as surveying the specifications produced by the IETF Working Group. Here is an excerpt, from "The Atom Syndication Format":
"Like other syndication formats, Atom is made up of feeds and entries. A feed contains metadata that applies to the feed itself, followed by a series of entries. Clients can keep current on the most recently updated entries by polling a feed's URI. Entries also contain metadata such as titles, summaries, authors, dates, and, potentially, the entry's full text. In previous versions, RSS left the inclusion of full content to extensions, which created confusion among clients and publishers because publishers often duplicate content across multiple extensions and core elements in an effort to make entry content visible in the widest variety of client software.
The most popular content extension, content:encoded, was part of a larger RSS 1.0 module that clients typically implemented only partially. Many RSS feeds used the XHTML body element to transfer the full content of entries, but most aggregators were unequipped to handle content as parser-generated events, given that RSS content is usually escaped. Because it's seldom well-formed XML, most HTML content in RSS would cause XML parsers to abort processing. To work around this reality, RSS publishers transform parser-significant characters such as < into entities like < in the process known as escaping. It lets HTML content appear as a single character blob to the XML parser and enables much more lenient standards for HTML content. To address this fact, Atom standardized a content element for entries but left it extensible, so that publishers could include new content formats that remained clearly identified as content. The Atom specification also solved many other interoperability problems. The encoding of all plaintext or HTML content is now clearly indicated by the format through required attributes when fields contain (X)HTML, thus reducing the heuristics necessary to process the content. The WG also adopted the date format defined in RFC 3339, which simplifies parsing and broadens internationalization. Additionally, Atom appropriated the link element from XHTML, which lets authors easily tag links via a rel attribute.
Atom increases the number of required elements, relative to RSS, for entries and feeds. Most importantly, it requires a unique identifier in each entry. A single atom:id value applies to all instantiations of an entry. A cross between the Web's stable resource references and email's message-id approach, this feature is critical for preventing clients from listing duplicates when an entry is updated. In addition, previous versions of RSS have vaguely specified interactions with many common XML standards... A conservative count shows eight document formats calling themselves RSS. In order of appearance, they are 0.90, 0.91, 1.0, 0.92, 0.93, 0.94, 2.0, and 1.1. The versions in widest deployment today [2005-07] are 0.91, 1.0, and, most popularly, 2.0... Atom clearly defines its relationship to other XML standards, such as Namespaces in XML, XML Base, XML Encryption, XML Digital Signatures, and the XML specification itself..."
* The two kinds of Atom Documents defined in the Atom Format specification (Atom Feed Documents and Atom Entry Documents) must be well-formed XML, and are identified using the 'application/atom+xml' media type. Atom uses XML Namespaces to uniquely identify XML element names, and terminology from the XML Infoset. The specification does not define a DTD for Atom Documents, but expresses conformance within the prose; it provides a RELAX NG Compact Schema in an Informative Appendix B. The specification also places some requirements on Atom Processors.
Any element defined in The Atom Syndication Format specification may have an xml:base and/or xml:lang attribute. The value of the xml:lang attribute "indicates the natural language for the element and its children [and] the language context is only significant for [particular] elements and attributes declared to be 'language-sensitive'," as defined in the specification.
The three "Container Elements" defined in the The Atom Syndication Format include the atom:feed element, the atom:entry element, and the atom:content element. The atom:feed element is the top-level (document, root) element of an Atom Feed Document, "acting as a container for metadata and data associated with the feed. Its element children consist of metadata elements followed by zero or more atom:entry child elements. The atom:entry element represents an individual entry, acting as a container for metadata and data associated with the entry. This element can appear as a child of the atom:feed element, or it can appear as the top-level element of a standalone Atom Entry Document.
The atom:content Element either contains the content of the entry or links to the content of the entry. The entry's content may be free-form character text, HTML, XHTML, or other (linked) content identified by a (non-composite) MIME media type.
The Atom specification's atom:content element is designed to support the inclusion of arbitrary foreign markup, incorporating markup constructs from other vocabularies. The processing rules for "unknown foreign markup" vary depending upon context and the type of Atom construct, but in all cases, an Atom processor must not stop processing or signal an error if "unknown foreign markup" in a well-formed Atom document is legal according to the conformance rules in the Atom specification.
The Atom Syndication Format defines "Person Constructs" and "Date Constructs" in addition to the common Text Constructs referenced above. A Person Construct is "an element that describes a person, corporation, or similar entity," and defined elements include atom:name (a language-sensitive, human-readable name for the person or entity), an atom:uri element which designates an IRI associated with the person, and an atom:email element which conveys an e-mail address associated with the person/entity, according to relevant rules in RFC 2822. A Date Construct is an element having content that conforms to the 'date-time' production in RFC 3339; these date values are compatible with ISO 8601, the W3C. NOTE 'datetime-19980827, and W3C XML Schema Part 2.
Atom metadata markup elements are designed for use with Atom feed and/or entry elements, with special rules for the semantics of (non-)inheritance down the element content hierarchy. These metadata elements include: atom:author [author of the entry or feed], atom:category [with term, scheme, label], atom:contributor, atom:rights, atom:generator [agent used to generate a feed], atom:icon [image which provides iconic visual identification for a feed], atom:id [permanent, universally unique identifier for an entry or feed], atom:logo, atom:link [empty element defining a reference to a Web resource], atom:published [date construct indicating an event early in the life cycle of the entry], atom:source [a feed from which an entry is copied], atom:subtitle, atom:summary [summary, abstract or excerpt of an entry], atom:title, and atom:updated [date construct indicating the most recent instant in time when an entry or feed was modified in a way the publisher considers significant].
Atom processors are required to respect XML security measures used by publishers of Atom feeds if these conform to the designated W3C Digital Signatures and XML Encryption specifications. "Because Atom is an XML-based format, these existing XML security mechanisms can be used to secure [Atom] content" as defined for the atom:feed, atom:entry, and other elements.
[* This summary cribbed from earlier version; checking for variations]
The IETF atompub Working Group was chartered under the Applications Area directed by Ted Hardie and Scott Hollenbeck. The Working Group's Co-Chairs are Paul Hoffman and Tim Bray. Sam Ruby serves as the WG Secretary.
According to the IETF Atom Publishing Format and Protocol (atompub) Working Group Charter:
Atom defines a feed format for representing and a protocol for editing Web resources such as Weblogs, online journals, Wikis, and similar content. The feed format enables syndication; that is, provision of a channel of information by representing multiple resources in a single document. The editing protocol enables agents to interact with resources by nominating a way of using existing Web standards in a pattern.
Atom consists of:
- A conceptual model of a resource
- A concrete syntax for this model
- A syndication and archiving format (the Atom feed format) using this syntax
- An editing protocol using this syntax
The format must be able to represent:
- a resource that is a Weblog entry or article (e.g., it has an author, date, identifier, and content)
- a feed or channel of entries, with or without enclosed content
- a complete archive of all entries in a feed
- existing well-formed XML (especially XHTML) content
- additional information in an user-extensible manner
The editing protocol must enable:
- creating, editing, and deleting feed entries
- multiple authors for a feed
- multiple subjects or categories in a feed
- user authentication
- adding, editing, and deleting users
- setting and getting user preferences
- creating, getting and setting related resources such as comments, templates, etc.
The working group will use experience gained with RSS (variably used as a name by itself and as an acronym for 'RDF Site Summary', 'Rich Site Summary', or 'Really Simple Syndication') as the basis for a standards-track document specifying the model, syntax, and feed format. The feed format and HTTP will be used as the basis of work on a standards-track document specifying the editing protocol. The goal for the working group is to produce a single feed format and a single editing protocol; the working group will only consider additional formats or additional protocols if those charter changes are approved by the IESG.
The working group will also take steps to ensure interoperability, by:
- unambiguously identifying required elements in formats
- clearly nominating conformance levels for different types of software
- providing clear extensibility mechanisms and constraints upon them
The Atom protocol will be designed to provide security services
for updating and accessing dynamic online resources. The working
group will consider current known issues with requirements for
remote access, along with the fact that many such resources are
constrained by providers who provide the resource owners with
little configuration control.
The working group's primary focus will be on delivering an
interoperable format and corresponding protocol; it is expected
that all but the most basic, generic metadata and functions will be
accommodated through extensions, rather than in the core documents.
Extension development is not included in this charter. The working
group will consider the need to either close or to modify the charter
and document extensions once the core document set has been approved
by the IESG.
Atom Format Version -10 announcements:
- I-D ACTION:draft-ietf-atompub-format-10.txt. From the I-D-Announce List
- Atom 1.0. By Tim Bray (IETF atompub Working Group Co-Chair). "It's cooked and ready to serve. There are a couple of IETF process things to do, but this draft (HTML version) is essentially Atom 1.0. Now would be a good time for implementors to roll up their sleeves and go to work. Here's a comparison of RSS 2.0 and Atom 1.0, here's a list of known Atom feeds (which I bet doesn't even last a few weeks), and Sam Ruby is updating the Feed Validator (it's starting to work, but Sam says he doesn't think it's quite into beta yet)... A collection of Atom feed-format tests is under construction... my personal thank-you to the contributors... an ongoing full-text feed is available...
- Atom 1.0. By Danny Ayers. "Major thanks and congrats to Sam Ruby for getting it started, to him Tim and Paul Hoffman for acting as WG chairs (and to Sam again, alongside Mark Pilrim for their work on tests and the validator), and Mark Nottingham and Robert Sayre for doing the editing. Thanks to the IETF for hosting it all. Oh yeah, and a big Woo-hoo! to the other pedantic curmudgeons in the WG ;-)
- Atom 1.0 is Baked. By Len Bullard. Post to XML-DEV. "I've been reviewing this document. The Atom tribe did a heckuva good job cleaning up and tightening the RSS-clone. I think this is a good doc to review... Watching the Schema debates and sitting on a few of the other standards lists, I am increasingly impressed with the productivity possible when wikis are employed and the amount of process is reduced to the minimum required... the use of politics to determine outcomes is more reasonable, the ability to get far opportunities closer together is better, and overall, it seems to produce a better specification. Standards wonks pay heed."
Atom 1.0 specification (draft version -10):
Atom tests, tools, and feeds:
- Atom Format Tests. Managed by Sam Ruby. "Outline of test cases intended to be produced for the Feed Validator. See the posting.
- Feed Validator. By Mark Pilgrim and Sam Ruby. "A validator for syndicated feeds. It works with RSS 0.90, 0.91, 0.92, 0.93, 0.94, 1.0, and 2.0. It also validates Atom feeds."
- Known Atom Feeds
Atom and RSS:
- "RSS 2.0 and Atom 1.0, Compared." Tim Bray, Paul Hoffman, Sam Ruby, and Rob Sayre. A [Pre-]Wiki Snapshot. Contents include comparisons at these levels: (1) Major/qualitative differences: Deployment; Specifications; Publishing protocols; Required content; Payload; Full or partial content; Autodiscovery; Extraction and aggregation. (2) Differences of degree: Extensibility; URIs; Software availability; Language tagging; Digital signatures and encryption; Authors; Categories; Schema. (3) Sample RSS and atom feeds; (4) Element comparison table.
- Wiki 'RSS and Atom'. "People who generate syndication feeds have a choice of feed formats. As of mid-2005, the two most likely candidates will be RSS 2 and Atom 1.0. The purpose of this page is to summarize, as clearly and simply as possible, the differences between the RSS 2 and Atom 1.0 syndication languages."
- "RSS and Atom: A Quick Overview." By Bob Wyman (PubSub). Presented at the News Standards Summit 2005, Amsterdam RAI Centre, Netherlands. The presentation supplies a short 'RSS/Atom History' and example instances of RSS 2.0 and Atom feeds/entries. From slides 8-10, 'Atom Enhancements to RSS': Clarity in the specification: Carefully worded, with every word fought over at length on IETF mailing lists; RNG [RELAX NG] specifications provided to reduce ambiguity; Thoroughly specified, less ambiguous; Designed with Matching API; Well-defined Content Model; Support for content types; Support for linked, rather than embedded, content; Broader range of solutions addressed; Defined method for encryption and digital signatures; Defined Extensibility Model; Mandatory globally unique Entry IDs; Mandatory timestamps; Distinction between Summary and Content; Distinction between author and contributor; Free Standing Atom Entry Documents (supports API and 'Atom over XMPP'); Atom:source elements preserve attribution on copy; atom:updated to flag 'significant' changes; Multiple 'enclosures' via link element; Relative URI support via xml:base; Support for xml:lang; I18N support enhanced by use of IRIs; Support for RFC3339 dates (ISO-like dates). [cache]
- "Atom 1.0 vs RSS 2.0." Slashdot. July 18, 2005. Interesting, but you'll need a good filter.
Earlier Atom news stories:
- Atom WG:
|Receive daily news updates from Managing Editor, Robin Cover.|