The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
Advanced Search
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

Cover Stories
Articles & Papers
Press Releases

XML Query

XML Applications
General Apps
Government Apps
Academic Apps

Technology and Society
Tech Topics
Related Standards
Created: October 22, 2003.
News: Cover StoriesPrevious News ItemNext News Item

Atom as the New XML-Based Web Publishing and Syndication Format.

Update 2004-06-16: On June 16, 2004 the IESG officially announced the formation of the IETF Atom WG: "IETF Forms New Atom Publishing Format and Protocol (atompub) Working Group." See general references in the topic document "Atom Publishing Format and Protocol."

Update 2004-05-05: On May 05, 2004 the Internet Engineering Steering Group (IESG) announced the proposal for a new IETF Atom Publishing Format and Protocol Working Group within the IETF Applications Area. See details in the news item "IESG Announces Proposal for IETF Atom Publishing Format and Protocol Working Group."

[October 22, 2003] The Atom Project, to the extent that anyone can declare authoritatively what it is, or is quintessentially meant to support, is "an initiative to develop a common syntax for syndication, archiving, and publishing." Sam Ruby (Emerging Technologies Group, IBM) is most often credited for originating the core ideas, and design work spread across several wikis and weblog Internet sites is now being shared by some of the brightest developer minds focused upon the future of Web content creation and distribution.

The developers agree that Atom "will be vendor neutral, implemented by everybody, freely extensible by anybody, and cleanly and thoroughly specified." Atom is sometimes characterized as the successor to RSS (Really Simple Syndication or RDF Site Summary), which is variably used for news headline syndication, website metadata description, and content syndication. Like RSS, Atom is being created through an informal consensus process by volunteers in the Web developer community at large.

Sam Ruby appears to recognize that the function of Atom will be revealed in unpredictable ways, escaping any telos imagined by the current designers. The key insights are these: design Atom such that content is not treated as a second class citizen (allow its conceptual model and syntax to blur the subjective distinction between metadata and data); insist upon a uniform mechanism for expressing the core concepts independent of the usage (e.g., allow multiple implementation designs conforming to abstract API requirements, and anticipate multiple schema formalisms for validation); keep the format open and simple (e.g., not requiring special serialization of the XML, implementable using simple POST and GET operations under HTTP).

The Atom design is envisioned as extensible for different application areas (license terms, access control, content categorization, versioning, related resources, etc.) The core features are those common to most creations of intellectual works: source/author, editing date(s), resource identifier/location, and content. Given these minimal but central goals, we can understand the simplicity and generality of the abstract for the draft Atom API specification: the API document "presents a technique for using XML and HTTP to edit content." In this context, "edit" means "read, write, modify, delete" (approximately: GET, POST, PUT, DELETE).

The goal of extreme generality remains in tension with the competing objective of ensuring that the new syndication format, while extensible, has predictable consistency at the semantic level. This probably means that the formalism in final draft will specify some required elements. In particular, if an agreed design goal is to capture time sensitive information, then an element for time information would be required. In the draft syntax description [as of 2003-10-22], the top-level <feed> element has required subelements title, link, modified (date in UTC), and author. An <entry> element would have required subelements title, link (URI permanent link), id, issued (W3DTF +/- timezone), and modified.

RoadMap snapshot for Atom (Echo/Pie) as of 2003-10-17: The project roadmap involves: "(1) Decide on the conceptual model of a log entry. Primer, ConceptualModel; (2) Decide on a syntax for this model. Syntax, SyntaxConsiderations; (3) Build a syndication format using this syntax; (4) Build an archiving format using this syntax; (5) Build a weblog editing protocol using this syntax (the Atom API)." According to this RoadMap document, sixty-some companies have pledged support for Atom (aka Echo/Pie/etc) along with 170+ individual developers.

The Internet domain is host for Sam Ruby's weblog and for the Atom Project wiki, both serving as publication organs for Atom design and development. The "It's Just Data" blog for Atom and related topics is built in XHTML 1.1 code. This is important to one of Sam Ruby's goals for the new syndication format: a desire to enable such things as XPath queries over the content. A blog entry for September 26, 2003 "Fun with XPath" documents some of the details.

Note: Descriptive text above is based in part upon a summary provided by Sam Ruby. Sam presented an overview of Atom at the News Standards Summit on December 8, 2003 (XML 2003 venue).

Atom Entry and Content Model

The content, structure, and (lexical) syntax for Atom <entry> and <content> elements are still [2003-10] under discussion. Mark Pilgrim presents draft examples and some of the key concepts in his article "The Atom API," published by Excerpts:

[An Atom entry has] "lots of information: a title, an excerpt or summary, and an author who has a name, email address, and URL of his own. The entry has a 'created' date and a 'modified' date (usually server-generated), and an 'issued' date (which is a date that the author would like to give to this entry, separate from when he actually posted it). The entry is viewable at a specific link, has an internal ID (a URN), and finally has some XHTML content.

The Atom content model is probably worth a whole article by itself, but for the moment let me just handwave and say that it can handle more than just XHTML. Any MIME type can be expressed (specify it in the @type attribute), and non-XML content (such as HTML or plain text) is simply escaped or put in a CDATA block, with a mode="escaped" attribute on the content element. It can even handle binary content (such as an image) by specifying @mode="base64" and including a base64-encoded representation of the data...

The Atom API has several other methods beyond add, edit, delete, retrieve, search. It can be used for posting comments on entries, managing users and user preferences, managing categories, managing site templates; eventually it will be usable for everything you can do manually with your weblog through your server's browser-based interface... [As for] Atom authentication: it does not involve sending plain text passwords in the clear..."

Principal references:

Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation


XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Bottom Globe Image

Document URI:  —  Legal stuff
Robin Cover, Editor: