The World Wide Web Consortium (W3C) has announced the publication of a First Public Working Draft of HTML 5: A Vocabulary and Associated APIs for HTML and XHTML. The specification is intended to replace, viz., become the new version of, what was previously defined in the HTML4, XHTML 1.x, and DOM2 HTML specifications.
The HTML 5 specification defines the fifth major revision of the core language of the World Wide Web: HTML. In this version: (1) new features are introduced to help Web application authors, (2) new elements are introduced based on research into prevailing authoring practices, and (3) special attention has been given to defining clear conformance criteria for user agents in an effort to improve interoperability. The new features are presented in the companion Working Draft HTML 5 Differences from HTML 4. The specification attempts to fulfill goals and principles articulated in the HTML Design Principles Working Draft.
According to the W3C announcement, the HTML 5 specification "helps to improve interoperability and reduce software costs by giving precise rules not only about how to handle all correct HTML documents but also how to recover from errors. Ajax and related innovations have propelled demands for a new standard that allows people to create Web applications that interoperate across desktop and mobile platforms. Some of the most interesting new features for authors are APIs for drawing two-dimensional graphics, embedding and controlling audio and video content, maintaining persistent client-side data storage, and for enabling users to edit documents and parts of documents interactively."
The new specification differs from previous versions of "HTML" in that it defines an abstract language for describing documents and applications, as well as some APIs for interacting with in-memory representations of resources that use this language. The in-memory representation is known as "DOM5 HTML", or "the DOM" for short.
Various concrete syntaxes could be used to transmit resources that use the abstract language; two concrete syntaxes are defined in the 2008-01-22 Working Draft specification. The first such concrete syntax is "HTML5". This is the format recommended for most authors, and is compatible with all legacy Web browsers. If a document is transmitted with the MIME type text/html, then it will be processed as an "HTML5" document by Web browsers. The second concrete syntax uses XML, and is known as "XHTML5". When a document is transmitted with an XML MIME type such as application/xhtml+xml, then it is processed by an XML processor by Web browsers, and treated as an "XHTML5" document.
The specification section on Scope addresses the relationship of HTML 5 to XUL, Flash, Silverlight, and other proprietary UI languages and to CPU-intensive high-end workstations and their associated computing environments:
"This specification is independent of the various proprietary UI languages that various vendors provide. As an open, vender-neutral language, HTML provides for a solution to the same problems without the risk of vendor lock-in.
For sophisticated cross-platform applications, there already exist several proprietary solutions (such as Mozilla's XUL, Adobe's Flash, or Microsoft's Silverlight). These solutions are evolving faster than any standards process could follow, and the requirements are evolving even faster. These systems are also significantly more complicated to specify, and are orders of magnitude more difficult to achieve interoperability with, than the solutions described in this document. Platform-specific solutions for such sophisticated applications (for example the MacOS X Core APIs) are even further ahead.
The scope of this specification is not to describe an entire operating system. In particular, hardware configuration software, image manipulation tools, and applications that users would be expected to use with high-end workstations on a daily basis are out of scope.
In terms of applications, this specification is targeted specifically at applications that would be expected to be used by users on an occasional basis, or regularly but from disparate locations, with low CPU requirements. For instance: online purchasing systems, searching systems, games (especially multiplayer online games), public telephone books or address books, communications software (e-mail clients, instant messaging clients, discussion software), document editing software, etc."
The specification's Scope section also clarifies the relationship of XHTML 5 to HTML 4.01, XHTML 1.1, DOM2 HTML, Web Forms 2.0, and XHTML2.
XHTML 5 "represents a new version of HTML4 and XHTML1, along with a new version of the associated DOM2 HTML API. Migration from HTML4 or XHTML 1.1 to the format and APIs described in this specification should in most cases be straightforward, as care has been taken to ensure that backwards-compatibility is retained. The specification will eventually supplant Web Forms 2.0 as well.
XHTML2 defines a new HTML vocabulary with better features for hyperlinks, multimedia content, annotating document edits, rich metadata, declarative interactive forms, and describing the semantics of human literary works such as poems and scientific papers. However, it lacks elements to express the semantics of many of the non-document types of content often seen on the Web... XHTML2 and this specification use different namespaces and therefore can both be implemented in the same XML processor.
The specification's Section 1.3 on "Conformance Requirements" clarifies that both document structure and processing model are normatively defined. It defines detailed processing models to foster interoperable implementations.
"This specification describes the conformance criteria for user agents (relevant to implementors) and documents (relevant to authors and authoring tool implementors).
There is no implied relationship between document conformance requirements and implementation conformance requirements. User agents are not free to handle non-conformant documents as they please; the processing model described in this specification applies to implementations regardless of the conformity of the input documents.
User agents fall into several (overlapping) categories with different conformance requirements. [For example]:
- Web browsers and other interactive user agents: Web browsers that support XHTML must process elements and attributes from the HTML namespace found in XML documents...
- Non-interactive presentation user agents: User agents that process HTML and XHTML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction...
- User agents with no scripting support: Implementations that do not support scripting (or which have their scripting features disabled) are exempt from supporting the events and DOM interfaces mentioned in the specification.
- Conformance checkers: Conformance checkers must verify that a document conforms to the applicable conformance criteria...
- Data mining tools: Applications and tools that process HTML and XHTML documents for reasons other than to either render the documents or check them for conformance should act in accordance to the semantics of the documents that they process.
- Authoring tools and markup generators: Authoring tools and markup generators must generate conforming documents; conformance criteria that apply to authors also apply to authoring tools, where appropriate...
The 2008-01-22 Working Draft for XHTML 5 defines dependencies upon four underlying specifications. It does not require support of any particular network transport protocols, style sheet language, scripting language, or any of the DOM and WebAPI specifications beyond those described above. However, the language described by this specification is biased towards CSS as the styling language, ECMAScript as the scripting language, and HTTP as the network protocol, and several features assume that those languages and protocols are in use. The four specifications are:
- XML: Implementations that support XHTML5 must support some version of XML, as well as its corresponding namespaces specification, because XHTML5 uses an XML serialisation with namespaces.
- XML Base: User agents must follow the rules given by XML Base to resolve relative URIs in HTML and XHTML fragments. That is the mechanism used in this specification for resolving relative URIs in DOM trees.
- DOM: Implementations must support some version of DOM Core and DOM Events, because this specification is defined in terms of the DOM, and some of the features are defined as extensions to the DOM Core interfaces.
- ECMAScript: Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings for DOM Specifications specification, as this specification uses that specification's terminology.
The HTML 5 specification will not be considered finished before there are at least two complete implementations of the specification. This is a different approach than previous versions of HTML had. The goal is to ensure that the specification is implementable and usable by designers and developers once it is finished.
The editors of the Working Draft provide instructions on how to read the specification: "This specification should be read like all other specifications. First, it should be read cover-to-cover, multiple times. Then, it should be read backwards at least once. Then it should be read by picking random sections from the contents list and following all the cross-references."
Publication note: The HTML 5 [Editor's Draft] specification is also being produced by the WHATWG. The two specifications are identical from the Table of Contents onwards. The W3C HTML Working Group is the W3C working group responsible for this specification's progress along the W3C Recommendation track.