[NOTE: This local archive copy of "XML Activity" was mirrored as a snapshot from the official and canonical URL, http://www.w3.org/XML/Activity, 1999-01-06; please refer to the canonical source document if possible. A few anchors have been placed in this version to facilitate linking to relevant subsections. 1999-01-29.]

XML Activity

Work on XML is being managed as part of W3C's Architecture Domain.

Activity statements provide a managerial overview of W3C's work in this area. They are designed to be read from beginning to end, to be informative and interesting. The role of W3C is given, also the benefits to the Web community, accomplishments to date and a summary of what the future holds.

Introduction
Role of W3C in developing XML
What the Future Holds
Contacts

For brevity, in this Activity Statement, we often refer to the Extensible Markup Language as simply "XML".

Introduction

XML -- the Extensible Markup Language -- is a simple, very flexible text format based on SGML (ISO 8879). Designed to meet the challenges of large-scale electronic publishing, XML^TM will also play an increasingly important role in the exchange of a wide variety of data on the Web.

XML will

Enable internationalized media-independent electronic publishing
Allow industries to define platform-independent protocols for the exchange of data, especially the data of electronic commerce
Deliver information to user agents in a form that allows automatic processing after receipt
Make it easy for people to process data using inexpensive software
Allow people to display information the way they want it
Provide metadata -- data about information -- that will help people find information and help information producers and consumers find each other

Simple example of XML

The best way to appreciate what XML looks like is with a simple example. Imagine your company sells products on-line. Marketing descriptions of the products are written in HTML, but names and addresses of customers, and also prices and discounts are formatted with XML. Here is the information describing a customer:

    <customer-details id="AcPharm39156">
        <name>Acme Pharmaceuticals Co.</name>
        <address country="US">
            <street>7301 Smokey Boulevard</street>
            <city>Smallville</city>
            <state>Indiana</state>
            <postal>94571</postal>
        </address>
    </customer-details>

The XML syntax uses matching start and end tags, such as <name> and </name>, to mark up information. A piece of information marked by the presence of tags is called an element; elements may be further enriched by attaching name-value pairs (for example, country="US" in the example above) called attributes. Its simple syntax is easy to process by machine, and has the attraction of remaining understandable to humans. XML is based on SGML, and is familiar in look and feel to those accustomed to HTML.

Building applications with XML

XML is a low-level syntax for representing structured data. You can use this simple syntax to support a wide variety of applications. This idea is put across in a simplistic way in the diagram below, which shows how XML now underpins a number of Web mark-up languages and applications. HTML, MathML and many other applications are based on XML

W3C are redeveloping HTML as a suite of XML tag sets so that, although documents will still be marked up using HTML, this will conform to the rules of XML. In this environment mathematical expressions can be inserted into documents using MathML, a formatting language written in XML and developed by W3C's Math working group. Presumably, other domain-specific XML-based tag sets will become candidates for inclusion in HTML documents.

W3C's Metadata Activity is developing the Resource Description Format (RDF). This uses a simple data model expressed in XML syntax as the basis for a language for representing properties of Web resources such as images, documents and the relationships held between them. The Platform for Internet Content Selection (PICS), is being recast in RDF. The PICS framework provides a means for attaching labels to material (in particular, to indicate whether it is suitable for children).

Finally, the Synchronized Multimedia Integration Language - (SMIL) is an XML application consisting of a declarative language for scheduling multimedia presentations on the Web.

Outside of W3C, many groups are already defining new formats for information interchange. The number of XML applications appears likely to grow rapidly. There are many areas, for example, the health-care industry, the Inland Revenue, government and finance, where XML applications may be soon be used to store and process data. XML as a simple method for data representation and organization will mean that problems of data incompatibility and tedious manual re-keying will, by and large, be solved.

The Document Object Model and XML

The flexibility of XML makes it ideal for interchange of structured data for further processing on the receiving machine. The W3C Document Object Model Activity will provide an interoperable set of classes and methods to manipulate XML documents (as well as HTML documents) from programming languages such as Java, ECMAScript, VBScript, and C++.

Namespaces and XML

XML markup can be used to stucture data to support automatic processing. How is software to recognize the markup it exists to process, and avoid confusing it with markup designed for the use of some other software? For example, one application might use an element called address to label the mailing address of a person; in another application, address might instead be used for a network address. How would a machine or even a person looking at the XML markup know which use of "address" is intended in a given instance?

What is needed is a method for identifying the conventions governing the use of particular sets of elements. The idea is to use a Web address as a globally unique name for such a set of conventions. W3C's work on namespaces is concerned with the elaboration of this idea.

Current Situation

W3C's XML 1.0 Recommendation was issued on February 10, 1998. In response to the increasing popularity of XML as a basis for Web applications, the Activity has organized itself into the following groups:

[XML Coordination Group
XML Schema Working Group
XML Linking Working Group
XML Information Set Working Group
XML Fragment Working Group
The XML Syntax Working Group]

XML Coordination Group

The membership of this group is the chairs of the individual Working Groups. Its role is to provide a forum for coordination between the Working Groups of the XML Activity, and between the XML Activity and other parts of W3C, and between the XML Activity and other organizations. In particular, the co-ordination group:

Coordinates workflow
Watches out for dependencies between WGs
Creates and dissolves WGs within its chartered scope (rare but essential)
Sets master WG/IG/CG meeting schedule
Maintains the public roadmap (the document you are now reading)
Notifies IG of changes they should be aware of
Manages crises
- Turns hard policy/architecture questions over to the IG
- Deals with upper W3C echelons when necessary
- Deals with aggrieved member organizations when necessary
Maintains liaison inside and outside the W3C
In particular, gathers and forwards requests for additional requirements to the appropriate WG(s)
Forwards requests for process changes originating in the IG upward into the larger W3C
Suggests/proposes items relating to policy and architecture to the IG for its consideration

The chair of the XML Coordination Group is Jon Bosak of Sun Microsystems.

XML Schema Working Group

While XML 1.0 supplies a mechanism, the Document Type Definition (DTD) for declaring constraints on the use of markup, automated processing of XML documents requires more rigorous and comprehensive facilities in this area. Requirements are for constraints on how the component parts of an application fit together, the document structure, attributes, data-typing, and so on. The XML Schema Working Group is addressing means for defining the structure, content and semantics of XML documents.

The co-chairs of the Schema WG are Dave Hollander of Hewlett-Packard and C. M. Sperberg-McQueen, of the University of Illinois at Chicago and the W3C.

XML Linking Working Group

The XML Linking Working Group is designing hypertext links for XML. Engineers defining the way that links are to be written in XML have made a distinction for links between objects - "external" links, and "internal" links to locations within XML documents, and both types will receive detailed treatment by this group. The objective of the XML Linking Working Group is to design advanced, scalable, and maintainable hyperlinking and addressing functionality for XML

The working drafts XML Linking Language (XLink) and XML Pointer Language (XPointer) represent the basis on which the work of the Linking WG will proceed.

The chair of the Linking WG is Bill Smith, of Sun Microsystems.

[Note: Mail Archives for comments on XML Linking, viz., www-xml-linking-comments@w3.org.]

XML Information Set Working Group

The XML 1.0 Recommendation describes the physical representation of XML documents: the use of brackets, character strings and other "nuts and bolts" which make up the language. The Information Set Working Group is looking at more abstract descriptions of XML documents in terms of document tree structures, elements, their attribute lists and so on. The idea is to provide a common reference set that other specifications can use and extend to construct their underlying data models, thus helping to ensure interoperability among the various XML-based specifications and among XML software tools in general.

The chair of the Information Set WG is David Megginson, invited expert.

XML Fragment Working Group

The XML standard supports logical documents composed of possibly several entities. It may be desirable to view, edit, or interchange one or more of the entities or parts of entities without interchanging the entire document. The problem, then, is how to provide to a recipient of such a fragment the appropriate information about the context that fragment had in the larger document.

The goal of the Fragment Working Group is to define a way to send fragments of an XML document without having to send all or part of the parent document as well. The delivered fragments can either be viewed or edited immediately or accumulated for later use, assembly, or other processing.

The chair of the XML Fragment WG is Paul Grosso of ArborText.

[Note: Mail archives for the XML Fragments List, www-xml-fragment-comments@w3.org.]

The XML Syntax Working Group

The XML Syntax Working Group is concerned with several aspects of XML:

XML Style Sheet Linking. Since XML has no pre-defined set of tags like HTML's P and H1, information about how to display elements must be given in a stylesheet. This is a bit more work, but it provides tremendous flexibility and facilitates managing consistency across large sets of documents. For details, see the W3C Style Sheets Activity and the CSS and XSL home pages.
Defining an XML profile, consisting of a simplified and reduced set of XML features which might specify, for example, a sub-set of the full recommendation that a given device might support, or a given XML application might use.
Canonicalizing XML which involves finding a single or "canonical" version of every possible possible form of the same document (by reducing white space, mapping quote marks to a standard form, etc.etc.) with a view to using that standard form for the purpose of applying digital signature technology. An algorithm is applied to the canonical form of the document to generate a large number. If the document is tampered with in any way when it is sent down the wire, the algorithm applied to the document will generate a different number from the original, showing up even minute changes to have taken place.
Tracking Internationalization Developments. One part of this work, for example, concerns the use by XML, of Universal Character Set defined by ISO/IEC 10646 and Unicode. The goal is to arrange that each time these standards are extended, that the XML specification is automatically updated accordingly
Errata to XML 1.0. The group plan to track errata in the XML Recommendation.

The co-chairs of the XML Syntax WG are Tim Bray, invited expert, and Joel Nava of Adobe.

What the Future Holds

The Schema Working group plans to deliver Requirements, Working Drafts, and Proposed Recommendations on data typing and schema language in 1999.

The Fragment Working Group expects to issue a W3C Proposed Recommendation for Fragment Interchange by Summer 1999

The Syntax Group plans to deliver:

Proposed Recommendation for the XML Style Sheet Linking Version 2
Proposed Recommendation for the XML profile
Proposed Recommendation for Canonicalizing XML

all during the middle of next year (1999).

The Information Set Working Group plans to have completed the First public XML Information Set Working Draft to be released by the end of 1998 and a Proposed Recommendation by Spring 1999.

Contacts

Dan Connolly, XML Activity Lead

by Dan Connolly and Tim Bray
Last modified $Date: 1998/11/12 22:34:05 $
Created January 1996

Copyright © 1998 W3C (MIT, INRIA, Keio ), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.