Members of the W3C Multimodal Interaction Working Group have issued a first public working draft for the Ink Markup Language which "serves as the data format for representing ink entered with an electronic pen or stylus. The markup allows for the input and processing of handwriting, gestures, sketches, music and other notational languages in Web-based applications. It provides a common format for the exchange of ink data between components such as handwriting and gesture recognizers, signature verifiers, and other ink-aware modules. InkML supports a complete and accurate representation of hand-drawn ink. For instance, in addition to the pen position over time, InkML allows recording of information about transducer device characteristics and detailed dynamic behavior to support applications such as handwriting recognition and authentication. It offers support for recording additional channels such as pen tilt, or pen tip force, commonly referred to as pressure in manufacturers' documentation. InkML also provides means for extension; by virtue of the XML-based language notation, users may easily add application-specific information to ink files to suit the needs of the application at hand. The Ink Markup Language is designed for use in the W3C Multimodal Interaction Framework as proposed by the W3C Multimodal Interaction Activity.
Bibliographic Information
Ink Markup Language. Edited by Gregory Russell (IBM). W3C Working Draft 6-August-2003. First Working Draft. Version URL: http://www.w3.org/TR/2003/WD-InkML-20030806. Latest version URL: http://www.w3.org/TR/InkML. Authors: Gregory Russell (IBM), Yi-Min Chee (IBM), Giovanni Seni (Motorola), Larry Yaeger (Apple), Christopher Tremblay (Corel), Katrin Franke (Fraunhofer Gesellschaft), Sriganesh Madhvanath (HP), and Max Froumentin (W3C).
Background and Motivation
As more electronic devices with pen interfaces have and continue to become available for entering and manipulating information, applications need to be more effective at leveraging this method of input. Handwriting is an input modality that is very familiar for most users since everyone learns to write in school. Hence, users will tend to use this as a mode of input and control when available.
A pen-based interface consists of a transducer device and a pen so that the movement of the pen is captured as digital ink. Digital ink can be passed on to recognition software that will convert the pen input into appropriate computer actions. Alternatively, the handwritten input can be organized into ink documents, notes or messages that can be stored for later retrieval or exchanged through telecommunications means. Such ink documents are appealing because they capture information as the user composed it, including text in any mix of languages and drawings such as equations and graphs.
Hardware and software vendors have typically stored and represented digital ink using proprietary or restrictive formats. The lack of a public and comprehensive digital ink format has severely limited the capture, transmission, processing, and presentation of digital ink across heterogeneous devices developed by multiple vendors. In response to this need, the Ink Markup Language (InkML) provides a simple and platform-neutral data format to promote the interchange of digital ink between software applications. [adapted from the Overview]
Ink Markup Language Requirements
A W3C Note published in January 2003 outlines the Requirements for the Ink Markup Language. An excerpt from the 'Introduction follows.
The Ink Markup Language is the data format used to represent ink entered with an electronic pen or stylus in a multimodal system.
These requirements have been compiled based on review of the fundamental Multimodal Interaction Requirements and additional considerations pertaining to the role of the markup in pen-enabled systems. For each requirement that has been derived (in whole or in part) from fundamental Multimodal Interaction Requirements, the derivation is noted.
The Ink Markup will consist of primitive elements that represent low-level ink data information and application-specific elements that characterize meta information about the ink. Examples of primitive elements are device and screen context characteristics, and pen traces. Application-specific elements provide a higher level description of the ink data. For example, a segment tag could represent a group of ink traces that belong to a field in a form. Consequently, the requirements for the Ink Markup Language could fall in either of the two categories. This document does not attempt to classify requirements based on whether they are low-level or application specific.
The requirements are organized into six categories: General Requirements, Input Processing, Output Processing, Architectural, Mobility, and Multimodal Synchronization.
About the W3C Multimodal Interaction Framework
The W3C Multimodal Interaction Framework specification and identifies the major components for multimodal systems. Each component represents a set of related functions. The framework identifies the markup languages used to describe information required by components and for data flowing among components. The W3C Multimodal Interaction Framework describes input and output modes widely used today and can be extended to include additional modes of user input and output as they become available.
The purpose of the W3C multimodal interaction framework is to identify and relate markup languages for multimodal interaction systems. The framework identifies the major components for every multimodal system. Each component represents a set of related functions. The framework identifies the markup languages used to describe information required by components and for data flowing among components.
The multimodal interaction framework is not an architecture . The multimodal interaction framework is a level of abstraction above an architecture. An architecture indicates how components are allocated to hardware devices and the communication system enabling the hardware devices to communicate with each other. The W3C Multimodal Interaction Framework does not describe either how components are allocated to hardware devices or how the communication system enables the hardware devices to communicate. Section describes several example architectures consistent with the W3C multimodal interaction framework.
The Framework will build upon a range of existing W3C markup languages together with the W3C Document Object Model (DOM). DOM defines interfaces whereby programs and scripts can dynamically access and update the content, structure and style of documents.
W3C Multimodal Interaction Activity
As detailed in the Multimodal Interaction Requirements document, the W3C Multimodal Interaction Activity "is extending the Web user interface to allow multiple modes of interaction, offering users the choice of using their voice, or an input device such as a key pad, keyboard, mouse, stylus or other input device. For output, users will be able to listen to spoken prompts and audio, and to view information on graphical displays. The Working Group is developing markup specifications for synchronization across multiple modalities and devices with a wide range of capabilities. The specifications should be implementable on a royalty-free basis."
Principal references:
- Ink Markup Language. W3C Working Draft 6-August-2003.
- "Requirements for the Ink Markup Language." W3C Note 22-January-2003.
- Contact: Max Froumentin (W3C Multimodal Interaction Working Group)
- Mailing list archives for the www-multimodal@w3.org list.
- W3C Multimodal Interaction Framework. W3C Note 06-May-2003
- W3C Multimodal Interaction Activity
- W3C Device Independence Activity