[November 20, 2000] On November 20, 2000, the W3C issued a new working draft specification which describes markup for representing natural language semantics: Natural Language Semantics Markup Language for the Speech Interface Framework. Reference: W3C Working Draft 20-November-2000, by Deborah A. Dahl (Unisys). Document abstract: "The W3C Voice Browser working group aims to develop specifications to enable access to the Web using spoken interaction. This document is part of a set of specifications for voice browsers, and provides details of an XML markup language for describing the meanings of individual natural language utterances. It is expected to be automatically generated by semantic interpreters for use by components that act on the user's utterances, such as dialog managers." In this proposal, the NL semantics representation "uses the data models of the W3C XForms draft specification to represent application-specific semantics. While XForms syntax may change in future revisions of the specification, it is not expected to change in ways that affect the NL Semantics Markup Language significantly." The authors of the WD are members of the W3C Voice Browser Working Group. The specification has been produced as part of the W3C Voice Browser Activity, and forms part of the proposals for the W3C Speech Interface Framework. The specification includes a set of draft elements and attributes and [later will include] a draft DTD. Markup uses a root element <result> (with attributes grammar, x-model, and xmlns) which includes one or more <interpretation> elements. Multiple interpretations result from ambiguities in the input or in the semantic interpretation. The <interpretation> element has attributes confidence, grammar, x-model, and xmlns. The <interpretation> element includes an <input> element which contains the input being analyzed, optionally a <model> element defining the XForms data model and an <instance> element containing the instantiation of the data model for this utterance. Description: "The general purpose of the NL Semantics Markup is to represent information automatically extracted from a user's utterances by a semantic interpretation component, where utterance is to be taken in the general sense of a meaningful user input in any modality supported by the platform. Referring to the sample Voice Browser architecture in Introduction and Overview of the W3C Speech Interface Framework, a specific architecture can take advantage of this representation by using it to convey content among various system components that generate and make use of the markup. Components that generate NL Semantics Markup: (1) ASR, (2) Natural language understanding, (3) Other input media interpreters [e.g. DTMF, pointing, keyboard], (4) Reusable dialog component, (5) Multimedia integration component. Components that use NL Semantics Markup: (1) Dialog manager, and (2) Multimedia integration component. A platform may also choose to use this general format as the basis of a general semantic result that is carried along and filled out during each stage of processing. In addition, future systems may also potentially make use of this markup to convey abstract semantic content to be rendered into natural language by a natural language generation component..."
References: