Cover Pages: Last Call Working Draft for W3C Speech Synthesis Markup Language (SSML).

The W3C Voice Browser Working Group has released a Last Call Working Draft of the "Speech Synthesis Markup Language Version 1.0." This specification describes markup for generating synthetic speech via a speech synthesizer, and forms part of the proposals for the W3C Speech Interface Framework. The Voice Browser Working Group has sought to develop standards to enable access to the Web using spoken interaction. The Speech Synthesis Markup Language Specification is part of this set of new markup specifications for voice browsers, and is designed to provide a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications. The essential role of the SSML markup language is to provide authors of synthesizable content a standard way to control aspects of speech such as pronunciation, volume, pitch, rate, etc. across different synthesis-capable platforms. SSML is based upon the JSGF and/or JSML specifications, which are owned by Sun Microsystems, Inc.; a related initiative to estabilish a standard system for marking up text input is SABLE." An informative Appendix B provides the XML DTD for SSML; the normative Appendix C defines the SSML XML Schema.

Bibliographic information: Speech Synthesis Markup Language Version 1.0. W3C Working Draft 02-December-2002. Edited by Daniel C. Burnett (Nuance), Mark R. Walker (Intel), and Andrew Hunt (SpeechWorks International). Version URL: http://www.w3.org/TR/2002/WD-speech-synthesis-20021202/. Latest version URL: http://www.w3.org/TR/speech-synthesis/. Previous version URL: http://www.w3.org/TR/2002/WD-speech-synthesis-20020405/.

Status: "This is a W3C Last Call Working Draft for review by W3C Members and other interested parties. Last Call means that the Working Group believes that this specification is technically sound and therefore wishes this to be the last call for comments. If the feedback is positive, the Working Group plans to submit it for consideration as a W3C Candidate Recommendation. Comments can be sent until 15 January 2003... Although an Implementation Report Plan has not yet been developed for this specification, the Working Group currently expects to require at least two independently developed interoperable implementations of each required feature, and at least one implementation of each feature, in order to exit the next phase of this document, the Candidate Recommendation phase. To help the Voice Browser Working Group build such a report, reviewers are encouraged to implement this specification and to indicate to W3C which features have been implemented, and any problems that arose..."

From the W3C Voice Browser Activity Statement:

W3C is working to expand access to the Web to allow people to interact via key pads, spoken commands, listening to prerecorded speech, synthetic speech and music. This will allow any telephone to be used to access appropriately designed Web-based services, and will be a boon to people with visual impairments or needing Web access while keeping their hands and eyes free for other things. It will also allow effective interaction with display-based Web content in the cases where the mouse and keyboard may be missing or inconvenient.

To fulfill this goal, the W3C Voice Browser Working Group is defining a suite of markup languages covering dialog, speech synthesis, speech recognition, call control and other aspects of interactive voice response applications. Specifications such as the Speech Synthesis Markup Language, Speech Recognition Grammar Specification, and Call Control XML are core technologies for describing speech synthesis, recognition grammars, and call control constructs respectively. VoiceXML is a dialog markup language that leverages the other specifications for creating dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key (touch tone) input, recording of spoken input, telephony, and mixed initiative conversations.

These specifications bring the advantages of Web-based development and content delivery to interactive voice response applications. Further work is anticipated on enabling their use with other W3C markup languages such as XHTML, XForms, and SMIL. This will be done in conjunction with other W3C Working Groups, including the Multimodal Interaction Activity.

Principal references:

Speech Synthesis Markup Language Version 1.0. W3C Working Draft 02-December-2002.
W3C Voice Browser Activity
W3C Voice Browser Activity Statement
Mail archives for 'www-voice'
Introduction and Overview of W3C Speech Interface Framework. W3C Working Draft 4-December-2000.
JSpeech Markup Language (JSML). W3C Note 05-June-2000.
"W3C Publishes New Speech Synthesis Markup Language Specification." News item 2002-04-05.
"W3C Speech Synthesis Markup Language Specification" - Main reference page.


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY