The XML development team at IBM alphaWorks labs has released a beta version of a 'Voice Toolkit' to assist in the creation of voice applications "in less time, using a VoiceXML application development environment. The Voice Toolkit features grammar and VoiceXML editors so that application developers do not need to know the internals of voice technology. The Voice Toolkit Beta includes: (1) An integrated development environment (IDE) - runs on the desktop and enables the multi-step process of creating speech applications; (2) A VoiceXML editor - provides content assistance and integrated pronunciation development; (3) A Grammar editor - enables syntax-checking and integrated pronunciation development for generating JSGF grammars for VoiceXML applications. The grammar editor includes grammar creation for SRCL/BNF grammars and it provides conversion capability between SRCL/BNF and JSGF; (4) A pronunciation builder - generates a pronunciation from spelling; and it lets you manually create pronunciations; (5) A basic audio recorder - allows the creation of audio files from spoken text and the playing of previously-recorded audio files; (6) VoiceXML Reusable Dialog Components - pre-written VoiceXML code for use as building blocks for application functions."
From the website description:
The Voice eXtensible Markup Language (VoiceXML) is an XML-based markup language for creating distributed voice applications, much as HTML is a markup language for creating distributed visual applications. VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed-initiative conversations. The goal is to provide voice access and interactive voice response (e.g., by telephone, PDA, or desktop) to Web-based content and applications.
A grammar [in this connection] is an enumeration, in compact form, of the set of utterances (words and phrases) that constitute the acceptable user response to a given prompt. The VoiceXML 1.0 specification requires all valid spoken and telephone key-pad input to be specified using a grammar. The Voice Toolkit can create speech recognition and DTMF grammars in JSGF (Java Speech Grammar Format) for use in VoiceXML applications; and it can ceate Speech Recognition Control Language (SRCL) (a variant of Backus-Naur Form grammars used in DirectTalk) for use in DirectTalk IVR applications. JSGF grammars can be either built-in (from IBM WebSphere Voice Server SDK's VoiceXML browser), inline (within the VoiceXML file), or external (a separate file)...