At the SpeechTEK 2004 Conference IBM announced a major contribution of software to open source initiatives at the Apache Software Foundation and the Eclipse Foundation.
The new software projects are intended to "spur the availability of speech-enabled applications by making it easier and more attractive for developers to build and add speech recognition capability in a standardized way. Supported by more than 20 key industry players from speech vendors to platform providers, the initiative is aimed at ending the battles over competing, proprietary specifications."
An Eclipse Voice Tools Project will "focus on Voice Application tools in the JSP/J2EE space, based on W3C standards, so that these standards become dominant in voice application development. It will depend on and extend the XML and Web development capabilities of the Eclipse Web Tools Platform Project, providing a set of Eclipse plugins that will provide development tools for W3C Standards/Recommendations for Voice application markup."
Under the project proposal, the Voice Tools will initially "consist of editors for VoiceXML, the XML Form of SRGS (Speech Recognition Grammar Specification), and CCXML (Call Control eXtensible Markup Language). Implementations of other tools that implement W3C voice standards, such as the LexiconML (Pronunciation Markup Language), will be added as the standards solidify and the Voice Tools Eclipse community grows."
The Voice Tools editors are planned as "extensions of the SSE (Structured Source Editor) from the Web Tools Platform Project, and will have the capability of syntactical validation and content-assistance of the markup tags for each of the W3C standards. The XML DTD (Document Type Description) for the standard markup will be used to perform these functions, and the user will have the ability to choose the DTD from a preference page associated to each of the markup editors."
IBM is also contributing its Reusable Dialog Components (RDCs) technology to the Apache Software Foundation. RDCs are "pre-built speech software components, or building blocks that handle basic functions such as date, time, currency, locations (major cities, states, zip codes). They are often-used functions in speech-enabled infrastructure applications. For example, these RDCs allow a caller to book a flight using an auto-agent over the phone. Multiple reusable dialog components can be aggregated to provide higher levels of user functionality."
RDCs are Java Server Page (JSP) tags developed by IBM Research that "enable dynamic development of voice applications and multimodal user interfaces. JSPs that incorporate RDC tags automatically generate W3C VoiceXML 2.0 at runtime — providing a standard basis for speech applications. By providing familiar and standards-based programming models, J2EE developers can add voice interaction to Web applications. By making the RDC framework and set of example tags available to the community, speech components built using it will work together, regardless of the vendor that created them."
From the IBM Announcement
IBM today announced it is contributing software to the open source community in a move to spur the availability of speech-enabled applications by making it easier and more attractive for developers to build and add speech recognition capability in a standardized way.
The initiative, supported by more than 20 key industry players from speech vendors to platform providers, is aimed at ending the battles over competing, proprietary specifications.
IBM is contributing Reusable Dialog Components (RDCs) to Apache Software Foundation and proposing a project at the Eclipse Foundation to donate markup editors for speech standards established by the W3C.
Pre-built speech software components, or "building blocks" that handle basic functions such as date, time, currency, locations (major cities, states, zip codes), RDCs are often-used functions in speech-enabled infrastructure applications. These allow a caller to, for example, book a flight using an auto-agent over the phone. Multiple reusable dialog components can be aggregated to provide higher levels of user functionality.
Developed by IBM Research, RDCs are Java Server Page (JSP) tags that enable dynamic development of voice applications and multimodal user interfaces. JSPs that incorporate RDC tags automatically generate W3C VoiceXML 2.0 at runtime — providing a standard basis for speech applications. By providing familiar and standards-based programming models, J2EE developers can add voice interaction to Web applications. And by making the RDC framework available to the community, speech components built using it will work together, regardless of the vendor that created them. Both the framework and a set of example tags are to be contributed to the Apache Software Foundation.
Separately, IBM's contribution of speech markup editors to Eclipse is aimed at making it easier for developers to write standards-based speech applications as well as create and utilize RDCs within those applications. In proposal stage, this contribution not only gives speech developers a standard way of writing VoiceXML applications, it can also give web developers tools to more easily add speech access to their web applications. This comprises the initial formation of a project at Eclipse for open source tools for voice application development, which will be further evolved by several companies in the VoiceXML community.
Supporters of this initiative include: Apptera, AT&T, Audium, Avaya, Cisco, Fluency, Genesys, Kirusa, Loquendo, Motorola, Nortel, Nuance, Openstream, ScanSoft, Siebel, Syntellect, Telisma, TuVox, V-Enable, Viecore, Vocomo, VoiceGenie, Voice Partners, and VoxGeneration.
Currently, much of the code in the speech ecosystem is proprietary and specific to each vendor. This initiative is aimed at giving speech developers the benefits of open, standards-based programming models and tools that mainstream developers have had. This can also allow companies to speech-enable their existing applications more quickly and efficiently since developers will be able to build speech applications from standards-based components from various speech providers in the same application.
"Since its initial $40 million contribution to launch Eclipse in November of 2001, IBM has continued to contribute to making Eclipse an open platform for application development and integration," said Mike Milinkovich, Executive Director of the Eclipse Foundation. "With this project proposal, IBM is taking another step toward propelling innovation and giving Java developers the tools to work speech technology into their applications."
The initiative follows closely on the heels of IBM's contribution of its Cloudscape database to the Apache open source community as well as additional developer resources for the community to build on cloudscape.
"This is the latest step in IBM's contribution to open source and to speech technology," said Gary Cohen, General Manager, IBM Pervasive Computing. "By giving more standards-based speech resources to the development community, IBM hopes to accelerate development and drive innovation in all areas of the speech ecosystem — from speech vendors, to ISVs, to platform providers."
Principal references:
- IBM Announcement 2004-09-13: "IBM to Contribute Speech Software to Apache Software and Eclipse Foundations. Open Source Initiative Aimed at Driving Interoperability Between Vendors, Broadening Resources Available to Speech Community."
- Project Proposal for Voice Tools Project
- Eclipse Web Tools Platform Project
- Eclipse Foundation
- Jakarta Project: Reusable Dialog Components (RDC) Tag library
- Apache Software Foundation
- IBM open source projects
- Press:
- "Speech Code from IBM to Become Open Source." By Steve Lohr. From CNet News.com and The New York Times (September 12, 2004).
- "IBM Developing Speech Market Donating reusable dialogue components to Apache to help drive speech app marketplace for contact centers, partnering on products with Avaya." By Jim Ericson. From Line56 (September 13, 2004).
- "IBM Donates Voice Code to Apache." By Erin Joyce. From InternetNews.com(September 13, 2004).
- "IBM Takes Speech Tech Open Source." By Shelley Solheim. In eWEEK (September 13, 2004).
- "IBM Releases Speech Technology To Open Source." By Antone Gonsalves [TechWeb News]. In CRN (September 14, 2004).
- General references:
- "Speech Synthesis Markup Language (SSML) Version 1.0 Advances to W3C Recommendation."
- "W3C Speech Synthesis Markup Language Specification" - Main reference page.
- "VoiceXML Forum" - Main reference page.