The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Created: December 03, 2002.
News: Cover StoriesPrevious News ItemNext News Item

IBM WebSphere Voice Application Access Supports VoiceXML.

IBM has announced the WebSphere Voice Application Access middleware product designed to simplify "building and managing voice portals and to more easily extend web-based portals to voice. Leveraging the scalability, personalization, and authentication features of IBM's WebSphere Portal, it enables mobile workers to more easily access information from multiple voice applications -- using a single telephone number. This new offering includes IBM's WebSphere Voice Server as well as ready-to-use email, personal information management (PIM) functions, and sample portlets. It also supports VoiceXML and Java -- including development tools based on Eclipse, the open-source, vendor-neutral platform for writing software -- and uses open-standard programming languages to create voice-enabled applications that will interoperate with a range of web servers and databases. Building on the VoiceXML standards allows IBM WebSphere Voice Application Access to work with third party browsers and their associated underlying speech recognition and text-to-speech technologies. As the VoiceXML 2.0 specification nears final approval, IBM WebSphere Voice Application Access will move quickly to support it."

Description from "Extending an Enterprise with IBM WebSphere Voice Application Access," by Eddie Epstein:

Voice access to computers has become a preferred interface. The availability of high quality automated speech recognition and speech synthesis technologies, combined with lower cost and higher performance hardware, make automated voice access feasible for most applications. What is particularly important is that most applications can be multichannel in nature: providing voice access in addition to the traditional visual interface.

A major stumbling block for the voice interface has been the unnatural and difficult-to-understand nature of computer generated voices. Recent breakthroughs in the use of concatenative text-to-speech technology has eliminated this limitation and resulted in voice quality comparable to human speech. Speech recognition accuracy has also continued to improve, so that millions of people daily use their voice to 'dial' phone numbers by saying a person's name, manage their investment portfolios, and access weather information, sports scores and other information. In addition to technology improvements, the steady refinement of conversational dialogue design has resulted in a much more efficient and pleasant user experience than was provided by earlier voice activated systems...

The last critical piece to fall into place has been the availability of VoiceXML, an open standards-based voice application design protocol that is supported by all major speech technology suppliers. This standard was designed to allow voice applications to run on all enterprise-quality computer hardware and operating system platforms. Companies can be sure that their investment in a VoiceXML application infrastructure won't lock them into a single supplier for critical system components.

VoiceXML was introduced specifically to eliminate the need for proprietary IVR application design environments, to automatically provide the integration to middleware using the view-and-form based model of Web application design, and to create a standardized interface to speech recognition and speech synthesis technologies. VoiceXML enables WebSphere Voice Application Access to integrate voice interface capabilities in the same way WebSphere Portal Server applications are built on HTML and WML. These protocols provide a modular application design environment with common components sharable across all access modalities.

Most existing automated voice solutions have been created using proprietary voice application environments combined with custom interfaces to back-end business logic and data. These custom interfaces are difficult to integrate with traditional GUI Web access solutions. IBM WebSphere Voice Application Access combines the modular application design paradigm of IBM WebSphere Portal Server with VoiceXML to add voice access to the other modalities supported by WebSphere Portal Server. By building on VoiceXML, not only is the growing community of voice application developers able to directly leverage the voice application access platform, but platform customers should be able to choose between leading speech recognition and text-to-speech offerings.

Components of an application implementation using IBM WebSphere Voice Application Access: As with traditional visual interfaces, application logic and the generation of presentation markup is done in the application server middleware. VoiceXML markup is delivered to a speech server stack including a VoiceXML browser and underlying automatic speech recognition (ASR) and text-to-speech (TTS) technologies. A media gateway such as IBM WebSphere Voice Response is required to provide connectivity with the telephone network... Individual portlets deliver VoiceXML markup to the Voice Aggregator, which creates complete VoiceXML documents including support for a global main menu. Markup is sent to a compliant VoiceXML browser using standard HTTP connectivity. The VoiceXML browser works with ASR and TTS engines to interpret spoken input and generate voice output. The browser can also accept DTMF (telephone keypad) as input and use prerecorded audio files for output.

In order to interpret voice input, ASR engines use active vocabularies that identify recognizable words. These vocabularies also specify allowable word sequences; this combination of vocabulary and specific word ordering is called a speech recognition grammar. Each word in a grammar is represented by a spelling, but it is actually the word's pronunciation that is used by the ASR engine. Although both ASR and TTS speech technologies have large dictionaries of word pronunciations, applications will often use words or abbreviations outside the dictionary that require the definition of new pronunciations.

Roadmap: Building on the VoiceXML standards allows IBM WebSphere Voice Application Access to work with third party browsers and their associated underlying speech recognition and text-to-speech technologies. As the VoiceXML 2.0 specification nears final approval, IBM WebSphere Voice Application Access will move quickly to support it... New wireless networks and devices capable of supporting both voice and data channels require multimodal applications that combine the power of voice and visual interfaces. IBM WebSphere Voice Application Access is designed to combine with other IBM portal offerings to offer a platform for multimodal applications using server based voice technology.

From the 2002-12-02 announcement:

Adding to its portfolio, IBM unveiled the new WebSphere Voice Application Access product: middleware that simplifies building and managing voice portals and more easily extends web-based portals to voice. Leveraging the scalability, personalization and authentication features of IBM's WebSphere Portal, it enables mobile workers to more easily access information from multiple voice applications -- using a single telephone number.

This new offering includes IBM's WebSphere Voice Server as well as ready-to-use email, personal information management (PIM) functions, and sample portlets. It also supports VoiceXML and Java -- including development tools based on Eclipse, the open-source, vendor-neutral platform for writing software -- and uses open-standard programming languages to create voice-enabled applications that will interoperate with a range of web servers and databases.

In keeping with IBM's strategy to provide solutions across multiple platforms, IBM will be working to make WebSphere Voice Application Access interoperable with offerings from third party VoiceXML vendors, such as Nuance and Cisco. In addition, IBM is also working with independent solutions vendors including V-Enable, Voxsurf and Viecore to extend their current solutions.

Principal references:


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Bottom Globe Image

Document URI: http://xml.coverpages.org/ni2002-12-03-c.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org