The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
Advanced Search
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

Cover Stories
Articles & Papers
Press Releases

XML Query

XML Applications
General Apps
Government Apps
Academic Apps

Technology and Society
Tech Topics
Related Standards
Last modified: June 24, 2005
Voice Browser Call Control (CCXML)

[June 16, 2003]   Updated W3C Working Draft for Call Control Extensible Markup Language (CCXML).    The W3C Voice Browser Working Group has released an updated working draft specification for Voice Browser Call Control: CCXML Version 1.0. The CCXML specification defines declarative markup designed "to provide telephony call control support for VoiceXML or other dialog systems. CCXML is an adjunct language intended to complement and integrate with a VoiceXML system. The document contains references to VoiceXML's capabilities and limitations, and provides details on how VoiceXML and CCXML can be integrated. However, the two languages are separate and neither is required for an implementation of the other. CCXML can be integrated with a traditional IVR system, and VoiceXML can interface with other call control systems." The changes in this third working draft are substantial, as identified in the separate changed marked 'diff' version and summarized in Appendix G. The 2003-06-12 release "includes major revisions to the call control and media models to better specify the behavior and includes full details on all call control objects and events; a number of attribute names were updated to make the specification more consistent."

[October 15, 2002] CCXML Revised Working Draft. "Voice Browser Call Control: CCXML Version 1.0." W3C Working Draft 11-October-2002. Edited by RJ Auburn (Voxeo). "This document describes CCXML, or the Call Control Extensible Markup Language. CCXML is designed to provide telephony call control support for VoiceXML or other dialog systems. CCXML has been designed to complement and integrate with a VoiceXML system. Because of this you will find many references to VoiceXML's capabilities and limitations. One will also find details on how VoiceXML and CCXML can be integrated. However it should be noted that the two languages are separate and are not required in an implementation of either language. For example CCXML could be integrated with a more traditional IVR system and VoiceXML or other dialog systems could be integrated with some other call control systems... This specification describes markup for designed to provide telephony call control support for VoiceXML or other dialog systems. CCXML is far from complete. This draft is meant to give people access to an early version of the language so that people can understand the direction that the working group is moving... Every executing VoiceXML program has an associated CCXML program. It runs on a thread separate from the VoiceXML dialog. When an event is delivered to a user's voice session (now a coupling of an active VoiceXML dialog and its CCXML program), it is appended to the CCXML program's queue of events. The CCXML program spends almost all its time removing the event at the queue's head, processing the event, and then removing the next event. Meanwhile, the VoiceXML dialog can interact with the user, undisturbed by the incoming flow. Most VoiceXML programs never need to consider event processing at all. Writing a CCXML program, then, mainly involves writing the handlers which are executed when certain events arrive. There are mechanisms for passing information back and forth between VoiceXML and CCXML, but the important points are that CCXML: (1) lives on its own thread, and (2) carries the burden of rapid asynchronous event handling..."

[February 21, 2002] On 2002-02-21, the W3C published a draft for the Call Control Extensible Markup Language, designed "to provide telephony call control support for VoiceXML or other dialog systems. CCXML will be part of the W3C Speech Interface Framework, which includes markup languages for dialog (VoiceXML 2.0), speech grammar, speech synthesis, natural language synthesis, and pronunciation lexicon."

[February 21, 2002]   W3C Publishes Specification for Voice Browser Call Control (CCXML).    The W3C Voice Browser Working Group has released a first public Working Draft specification for Voice Browser Call Control: CCXML Version 1.0. The CCXML specification, based upon CCXML 1.0 submitted in April 2001, "describes markup for designed to provide telephony call control support for VoiceXML or other dialog systems. CCXML has been designed to complement and integrate with a VoiceXML system." The draft thus contains many references to VoiceXML's capabilities and limitations, together with details on how VoiceXML and CCXML can be integrated. However, the two languages are separate and are not required in an implementation of either language. For example CCXML could be integrated with a more traditional IVR system and VoiceXML could be integrated with some other call control system... Properly adding advanced telephony features to VoiceXML [through CCXML] entails adding not just a new telephone model, but new call management and event processing, as well... events from telephony networks or external networked entities are non-transactional in nature; they can occur at any time, regardless of the current state of VoiceXML interpretation. These events could demand immediate attention. We could either abandon VoiceXML's admirably simple single-threaded programming model, or delay event-servicing until the VoiceXML program explicitly asked to handle such events. Instead of making either of these bad choices, we instead move all call control functions out of VoiceXML into an accompanying CCXML program. VoiceXML can thus focus on being effective for voice dialogs, while CCXML tackles the very different problems..." [Full context]

Rationale for CCXML is provided via a list of "needed features that VoiceXML currently can't supply":

  • Support for multi-party conferencing, plus more advanced conference and audio control. Any large conference application requires such features.
  • Ability to give each active call leg its own dedicated VoiceXML interpreter. Currently, the second leg of a transferred call lacks a VoiceXML interpreter of its own, limiting the scope of possible applications.
  • Sophisticated multiple-call handling and control, including the ability to place outgoing calls. Multiple-party conferences mean VoiceXML needs a more effective way of handling telephony resources.
  • Handling for richer and more asynchronous events. Advanced telephony operations involve substantial amounts of signals, status events, and message-passing. VoiceXML does not currently have a way to integrate these asynchronous 'external' events into its event-processing model.
  • Ability to receive events and messages from external computational entities. Interacting with an outside call queue, or placing calls on behalf of a document server, means the VoiceXML must be contacted by an outside party.

From the W3C Voice Browser Activity statement: "Call Control: W3C is working on markup to enable fine-grained control of speech (signal processing) resources and telephony resources in a VoiceXML telephony platform. The scope of these language features is for controlling resources in a platform on the network edge, not for building network-based call processing applications in a telephone switching system, or for controlling an entire telecom network. These components are designed to integrate naturally with existing language elements for defining applications which run in a voice browser framework. This will enable application developers to use markup to perform call screening, whisper call waiting, call transfer, and more. Users can be offered the ability to place outbound calls, conditionally answer calls, and to initiate or receive outbound communications such as another call..."

From the W3C Voice Browser Activity statement: "W3C is working to expand access to the Web to allow people to interact via key pads, spoken commands, listening to prerecorded speech, synthetic speech and music. This will allow any telephone to be used to access appropriately designed Web-based services, and will be a boon to people with visual impairments or needing Web access while keeping theirs hands and eyes free for other things. It will also allow effective interaction with display-based Web content in the cases where the mouse and keyboard may be missing or inconvenient. To fulfil this goal, the W3C Voice Browser working group is defining a suite of markup languages covering dialog, speech synthesis, speech recognition, call control and other aspects of interactive voice response applications. VoiceXML is a dialog markup language designed for telephony applications, where users are restricted to voice and DTMF (touch tone) input. The other specifications are being designed for use in a variety of contexts, and not just with VoiceXML. Further work is anticipated on enabling their use with other W3C markup languages such as XHTML, XForms and SMIL. This will be done in conjunction with other W3C working groups, including the proposed new Multimodal working group."

Main References

Articles, News

  • [April 2005] "Using Call Control XML (CCXML) as a SIP Softswitch." By R.J. Auburn (Chief Technology Officer, Voxeo Corporation). From VoiceXML Review Volume 5, Issue 2 (March/April 2005). "CCXML is a sibling standard to VoiceXML, designed to facilitate enhanced call control capabilities in both VoiceXML platforms and in call control platforms that have no relationship to or with VoiceXML. CCXML gives both VoiceXML and non-VoiceXML telephony developers alike the ability to quickly create and deploy telephony applications that make use of enhanced call routing, multiparty call conferencing, outbound dialing, intelligent call progress analysis, call center / Computer Telephony Integration (CTI), and more. Over the last few years, developers have increasingly used CCXML to add robust call control features to their telephony applications. In parallel, over the last few years Voice over IP — and more specifically SIP Voice over IP - has found widespread acceptance and deployment in everything from long distance networks to enterprise call centers and consumer telephony services like Vonage and AT&T's CallVantage. SIP has even conquered the world of cell phones: many of the proposed third generation ('3g') cellular telephony standards are in fact SIP signaling over cellular data networks. As both SIP and CCXML technologies are finding rapid adoption by carriers, enterprises, and consumers, many in the industry have wondered how these technologies work together in next generation telephony deployments. Are they competitive or complimentary? Do they work well together? What advantages or disadvantages result from a combination of both? This article will address many of those questions by showing how CCXML can be used to implement an extensible SIP Softswitch... By developing and deploying applications on a CCXML based SIP Softswitch, enterprises and carriers alike have access to a rich set of call control features while retaining the simplicity and power of web-based application development. The resulting applications work well with both next-generation SIP deployments and traditional PSTN solutions. SIP and CCXML are extremely complimentary technologies that work well together. Many of the most valuable benefits of both technologies are found at the intersection of the two: CCXML makes it much easier to build SIP call control applications; and SIP makes it much easier to integrate CCXML with both new and existing telephony architectures..."

  • [October 16, 2002] "VoiceXML, CCXML, and SALT." By Ian Moraes. In XML Journal Volume 3, Issue 9 (September 2002), pages 30-34. "There's been an industry shift from using proprietary approaches for developing speech-enabled applications to using strategies and architectures based on industry standards. The latter offer developers of speech software a number of advantages, such as application portability and the ability to leverage existing Web infrastructure, promote speech vendor interoperability, increase developer productivity (knowledge of speech vendor's low-level API and resource management is not required), and easily accommodate, for example, multimodal applications. Multimodal applications can overcome some of the limitations of a single mode application (GUI or voice), thereby enhancing a user's experience by allowing the user to interact using multiple modes (speech, pen, keyboard, etc.) in a session, depending on the user's context. VoiceXML, Call Control eXtensible Markup Language (CCXML), and Speech Application Language Tags (SALT) are emerging XML specifications from standards bodies and industry consortia that are directed at supporting telephony and speech-enabled applications. The purpose of this article is to present an overview of VoiceXML, CCXML, and SALT and their architectural roles in developing telephony as well as speech-enabled and multimodal applications... Note that SALT and VoiceXML can be used to develop dialog-based speech applications, but the two specifications have significant differences in how they deliver speech interfaces. Whereas VoiceXML has a built-in control flow algorithm, SALT doesn't. Further, SALT defines a smaller set of elements compared to VoiceXML. While developing and maintaining speech applications in two languages may be feasible, it's preferable for the industry to work toward a single language for developing speech-enabled interfaces as well as multimodal applications. This short discussion provides a brief introduction to VoiceXML, CCXML, and SALT for supporting speech-enabled interactive applications, call control, and multimodal applications and their important role in developing flexible and extensible standards-compliant architectures. This presentation of their main capabilities and limitations should help you determine the types of applications for which they could be used. The various languages expose speech application technology to a broader range of developers and foster more rapid development because they allow for the creation of applications without the need for expertise in a specific speech/telephony platform or media server. The three XML specifications offer application developers document portability in the sense that a VoiceXML, CCXML, or SALT document can be run on a different platform as long as the platform supports a compliant browser. These XML specifications are posing an exciting challenge for developers to create useful, usable, and portable speech-enabled applications that leverage the ubiquitous Web infrastructure..."

  • [October 16, 2002] "Introduction to CCXML, Part V." By Jonathan Eisenzopf (Ferrum Group, LLC). From 2002. Part 5 in a 5-part series '. See Part1, Part 2, Part 3, and Part 4. "VoiceXML is a language for creating and managing speech dialogs. VoiceXML does not, however, provide call control functions such as multi-party conferencing and outbound calling. This functionality has been filled by vendors who provide proprietary extensions that are not usually portable from one platform to another. The CCXML specification was developed to add the call control features that VoiceXML lacks and to standardize this functionality. It also provides one of the critical 'missing pieces' that skeptics have cited as a barrier for VoiceXML to be more widely adopted by the telephony marketplace..."

Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation


XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI:  —  Legal stuff
Robin Cover, Editor: