CallXML
www.callxml.org
Overview
CallXML is an XML based markup language used to describe the user interface of a telephone, voice over IP, or multi-media call application to a CallXML browser. A CallXML browser can then use that description to control and react to the call itself. CallXML includes:
· Media action elements such as <playAudio> and <recordAudio> to describe what to present to the user during a call.
· Call action elements such as <answer>, <call>, and <hangup> to describe how to control and route the call itself.
· Logic action elements such as <assign>, <clear>, and <goto> to describe how to modify variables and interact with traditional server-side web logic such as perl, other cgi languages, PHP, or ASP.
· event elements such as <onTermDigit>, <onHangup> to describe how to react to things the user can do during the call, such as pressing digits or hanging up.
· block elements which logically group actions & events together, so that one set of event handling elements can be used for several sequential actions.
HTML is a markup language used to describe the user interface of a web application. The table below compares some elements from HTML to elements in CallXML:
|
HTML |
CallXML |
|
<html> element begins an HTML document |
<callxml> element begins a CallXML document |
|
<table> element groups other visual elements together |
<block> element groups other CallXML action and event elements together |
|
<img> element displays a graphic |
<playAudio> element plays an audio file |
|
<a href="URL"> element describes where to go when a user clicks on a web link |
<onTermDigit> event element describes what to do when a user presses a button on the phone |
VoiceXML is designed to make it easy for web developers to create voice recognition based interfaces for either telephones or computer-based applications. As such, VoiceXML is an excellent solution for voice based applications which provide access to web content and information, including:
· Applications which allow users to retrieve web content via phone (ie: voice portals, web-by-phone, audiotex, etc)
· Applications which allow users to interact with web based services using spoken commands (ie: stock quotes, sports scores, etc)
CallXML was designed to make it easy for web developers to create applications that can interact with and control any number or type calls, including:
· Telephone or Voice over IP call applications which can control the initiation and routing of a phone call itself, supporting such features as outbound dialing, conferencing, and multi-call interactions (ie: conference bridges, internet call waiting, follow-me/find-me, etc)
· Telephone or Voice over IP call applications which can easily interact and respond to touch-tone based entry and selection (ie: voicemail, interactive voice response, etc)
· Call Applications which include support for additional media, such as faxes and video (ie: unified messaging, video conferencing, etc)
Because of the natural complexity associated with dealing with voice commands, VoiceXML uses a relatively complex form/field/grammar/filled interface model in its design.
In contrast, CallXML uses a more simplified block / action / event interface model, which can be easier to learn and which allows for visual design tools which directly represent CallXML markup as simple flow-chart like user interfaces.
The following CallXML example provides a basic answering machine application. It will play a greeting and then record an audio message. The audio message will be emailed to bob.jones@xyz.com.
However, if the caller presses the “*” key any time while
the greeting is being played or audio is being recorded, the CallXML browser
will terminate the play/record and go to a new URL for more instructions. Alternatively, if the user hangs up during
play or record, the CallXML browser will go to a second new URL.
<?xml version="1.0" encoding=”UTF-8”>
<callxml>
<block>
<answer/>
<playAudio value="http://xyz.com /greeting.wav/" termDigits="*">
<recordAudio value="mailto:abc@xyz.com" maxTime="30s" termDigits="*"/>
<onTermDigit value="*">
<goto value="http://xyz.com/cgi-bin/get_pin_code.pl"/>
</onTermDigit>
<onHangup>
<goto value="http://xyz.com/end_call.xml"/>
</onHangup>
</block>
</callxml>
If you know HTML but not XML, you will find that the two are very similar, although XML requires more precision when coding. Significant differences include:
· HTML files are often called “pages”, while XML files are normally called “documents”.
· All XML elements must be closed. For example, in HTML you often find the paragraph element (<p>) without a closing element. This is not allowed in XML. In XML you have to close the element like this:
<p>hi!</p>
More surprising is the fact that empty elements must be closed as well. So the HTML break tag <br> would have to be closed in XML with the combined open/close element form:
<br />
For compatibility with existing HTML processors, you should always include a space before the closing slash. When using elements with names that are identical to existing HTML elements, you should avoid using the equivalent form:
<br></br>
since many HTML processors will break when they see this construct.
· XML is very strict about nested elements. While HTML will let you get away with a construct like this:
<p>This is <b>bold <i>italic</b></i>.</p>
XML requires all contained elements to be closed before the containing element can be closed. So this is the legal way to handle the tags in XML:
<p>This is <b>bold <i>italic</i></b>.</p>
Note that HTML officially requires the same thing, but these errors are so common that most HTML processors deliberately ignore them.
CallXML Logic Elements
CallXML browsers include a per “call” or “session” variable space. The CallXML logic elements are used to assign and clear values in those variables, and to submit those variables as HTTP GET or POST information to traditional web applications created with languages such as CGI, Perl, Java, and .ASP.
CallXML variables can be referenced within CallXML element attributes by using the $varname; syntax.
<assign var=“ttt” default:
""
value=“123”/> default:
""
|
Attribute |
Description |
|
var |
Variable name to use when assigning a value |
|
value |
Value to put into the variable |
Assigns the value specified by the attribute value to the variable specified by the attribute var. As show above, will assign the value "123" to variable named "ttt”.
In addition to variables explicitly assigned by the callxml markup, CallXML browsers may automatically create variables which contain information related to the call / session.
The following is a list of automatically created global variables associated with each session:
|
Name |
Description |
|
session.Digits |
The digit buffer |
|
session.CallerID |
The callers phone number |
|
session.CalledID |
The number that was called to get here |
|
session.EventSenderID |
The session ID of the sender of the last external event |
|
session.CallerAccountID |
Voxeo ID of the caller (if known) |
|
session.CalledAccountID |
Voxeo ID of the called party (if known) |
|
session.ID |
Identifier for the current session |
Variable names must start with a letter A through Z or a through z and can contain: A through Z, a through z, 0 through 9, and the underscore. Variable names can be 1-40 characters in length. Variable names beginning with the characters "session." or "SESSION." in any case combination are reserved for internal system variables and cannot be created or set by the programmer except that the session.Digits buffer can be cleared by the clearDigits element or the clear var="session.Digits" element. The session.Digits buffer can be set using the <assign var="session.Digits" value="anything"> element.
<onError>
<?xml version="1.0" encoding=”UTF-8”>
<!-- read a first and last name to the caller using
text to speech -->
<callxml>
<block>
<assign var=”firstname” value=”jonathan”/>
<assign var=”lastname” value=”smith”/>
</block>
<text>
Thanks for calling, $firstname; $lastname;.
</text>
</callxml>
<clear var=“ttt”/>
|
Attribute |
Description |
|
var |
Name of the variable to clear. |
Clears variable specified by the attribute var. As show above, will clear a variable named "ttt". Effectively this works the same as <assign var="ttt" value="" />
<onError>
<clearDigits />
none
Clears the session.Digits buffer.
The session.Digits buffer contains any touch-tone digits the user may have pressed before a CallXML action is executed. This element will clear the digit buffer of any queued digits.
In addition to the clearDigits
element and the ability to clear the session.Digits buffer manually, several
CallXML elements have a clearDigits attribute that does the same thing in order
to reduce the number of elements required for common tasks.
<onError>
<goto value=“http://w.v.n/next.voxeo#block” Default:
""
submit=“*|x,y,z” Default: "*"
method=“get|post” Default:
"get" />
|
Attribute |
Description |
|
value |
Either a full URL (http://w.v.n/yo.voxeo) or a local URI pointing to a <block> label in the same CallXML file (e.g., #main_menu). Supported URL
formats include: |
|
submit |
List of variables to submit to the called URL/URI can be “all” or “*” for everything, or a comma delimited list of variables to submit (e.g., submit = “Variable1, Variable2, Variable3, Variable5, Variable9”). |
|
method |
Submit method to use: |
This element can either:
· Leap to another block of CallXML actions in the current file, by specifying value=”#blocklabel”, or
· Perform an HTTP GET or POST to fetch a new CallXML document, by specifying value=”url”.
When used to fetch a new CallXML document, the submit attribute can be used to pass CallXML browser variables with the HTTP GET or POST operation used to request the new document.
The method attribute is used to select whether to use HTTP GET or POST when fetching the new document.
<onError>
<?xml version="1.0" encoding=”UTF-8”>
<!—assign two vars, then send them to a perl script xyz.com as GET URL variables -- >
<callxml>
<block>
<assign var=”firstname” value=”jonathan”/>
<assign var=”lastname” value=”smith/>
<goto value=”http://xyz.com/cgi-bin/app.pl”
submit=”firstname, lastname”/>
</block>
</callxml>
A note about session elements |
|
|
<run value=“http://w.v.n/next.voxeo|#block” Default: ""
submit=“*|x,y,z” Default: "*"
method=“get|post” Default: "get"
var=“varForReturnedSessionID” Default:
"" />
|
Attribute |
Description |
|
value |
Either a full URL (http://w.v.n/yo.voxeo) or a local URI pointing to a <block> label in the same voxeo file (e.g., #main_menu). Supported URL
formats |
|
submit |
List of variables to submit to the called URI/URI can be “*” for everything or a comma delimited list of variables to submit. |
|
method |
Submit method to use: |
|
var |
Variable name in which to store the new session ID/ |
This element will run/start a new session, and fetches a CallXML document to use for the session from the URL or URI specified by value.
The submit attribute can be used to pass copies of CallXML browser variables from the parent session with the HTTP GET or POST operation used to fetch the new document.
The method attribute is used to select whether to use HTTP GET or POST. Var specifies the name of a variable in the parent session that will receive the unique identifier of the new session.
<onError>
<?xml version="1.0" encoding=”UTF-8”>
<!-- run a new session, starting with CallXML code from xyz.com -->
<!-- pass a few copies of a few variables to the code @ xyz.com -->
<callxml>
<block>
<assign var=”firstname”
value=”jonathan”/>
<assign
var=”lastname” value=”smith”/>
<assign var=”parentsession” value=”$session.ID;”/>
<run value=”http://xyz.com/newsess.asp”
submit=”fistname, lastname, parentsession”
var=”childsessid”/>
<!-– the session id of the new session is now -->
<!-– stored in childsessid -->
</block>
</callxml>
<sendEvent value=“msg_call_answered” Default: ""
session=“sss” Default:
"" />
|
Attribute |
Description |
|
value |
String containing the body of the message. |
|
session |
ID of the session to which the event will be sent. |
<sendEvent> is a tag that allows one session to send a message to another session.
The attribute value is used to specify the text message to send to the other session.
The attribute session is used to specify the unique identifier of the session to send the message to.
<onExternalEvent>
This event allows the session that is the recipient of a <sendEvent> to process the message. In use, one session could start another and wait for its child process to send it a message of some sort indicating that the child session had completed its task. For example, this highly simplified interaction could be used as the basis for a follow me/find me application:
<block>
<assign
var="parent"
value="$session.ID;" />
<run
value="Session2.xml"
submit="*" />
<delay>
<onExternalEvent
value="foo">
<simline
value="caught a foo external
event." />
</onExternalEvent>
</block>
<block>
<simline
value="send an event."
/>
<sendEvent
value="foo"
session="$parent;" />
</block>
CallXML Call Action Elements
CallXML call action elements specify actions the CallXML browser can apply to the call associated with the browser session. These actions include <answer> to answer a new inbound call, <hangup> to hangup or disconnect the call, <call> to initiate a new outbound call, and <conference> to connect or conference the audio from to different calls or sessions together.
<answer/>
None
This element will “answer” or “pick-up” the call. Any time a new call is received by a CallXML browser, it will use a browser specific mechanism to determine a URL from which to fetch an initial CallXML document to use for that call. However, the CallXML browser does not “answer” or “pick-up” the call until this element is executed.
<onError>
<?xml version="1.0" encoding=”UTF-8”>
<!-– answer the call, then play a greeting -->
<callxml>
<block>
<answer/>
<playAudio value=”greeting.wav” />
</block>
</callxml>
<hangup/>
None
This element instructs the CallXML browser to “hang-up” or disconnect the call associated with the current session.
<onError>
<?xml version="1.0" encoding=”UTF-8”>
<! – answer the call, then play a greeting, then hangup -- >
<callxml>
<block>
<answer/>
<playAudio value=”greeting.wav”/>
<hangup/>
</block>
</callxml>
<call value=“pstn:18314395130” Default: ""
callerID="pstn:1234567890" Default: "callerID"
maxTime=“30s” Default: "30s" />
|
Attribute |
Description |
|
value |
URL describing the place to initiate a call to |
|
callerID |
CallerID to present when placing the call |
|
maxTime |
Maximum amount of time to wait for the call to be answered |
The Call element allows for new outbound calls to be placed to the specified address.
The address is specified by the attribute value and is in URL format. Supported formats include PSTN:// to instruct the browser to call a normal telephone number.
The maximum amount of time to wait for the call to be answered can be specified by the attribute maxTime.
<onError>
<onAnswer type=“person|machine|unknown”>
<onCallFailure>
<onMaxTime>
<?xml version="1.0" encoding=”UTF-8”>
<! – place a call, when it’s answered, play a greeting -- >
<callxml>
<block>
<call value=”pstn:14075551212”/>
<onAnswer>
<playAudio value=”greeting.wav”/>
</onAnswer>
</block>
</callxml>
<conference targetSessions=”sessionID1,
sessionID2” Default:
""
termDigits=”#” Default:
"" />
|
Attribute |
Description |
|
targetSessions |
One or more unique session identifiers, separated by commas. |
|
termDigits |
List of touch-tone digits which can terminate the conference |
This element allows multiple lines in separate sessions to be conferenced together so that the parties on each line can talk to each other simultaneously.
The list of sessions to conference together is specified by the attribute targetSessions, and can be one more unique session identifiers separated by commas.
The termDigits can be used to specify touch-tone digits that will terminate the conference.
<onError>
<?xml version="1.0" encoding=”UTF-8”>
<!-- conference with another session -->
<!-- assume we previously stored the id of the other session into the variable “otherSess” -->
<!-- then wait until the conf ends or another event occurs -->
<callxml>
<block>
<conference targetSessions=”$othersess;”/>
<delay value=”nolimit”/>
</block>
</callxml>
<waitForConferenceEnd />
None
This element allows a session that is a target of a conference command to suspend all other processing until the conference is terminated by either party. This element has been deprecated in favor of the more general “wait” element.
<onError>
<wait value=”10s|nolimit” Default: “”
termDigits=”*”/> Default: “”
|
Attribute |
Description |
|
value |
Amount of time to wait. |
|
termDigits |
List of touch-tone digits which can terminate the wait action |
The wait element is used to instruct the CallXML browser to wait for a specified amount of time.
The time value is specified by the attribute value and can be in seconds (s), minutes (m), or milliseconds (ms); or can be set to “nolimit” to have no limit on the amount of time to wait.
termDigits specifies a list of digits that can terminate the wait action.
<onTermDigit>
<onHangup>
<onError>
<?xml version="1.0" encoding=”UTF-8”>
<!-- play an audio file, then wait up to 30 seconds for the user to press a button or hangup -->
<callxml>
<block>
<playAudio value=”waitingonyou.wav”/>
<delay value=”30s”/>
</block>
</callxml>
CallXML
Media Action Elements
<getDigits var="pager_msg" default: none
maxDigits="9"
default:
nolimit
termDigits="#*" default: ""
includeTermDigit="TRUE|FALSE" default: FALSE
clearDigits="TRUE|FALSE"
default:
FALSE
maxTime="30s"
default:
30s
maxSilence="5s"
default:
5s />
|
Attribute |
Description |
|
var |
Read digits into the variable specified, “pager_msg” in this example. |
|
maxDigits |
The maximum number of digits to read, 4 digits in this example. |
|
termDigits |
Digits that can terminate or end the entering of digits; either # or * in this example. The termDigits value can be any combo of "1234567890*#ABCD@" or "any" or "". "any" indicates that any digit will terminate the entry. |
|
includeTermDigit |
TRUE to include the term digit with the value placed into var, or FALSE to not include it. |
|
clearDigits |
Boolean value indicating whether the queued digits buffer should be cleared when this action starts. TRUE clears the digits buffer. FALSE leaves the contents of the digit buffer alone. |
|
maxTime |
Maximum time period to wait for digits, 30 seconds in this example. The time units can be specified in ms (milliseconds), s (seconds), or m (minutes). If no unit specification is present, seconds are presumed. |
|
maxSilence |
Maximum time to wait between digits, 5 seconds in this example. Time units can be in ms (milliseconds), s (seconds), or m (minutes). If no unit specification is present, seconds are presumed. |
The getDigits element reads input touch-tone digits from the call and places them into a variable by the var attribute.
In the example above, the user would have 30 seconds to enter up to 9 digits on her phone, pausing no more than 5 seconds between digits, and ending the digit input with either the # key or * key.
This element is typically used for such tasks as gathering PIN codes, pager numbers, and anything else that involves multiple digits coming from the user. <getDigits> requires an associated <onTermDigit> event element .
The follow events may occur while this action is running:
<onTermDigit>
<onMaxDigits>
<onMaxTime>
<onMaxSilence>
<?xml version="1.0" encoding=”UTF-8”>
<!-- play a file asking a user to enter their 4 digit pin code, then get the pin code and store it in a variable named user_pin .. wait up to 30 seconds for them to enter the pin. -->
<!-- then tell them what pin code they entered using text to speech -->
<callxml>
<block>
<playAudio value=”enterpin.wav”/>
<getDigits var=”user_pin”
maxDigits=”4”
maxTime=”30s”/>
<onMaxDigits>
<text> you entered $user_pin; </text>
</onMaxDigits>
</block>
</callxml>
<play… />
<playNumber
format=“digits|number” Default:
"digits"
value=“12345” Default: ""
termDigits=“*#” Default: ""
clearDigits=“TRUE|FALSE” Default: "FALSE" />
<playMoney format=“us” Default: "us"
value=“1.25” Default: ""
termDigits=“*#” Default: ""
clearDigits=“TRUE|FALSE” Default: "FALSE" />
<playDate format=“ddmmyyyyhhss” Default: as shown
value=“1012990732” Default: ""
termDigits=“*#” Default: ""
clearDigits=“TRUE|FALSE” Default: "FALSE" />
<playAudio format=“audio/vox” Default: "audio/vox"
value=“http://www.ttt.com/sample.vox” Default: ""
termDigits=“*#” Default: ""
clearDigits=“TRUE|FALSE” Default:
"FALSE" />
|
Attribute |
Description |
|
format |
Formatting string to use for the play (not totally defined yet). |
|
value |
The value to play (literal, variable, or URL reference). |
|
termDigits |
Digits that can terminate the play. |
|
clearDigits |
Boolean variable indicating whether the session.Digits buffer should be cleared on entry. TRUE clears the digits buffer. FALSE leaves the contents of the digits buffer alone. |
<play … > is used to play an a number, date, money value, numeric value, or audio file.
For <playAudio>, the format attribute refers to the mime-type of the audio file, in case the CallXML browser cannot automatically determine the mime-type.
<onError>
<?xml version="1.0" encoding=”UTF-8”>
<!-- ===================== -->
<!-- Here is our main menu script -->
<!-- ===================== -->
<callxml>
<block label=“MainMenu”
repeat=“3”
clearDigits=“TRUE”/>
<playAudio format= “audio/vox”
value=“http://www.blahblah.com/mainmenu.vox
termDigits=“567890*#”
clearDigits=“TRUE”/>
<!-- ============== -->
<!-- Our event handlers -->
<!-- ============== -->
<onTermDigit value= “*#”>
Do something here.
</onTermDigit>
<onTermDigit value= “567890”>
Do something different here.
</onTermDigit>
</block>
</callxml>
<recordAudio format=“audio/vox” Default:"audio/vox"
value=“ftp://www.v.n/msg.vox” Default: ""
termDigits=“*#” Default: ""
clearDigits=“TRUE|FALSE” Default: "FALSE"
maxTime=“30s” Default: "30s"
maxSilence=“5s” Default: "5s"
beep="TRUE|FALSE" Default: "TRUE"
/>
|
Attribute |
Description |
|
format |
Formatting string to use for the record (mime type). |
|
value |
Where to put the audio (variable or URL reference). |
|
termDigits |
Digits that can terminate the play. |
|
clearDigits |