A SIP Interface to VoiceXML Dialog Servers From: http://www.ietf.org/internet-drafts/draft-rosenberg-sip-vxml-00.txt ------------------------------------------------------------------------------------ Internet Engineering Task Force SIP WG Internet Draft Rosenberg,Mataga,Ladd draft-rosenberg-sip-vxml-00.txt dynamicsoft July 13, 2001 Expires: February 2002 A SIP Interface to VoiceXML Dialog Servers STATUS OF THIS MEMO This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt To view the list Internet-Draft Shadow Directories, see http://www.ietf.org/shadow.html. Abstract VoiceXML is an XML based scripting language for describing voice dialogs. VoiceXML interpreters run within an interpreter context that, among other tasks, provides a call control interface for accessing the interpreter. It is very natural to provide a VoIP-based interpreter context that uses SIP and RTP to communicate with the outside world. In this document, we provide detailed specifications for a SIP/RTP based interpreter context. 1 Introduction VoiceXML [1] is an XML based scripting language for describing voice dialogs. It supports user input through speech recognition and DTMF, and can communicate with the user through text-to-speech or recorded files. VoiceXML scripts are interpreted by a VoiceXML interpreter. Rosenberg,Mataga,Ladd [Page 1] Internet Draft sip-vxml July 13, 2001 This interpreter, in turn, runs within an interpreter context. The interpreter context is the interface between the outside world and the interpreter. It typically handles the mechanisms by which the script execution begins, and by which it is fed media to drive it. It also provides the means for fetching documents from some form of document server. It is very natural to provide a VoiceXML interpeter context based purely on IP. Specifically, based on VoIP using SIP [2] and RTP [3], along with HTTP for document access. An incoming VoIP call triggers the execution of the script, fetched from a server using HTTP. The incoming RTP stream for the call is passed to the interpeter for processing, and speech generated by the interpreter is sent over RTP to the called party. We call a pure IP-based VoiceXML system an "IP dialog server", or just "dialog server". Dialog servers are a key part of the application story for SIP-based networks, as described in the SIP application component architecture [4]. That document describes SIP-based dialog servers, and provides a high level overview of how the SIP interface works. This document provides a stand-alone, self-contained, more thorough description of a SIP-based VoIP VoiceXML interpreter context. 2 Script Initiation The script execution begins when a session is established using an INVITE request. 2.1 Script Naming In SIP, the request-URI identifies the user or service that the call is destined for. In the case of a dialog server, the dialog itself is the target for the call. As such, the request URI should contain the identifier for this dialog. This is consistent with the Request-URI service invocation model of RFC 3087 [5]. This URL can be in one of two formats. In the first, the VoiceXML script is identified directly by an HTTP URL. In the second, the script is not specified. Rather, the dialog server uses its configuration to map the incoming request to a specific script. The format for the Request-URI in either case is: Request-URI = "sip:" service-ID "." dialog-type ["." dialog-specific] "@" hostport url-parameters [headers] service-ID = "dialog" | extension-token dialog-type = "vxml" | service-token dialog-specific = vxml-specific | service-token Rosenberg,Mataga,Ladd [Page 2] Internet Draft sip-vxml July 13, 2001 service-token = 1*(alphanum | "-" | "!" | "%" | "*" | "_" | "+" | "`" | "'" | "~{}" ) vxml-specific = user-unreserved | unreserved | escaped Since the request URI can indicate a request for a variety of different services, of which a dialog server is only one type, the request URI first begins with a service identifier, that indicates the basic service required. This document specifies that dialog servers are addressed by having the first part of the username in the request-URI contain the service identifier "dialog" to indicate that a dialog service is requested. This is followed by a period, and after that, an identifier that indicates the means by which the dialog is specified. Currently, one mechanism is defined - a VoiceXML script. Other tokens can be used to indicate different mechanisms (note that service-token is identical to the BNF for token from RFC 2543, except that the "." character is disallowed). After that comes an optional period followed by dialog-mechanism specific identification. For VoiceXML scripts, when present, this identification information is always a URL-encoded version of the URL which references the script to execute. When not present, the dialog server uses server-specific configuration to determine which script to execute. Examples of URLs that invoke VoiceXML dialogs are: sip:dialog.vxml.http%3a//dialogs.server.com/script32.vxml@vxmlservers.com sip:dialog.vxml@vxmlservers.com The first of these indicates that the dialog server (located at vxmlservers.com) should invoke a VoiceXML script fetched from http://dialogs.server.com/script32.vxml. Since the user part of the SIP URL cannot contain the : character, this must be escaped to %3a. 2.2 Responding to the INVITE If the server receiving the INVITE doesn't support the specifics of the service request (for example, the requested VoiceXML version is not supported), the server SHOULD generate a 501 response. It MAY include a Warning header providing details on why the request could not be serviced. The server SHOULD authenticate the caller and verify that they are authorized to access the requested service. It is anticipated that Rosenberg,Mataga,Ladd [Page 3] Internet Draft sip-vxml July 13, 2001 dialog servers will generally be used in conjunction with an application server which makes the actual authorization decision about whether the call is to be processed. As a result, the dialog server's authorization decision is simple - if it came from an authorized upstream server, the request is allowed. It is RECOMMENDED that a persistent TLS connection between the application server and the dialog server be used to provide the authentication credentials for this kind of scenario. The server then validates that the SDP in the INVITE, if present, is acceptable. It does so based on the procedures of Section 2.3. If it has gotten this far, the server SHOULD fetch the script identified by the request-URI before generating a final response to the request. If the script cannot be fetched, or is invalid, the server generates a 502 Bad Gateway response, since effectively the server is a gateway to HTTP. It MAY include a Warning header providing details on the reason for failure. Once the script has been fetched, and is valid, and the offered SDP is deemed acceptable, the server SHOULD generate a 200 OK response. The generation of the response, and ACK processing, are based on standard SIP semantics. 2.3 SDP Processing If the INVITE contains SDP with an offer, the dialog server will generate an answer as per SIP-bis [6]. The offer is deemed unacceptable if it contains no media lines of type audio, or if the dialog server supports none of the codecs listed for the audio streams. Otherwise, it is deemed acceptable. The answer generated by the dialog server SHOULD refuse all media streams excepting the first offered audio stream. Choice of codecs used by the dialog server is at the discretion of the implementor. However, it is STRONGLY RECOMMENDED that all dialog servers support G.711 and RFC 2833. If an offered media stream does not indicate support for RFC 2833 tones, the dialog server SHOULD add that codec to the answer. As described in RFC2543-bis, this allows the dialog server to inform the caller that it can receive rfc2833 media, even if the caller cannot receive it. The server SHOULD allow sendonly, recvonly, and sendrecv media streams, as well as streams on hold. The meaning of these for script interpretation is discussed in Section 4. If the INVITE from the caller did not contain an SDP, the dialog server SHOULD generate an offer in the 2xx with a single audio media Rosenberg,Mataga,Ladd [Page 4] Internet Draft sip-vxml July 13, 2001 line, listing all codecs supported by the dialog server. 2.4 Script Variables In VoiceXML 1.0, the interpreter context provides the script with several variables that provide information on the call control interfaces. These variables are set in the following fashion: session.telephone.ani: This variable is the value of the URL in the From field of the INVITE that triggered the script. session.telephone.dnis: This variable is the value of the URL in the To field of the INVITE that triggered the script. session.telephone.iidigits: If the Contact header in the INVITE request uses the SIP caller preferences contact parameters [7] to provide additional information on the initiating device, the interpreter context SHOULD map these parameters to closest II digit if possible. session.telephone.uui: This variable is set only if the INVITE request contained an embedded ISUP IAM request [8]. In that case, the user-to-user information elements from that IAM are extracted, and mapped to this variable. Support for this is optional, but RECOMMENDED. 3 Document Acquisition The interpreter context fetches the script using normal HTTP GET and POST requests [9]. It MUST follow the caching behaviors specified in VoiceXML 1.0. It MAY support other document acquisition protocols, such as FTP. 4 Audio Input and Output Audio input and output are provided through RTP. The implementation platform SHOULD provide DTMF recognition on the incoming media stream, indpendent of its codec type. This is greatly facilitated through RFC 2833, which pushes the DTMF detection operation to the originator. The implementation platform SHOULD provide speech recognition on the incoming media stream as well. To be very explicit, this means that the dialog server SHOULD support recognition of DTMF and speech by processing a single incoming media stream. Furthermore, this stream can be sent by the caller using one of at least two codecs - G.711 and RFC 2833, and that the sender of the media can switch codecs on the fly when it detects DTMF. This means that RTP packets 1, 2 and 3 might be G.711, followed by RTP Rosenberg,Mataga,Ladd [Page 5] Internet Draft sip-vxml July 13, 2001 packet 4 which is RFC 2833. Furthermore, despite the fact that the sender can send RFC2833, the dialog server SHOULD still perform DTMF detection on the media stream, in case the sender does not support RFC 2833, or does support it, but misses a digit. OPEN ISSUE: This is a strong statement; if the probability of missed DTMF is small, the dialog server shouldn't have to do detection if it knows the caller has done it. Problem, though: since SDP has no way to indicate code- specific directionalities in a sendrecv stream, a UA that can only send RFC 2833 doesn't say anything about it in the SDP in the INVITE. As a result, there is no way to know for sure that the sender can do it until the first RFC 2833 packet shows up. The SDP FID [10] specification resolves this. Should we make support for the FID spec mandatory for dialog servers? Some implementations we are aware of use a separate stream for the DTMF and for the speech. This approach is NOT RECOMMENDED, since it makes synchronization of the speech and DTMF difficult. SDP allows media streams to be unidirectional. If a stream is one-way from the caller to the dialog server, this means that script processing SHOULD proceed normally, except that any audio which would normally be output by the implementation platform is discarded. Furthermore, if a stream is one-way from the dialog server to the caller, script processing SHOULD proceed normally, except that the implementation platform never delivers characters (i.e., DTMF digits) or utterances to the interpreter. In other words, behavior is identical to the case where the caller is simply not talking. Unidirectional streams are very useful for applications which require a "listener" on an existing media stream to look for a particular utterance and DTMF digit, and deliver that to an application server for event processing. Therefore, it is RECOMMENDED that they be supported in dialog servers as described above. SIP allows media streams to be placed on hold. This will happen when the interpreter context receives a re-INVITE with an SDP with a 0.0.0.0 connection line. This is handled identically to the case of a media stream which is unidirectional from the dialog server to the caller, meaning that it's "just" disconnected, not an interpreter- freeze. SIP allows media streams to be disabled by setting the port to zero. This has very specific meaning in the case of a dialog server. It has the effect of requesting a freeze of the interpreter state. When the Rosenberg,Mataga,Ladd [Page 6] Internet Draft sip-vxml July 13, 2001 interpreter context returns a 200 OK as a response, it indicates that the interpreter has been frozen. The interpreter is truly frozen; the behavior should be as if time were literally suspended as far as the interpreter is concerned. To unfreeze the interpreter state, a re- INVITE is needed to establish a new audio media stream. This will cause processing of the script to continue at exactly the same place it left off, using the media input and output from the new media stream to drive the interpreter. It is critical that, as far as the script is concerned, the freeze never even took place. This capability is essential for supporting feature composition of voice-based applications. Consider application A, which allows the user to hear an announcement when a friend comes online. If the user says yes, a call is placed to that friend. Another application, B, allows the user to hear stock quotes. We'd like to compose these so that both can happen simultaneously. For that to happen in a reasonable fashion, one of these applications has the "focus", meaning that it is the one processing the input and output from the user. Consider the case where the stock quote application has the focus. An the stock quote application runs on dialog server X, and the presence application on dialog server Y. Application server Z is the central point for all system events related to all applications. The flow to consider is show in Figure 1. At the beginning of the flow, the caller has a call leg to the AS, the the AS has used third party call control [11] to connect the caller to dialog server X. This means there is an RTP connection between the caller and this dialog server, as shown. An external event (such as a friend coming online), will cause an application server to decide that the other voice application needs to receive the focus. However, we don't want to terminate the stock quote application; we merely wish to suspend it so that the user can resume it after hearing that the friend came online. So, the application server sends a re-INVITE (1) to the dialog server running the stock quote application, and requests it to be frozen. When the interpreter comples the current prompt block, the context freezes the interpreter and returns a 200 OK. The AS then connects the user to the dialog server running the presence application (4-9). Dialog server Y will fetch the VoiceXML script from the AS (since the AS knows the identity of the buddy that came online, it needs to be the one that generates the VoiceXML script), but this is not shown. This dialog runs, and assuming the user doesn't call the friend, the script terminates, causing server Y to send a BYE (10). The AS decides to resume the stock quote application. So, using 3pcc, it reconnects the caller with server X (12-17). The re-INVITE to server X (14) has the effect of unfreezing the context, so processing continues where the call left off. Rosenberg,Mataga,Ladd [Page 7] Internet Draft sip-vxml July 13, 2001 The result of this is that the user's experience is the following: network: Please enter the stock to check. user: Lucent network: Lucent technologies is at six dollars. network: Friend alert: Bob is online. Would you like to call him? user: no network: Please enter the name of the stock to check. Note that The issue of when the interpreter can be suspended is being worked in the W3C. The key idea with this mechanism is that in NO CASE should the VoiceXML script for the stock quote application need to know that this external event (the buddy coming online) has occurred, so that it can play the buddy announcement. Doing so is counter to the entire concept of feature interaction; it is an intractable problem if every application and feature needs to know about each other. In the approach proposed here, each voice application remains independent. The application server plays the role of composing them by activating and deactivating the contexts as needed. This still requires the AS to know the set of applications that are running, but in this case, it doesn't need to know anything except the relative precedences of the various applications and the events which trigger them. Logic for that can, in principle, be constructed in a generic way, independent of the specific applications. This approach isn't perfect for all cases, but its simple enough to get things started. 4.1 Processing Further SIP Messages The interpreter context processes subsequent SIP messages in the following fashion. 4.2 BYE If a BYE request is received from the caller, this terminates the call. The interpreter context SHOULD throw the telephone.disconnect event to the interpreter. 4.3 re-INVITE If a re-INVITE is received, it has the effect of changing some aspect of the media input and output. Codec changes, port changes, and IP Rosenberg,Mataga,Ladd [Page 8] Internet Draft sip-vxml July 13, 2001 Caller AS (Z) DS (X) DS (Y) |RTP | | | |...................................| | | |friend online | | | |<-------- | | | |(1) INV disable | | | |---------------->|request freeze | | |(2) 200 OK | | | |<----------------|frozen | | |(3) ACK | | | |---------------->| | | |(4) INV no SDP | | | |---------------------------------->| | |(5) 200 SDP 1 | | |(6) INV SDP 1 |<----------------------------------| |<----------------| | | |(7) 200 SDP 2 | | | |---------------->|(8) ACK SDP 2 | | |(9) ACK |---------------------------------->| |<----------------| | | | | | | | RTP | | | |.....................................................| | | | | | |(10) BYE | | | |<----------------------------------| | |(11) 200 OK | | | |---------------------------------->| |(12) INV no SDP | | | |<----------------| | | |(13) 200 SDP 3 | | | |---------------->|(14) INV SDP 3 | | | |---------------->|unfreeze | | |(15) 200 SDP 4 | | |(16) ACK SDP 4 |<----------------| | |<----------------|(17) ACK | | | |---------------->| | |RTP | | | |.................|.................| | | | | | | | | | Figure 1: Voice Application Composition Rosenberg,Mataga,Ladd [Page 9] Internet Draft sip-vxml July 13, 2001 address changes are handled normally as per bis [6]. Specific processing is required for changes in stream direction, placing the call on hold, disabling a media stream, and adding a new audio stream after a previous re-INVITE disabled it. See Section 4. 4.4 INFO, MESSAGE These messages are ignored by the interpreter context. 5 Tag Processing Certain tags within the VoiceXML script have call control implications. The following subsections describe how the interpreter context handles them. 5.1 Exit VoiceXML 1.0 says that the processing of the exit tag is entirely context specific. For SIP, the interpreter context SHOULD send a BYE to terminate the call. Ideally, the VoiceXML element would also post the given namelist to a URI specified in the original call setup. For example, the URI of an HTTP servlet running directly in the AS or in an associated web application server would be an appropriate choice. This would allow voice interactions to be completely independent of the calling context, and therefore be re-usable across providers and applications. The VoiceXML specification is silent on exactly what should happen with the namelist. For this reason, we do not specify specific processing at this time. OPEN ISSUE: Should we specify something? We could provide an additional URL at script initiation which is used to post the namelist upon exit. 5.2 Disconnect The interpreter context SHOULD send a BYE to terminate the call. As per the VoiceXML specification, a telephone.disconnected.hangup event is also thrown. 5.3 Transfer VoiceXML 1.0 supports two styles of transfer, bridged and blind. Rosenberg,Mataga,Ladd [Page 10] Internet Draft sip-vxml July 13, 2001 5.3.1 Blind When the interpreter context needs to perform a blind transfer, it SHOULD generate a REFER [12] request. The REFER request is sent to the caller. It contains a Refer-To header which contains the target URL specified in the URI in the value of the "dest" attribute of the transfer tag. If the transfer tag contains a connecttimeout attribute, the URI in the Refer-To has an Expires header parameter appended to it, containing the duration from the attribute. For example, if the following transfer tag was encountered: The REFER would look like: REFER sip:caller@pc13.company.com Via: SIP/2.0/UDP server3.vxmlservers.com From: sip:dialog.vxml20@vxmlservers.com;tag=8aa6s CSeq: 3487 REFER Call-ID: 9a8s9809s@102.3.4.4 To: sip:caller@company.com;tag=99as7 Refer-To: sip:support@foo.com?Expires=10 Referred-By: sip:dialog.vxml20@vxmlservers.com If the REFER is rejected, the interpreter context outputs a network_busy as the outcome of the transfer attempt. Otherwise, the interpreter context remains suspended until a NOTIFY is received. At some point before the expiration, the interpreter context will receive a NOTIFY request containing the final response received for the triggered INVITE. If this response is a 2xx, the interpreter context throws a telephone.disconnect.transfer, and sends a BYE request to terminate the call. If the final response was a non-2xx response, the transfer attempt failed. If the final response was a 486, the outcome of the transfer attempt is set to busy, and form processing continues. If the final response was a 408, the outcome of the transfer attempt is set to noanswer, and form processing continues. For any other response, the outcome of the transfer attempt is set to network_busy, and form Rosenberg,Mataga,Ladd [Page 11] Internet Draft sip-vxml July 13, 2001 processing continues. 5.3.2 Bridged In a bridged transfer, the interpreter context resumes after the transfer call completes. VoiceXML 1.0 also allows the script to specify a grammar within the transfer tag, allowing it to listen in for DTMF that meets that grammar. When a match is found, the transfer is terminated and control returns to the interpreter. This function requires that the dialog server act as a UAC, and make the outbound call to the transferred party. The flow is shown in Figure 2. The caller connects to the dialog server with messages 1-3. RTP flows between the caller and the dialog server. When the transfer tag is encountered, the dialog server sends an outbound INVITE (4) The outbound INVITE contains the same SDP, SDP 1, offered by the caller. If the final response (5) is a 200 OK, this contains SDP3. The dialog server continues to receive media from the caller. This is passed on to the transfer target, using SDP3. However, media from the transfer target to the caller goes direct, bypassing the dialog server. If the final response to the INVITE was a non-2xx response, the transfer attempt failed. If the final response was a 486, the outcome of the transfer attempt is set to busy, and form processing continues. If the final response was a 408, the outcome of the transfer attempt is set to noanswer, and form processing continues. For any other response, the outcome of the transfer attempt is set to network_busy, and form processing continues. The INVITE should not be left pending for more than the amount of time in the connecttimeout parameter, if specified. After that amount of time has passed, the INVITE request is cancelled, and form processing continues. The outcome of the transfer is set to noanswer. If the final response to the INVITE was a 2xx response, the transfer attempt succeeded. In addition to passing on the media to the transfer target, the interpreter passes the media received from the caller through the grammar present within the transfer tag, if present. If the grammar is matched, the interpreter context sends a BYE to the transfer target. Processing continues within the interpreter. If the transfer target sends a BYE, a 200 OK is returned. The outcome of the script is set to far_end_disconnect. Form interpretation continues. If the caller sends a BYE, a 200 OK is returned. The dialog server sends a BYE to the transfer target. A Rosenberg,Mataga,Ladd [Page 12] Internet Draft sip-vxml July 13, 2001 |(1) INVITE SDP1 | | |-------------------->| | |(2) 200 SDP2 | | |<--------------------| | |(3) ACK | | |-------------------->| | |RTP | | |<...................>| | | |(4) INVITE SDP1 | | |------------------->| | |(5) 200 SDP3 | | |<-------------------| | |(6) ACK | | |------------------->| | RTP from caller | | |....................>| RTP from caller | | |...................>| | RTP to caller | |<.........................................| | | | | | | Caller DS Transfer target Figure 2: Bridged Transfer flow telephone.disconnect.hangup event is thrown, and form processing continues to allow cleanup. OPEN ISSUE: When would it even be possible for the transfer Rosenberg,Mataga,Ladd [Page 13] Internet Draft sip-vxml July 13, 2001 outcome to be near_end_disconnect? Wouldn't this terminate the script, so that there is no transfer outcome? If the transfer target sends a REFER (ie., the caller is to be transferred elsewhere), the interpreter context responds with a 200 OK. It creates a new REFER with the same Refer-To header (but its own value for Referred-By), and sends it to the caller. Upon receiving a 200 OK to the REFER, the dialog server sends a NOTIFY to the transfer target, informing it of a successful REFER completion to the new target. If a BYE is received from the transfer target, the interpreter sends a BYE to the caller as well, and throws a telephone.disconnect.transfer event. 6 Additional Requirements In addition to the above behaviors, we also recommend that several optional SIP capabilities be implemented by dialog servers. This is to support their intended use cases as components in the application server component architecture [4]. The following list of requirements includes these recommended features, in addition to summarizing the ones scattered above: 1. The dialog server SHOULD support SIP over persistent TCP and TLS connections, and SHOULD support a configurable authorization listing of allowed Distinguished Names which can connect. This is useful when authorization decisions are outsourced to an application server, as described above. 2. The dialog server SHOULD fully support RFC 1889 and RFC 1890. Of particular importance is RTCP. 3. The dialog server SHOULD support G.711 and RFC 2833. 4. The dialog server SHOULD support the UA requirements outlined in the third party call control specification [11]. This is important for building more complex applications, a common usage for dialog servers. 5. The dialog server SHOULD support the SDP FID attribute [10], and SHOULD use it to allow processing to occur over a collection of alternate streams with the same FID group. 6. The dialog server SHOULD support the REFER method [12], needed for the blind transfer tag. It SHOULD also allow itself to be referrred as a normal UAS. 7. The dialog server SHOULD allow any HTTP URL to be placed in Rosenberg,Mataga,Ladd [Page 14] Internet Draft sip-vxml July 13, 2001 the request-URI for specifying the script to execute. 7 Authors Addresses Jonathan Rosenberg dynamicsoft 72 Eagle Rock Avenue First Floor East Hanover, NJ 07936 email: jdrosen@dynamicsoft.com Peter Mataga dynamicsoft 72 Eagle Rock Avenue First Floor East Hanover, NJ 07936 email: pmataga@dynamicsoft.com David Ladd dynamicsoft 72 Eagle Rock Avenue First Floor East Hanover, NJ 07936 email: dladd@dynamicsoft.com 8 Bibliography [1] VoiceXML Forum, "Voice extensible markup language (VoiceXML) version 1.00," VoiceXML forum specification, VoiceXML Forum, Mar. 2000. [2] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg, "SIP: session initiation protocol," Request for Comments 2543, Internet Engineering Task Force, Mar. 1999. [3] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a transport protocol for real-time applications," Request for Comments 1889, Internet Engineering Task Force, Jan. 1996. [4] J. Rosenberg, P. Mataga, and H. Schulzrinne, "An application server component architecture for SIP," Internet Draft, Internet Engineering Task Force, Mar. 2001. Work in progress. [5] B. Campbell and R. Sparks, "Control of service context using SIP Rosenberg,Mataga,Ladd [Page 15] Internet Draft sip-vxml July 13, 2001 Request-URI," Request for Comments 3087, Internet Engineering Task Force, Apr. 2001. [6] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg, "SIP: Session initiation protocol," Internet Draft, Internet Engineering Task Force, Nov. 2000. Work in progress. [7] H. Schulzrinne and J. Rosenberg, "SIP caller preferences and callee capabilities," Internet Draft, Internet Engineering Task Force, Nov. 2000. Work in progress. [8] E. Zimmerer, J. Peterson, A. Vemuri, L. Ong, F. Audet, M. Watson, and M.Zonoun, "MIME media types for ISUP and QSIG objects," Internet Draft, Internet Engineering Task Force, Mar. 2001. Work in progress. [9] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee, "Hypertext transfer protocol -- HTTP/1.1," Request for Comments 2616, Internet Engineering Task Force, June 1999. [10] G. Camarillo, J. Holler, and G. Eriksson, "The SDP fid attribute," Internet Draft, Internet Engineering Task Force, Apr. 2001. Work in progress. [11] J. Rosenberg, J. Peterson, H. Schulzrinne, and G. Camarillo, "Third party call control in SIP," Internet Draft, Internet Engineering Task Force, Mar. 2001. Work in progress. [12] R. Sparks, "SIP call control," Internet Draft, Internet Engineering Task Force, Feb. 2001. Work in progress. Rosenberg,Mataga,Ladd [Page 16]