From: http://www.ietf.org/internet-drafts/draft-freed-sieve-in-xml-01.txt Title: Sieve Email Filtering: Sieves and display directives in XML Reference: IETF SIEVE WG Internet Draft 'draft-freed-sieve-in-xml-01' Date: February 25, 2008 I-D Tracker: http://ietfreport.isoc.org/idref/draft-freed-sieve-in-xml/ Tools: http://tools.ietf.org/html/draft-freed-sieve-in-xml-01 See also: IETF Sieve Mail Filtering Language (SIEVE) Working Group Charter http://www.ietf.org/html.charters/sieve-charter.html Sieve Mail Filtering Language Status Pages http://tools.ietf.org/wg/sieve/ Sieve: An Email Filtering Language http://ietfreport.isoc.org/idref/draft-ietf-sieve-3028bis/ ============================================================================== Network Working Group N. Freed Internet-Draft S. Vedam Expires: August 28, 2008 Sun Microsystems February 25, 2008 Sieve Email Filtering: Sieves and display directives in XML draft-freed-sieve-in-xml-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 28, 2008. Abstract This document describes a way to represent Sieve email filtering language scripts in XML. Representing sieves in XML is intended not as an alternate storage format for Sieve but rather as a means to facilitate manipulation of scripts using XML tools. The XML representation also defines additional elements that have no counterparts in the regular Sieve language. These elements are intended for use by graphical user interfaces and provide facilities for labeling or grouping sections of a script so they can be displayed more conveniently. These elements are represented as specially structured comments in regular Sieve format. Freed & Vedam Expires August 28, 2008 [Page 1] Internet-Draft An XML Representation for Sieve February 2008 Change History (to be removed prior to publication as an RFC Changed representation of comments in XML to use a comment element. Updatde references. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions used in this document . . . . . . . . . . . . . . 4 3. Grammatical structure of Sieve . . . . . . . . . . . . . . . . 4 4. XML Representation of Sieve . . . . . . . . . . . . . . . . . 5 4.1. XML Display Directives . . . . . . . . . . . . . . . . . . 7 5. Extended Example . . . . . . . . . . . . . . . . . . . . . . . 8 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 7.1. Normative References . . . . . . . . . . . . . . . . . . . 13 7.2. Informative References . . . . . . . . . . . . . . . . . . 13 Appendix A. Schema for Sieves in XML . . . . . . . . . . . . . . 14 Appendix B. Stylesheet for conversion from XML . . . . . . . . . 16 Appendix C. Acknowledgements . . . . . . . . . . . . . . . . . . 21 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 Intellectual Property and Copyright Statements . . . . . . . . . . 23 Freed & Vedam Expires August 28, 2008 [Page 2] Internet-Draft An XML Representation for Sieve February 2008 1. Introduction Sieve [RFC5228] is a language for filtering email messages at or around the time of final delivery. It is designed to be implementable on either a mail client or mail server. It is meant to be extensible, simple, and independent of access protocol, mail architecture, and operating system and it is intended to be manipulated by a variety of different user interfaces. Some user interface environments have extensive existing facilities for manipulating material represented in XML. While adding support for alternate data syntaxes may be possible in most if not all of these environments, it may not be particularly convenient to do so. The obvious way to deal with this issue is to map sieves into XML, possibly on a separate backend system, manipulate the XML, and convert it back to normal Sieve format. The fact that conversion into and out of XML may be done as a separate operation on a different system argues strongly for defining a common XML representation for Sieve. This way different front end user interfaces can be used with different back end mapping and storage facilities. Another issue with the creation and manipulation of sieve scripts by user interfaces is that the language is strictly focused on describing email filtering operations. The language contains no mechanisms for indicating how a given script should be presented in a user interface. Such information can be represented in XML very easily so it makes sense to define a framework to do this as part of the XML format. Structured comments can then be used to retain this information when the script is converted to normal Sieve format. Various sieve extensions have already been defined, e.g., [RFC5229] [RFC5230] [RFC5231] [RFC5232] [RFC5233] [RFC5235], and many more are planned. The set of extensions available varies from one implementation to the next and may even change as a result of configuration choices. It is therefore essential that the XML representation of Sieve be able to accommodate Sieve extensions without requiring schema changes. It is also desirable that Sieve extensions not require changes to the code that converts to and from the XML representation. This specification defines an XML representation for sieve scripts and explains how the conversion process to and from XML works. The XML representation is capable of accommodating any future Sieve extension as long as the underlying Sieve grammar remains unchanged. Furthermore, code that converts from XML to the normal Sieve format requires no changes to accommodate extensions, while code used to Freed & Vedam Expires August 28, 2008 [Page 3] Internet-Draft An XML Representation for Sieve February 2008 convert from normal Sieve format to XML only requires changes when new control commands are added - a rare event. An XML Schema and sample code to convert to and from XML format are also provided in the appendices. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 3. Grammatical structure of Sieve The Sieve language is designed to be highly extensible without making any changes to the basic language syntax. Accordingly the syntax of Sieve, defined in section 8 of [RFC5228], is entirely structural in nature and employs no reserved words of any sort. Structurally a sieve script consists of a series of commands. Each command in turn consists of an identifier, zero or more arguments, a optional test or test-list, and finally an optional block containing another series of commands. Commands are further broken down into controls and actions, although this distinction cannot be determined from the grammar. Some example Sieve controls are: stop; <-- No arguments, test, or command block require "fileinto"; <-- Control with a single argument if true {stop;} <-- Control with test and command block Some examples of Sieve actions are: discard; <-- Action with no args, test, or command block fileinto "folder"; <-- Action with an argument At the time of this writing there are no controls defined that accept both arguments and a test. Similarly, there are currently no defined actions that allow either a test or a command block. Nevertheless, the Sieve grammar allows such constructs to be defined by some future extension. A test consists of an identifier followed by zero or more arguments, then another test or test-list. Unlike commands, tests cannot be followed by a command block. Freed & Vedam Expires August 28, 2008 [Page 4] Internet-Draft An XML Representation for Sieve February 2008 Here are some examples of Sieve tests. Note that such tests have to appear as part of a command in order to be syntactically valid: true <-- Test with no argument or subordinate test header "to" "me@example.com" <-- Test with several arguments Command or test arguments can be either string lists, whole numbers or tags. (Tags are simply identifiers preceded by a colon.) Note that although the Sieve grammar treats single strings as a degenerate case of a string list, some tests or actions have arguments that can only be individual strings, not lists. Here is an example showing the use of both a test-list and a string list: if anyof (not exists ["From", "Date"], header :contains "from" "fool@example.edu") { discard; } Extensions can add new controls, actions, tests, or new arguments to existing controls or actions. Extensions can also change how string content is interpreted, although this is not relevant to this specification. However, it is especially important to note that so far no Sieve extension has added a new control to the language and it seems safe to assume that due to their nature future addition of controls will be rare. Finally, comments are allowed between lexical elements in a Sieve script. It is very important that comments be preserved in the XML representation. 4. XML Representation of Sieve Sieve controls and actions are represented in XML as control or action elements respectively. The command's identifier appears as a name attribute on the element itself. This is the only attribute allowed on controls and actions - arguments, tests, test-lists, and nest command blocks are all represented as nested elements. While naming the element after the control or action itself may seem like a better choice, doing so would result in extensions changing the XML schema. The example Sieve controls shown in the previous section would be represented in XML as: Freed & Vedam Expires August 28, 2008 [Page 5] Internet-Draft An XML Representation for Sieve February 2008 fileinto The example Sieve actions shown above would appear in XML as: folder The separation of controls from actions in the XML representation means that conversion from normal Sieve format to XML has to be able to distinguish between controls and actions. This is easily done by maintaining a list of all known controls since experience indicates new controls are rarely added. Tests are represented in the same basic way as controls and actions, that is, as a test element with a name attribute giving the test identifier. For example: tome@example.com String, number, and tag arguments are represented as str, num, and tag elements respectively. The actual string, number, or tag identifier appears as text inside the element. None of these elements have any defined attributes. Several examples of arguments have already appeared in the preceding control, action and test examples. String list arguments are represented as a list element which in turn contains one or more str elements. Note that this allows the distinction between a single string and a string list containing a single string to be preserved. This is not essential since a list containing a single string could simply be mapped to a string, but it seems prudent to maintain the distinction when mapping to and from XML. Nested command blocks appear as a series of control or action elements inside of an outer control or action element. No block element is needed since an inner command block can only appear once and only after any arguments, tests, or test-lists. For example: Freed & Vedam Expires August 28, 2008 [Page 6] Internet-Draft An XML Representation for Sieve February 2008 contains from fool@example.edu Finally, Sieve comments are mapped to a special comment element in XML. XML comments are not used because some XML tools do not make it convenient to access comment nodes. 4.1. XML Display Directives Sometimes graphical user interfaces are a convenient way to provide sieve management functions to users. These interfaces typically summarize/annotate/group/display sieve script(s) in an intuitive way for end users. To do this effectively, the graphical user interface may require additional information about the sieve script itself. That information or "meta-data" might include, but is not limited to - a sieve name (identifying the current sieve), whether the sieve is enabled or disabled, the order in which the part of the sieve are presented to the user. The graphical user interface may also choose to provide mechanisms to allow the user to modify the script. It is often useful for a graphical user interface to group related sieve script elements and provide an interface that display these groups separately so they can be managed as a single object. Some examples include Sieve statements that together provide vacation responders, blacklists/whitelists and other types of filtering controls. Some advanced graphical user interfaces may even provide a natural language representation of a sieve script and/or an advanced interface to present sieve statements directly to the user. A graphical user interface may also choose to support only a subset of action commands in the Sieve language (and its extensions) and so Freed & Vedam Expires August 28, 2008 [Page 7] Internet-Draft An XML Representation for Sieve February 2008 a mechanism to indicate the extent of support and characterize the relationships between those supported action commands and test (with its arguments) is immensely useful and probably required for clients that may not have complete knowledge of sieve grammar and semantics. The Sieve language contains no mechanisms for indicating how a given script should be presented in a user interface. The language also does not contain any specific mechanisms to represent other sorts of meta-data about the script. Providing support for such meta-data as part of a sieve script is currently totally implementation specific and is usually done by imposing some type of structure on comments. However, such information can be represented in XML very easily so it makes sense to define a framework to do this as part of the XML format. Implementations may choose to use structured comments to retain this information when the script is converted to normal Sieve format. This XML representation defines two display directives - displayblock and displaydata - as containers for meta-data needed by graphical user interfaces. The displayblock element can be used to enclose any number of sieve statements at any level. It is semantically meaningless to the sieve script itself. It allows an arbitrary set of attributes. Implementations MAY use this to provide many simple, display related meta-data for the sieve such as sieve identifier, group identifier, order of processing, etc. This information SHOULD be preserved in structured comments during conversion of XML to the normal Sieve syntax. The displaydata element supports any number of arbitrary child elements. Implementations MAY use this to represent complex data about that sieve such as a natural language representation of sieve or a way to provide the sieve script directly. Again, this information SHOULD be preserved in structured comments when converted. 5. Extended Example The example sieve script given in section 9 of [RFC5228] would be represented in XML as follows: Example Sieve Filter Declare any optional features or extensions used by the script Freed & Vedam Expires August 28, 2008 [Page 8] Internet-Draft An XML Representation for Sieve February 2008 fileinto Handle messages from known mailing lists Move messages from IETF filter discussion list to filter mailbox is Sender owner-ietf-mta-filters@imc.org filter move to "filter" mailbox Keep all messages to or from people in my company domain is From To example.com Try and catch unsolicited email. If a message is not to me, or it contains a subject known to be spam, file it away. all contains Freed & Vedam Expires August 28, 2008 [Page 9] Internet-Draft An XML Representation for Sieve February 2008 To Cc Bcc me@example.com matches subject *make*money*fast* *university*dipl*mas* spam Move all other (non-company) mail to "personal" mailbox. personal The same script could be annotated with graphical display hints in a variety of ways. Two possibilities are: fileinto is Sender owner-ietf-mta-filters@imc.org Freed & Vedam Expires August 28, 2008 [Page 10] Internet-Draft An XML Representation for Sieve February 2008 filter domain is From To example.com all contains To Cc Bcc me@example.com matches subject *make*money*fast* *university*dipl*mas* Freed & Vedam Expires August 28, 2008 [Page 11] Internet-Draft An XML Representation for Sieve February 2008 spam personal Note that since displayblock elements are semantically null as far as the script itself is concerned they can be used to group structures like elsif and else that are tied to statements in other groups. If the e-mail header "Sender" is owner-ietf-mta-filters@imc.org then file it into the "filter" folder. Otherwise if the address in the "From" or "To" has a domain that is "example.com" then keep it. Otherwise messages meeting with any of these conditions: (1) None of the addresses in "To" or "Cc" or "Bcc" contains the domain "example.com". (2) The "Subject" field matches the pattern *make*money*fast* or *university*dipl*mas* then file it into the "spam" folder. If all else fails then file the message in the "personal" folder. ... the actual sieve script ... Freed & Vedam Expires August 28, 2008 [Page 12] Internet-Draft An XML Representation for Sieve February 2008 6. Security Considerations Any syntactically valid sieve script can be represented in XML. Accordingly, all security considerations applicable to Sieve and any extensions used also apply to the XML representation. The use of XML carries its own security risks. Section 7 of RFC 3470 [RFC3470] discusses these risks. Arbitrary data can be placed in the extensible displayblock and displaydata constructs defined in this specification, possibly including entire scripts in languages other than Sieve. Appropriate security precautions should be taken when using these facilities. 7. References 7.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3470] Hollenbeck, S., Rose, M., and L. Masinter, "Guidelines for the Use of Extensible Markup Language (XML) within IETF Protocols", BCP 70, RFC 3470, January 2003. [RFC5228] Guenther, P. and T. Showalter, "Sieve: An Email Filtering Language", RFC 5228, January 2008. 7.2. Informative References [RFC5229] Homme, K., "Sieve Email Filtering: Variables Extension", RFC 5229, January 2008. [RFC5230] Showalter, T. and N. Freed, "Sieve Email Filtering: Vacation Extension", RFC 5230, January 2008. [RFC5231] Segmuller, W. and B. Leiba, "Sieve Email Filtering: Relational Extension", RFC 5231, January 2008. [RFC5232] Melnikov, A., "Sieve Email Filtering: Imap4flags Extension", RFC 5232, January 2008. [RFC5233] Murchison, K., "Sieve Email Filtering: Subaddress Extension", RFC 5233, January 2008. [RFC5235] Daboo, C., "Sieve Email Filtering: Spamtest and Virustest Extensions", RFC 5235, January 2008. Freed & Vedam Expires August 28, 2008 [Page 13] Internet-Draft An XML Representation for Sieve February 2008 Appendix A. Schema for Sieves in XML The following defines a schema for the XML representation of Sieve scripts. Note that aside from defining the displaydata and displayblock elements this schema imposes no constraints on their content. Freed & Vedam Expires August 28, 2008 [Page 14] Internet-Draft An XML Representation for Sieve February 2008 Freed & Vedam Expires August 28, 2008 [Page 15] Internet-Draft An XML Representation for Sieve February 2008 Appendix B. Stylesheet for conversion from XML Freed & Vedam Expires August 28, 2008 [Page 16] Internet-Draft An XML Representation for Sieve February 2008 \" \\ Freed & Vedam Expires August 28, 2008 [Page 17] Internet-Draft An XML Representation for Sieve February 2008 { } ; ( , ) Freed & Vedam Expires August 28, 2008 [Page 18] Internet-Draft An XML Representation for Sieve February 2008 " " G M K [ , ] : Freed & Vedam Expires August 28, 2008 [Page 19] Internet-Draft An XML Representation for Sieve February 2008 /* */ /* [* */ /* *] */ /* [| |] */ Freed & Vedam Expires August 28, 2008 [Page 20] Internet-Draft An XML Representation for Sieve February 2008 < /> < > </ > =" " Appendix C. Acknowledgements The stylesheet copy mode code is loosely based on a sample code posted to the xsl-list list by Americo Albuquerque. Andrew McKeon Freed & Vedam Expires August 28, 2008 [Page 21] Internet-Draft An XML Representation for Sieve February 2008 provided useful comments on the document. Authors' Addresses Ned Freed Sun Microsystems 3401 Centrelake Drive, Suite 410 Ontario, CA 92761-1205 USA Phone: +1 909 457 4293 Email: ned.freed@mrochek.com Srinivas Saisatish Vedam Sun Microsystems Phone: +91 80669 27577 Email: Srinivas.Sv@Sun.COM Freed & Vedam Expires August 28, 2008 [Page 22] Internet-Draft An XML Representation for Sieve February 2008 Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Freed & Vedam Expires August 28, 2008 [Page 23]