[Archive copy mirrored from the URL: http://www.geocities.com/WallStreet/Floor/5815/guide.htm, text only; see this canonical version of the document.]
Version 0.02
12th September 1997
Editor: Martin Bryan, The SGML Centre
Contributors: Members of the XML/EDI working group, including Benoít Marchal, Norbert H Mikula, Bruce Peat and David RR Webber.
XML/EDI Group Home Page URL: http://www.geocities.com/WallStreet/Floor/5815
Copyright © 1997. XML/EDI Group. All rights reserved, no part of this document may be commercially reproduced in part or in whole without consent and prior approval.
New penultimate paragraph added to Section 2. Definitions for XML/EDI
Section on Validation messages, under 6. The Implementation Process, has been extended to suggest how the XML Syle Language (SGML) and ECMAScript (the standardized version of JavaScript) can be used to control the contents of elements.
Section on Processing Messages, under 6. The Implementation Process, has been rewritten to exaplain how the XML Syle Language (SGML) and ECMAScript can be used to control the presentation and processing of XML/EDI data.
The example in the Appendix A has been extended to include a sample of XSL as Figure A.4. (NB. The ECMAScript parts of the example have not been generated at present. There are also some problems with doing things like scrolling selection lists and raised buttons. Suggestiosn for appropriate coding for these sections would be gratefully received - Ed.)
Throughout the document changes have been made to reflect the decision made on 10th September 1997 that XML names must be case-sensitive, and that all names reserved for use in XML must be entered in uppercase!
Put simply, the goal of XML/EDI is to deliver unambiguous and durable business transactions via electronic means.
Associated with this is a goal to establish a standard for commercial electronic data interchange that is open and accessible to all, and which delivers a broad spectrum of capabilities suitable to meet the full breadth of business needs.
To achieve this requires the use of a methodology that it is not only extensible enough to meet future requirements but also adaptable enough to incorporate new technologies and requirements as they emerge. To ensure broad adoption the technology selected needs to be widely and freely available. The Extensible Markup Language (XML) developed by the World Wide Web Consortium (W3C) provides such a freely available, widely transportable, methodology for well-controlled data interchange.
XML was designed principally for the exchange of information in the form of computer displayable "documents". Not all commercial data is interchanged in a displayable format. In particular data designed for electronic data interchange typically needs to be processed before it can be displayed. For this to be possible the data must be mapped, using some form of template, to a set of processing rules. These XML/EDI guidelines provide a standardized way in which such rules templates can be added to interchanged data.
These XML/EDI guidelines begin by formally defining the terms used in the text. This is followed by an impact statement that brings out a crystal ball and makes predictions from various viewpoints. The guidelines then give a background on the use of tools and standards which XML/EDI is built.
Note: These guidelines form the basis for development work on XML/EDI. They form an precursor to a formal "Specification of an EDI Application for XML" which will be submitted to the W3C to be sanctioned as an industry standard. As a document designed to be a lighting rod for ideas, this working document has been, and will continue to be, released in draft form. Comments on this draft should be sent to the XML/EDI working group at xml-edi@riv.be.
Electronic commerce has been defined in the European Workshop on Open System's Technical Guide on Electronic Commerce (EWOS ETG 066) as "Electronic exchange of data to support business transactions, i.e. the exchange of value through the delivery of a product from a seller to a buyer". As such it encompasses much more than what has been possible using traditional methods of Electronic Data Interchange such as EDIFACT. Electronic commerce is defined by EWOS as covering activities such as marketing, contract exchange, logistics support, settlement and interaction with administrative bodies (e.g. tax and custom data interchange). Electronic commerce covers all industrial and service operations, including services such as insurance, healthcare, travel and interactive home shopping.
Many people use the term EDI to refer to the set of messages developed for business-to-business communication as part of the United Nations Standard Messages Directory for Electronic Data Interchange for Administration, Commerce and Transport (EDIFACT). EDIFACT messages are transmitted in compressed form, using predefined field identifiers, which must occur in a predefined sequence. While EDI is, strictly speaking, wider in scope than EDIFACT, for the purposes of these guidelines EDI will be used in this restricted sense when not otherwise qualified.
The basic unit of information in an EDI message is the data element. For an invoice, each item being invoiced would be represented by a data element. Data elements can be grouped into compound data elements, and data elements and/or compound data elements may be grouped into data segments. Data segments can be grouped into loops; and loops and/or data segments form the business document.
The EDIFACT standards define whether data segments are mandatory, optional, or conditional and indicate whether, how many times, and in what order a particular data segment can be repeated. For each EDI message, a field definition table exists. For each data segment, the field definition table includes a key field identifier string to indicate the data elements to be included in the data segment, the sequence of the elements, whether each element is mandatory, optional, or conditional, and the form of each element in terms of the number of characters and whether the characters are numeric or alphabetic. Similarly, field definition tables include data element identifier strings to describe individual data elements. Element identifier strings define an element's name, a reference designator, a data dictionary reference number specifying the location in a data dictionary where information on the data element can be found, a requirement designator (either mandatory, optional, or conditional), a type (such as numeric, decimal, or alphanumeric), and a length (minimum and maximum number of characters). A data element dictionary gives the content and meaning for each data element.
Originally, EDI translation software was developed to support a variety of private system formats. Most often, the sender and receiver were required to contract in advance for a tailored software program that would be dedicated to mapping between their two types of datasets. Each time a new sender or receiver was added to the client list, a new translation program would be needed by the new party to format their data to conform to the standards in use by the participants. Of course, this becomes expensive. Such static systems do not easily allow synchronization of business transactions in distributed business processes that involve global rules, but with participants and actions that are not predetermined. To solve these issues it is desirable to develop automated tools and techniques that are easy to use and allow decomposition of transactions in actions to be performed locally and mapping of local actions onto efficient protocol exchanges.
The concept of the Electronic Enterprise requires a transition away from paper form based EDI. Key concepts that are required are the encapsulation of business rules (in EDI parlance the Implementation Guidelines) and also mechanisms to handle state and flow control (such as provided by hyperlinks in HTML). Also message sets must be able to handle partial information, where the complete information is not yet available, or simply is not required for the particular business process. This allows different parts of an enterprise to selectively contribute only the information that is germane to their business functions.
XML is the Extensible Markup Language subset of ISO's Standard Generalized Markup Language (SGML) developed by the World Wide Web Consortium (W3C) SGML on the Web working party during the latter half of 1996 and early 1997. The formal specification was submitted for approval by W3C members on 1st July 1997.
On 10th September 1997 a proposal for a new form of XML Style Language (XSL), which incorporates the ECMAScript standardized variant of JavaScript, was published by a consortium led by Microsoft, ArborText and the Inso Corporation. This version of the XML/EDI specification uses the power provided by this new advanced langauge combination to show how control of XML/EDI document processes can be acheived in a distributed manner.
Combining XML and EDI to develop XML/EDI indicates that the main method of capturing and coding EDI information will be through XML-coded electronic forms. In addition the XML/EDI specification shows how EDIFACT messages can be generated from XML/EDI forms, and vice versa.
XML/EDI isn't creating a new standard. XML/EDI is defining how companies can use current standards to solve their business problems.
Detail of the scope of XML/EDI, and the impact it is expected to have on business communities, are covered in Introducing XML/EDI.... For help readers of this document to appreciate the differences in practice between traditional EDIFACT-based web transactions and XML/EDI this section discusses some of the differences between traditions business-to-business electronic data interchange systems and the new breed of interactive electronic commerce tools being provided through the Internet.
Electronic Data Interchange (EDI) has been used for business-to-business communication for almost a quarter of a century. Initial efforts involved inter-company agreements on how to exchange commercial data, initially as information stored on tape and later as messages sent over dedicated data lines. To avoid having to use different protocols to move data between different companies, various industry groups identified sets of data that could form the basis of individual agreements. The industry groups also sought to agree the format in which fields in such data sets were interchange so that a company only needed to develop one methodology for decoding information received without resource to human intervention.
The Achilles Heel for this approach has always been two fold. One is that companies require flexibility in, and wish to deviate from, doctrinaire standards that do not fully meet their business need. Second, because the standards are pre-ordained there is no mechanism provided to transfer processing rules and associated information. It is assumed that the data meets the defined constraints and if not, has been duly modified to conform. This means that companies must conduct exacting analysis to determine precisely how they are going to move their business data to and from the predefined EDI formats.
The cost of these constraints has been borne as excessively long and complex implementation cycles for traditional EDI systems.
The world has changed from thirty years ago, and now requires more dynamic and vibrant tools that match the organized yet ad hoc nature presented by both modern business practice and its manifestations in the Internet itself. The Internet is re-writing the rules on how people interact, buy and sell, and exchange goods and services. In particular the Internet is showing us that EDI is not only relevant for business-to-business communications. The same concepts are also relevant for all consumer-to-supplier relationships, whether the consumer is an end-user, a manufacturer, a service organization such as a hospital or a hotel, a governmental organization or a virtual organization.
With the arrival of the Internet in the last decade of the 20th century the pattern of electronic commerce has dramatically changed. In particular, the Internet has introduced many new ways of trading, allowing interaction between groups that previously could not economically afford to trade with one another.
Whereas previously commercial data interchange involved mainly the movement of data fields from one computer to another, without human intervention, the new model for electronic commerce introduced by the Internet is fundamentally dependent on human interaction for the transaction to take place. The new model is based principally on the use of interactive selection of a set of options, and on the completion of "electronic forms", to specify user requirements.
As this new model develops there has been a fundamental shift in how data used for commerce should be processed. The original create-->transmit-->receive-->process cycle of information processing, using individual programs, is beginning to be replaced by the concept of active objects which have inherent processes associated with them, based on the class of information they contain. Today an invoice may no longer contain a copy of the information stored in the database it was generated from: instead it contains a pointer that says where it expects to get the data from, and this data will be fetched from its managed source each time the invoice is processed.
Such interactive programs require us to review the underlying philosophy of electronic commerce. What are the characteristics of a system designed for Interactive Electronic Commerce in an international marketplace?
To be truly interactive you need to be able to:
To do this you need to be able to:
Because these interactions can be complex, and potentially require specialized knowledge, the rule templates can be supplemented by XML/EDI data manipulation agents (DataBots) to ensure that users can express their requirements in high-level, natural language, terms. DataBots automatically creates appropriate rule templates and XML syntax to match user requirements and broker the entire interchange.
When DataBots are being used XML/EDI is identified as being robot generated by adding an R to its name to become XML/EDI-R.
At this point in time the Java programming language provides the vehicle that permits the DataBots to be deployed and received along with XML/EDI messages.
XML/EDI is a synthesis of many concepts. XML/EDI:
XML will be native language for the next generation of most of the popular WWW browsers. XML/EDI seeks to leverage the work and support (technically and financially) which XML is receiving. With traditional EDI, the infrastructure was built from the ground up, without being able to share resources with other programs. This paradigm is no longer appropriate in today's world of shared software development. By adopting XML/EDI, the EDI community can get to share the cost of extension and future development.
In 1986 the International Organization for Standardization (ISO) published an international standard defining a Standard Generalized Markup Language (SGML) that allowed its users to:
SGML has formed the basis of many of the large, multinational, documentation projects that have developed in the decade since its publication. It also formed the basis for the formalization of the HyperText Markup Language (HTML) that led to the formation of the World Wide Web of documentation that has become available on the Internet.
Key to the success of HTML was the development of the concept of Uniform Resource Locators (URLs) that allow users to identify the source of each piece of shared data in a consistent manner. Whilst the original concept has limitations as to the granularity of data access, its universality has greatly improved computer-to-computer communications.
In July 1996 the World Wide Web Consortium (W3C) set up a working group to study how SGML could be simplified to allow for its efficient use over the Internet. The result was the development of an Extensible Markup Language (XML) that combined the expressive power of SGML with the Internet-aware functionality of HTML.
XML provides an ideal methodology for Interactive Electronic Commerce because:
XML can be integrated with existing EDI systems by:
XML can extend existing EDI applications by:
Figure 1 illustrates the main layers of a fully integrated XML/EDI system.
Figure 1: The layers of an XML/EDI system
The XML/EDI specific components are built on top of existing standards for transmitting and processing XML-encoded data. These standards define shared features such as:
XML parsers, document browsers, page markup programs and related software functions are available of-the-shelf today. Because of this, XML/EDI isn't a new standard, but a framework for using existing standards to tackle existing problems in a new way.
The following XML/EDI specific components will either manifest themselves as built-in components into existing products, plug-in programs to existing tools, ActiveX controls, or standalone applications. It is anticipated that new applications will be created from the spark of XML/EDI implementation. The following list isn't comprehensive, but a starting place for development.
A primary component of XML/EDI is its dynamic common language and syntax repository. The various type of repositories include:
The central goals behind the development of the concept of DataBots are:
All these goals are realizable using XML/EDI-R.
Editor's Note: Figure needs revision to bring it into line with revised text
Note: The preliminary work and proof of concept for the DataBot technology is centered around a core module originally written in Prolog. (Prolog has long been recognized as highly suitable for creating Internet agent components.) Work is currently under way to port this work to Java so that it can be easily distributed across the Internet and integrated with Java-based XML browsers.
DataBots can:
It should also be noted that the template method that the DataBots implement is extremely compact and concise. The format can be compacted down by removing white space and other filler characters. This allows fulfilment of the objective of efficient protocol required to meet high volume constraints in batch EDI delivery systems.
Some additional considerations also need to be taken into account include Process Control and Object Oriented support. The latter obviously requires that the template and its data be encapsulated in an XML wrapper. Process Control is easy to accommodate using XML/EDI. The trend within the EDI community seems to be to use the Integrated Computer Aided Manufacturing (ICAM) Definition Language (IDEF) process modelling language. Given this, either XML tokens for IDEF entities can be assigned, and then process control lines added to the template format, or IDEF can be defined as a notation that can be processed by an XML/EDI-aware browser.
In summary, the optional DataBots component provides the agent that brokers, controls, corrects, directs and ensures that the XML/EDI-R method can process the information transfers correctly. It uses the rule templates that are an essential part of the XML/EDI-R message syntax to accomplish this. Supplemental Java based components can also be used when needed to control the business process.
Business objects will be available off-the shelf, created by developers, with rule sequences devised by users. The usage of these objects can be defined by their sphere of influence. Business objects can be:
Business objects, in most but not all cases, will be invoked by the XML/EDI Data Manipulation Agents. It is anticipated that for efficiency these object manipulation DataBots will be written as Java applets and/or ActiveX components, or using similarly integrated programming language tools. End-users will be supplied with tools that automatically generate the relevant agents from information provided about the application.
Below are just a few examples of the many possible classes of XML/EDI business objects:
Used for the interactive creation and completion of form-based EDI, the XML/EDItor is predicated to become the front-end for business applications. XML/EDI will reference Lexicon Repositories to prompt users for appropriate data using XML parse trees to request related fields.
It is anticipated that message stores will require extensions to provide the types of complex workflow management needed to ensure the correct delivery and processing of XML/EDI messages. For example, a message store should not be able to acknowledge receipt of a message until its contents have been parsed by an XML parser to ensure that the unencrypted data stream still forms a valid message.
In time it is anticipated that message stores will mutate to use XML natively. This is not because of XML/EDI directly but because message stores that know how to identify, search for and process objects within multimedia streams or business messages will be required for a wide range of application scenarios.
Based on ad-hoc, learned or profiled information, search engines will recognize XML/EDI specific tagging and be able to reference suitable private and public message stores, using standard WWW interfacing, to extract data intelligently. This will allow for the best combination of free-text and fielded search. Catalogs and buyer agents will be among the first to use XML/EDI technology in this way.
XML/EDI will use a mix of today's X.500 technology, security certificates, "yellow pages", Email look-up, and verified characteristics of entities. This is a critical component of performing business, much less via electronic means. Subsystems will undoubtedly develop along these lines: they will have to support XML/EDI interfacing of basic CRUD functions (Create, Revise, Update, Delete) as a minimum. XML/EDI Data Manipulation Agents shall be able to draw upon these resources to validate transactions.
The following stages are involved in using XML for the interchange of commercial EDI messages:
Identification of data sets for interactive electronic commerce will often be the responsibility of industry associations and various standardization bodies such as UN/EDIFACT and EBES (the European Board for EDI standardization).
Whereas existing EDI definitions are primarily concerned with the way in which a set of fields forms a message, the concepts required for XML/EDI are based more on the definition of independent classes of information that can be combined together with other classes of information to form interchangeable messages. As such the concepts are more akin to the idea of a Basic Semantic Repository (BSR) being proposed by ISO, and of the Business Systems Interconnection (BSI) proposal from University of Melbourne.
There is, however, one basic difference between using XML/EDI for defining data classes and using the BSR or BSI methodologies. In XML/EDI the order and number of subclasses of a data class can be altered by users without having to formally register that fact with any centralized organization. For example, if it was necessary for an application to separate building numbers or names from information about the street the building is located within, XML/EDI would allow users to define two new subclasses that would be combined to provide the information needed for an existing EDI address component.
One of the advantages the accrues from XML/EDI's ability to subclass fields is that such fields can be developed interactively using information supplied from more than one location. For example, telephone order processing systems in today's world of interactive electronic commerce often start by asking users for their postcode. This tells the system which region, town and street the user is located in, but not which building they are in. To find this out you need to ask the user for a number or name that uniquely identifies the building within the street identified by the postcode. Using these two related pieces of information it is possible to interactively complete a class of information, an address, that can then be shared by an order, its delivery note, and the invoice required for settlement.
Once information has been captured once, and used to create an instance of the relevant class of data, it should not be necessary to recreate the information each time it is required. All that should be needed is that processes that need this information reference the point at which the data was originally captured - the address associated with the order for the goods.
Messages that pass between systems will typically conform to a previously agreed XML document type definition (DTD) that formally describes, in terms interpretable by both humans and computers, an internationally accepted message type. XML DTDs can be developed by:
XML DTDs will typically be stored in separate files, which can be referenced, as an XML external subset, by those wishing to use it through the Internet Uniform Resource Locator that its originator has assigned to a publicly available copy of the data. Alternatively, if public access is to be restricted, the document type definition can be stored as an internal subset of the information within the message.
Where the document type definition is based on classes of information shared by more than one message, each class of information can be defined in a separate file, these files being referenced in a suitable sequence from within the external or internal subset of the XML DTD.
For example, an XML DTD could have the form:
<!ENTITY % address SYSTEM "http://www.myco.org/messages/XML/address.xml" > <!ENTITY % items SYSTEM "http://www.edifact.org/messages/XML/items.xml"> <!ENTITY % data "(#PCDATA)"> <!ELEMENT order (order-no, deliver-to, invoice-to, item+) > <!ELEMENT order-no %data; > <!ELEMENT deliver-to (address) > <!ELEMENT invoice-to (address) > <!--Import standard address class--> %address; <!--Import standard item class--> %items;
Note that the source of each class of information is
identified not in the call to the class itself (%address;
)
but within a formal definition of the data storage entities
required to process the document which the class definition
references (e.g. the first four lines of the DTD). This technique
allows files to be moved without having to change the main
definition of the DTD.
Typically the entity definitions will be stored outside the DTD, which will contain a reference to the URL of the point at which the latest details of library file locations can be found. For example:
<!ENTITY % library SYSTEM "http://www.myco.org/messages/XML/library.ent"> %library; <!ELEMENT order (order-no, deliver-to, invoice-to, item+) > <!ELEMENT order-no %data; > <!ELEMENT deliver-to (address) > <!ELEMENT invoice-to (address) > <!--Import standard address class--> %address; <!--Import standard item class--> %items;
where %library;
includes the entity definitions
given at the start of the previous example.
XML is currently (September 1997) being extended to provide facilities for ensuring that data modules taken from libraries do not introduce name clashes in their elements. The names of elements within each module can be qualified by a module (namespace) identifier. Each namespace identifier can be associated with a URL that uniquely identifies where the module is formally defined. For example, the contents of the library file referenced above could be defined as:
<?XML-NAMESPACE HREF="http://www.ebes.org/XML/EDI-address.xml" AS="address"?> <?XML-NAMESPACE HREF="http://www.fora-a.org/XML/order-items.xml" AS="items"?> <!ENTITY % data "(#PCDATA)"> <!ENTITY % address " <!ELEMENT address (address:company, address:street, address:town, address:region, address:postcode) > <!ATTLIST address id ID #IMPLIED > <!ELEMENT address:company %data; > <!ELEMENT address:street %data; > <!ELEMENT address:town %data; > <!ELEMENT address:region %data; > <!ELEMENT address:postcode %data; > <!ELEMENT same-as EMPTY> <!ATTLIST same-as idref IDREF #REQUIRED > "> <!ENTITY % items " <!ELEMENT item (item:identifier, item:name, item:quantity)> <!ELEMENT item:identifier %data; > <!ELEMENT item:database-key %data; > <!ELEMENT item:EAN %data; > <!ELEMENT item:name %data; > <!ELEMENT item:quantity %data; > ">
XML permits entities and attributes that are defined in the external subset to be redefined in the internal subset. This facility allows XML/EDI users to develop locally significant subclasses. It can also be used to create subsets of messages by removing unused fields from the data model.
For example, the internal subset of a DTD based on the above
standardized DTD could contain the following local redefinition
for the %items;
parameter entity:
<!ENTITY % items " <!ELEMENT item (item:identifier, item:name, item:quantity)> <!ELEMENT item:identifier (item:database-key?, item:EAN) > <!ELEMENT item:database-key %data; > <!ELEMENT item:EAN %data; > <!ELEMENT item:name %data; > <!ELEMENT item:quantity %data; > ">
In this case the optional item:database-key
field
can contain a direct pointer to the database entry from which the
EAN and associated product name were obtained. This key could be
used by a DataBot to process the item information without having
to generate a query based on the EAN normally provided the identifier
field as the basis for a slower-to-process database query.
An XML/EDI interactive electronic commerce message consists of a pointer to the document type definition, any definitions required in the internal subset of the DTD, and specified fields for each of the fields required for the message. For example, the following example could conform to the external DTD shown above:
<!DOCTYPE order SYSTEM "http://www.myco.org/messages/XML/message1.xml" [ <!ELEMENT identifier (database-key?, EAN) > <!ELEMENT database-key %data; > <!ELEMENT EAN %data; > ]> <order> <order-no>123456</order-no> <deliver-to> <address id="SGML154"> <address:company>The SGML Centre</address:company> <address:street>29 Oldbury Orchard</address:street> <address:town>Churchdown</address:town> <address:region>Glos.</address:region> <address:postcode>GL3 2PU</address:postcode> </address></deliver-to> <invoice-to> <same-as idref="SMGL154"/> </invoice-to> <item><item:identifier> <item:database-key>get151235</item:database-key> <item:EAN>15356378797</item:EAN></item:identifier> <item:name>Special Offer 16</item:name> <item:quantity>12</item:quantity></item></order>
XML/EDI messages can be validated by a validating XML document instance processor (known as an XML parser) to ensure they contain all required elements from the specified data set, and that the fields are in the required sequence. XML elements can, however, be assigned attributes that point to processors that can undertake such validity checks. This can be done either by associating notation processors with an element, or by associating a ECMAScript specification with the element.
The basic XML language allows user-defined notation processors to be used to validate the contents of specific XML elements. This is done by adding definitions of the following form to the external or internal subset of the DTD:
<!NOTATION EAN-vailidator SYSTEM "http://www.myco.org/messages/validate/EAN.cgi"> ... <!ATTLIST EAN check NOTATION (EAN-validator) #FIXED "EAN-validator">
The predefined check
attribute of the EAN
element will cause the contents of the element to be passed to
the program identified by the declaration for the notation
assigned the local name EAN-validator
which is
stored at the location indicated by the URL given in the notation
declaration. This processor would typically pass back a message
indicating whether or not the EAN is valid within the context of
the relevant message.
The XML Style Language (XSL) provides an alternative, and more generally applicable method. Details of this method are given below under the heading "Processing messages".
Data captured in XML/EDI messages can be exchanged:
Where conversion into a known EDIFACT format is required the
DTD can be extended to provide additional attributes that can
guide the transformation process. For example, the following
additional properties could be added to the list of attributes
assigned to the EAN
element:
<!ATTLIST EAN check NOTATION (EAN-validator) #FIXED "EAN-validator" EDI-prefix CDATA #FIXED "LIN+1++" EDI-suffix CDATA #FIXED ":EN'" >
The way in which a received message would be processed would depend on which of the available methods for exchanging messages was chosen. If the message was received in a format that provided the XML/EDI message generated by the originator, the XML Style Language (XSL) can be used to associate different processes with individual element classes so that elements can be processed by one or more local processors.
XML/EDI message instances are specifically designed to make
the selection of data fields and classes at the receiver as easy
as possible. Each field starts with a "start-tag" that
clearly identifies the class of the following data or embedded
subelement set, and specifics any non-default properties to be
associated with the data. The end of each data element is clearly
identified by an "end-tag", which consists of the name
of the element (class) preceded by a slash between a matched pair
of outward pointing angle brackets. Fields that contain no data,
and no embedded subelements (e.g. fields that are only present to
point to other data sources) have the slash indicating their end
point immediately before the last angle bracket of the start-tag
rather than immediately after the first one of the end-tag. (See
the example for the <same-as/>
element above.)
Classes that contain subclasses of information have embedded
elements between their start-tag and end-tag.
XSL allows sets of actions to be associated with particular XML elements. Actions can be defined in terms of values to be assigned to a set of data presentation attributes (styles), or in terms of a data processing script that users can define using a define-scriptobject . XSL scripts are defined using the ECMAScript language used for exchanging Java programming modules.
Which actions are associated with which elements are can be
defined using XML element sets known as XSL rules. A
simplified set of style-rules allow presentation
properties to be applied to element classes. Rules can
be associated with elements that have been assigned a unique
identifier (id
) attribute or that have been assigned
a particular value for a class
attribute.
Sets of rules and actions can be defined in macros. Macros can be associated with style processing attributes associated with specific instances of an element. The default set of style properties defined in XSL can be extended using define-style objects
The component parts of an XML Style Sheet can be:
A typical XML/EDI XSL description will contain:
<define-script>
element that
contains ECAMScript definitions of the variables and
functions required to process the document (in addition
to the default function set provided by XSL)<define-macro>
elements that
provide named sets of predefined actions<define-style>
elements that
define properties that are to be used to control style
processing<rule>
elements that contain
within them: <target-element>
that
indicates which type of element the rule is to
apply to, or<id>
element that
identifies the unique identifier of the
particular instance of an element the rule
applies to, or<class>
element that
identifies which class of elements the rule is to
apply to<element>
element that
defines ancestors of the targetted element that
must be present for the rule to apply (<element>
elements surround the targetted element
definition)<attribute>
element that
identifies which attributes the selected
element(s) must have before the rule applies<invoke-macro>
element,
which may have embedded within it a set of <arg>
(argument) control elements, that indicates which
macros are to be associated with the rule<style-rule>
elements that
show which presentation styles should be assoiciated with
particulart element types/classes/instances.XSL actions are typically associated with the way in which objects should be presented to users. This process is typically controlled through the use of flow objects. XSL provides two default sets of flow objects, one based on the elements typically found in HTML files, and the other based on the flow objects defined in ISO/IEC 10179 (DSSSL). The set of DSSSL flow objects supported by XSL includes:
The <eval>
element can be used to indicate
points at which macros and scripts are to be evaluated as a
result of applying a rule.
Note: Unfortunately the XSL specification has not yet been made public as it is still being reviewed by the XML Special Interest Group. When it is published a pointer to the full specification will be placed here.
For an example of the use of XSL specifications refer to Appendix A.
The XML link process can be used to associate XML/EDI rules with a file. Normally the Simple Link format will be used to identify one or more files containing the relevant rules. Typically this will result in an element of the following form being added to the start of the document instance:
<rules-template XML-LINK="SIMPLE" ROLE="xml/edi-rules" HREF="http://www.myco.org/XML/EDI/Rules/orders.xml" TITLE="Rules for processing orders" SHOW="EMBED" ACTUATE="AUTO"/>
Note: The XML-LINK
, ROLE
,
SHOW
and ACTUATE
attributes would typically be defined as default values in the
associated DTD. They are shown in this example to illustrate the
type of information that gets associated with an XML/EDI rule
link. The TITLE
attribute is
optional. It provides some text that users can click on to
display the relevant rules file.
The following statement of the current role of EDI in Book Ordering was made by the European Board of EDI Standardization by the UK Book Industry Communication (BIC) manager, Brian Green in May 1997:
"The nature of the book trade has encouraged its adoption of various forms of Electronic Commerce over the last 20 years. The introduction of a national UK standard book numbering system in the 1960's and an international standard (ISBN) in the early 70's together with central catalogues of books in print in nearly all countries was essential for an industry where even the smallest retail outlet offered customers the facility to order any one of around 600,000 books currently in print (in the UK) from 20,000 publishers with, currently, one hundred thousand new titles appearing every year. There was no hub in the traditional sense since, although WH Smith in the UK has always had a large market share, the number of book titles stocked is relatively low and they have not, until very recently, been much concerned with customers special orders.
In the late 1970's, the UK book trade set up Teleordering as a centralized ordering service using a simple non-standard order format, providing dedicated terminals on which booksellers simply keyed quantity and ISBN (their location number was installed on the form as a default). The orders were polled overnight by Teleordering and automatically routed to the correct publisher either electronically or, in the case of small publishers, by mail or fax. The bookseller received a basic confirmation of receipt of the order by Teleordering with an indication from the Teleordering database whether the book was recorded as available or out of print. Today TeleOrdering has an annual throughput of some 27 million orders, runs on PC's and is owned J Whitaker & Sons who also publish a 'books in print' CD-ROM and provide a sales data monitoring service. Teleordering has also established itself as an EDI VAN with a full range of Tradacoms and EDIFACT messages. The two services run side by side and will convert the non-standard Teleordering format orders coming from booksellers to EDIFACT or Tradacoms for transmission to publishers.
Similar services were set up in other European countries, the US, Canada etc., although the UK service has always been the largest in the world.
A second book trade EDI service, called First EDItion was set up in 1992 in the UK. This is a pure EDI service based on INS and is particularly strong in the library sector. Both First EDItion and Teleordering are being used for international trade, mainly between UK publishers and European wholesalers who, e.g. in Netherlands and Germany, operate their own dedicated electronic ordering services for booksellers in their countries. First Edition has announced that it will introduce a book trade service based on GE's "TradeWeb", which offers a forms-based Internet service linking to the GEIS VAN.
There has been an interesting 'light EDI' scheme running in the UK for the last four years. Following publication of the book trade Tradacoms messages by Book Industry Communication, the UK book trade EDI body, the major UK wholesalers, who had until then been offering dedicated electronic ordering services, decided to collaborate in a service called BUYLINE. They provided all their bookseller customers, at a nominal cost with simple forms based ordering software that links in with either the 'book bank' books in print CD-ROM or a wholesalers own stockist, enabling the bookseller to select the books required and choose their supplier from a pull down list. BUYLINE includes communications software that dials up the selected supplier and transmits the order in Tradacoms format. The software will also accept Tradacoms acknowledgments and present these to the user in a simple user-friendly format. The rights in this product have now reverted to the systems house, Triptych! ! ! , who developed it and they are extending the service to the major distributors as well as wholesalers. Their software is also included in a number of the book shop computer systems. It is generally expected that the BUYLINE system will migrate to EDIFACT and use Internet rather than direct dial up communications in due course.
A further development is the regular monthly production of multimedia CD-ROM stock catalogues by major European wholesalers. Most of these allow users to build order files and output them in EDI formats, normally using direct dial-up. It is anticipated that data compression and increased bandwidth will soon allow these facilities to be available over Internet. An important point, however, is that BIC in the UK and EDItEUR in Europe have managed to produce a consensus on the book trade implementation of the messages that ensures that all recent services use standard message formats."
BIC feel that trials of standard forms freely available over the Internet, outputting EDIFACT messages to any trading partner able to receive them, would be very helpful.
The HTML form shown in Figure A.1 has been designed for input of an order for up to two different books using the EDItEUR Book Ordering Message. The values entered into the fields on the form are the values used in the example EDI message provided for the form in the EDItEUR EDI Implementation Guidelines for Book Trade Distribution.
Figure A.1: HTML form for capturing EDItEUR Lite-EDI Book Order Messages
Figure A.2 shows how XML could be used to code a form whose appearance would be equivalent to the HTML form shown in Figure A.1.
<!DOCTYPE Book-Order PUBLIC "-//EDItEUR//DTD Book Order Message//EN"> <Book-Order Supplier="4012345000094" Send-to="http://www.bic.org/order.in"> <rules-template HREF="http://www.bic.org.uk/XML/EDI/Rules/orders.html"> <title>EDItEUR Lite-EDI Book Ordering</title> <Order-No>967634</Order-No> <Message-Date>19961002</Message-Date> <Buyer-EAN>5412345000176</Buyer-EAN> <Order-Line Reference-No="0528837"> <ISBN>0316907235</ISBN> <Author-Title>Labaln, Brian/Chrome</Author-Title> <Quantity>2</Quantity> </Order-Line> <Order-Line Reference-No="0528838"> <ISBN>0856674427</ISBN> <Author-Title>Parry, Linda (ed)/William Morris</Author-Title> <Quantity>1</Quantity> </Order-Line> <input type="checkbox" name="partial" value="allowed"/> <text>Tick here if a delayed/partial supply of order is acceptable </text> <input type="checkbox" name="confirmation" value="requested"/> <text>Tick here if Confirmation of Acceptance of Order is to be returned by e-mail </text> <input type="checkbox" name="DeliveryNote" value="required"/> <text>Tick here if e-mail Delivery Note is required to confirm details of delivery </text> <E-Address>E-mail address: <input name="e-address" size="25"></input> </E-Address> <Language>Please respond in: <select name="response-language"> <option value="EN" selected>English</option> <option value="FR">Français</option> <option value="DE">Deutsch</option> <option value="ES">Espagnol</option> <option value="IT">Italiano</option></select></language> <input type="submit" value="Press here to send completed form to supplier"> </Book-Order>
Figure A.2: XML encoding of Book Order Message
A typical reaction to seeing such a form is "Where has
all the EDI information gone?". The answer is that all
immutable information goes into the document type definition
(DTD) referenced in the <!DOCTYPE
statement that
starts the coding. Figure A.3 shows the contents of this DTD. A
single line reference to this DTD is sufficient to provide the
browser with all the additional information it needs to process
the message.
Note how the definition of each element defined in Figure A.3 contains attributes whose fixed values contain the prefixes and suffixes of each of the EDIFACT fields that need to be generated in response to the messages.
The message format generated for the completed form could be a pure EDIFACT message of the type shown on Page II-2-2 of the EDItEUR EDI Implementation Guidelines for Book Trade Distribution.
<!DOCTYPE Book-Order [ <!--XML-conformant DTD for EDItEUR Book Order Message. Version 1.0 - Created 1st July 1997 by M. Bryan from The SGML Centre This DTD should be referenced using the following public identifier: PUBLIC "-//EDItEUR//DTD Book Order Message//EN" --> <!--Entities referenced within DTD--> <!--Support information elements are designed to supply information that can be used to control the processing of the message.--> <!ENTITY % support-info "(E-Address|Language|text|input|selec)*" > <!--Entities used to datatype attribute values--> <!--Uniform Resource Locator identifier. Contents of attribute must provide a valid HTTP or MAILTO address conforming to IETF RFC 822--> <!ENTITY % URL "CDATA" > <!--EAN location code. Number that uniquely identifies suppliers/purchasers. --> <!ENTITY % EAN "NUMBER"> <!--Formal EDIFACT definition of datatype. May be used by EDI-compliant browsers to validate the data entered by users prior to acceptance when a user attempts to move to another field. --> <!ENTITY % EDItype "NAME" > <!--Message Content element declarations--> <!--Book Order element: Purpose: Container for message fields and support information. Attributes: EDI-Prefix formally identifies type of message EDI-Suffix contains strings to be output at end of message Send-to identifies Uniform Reference Locator (URL) for site to which EDIFACT message is to be sent for processing Supplier contains unique EAN that identifies supplier --> <!ELEMENT Book-Order (rules-template?, title?, Order-No, Message-Date, Buyer-EAN, Order-Line+, %support-info;) > <!ATTLIST Book-Order EDI-Prefix CDATA #FIXED "UNH+ME00579+ORDERS:D:93A:UN:EAN007" EDI-Suffix CDATA #FIXED "UNS+S'CNT+2:2'UNT+18+ME00579" Send-to %URL; #REQUIRED Supplier %EAN; #REQUIRED > <!--Rules-template element: Purpose: To indicate which set of rules should be used to process the component parts of the message --> <!ELEMENT rules-template EMPTY> <!ATTLIST rules-template XML-LINK CDATA #FIXED "SIMPLE" ROLE CDATA "xml/edi-rules" HREF CDATA #REQUIRED TITLE CDATA #IMPLIED SHOW (EMBED|REPLACE|NEW) "EMBED" ACTUATE (AUTO|USER) "AUTO" > <!--Title element: Purpose: Used to provide supplier dependent title for form: Title can be displayed in window header or at top of form, or in both locations --> <!ELEMENT title (#PCDATA) > <!--Order Number element: Purpose: Allows users to assign unique number to their order. Attributes: EDI-Prefix formally identifies type of message Datatype identifies format that contents must conform to Size indicates width of box to be used to capture input Title indicates text to precede box --> <!ELEMENT Order-No (#PCDATA) > <!ATTLIST Order-No EDI-Prefix CDATA #FIXED "BGM+220+" Datatype %EDItype; #FIXED "C8" Size NUMBER #FIXED "8" Title CDATA "Book Order No:" > <!--Message Date element: Purpose: To indicate date order was placed. Date must be entered in ISO 8601 format without separators, e.g. CCYYMMDD Attributes: EDI-Prefix formally identifies type of message EDI-Suffix identifies data to immediately follow contents Datatype identifies format that contents must conform to Size indicates width of box to be used to capture input Title indicates text to precede input field Comment contains explanatory text to follow the input field --> <!ELEMENT Message-Date (#PCDATA) > <!ATTLIST Message-Date EDI-Prefix CDATA #FIXED "DTM+137+" EDI-Suffix CDATA #FIXED ":102" Datatype %EDItype; #FIXED "Date" Size NUMBER #FIXED "12" Title CDATA "Message Date:" Comment CDATA "Enter dates in CCYYMMDD format" > <!--Buyer EAN identifier element: Purpose: To identify the unique EAN assigned to the purchaser. Attributes: EDI-Prefix formally identifies type of message EDI-Suffix identifies data to immediately follow contents Datatype identifies format that contents must conform to Size indicates width of box to be used to capture input Title indicates text to precede box --> <!ELEMENT Buyer-EAN (#PCDATA) > <!ATTLIST Buyer-EAN EDI-Prefix CDATA #FIXED "NAD+BY+" EDI-Suffix CDATA #FIXED "::9" Datatype %EDItype; #FIXED "C13" Size NUMBER #FIXED "13" Title CDATA "Buyer EAN:" > <!--Order line element: Purpose: Container for objects used to order book. Attributes: EDI-Prefix formally identifies type of message Line-no is calculated by system to be 1 + number of preceding order lines within file. Ref-Prefix identifies EDI prefix for reference number Reference-no uniquely identifies each line. Number is supplied by supplier's system with input file. --> <!ELEMENT Order-Line (ISBN, Author-Title, Quantity) > <!ATTLIST Order-Line EDI-Prefix CDATA #FIXED "LIN+" Line-no NUMBER #IMPLIED Ref-Prefix CDATA #FIXED "#RFF+LI:" Reference-No NUMBER #REQUIRED > <!--ISBN element: Purpose: To enter unique ISBN of book to be ordered Attributes: EDI-Prefix formally identifies type of message EDI-Suffix identifies data to immediately follow contents Datatype identifies format that contents must conform to Size indicates width of box to be used to capture input Title indicates text to precede box --> <!ELEMENT ISBN (#PCDATA) > <!ATTLIST ISBN EDI-Prefix CDATA #FIXED "PIA+5+" EDI-Suffix CDATA #FIXED ":IB" Datatype %EDItype; #FIXED "N12" Size NUMBER #FIXED "12" Title CDATA "ISBN:" > <!--Author and Title element: Purpose: Optional statement of author and title details to confirm correct ISBN has been entered. Attributes: EDI-Prefix formally identifies type of message Datatype identifies format that contents must conform to Size indicates width of box to be used to capture input Title indicates text to precede box --> <!ELEMENT Author-Title (#PCDATA) > <!ATTLIST Author-Title EDI-Prefix CDATA #FIXED "IMD+F+BST+:::" Datatype %EDItype; #FIXED "C60" Size NUMBER #FIXED "40" Title CDATA "Author/Title:" > <!--Quantity element: Purpose: To identify the number of copies required. Attributes: EDI-Prefix formally identifies type of message Datatype identifies format that contents must conform to Size indicates width of box to be used to capture input Title indicates text to precede box --> <!ELEMENT Quantity (#PCDATA) > <!ATTLIST Quantity EDI-Prefix CDATA #FIXED "PQTY+21:" Datatype %EDItype; #FIXED "N2" Size NUMBER #FIXED "2" Title CDATA "Quantity:" > <!--Declarations for message control support elements--> <!--Electronic Address element: Purpose: To capture electronic address to which messages from the supplier to the buyer can be sent. --> <!ELEMENT E-Address (#PCDATA|input)* > <!--Language element: Purpose: Container linking text to selection menu. --> <!ELEMENT Language (#PCDATA|select)* > <!--Text element: Purpose: Temporary element required because HTML input has no equivalent of the title element. --> <!ELEMENT text (#PCDATA) > <!--Input, select and option elements: Purpose: As per HTML (temporarily borrowed element). Attributes: As per HTML (temporarily borrowed attributes). --> <!ENTITY % InputType "(TEXT | PASSWORD | CHECKBOX | RADIO | SUBMIT | RESET | FILE | HIDDEN | IMAGE)" > <!ELEMENT input (#PCDATA) > <!ATTLIST input type %InputType; "TEXT" name CDATA #IMPLIED value CDATA #IMPLIED checked (checked) #IMPLIED size CDATA #IMPLIED maxlength NUMBER #IMPLIED src %URL; #IMPLIED align (top|middle|bottom|left|right) top > <!ELEMENT select (option+)> <!ATTLIST select name CDATA #REQUIRED multiple (multiple) #IMPLIED > <!ELEMENT option (#PCDATA) > <!ATTLIST option selected (selected) #IMPLIED value CDATA #IMPLIED > <!ENTITY % ISOlat1 SYSTEM "http://www.myco.org/public/entities/ISOlat1.ent" > <!ENTITY % ISOnum SYSTEM "http://www.myco.org/public/entities/ISOnum.ent" > %ISOlat1; %ISOnum; ]>
Figure A.3: XML Document Type Definition for Lite EDI Book Order
The rules file associated with this document could take the following form:
<DEFINE-SCRIPT> ... ECMAScript description of required functions and variables to be added here ... function BookOrderValidationCheck {....} function ISO8601DateCheck {....} function CheckEAN {....} function GetReferenceNo {....} function CheckIfTicked {.... function OutputIfYes{...} function OutputifNo{...} } </DEFINE-SCRIPT> <DEFINE-MACRO NAME="GetOrderLineNo"> <EMBEDDED-TEXT USE="EnteredData"> <SELECT ANCESTOR="Order-Line"> <EVAL>(node.Reference-No)</EVAL> </SELECT> </EMBEDDED-TEXT> </DEFINE-MACRO> </DEFINE-MACRO NAME="DisplayTickBox"> <EVAL>(CheckIfTicked OutputIfYes(<BOX>&tick;</BOX>) OutputIfNo(<BOX/>) </EVAL> </DEFINE-MACRO> <DEFINE-STYLE NAME="EnteredData" FONT="Arial" FONT-SIZE="11PT"/> <DEFINE-STYLE NAME="DefaultStyle" FONT="Times" FONT-SIZE="12PT"/> </DEFINE-STYLE NAME="RaisedButton" QUESTION="How?"> <STYLE-RULE> <TARGET-ELEMENT TYPE="title"/> <APPLY FONT="Times" FONT-SIZE="18PT" FONT-WEIGHT="BOLD" /> </STYLE-RULE> <RULE> <TARGET-ELEMENT TYPE="Order-No"/> <SEQUENCE> <EVAL>BookOrderValidationCheck <TABLE> <TABLE-ROW> <TABLE-CELL USE="DefaultStyle">Book Order No:</TABLE-CELL> <TABLE-CELL USE="EnteredData"><CONTENTS/></TABLE-CELL> </TABLE-ROW> </SEQUENCE> </RULE> <RULE> <TARGET-ELEMENT TYPE="Message-Date"/> <SEQUENCE> <EVAL>ISO8601DateCheck <TABLE-ROW> <TABLE-CELL USE="DefaultStyle">Message Date:</TABLE-CELL> <TABLE-CELL USE="EnteredData"><CONTENTS/></TABLE-CELL> <TABLE-CELL/> <TABLE-CELL USE="DefaultStyle">Enter date in CCYYMMDD format</TABLE-CELL> </TABLE-ROW> </SEQUENCE> </RULE> <RULE> <TARGET-ELEMENT TYPE="Buyer-EAN"/> <SEQUENCE> <EVAL>CheckEAN <TABLE-ROW> <TABLE-CELL USE="DefaultStyle">Buyer EAN:</TABLE-CELL> <TABLE-CELL USE="EnteredData"><CONTENTS/></TABLE-CELL> </TABLE-ROW> <TABLE-ROW> <TABLE-CELL USE="DefaultStyle">Supplier EAN:</TABLE-CELL> <TABLE-CELL USE="EnteredData">4012345000091</TABLE-CELL> </TABLE-ROW> </SEQUENCE> </RULE> <RULE> <ELEMENT TYPE="Order-Line"> <TARGET-ELEMENT TYPE="ISBN"/> </ELEMENT> <TABLE-ROW> <TABLE-CELL USE="DefaultStyle">ISBN:</TABLE-CELL> <TABLE-CELL USE="EnteredData"><CONTENTS/></TABLE-CELL> </RULE> <RULE> <ELEMENT TYPE="Order-Line"> <TARGET-ELEMENT TYPE="Author-Title"/> </ELEMENT> <TABLE-CELL USE="DefaultStyle">Author/Title:</TABLE-CELL> <TABLE-CELL USE="EnteredData"><CONTENTS/></TABLE-CELL> </TABLE-ROW> </RULE> <RULE> <ELEMENT TYPE="Order-Line"> <TARGET-ELEMENT TYPE="Quantity"/> </ELEMENT> <TABLE-ROW> <TABLE-CELL USE="DefaultStyle">Quantity:</TABLE-CELL> <TABLE-CELL USE="EnteredData"><CONTENTS/></TABLE-CELL> <TABLE-CELL/> <TABLE-CELL USE="DefaultStyle">Order line reference number: <INVOKE MACRO="GetOrderLineNo"/> </TABLE-CELL> </TABLE-ROW> </RULE> <RULE> <TARGET-ELEMENT TYPE="Order-Line"/> <EVAL>(...If no more order lines output </TABLE> after end tag ....) </RULE> <RULE> <TARGET-ELEMENT TYPE="input"> <ATTRIBUTE NAME="type" VALUE="checkbox"/> </TARGET-ELEMENT> <PARAGRAPH> <INVOKE MACRO="DisplayTickBox"/> </RULE> <RULE> <TARGET-ELEMENT TYPE="text"/> <CONTENTS/> </PARAGRAPH> </RULE> <RULE> <TARGET-ELEMENT TYPE="E-Address"/> <PARAGRAPH> <CONTENTS/> </RULE> <RULE> <TARGET-ELEMENT TYPE="input"> <ATTRIBUTE NAME="name" VALUE="e-address"/> </TARGET-ELEMENT> <EMBEDDED-TEXT USE="EnteredData"> <EVAL>InputBoxContents</EVAL> </EMBEDDED-TEXT> </RULE> <RULE> <TARGET-ELEMENT TYPE="Language"/> <CONTENTS/> </RULE> <RULE> <ELEMENT TYPE="Language"> <TARGET-ELEMENT TYPE="select"/> </ELEMENT-TYPE> <BOX> <SCROLL> <EVAL>EmbeddedOptions</EVAL> </SCROLL> </BOX> </PARAGRAPH> </RULE> <RULE> <TARGET-ELEMENT TYPE="input"> <ATTRIBUTE NAME="name" VALUE="submit"> </TARGET-ELEMENT> <BOX USE="RaisedButton"> <CONTENTS/> </BOX> </RULE>
Figure A.4: XML/EDI Processing Rules for Lite EDI Book Order
DataBots - XML/EDI Data Manipulation Agent (a.k.a. "Bot" is a software term for a component that acts as an Agent).
XML/EDI-R - the combination of XML message syntax and rule based EDI.