X12/XML FAQ Sheet

[This local archive copy mirrored from the canonical site: http://www.disa.org/x12/x12xmlfaq.html; links may not have complete integrity, so use the canonical document at this URL if possible.]

ANSI ASC X12/XML FAQ Sheet

DRAFT Version 0.2

Instructions

Please return all comments directly to Christine at [pks@webbernet.net].

Note:This is a working draft and at any point in time should "generally" reflect the discussion on the listserv. However it may not reflect the actual direction of the listserv and to assume that it completely reflects the view of the workgroup is not appropriate.

Purpose

XML may be the means to bridge EDI into Internet Electronic Commerce, by making the existing EDI knowledge base more palatable to the Internet EC developers. Because of this, CommerceNet Consortium, XML/EDI Group and ANSI ASC X12 have entered into a joint project to investigate how to translate ANSI ASC X12 data elements, segments and transactions into XML. The output of the investigation is at least three documents: 1) a FAQ, 2) recommendations and 3) a tutorial.

Background

XML is a new technology that is based on SGML. It allows one to indicate the values of data within a document, such as <price>9.99</price>. These tags, <price>, may be defined on a document by document, application by application, industry by industry or on a global basis. This allows a great amount of flexibility in identifying data and allows XML to mimic other existing proprietary or standard data formats, yet do them in a manner that makes the data more easily transferred between application formats. Much of what today’s EDI translators have to know can be retained in the XML format (DTDs) so that off-the-shelf XML tools can interpret the data structure, bypassing the need for complex applications-specific translators.

Note: These are only a part of the questions that will be in the FAQ by the time we complete the effort.

Question 1 — What does a data-element look like in XML?

A data element consists of a start-tag, some data (or embedded elements) and an end-tag.

An example of a simple data element name is:

<part-no>712345123459</part-number>

Additionally the start-tag may be qualified by some processing attributes. For example:

<part-no type="U.P.C. Consumer Package Code (1-5-5-1)"
defn="http://www.uc-council.org/d33-1.htm">
712345123459
</part-no>

Question 2 — What does a segment look like in XML?

There might not be segments in X12 XML. It is possible the XML elements will be self-describing. However, if segments are used, they will look like the following:

<segment-a>
  <part-no database="http://www.ean.org/">123456789</part-no>
  <quantity type="case">1</quantity>
  <price currency="US-dollar">12.50</price>
</segment-a>

Question 3 — How do we maintain the relationships between data elements embodied in segment level syntax notes?

You can qualify groups by the same occurrence indicators used for elements. You can also nest groups to a predefined number of levels (the XML default is more than high enough for ANSI ASC X12 needs). In addition, you have another type of "connector" than the sequence connector (the comma). This other connector, "|," indicates either/or situations. For example, one could define the model for an element called "element-b" as follows:

<!ELEMENT element-b ( (a | (b,c)) + | (d?, e+, f*) )>

This model says that "element-b" either contains a single "a" element, or one or more pairs of "b" and "c" elements. If there is a "b," there must be a "c," and if there are more than one "b," each in turn must have its own "c." If neither of these states occurred, there must be one or more "e" elements, with an optional single "d" element and any number of optional "f" elements, but once you enter any "f" element you may not enter any more "e" elements. If "a," "b/c" or "e" are not present, an error is to be signaled.

Question 4 — How do we represent a "mandatory" or "conditional" item in XML?

XML uses "occurrence indicators" in the XML content model to indicate whether an element is mandatory or optional.

If the element’s occurrence indicator is	then the element is
not listed	mandatory and may only occur once at the point indicated in the model
?	optional and not repeatable
*	optional and repeatable
+	mandatory and repeatable

For example, the following model for a message could be declared in XML:

<!ELEMENT message-x (header, metadata*, segment-a+, signature?)>

In this case, the message must consist of a compulsory header, which may optionally be followed by any number of metadata statements. The message must contain at least one segment of type a, but can contain as many of these segments as is required, and may be qualified by a single signature.

Question 5 — What are the best alternatives for "typing" (nnn, aaa, length 0-5, etc.) elements?

XML has no typing mechanisms. ISO 10744, which defines extensions to SGML in an annex, has a formal method for defining lexical types based on a formal definition of a lexical model and an attribute called lextype. (There is also an informal mechanism based on the use of lextype comments, but this won't work in XML because comments are not allowed within XML declarations.)

However, one does not need to use these standardized methods, and there may be good reasons why one should not do so. Consider the following definitions:

<!ATTLIST element-y X12-REPR CDATA #FIXED "AN/2/30" >

This associates a fixed attribute, unique to the XML-EDI namespace, which provides a definition of the EDI datatype of element-y.

Using this technique, we can develop an ANSI ASC X12-compliant method for checking the validity of ANSI ASC X12 messages and develop a general purpose program for lexical checking of ANSI ASC X12 messages that can be shared.

Question 6 — What would the DTD syntax be for alpha or numeric data only?

There is no "best way." One suggested mechanism is based on POSIX. Another mechanism is based on using URLs to trigger Java lextype analyzers (e.g., <!ATTLIST element-y XML-EDI:definition CDATA #FIXED "lextype.java?alpha[1,3]" >) Other mechanisms based on the use of XSL can also be demonstrated. Alternatively, you can adopt the general-purpose mechanisms defined in ISO 10744 for defining lexical models using SGML. Or, you can say "ANSI ASC X12 messages conform to these data types" and require the use of parameter entities to define this.

Here is yet another solution:
<!ELEMENT x (%alpha;)>
<!ATTLIST x min-length NUMBER #FIXED "2" max-length NUMBER #FIXED "6">
<!ELEMENT x (%integer)>
<!ATTLIST x min-length NUMBER #FIXED "1" max-length NUMBER #FIXED "4">
<!ELEMENT x (%decimal)>
<!ATTLIST x
min-before-decimal NUMBER #FIXED "1"
max-decimal-length NUMBER #FIXED "2"
decimal-identifier ("point"|"comma") "point" >

Question 7 — How do we represent EDI loops in XML?

XML models can contain "groups" of elements that are repeatable. In the following example modified from that in question 4, segment-a is replaced with a group consisting of a fixed sequence of three objects.

<!ELEMENT message-x (header, metadata*, (part-no, quantity, price)+,
signature?)>

Question 8 — What are the options for representing composite data elements in XML?

Groups are the way SGML can be used to create related sets of elements. See question 7.

Question 9 — What are the different ways one may name EDI data elements in XML?

Use the element ID as the name: <a100>fjfj</a100>
Use the data element name: <CurrencyCode>fjfj</CurrencyCode>
Use the structural hierarchy of where the data element resides:
Use an ISO/ITU OID tree-type structure which shows where every element belongs in the hierarchy, such as: <x1.3.1.850.111.100>fjfj</x1.3.1.850.111.100>

where the order of the hierarchy is: ANSI ASC X12, version, release, x850, currency code.

Each is assigned a number such as:
X12 = 1
version = 3
release = 1
850 = 850
CurrencyCode = 100

Another possibility is that the name used for the element can be dependent on local usage and not be linked directly to the source of the definition or the context in which the object was originally used. For example:

<Height.PhysicalDimensions UOM="MR">26</Height.PhysicalDimensions>

In an X12 MEA Segment, this value — MEA03 D.E. 739 (Measurement Value) (in this case "26") might require up to three additional data elements to describe (or "qualify") it: (1) MEA01 D.E. 737 (Measurement Reference ID Code — Code identifying the broad category to which a measurement applies) value "PD" (Physical Dimensions), (2) MEA02 D.E. 738 (Measurement Qualifier — Code identifying a specific product or process characteristic to which a measurement applies) value "HT" (Height), and finally, (3) MEA04-C00101 D.E. 355 (Unit or Basis for Measurement Code — Code specifying the units in which a value is being expressed, or manner in which a measurement has been taken) value "MR" (Meter).

Question 10 — Are XML tags positional? The Unit of Measure, DE355 has 418 code values. How and where would all these code values be carried, and how would validation be carried out? Would the code values have to be passed with every data message, or only in an environment where data input or validation was required? How would you tell the difference between a transmission where validation might be required, and those where it is not?

XML tags are not positional, however specific data element names may be. For example, in ANSI ASC X12, the N2 segment is only meaningful in the context of the N1 segment that precedes it. An XML tag of <N2> would not necessarily be associated with an <N1> that came immediately before it.

Precisely because of elements such as DE355, which have vast ranges of permitted values, data typing is to be handled by invoking a referring to a remote XML document, described by a URL, which would accommodate the description of each possible code value along with its definition and explanation.

All of the below are valid:

<X12 ver='1' release='3'>
     <po>
          <cur>
               <code>123</code>
          </cur>
     </po>
</X12>

<X12 ver='1' release='3'>
     <po>
     <cur p01="Value" p02="Value" p04="Value"/>
     </po>
lt;/X12>

<X12 ver='1' release='3'>
     <X12-message id="850">
     <seg name="cur" p01="Value" p02="Value" p04="Value"/>
     </X12-message>
</X12>

Question 11 — Does using XML eliminate the need for a translator?

It will once we have the missing bit, which we haven't reached yet: XSL or the XML/EDI template.

The XML Stylesheet Language (XSL) is a generalized mechanism for converting XML objects into a displayable/processable form. It includes ECMAScript, a version of JavaScript/JScript that has been standardized by the European Computer Manufacturers Association. The fact that XSL processing instructions can be attached permanently or temporarily to XML messages within general purpose XML tools, and are themselves XML messages, means that there is no need to invent yet another mechanism for moving processing information from one system to another. The fact that ECMAScript is among the most portable of languages currently available means that we can envisage its support in wide range of environments. At present, XSL still awaits formal agreement, and still fails to provide some facilities basic to the support of XML/EDI. We are working on this (among many other things). Before we can finalize the spec, however, we need a clear statement of the needs of specific communities in this area for input/output control and database relationships. Hopefully some of this information will be obtained by our analysis of the needs of those in the EDI community, who have many years of experience in just this field.

The XML/EDI Group is also investigating the development of a non-programming method for associating XML information objects with intelligent agents that can carry out specific business processes. Known as templates, these spreadsheet-like creatures will enable us to map a single XML element, or groups of XML elements, to locally accessible agents. At present, their exact format and use are undocumented.

Question 12 — What are the current data types supported by ANSI ASC X12 standards?

See the list of data element types in ANSI ASC X12.6. Examples include:

Signed Decimal
String — Alphanumerics or any non-control character
Date — YYMMDD
Time HHMMSSd..d, or HHMM if seconds and decimals are not desired
Binary — Any possible string of 8 bit bytes

Sizes for these items are set in the individual data element definitions.

Question 13 — What about basic machine concepts like endian-ness, floating point number format, sign extension, significant digits, precision and so on. What piece of software will perform and interpret the mapping rules to translate the XML/EDI data stream into the machine's native representation?

That’s the beauty of using ECMAScript rather than a basic programming language: the script interpreter on the processing machine takes care of all that bother for you automatically.

Question 14 —What about mapping rules? How is the template designer going to know what any one individual receiver's flat-file format looks like, and the decisions that receiver has chosen as to holding numbers as a BCD string, a floating point number or a long integer? Does each target machine's language compiler pad structures to a convenient spatial boundary, and under what circumstances ?

The template is a flat file, based on a set of clearly defined semantics. Any tools that can accept XML/EDI templates will have interpreters that understand these semantics. We still have to define these — but we know the how the general mechanism works because it is based on well-understood techniques already used in AI systems.

Question 15 — Are you predicating that to receive XML you must be running some kind of relatively intelligent database system that is capable of understanding field/table names and taking care of the mapping for you?

No, the templates will generate standard SQL or ODBC calls that will work with virtually any database. The field naming conventions will be application/industry segment defined. Platform independence is another critical issue. ECMAScript is a cross-platform standard.

Question 16 — How is the XML information routed?

Using any protocol. Normally it will be HTTP'd or FTP'd, but it could equally well be sent as an e-mail attachment, BASE 64 or anything else you can negotiate over your networks.

Question 17 — Does XML get encapsulated into some kind of enveloping structure?

That depends on the negotiation that precedes it.

Question 18 — Is XML sent or received using URL's?

Only in the sense that HTTP can use URLs for address resolution.

Question 19 — Can we map ANSI X12 directly on to XML?

The following is an example of one-to-one mapping of ANSI ASC X12 to XML for a TDS segment ("Total Monetary Value Summary").

     <TDS>
          <TDS01> value1 </TDS01>
          <TDS02> value2 </TDS02>
          <TDS04> value3 </TDS04>
     </TDS>

or even

You can use a single element and allow spaces (or some other character) to delimit values in the lexical analysis of the contents, e.g.,

<TDS>value1 value2 value3</TDS>

Question 20 — Do I still need a "translator"/database/directory at the other end to make out what the three values, value1, value2, and value3, really are?

Not necessarily. The transportable XML/EDI template will contain the mapping rules for elements, attributes and contents with respect to a database or any other pertinent application. In this respect, building a template for the attribute based example is probably the easiest solution because it involves no loops and you can map directly from named attributes to named database fields.

The ANSI ASC X12 TDS Segment has four positional elements, each D.E. 610 "Amount." That by itself doesn't say much. The semantic notes, which are part of the ANSI ASC X12 standard, and must be complied with, read:

TDS01 is the total amount of invoice (including charges, less allowances) before terms discount (if discount is applicable).

TDS02 indicates the amount upon which the terms discount amount is calculated.

TDS03 is the amount of invoice due if paid by terms discount due date (total invoice or installment amount less cash discount).

TDS04 indicates the total amount of terms discount.

So the ANSI ASC X12 definition clearly tells that the correct XML is:

or, in a shorter form:

Now mapping becomes trivial because we have uniquely named elements.

A comment (which is not part of the standard) attached to the segment says TDS02 is required if the dollar value subject to discount is not equal to the dollar value of TDS01.

This is a semantic constraint that cannot be applied in the XML model, anymore than it can in the ANSI ASC X12 model. (SGML provides a mechanism for this, but XML does not support it.)

Question 21 — By using XML, have I lost a lot of context and information, unless I have an ANSI ASC X12 dictionary (or directory, or database, or repository) at my end?

Not if you define the DTD correctly.

Question 22 — Consider an implementation convention for a transaction set which features DE355 in, say, 12 different places. In each place, it has a different but partly overlapping subset (of say, 30 code values). Where would this go ?

This is where you need to take advantage of the XML namespace proposals, as illustrated in http://www.sgml.u-net.com/EDIintro.htm

For each position that DE355 is used in a different way, you need to create a different namespace. This needs to be given a name, and be pointed to a different semantic definition. For example, one can define the following namespaces:

<?XML:namespace href="http://www.x12.org/DE355?role1" as="role1">

<?XML:namespace href="http://www.x12.org/DE355?role2" as="role2">

<?XML:namespace href="http://www.x12.org/DE355?role3" as="role3">

One can now define three different elements, each with its own pointer to a lexical type validator:

<!ELEMENT role1:DE355 (#PCDATA)>

<!ATTLIST role1:DE355

XML-EDI:definition CDATA #FIXED "X12-lextypes.java?X12-DE355-set1" >

<!ELEMENT role2:DE355 (#PCDATA)>

<!ATTLIST role2:DE355

XML-EDI:definition CDATA #FIXED "X12-lextypes.java?X12-DE355-set2" >

<!ELEMENT role3:DE355 (#PCDATA)>

<!ATTLIST role3:DE355

XML-EDI:definition CDATA #FIXED "X12-lextypes.java?X12-DE355-set3" >