[This local archive copy mirrored from the canonical site: http://www.xmlxperts.com/nfsae.htm; links may not have complete integrity, so use the canonical document at this URL if possible.]
Authored by Dianne Kennedy, XMLXperts Ltd.
Chairperson SAE J2008 SGML Working Group
Considerable time and resources have been spent in every vertical industry to develop industry standard SGML Document Type Definitions, or DTDs. The aerospace, defense, automotive, telecommunications, semi-conductor, railroads, health, scholarly journal and newspaper industries each have their own SGML DTDs. These DTDs set the rules by which data is coded for interchange. Now there is a great interest in providing industry standard XML DTDs to facilitate Web delivery of richly encoded data to the desktop. Because industry DTDs encode complex data constructs for a vertical industry, these DTDs tend to be quite complex. A semi-automated method of converting existing DTDs from SGML to XML will not only prove cost-effective from a time and resource perspective, but can aide each industry to make the transition to Web delivery in the most immediate and timely fashion.
In the automotive industry, the development of the SAE J2008 suite of standards was a direct response to the requirements of the 1990 Clean Air Act. Paragraph 202(m)(5) of the Act addresses the requirement for Information Availability and tasks automotive manufacturers to "provide any and all information needed to make use of emission control diagnostics systems including instructions for making emissions related diagnostics and repair."
Availability of vehicle service information is the key element to effective automotive diagnosis and repair. And effective diagnosis and repair is believed to have a direct impact on air quality. Studies have shown that automotive technicians will only use proper service procedures if information access is fast and easy. If information is not readily available alternate service procedures (which may be less effective) will be employed. Rather than develop information systems specific to emission defects, the mission of SAE J2008 was broadened to accommodate all other vehicle service information as well. In 1997, the mission further expanded to include service information for all on-road vehicles including heavy trucks and construction equipment.
The SAE J2008 Task Force studied a variety of information modeling and exchange methodologies. Because there is no standardization in the industry in terms of automotive service document specifications, work began with the development of a relational data model. The SGML definition was based on the data model rather than being based on any particular type of service manual.
During 1997, the Draft DTD within SAE J2008 was updated for presentation as a final SAE Standard. Massive changes were made to the DTD in order to support the addition of heavy truck data and to make the DTD reflect the Data Model in the most concise manner possible. Near & Far was used extensively to create this new version of SAE J2008. All members of the SGML Working Group were experienced SGML designers, so the tool was used in the SGML Symbology mode. Using Near & Far Designer, the group could prototype alternate models quickly and easily. The once daunting task of recording hand drawn structure charts in SGML and parsing became an automatic function using the Microstar tool. SGML Working Group members agree that the DTD could not have been so completely re-worked in such a short amount of time without employing Near & Far ®.
It is important to note that the version of SAE J2008 that will be balloted late in 1998 is not an XML DTD. Due to time constraints, this first formal version of the standard contains an SGML-compliant DTD. The SGML Declaration, however, was updated to be XML compatible because there is interest in being able to directly deliver XML- compliant data to desktop browsers.
In order to develop an XML DTD for SAE J2008, the existing SGML DTD must be converted. Microstar's Near & Far Designer® 3.0 was used to help automate this activity. This new version of Near & Far incorporates XML into the familiar Microstar SGML DTD design tool. Implementers can use the tool to create a new XML DTD using a graphical user interface. But more interesting to those working with SAE J2008 is the ability of the tool to assist with the conversion from an SGML DTD to an XML DTD.
XML is a narrow profile of SGML. It has a concrete syntax prescribed by the XML SGML Declaration which has become the standard syntax for Web delivery of SGML data. In addition, the syntax of declarations within the DTD have been limited to assure creation of well-formed, self-describing documents.
Near & Far Designer® 3.0 automates the conversion of one-for-one differences between SGML and XML. These conversions are straightforward and are performed when a user selects the "Convert to XML" option in the Tools pull-down menu.
Automatic conversion routines include:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE j2008 [ . . . <!ELEMENT Paths - - (Path1 | Path2 | Path4 | Path3 | Path4 | Path6 | Path7 | Path8 | Path9 | Path10 | Path11 | Path12 | Path13 | Path14 | Path15 | Path16 | Path17 | Path18 | Path19)+ > <!ELEMENT Path1 - - (ServInfo | ServInforef | SIEdeletefrompath)+ --Supporting OEM Tables 103 (85,105)--> <!ATTLIST Path1 vehSGMLid IDREF #REQUIRED vehvarSGMLid IDREF #IMPLIED >
<!-- This XML DTD was developed by XMLXperts LTD. It is based on SAE J2008 DTD Dated 2/98 --> <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE j2008 [. . . <!ELEMENT Paths (Path1 | Path2 | Path4 | Path3 | Path4 | Path6 | Path7 | Path8 | Path9 | Path10 | Path11 | Path12 | Path13 | Path14 | Path15 | Path16 | Path17 | Path18 | Path19)+ > <!--Supporting OEM Tables 103 (85,105)--> <!ELEMENT Path1 (ServInfo | ServInforef | SIEdeletefrompath)+ > <!ATTLIST Path1 vehSGMLid IDREF #REQUIRED vehvarSGMLid IDREF #IMPLIED >
In XML, a number of SGML attribute values are not allowed. These restrictions were implemented so that XML-coded data could be "self describing and well-formed". In XML, only CDATA, NMTOKEN, NMTOKENS, ID, IDREF, IDREFS, ENTITY, and ENTITIES are allowed as attribute values. For defaults, only declared defaults, #REQUIRED, #FIXED, and #IMPLIED are allowed.
Specific SGML attribute values that are forbidden in XML include:
And these SGML attribute defaults are not allowed in XML DTDs:
Clearly to convert from SGML to XML DTDs, we need to review the attribute values and defaults and change them to acceptable XML attribute values. This is not an automatic one-to-one mapping as were the conversions discussed in the previous section. However, this conversion can be automated once a mapping has been established.
Near & Far Designer® 3.0 enables us to specify standard SGML-to-XML mappings for attribute values and defaults using the "Tools" pull-down menu. Simply select "Options" and then "XML". At this point you can use check boxes to indicate replacements you wish to make automatically. For example, you can select "Replace NUTOKENS with NMTOKENS" or you can select "Replace NAMES with NMTOKENS." For the conversion of SAE J2008 to XML, the standard replacements suggested by check boxes in the XML menu were used.
<!ATTLIST Driveline configgrpSGMLid ID #REQUIRED mcseqnbr NUMBER #REQUIRED drivemfrcode NUMBER #REQUIRED drivetypenbr NUMBER #REQUIRED drivedesc CDATA #IMPLIED update (delete,change,original) #REQUIRED ldup CDATA #REQUIRED configgrpnbr NUMBER #FIXED "7" >
<!ATTLIST Driveline configgrpSGMLid ID #REQUIRED mcseqnbr NMTOKEN #REQUIRED drivemfrcode NMTOKEN #REQUIRED drivetypenbr NMTOKEN #REQUIRED drivedesc CDATA #IMPLIED update (delete|change|original) #REQUIRED ldup CDATA #REQUIRED configgrpnbr NMTOKEN #FIXED "7" >
In XML, the SGML declared content CDATA and RCDATA are not allowed within content models in the DTD. Again, a one-to-one mapping from SGML to XML does not exist. So conversion cannot proceed automatically. However, as was the case with attribute values and defaults, this conversion can be automated once a mapping has been established.
Near & Far Designer® 3.0 enables us to specify standard SGML-to-XML mappings for declared content using the "Tools" pull-down menu. Simply select "Options" and then "XML". At this point you can use check boxes to indicate replacements you wish to make automatically. For SAE J2008, any CDATA and RCDATA specifications were directly replaced with #PCDATA.
In addition to the conversion items which you can automate with Near & Far Designer® 3.0, certain issues will remain which cannot be resolved either with a one-for-one replacement or with user-defined mappings. In these cases Near & Far Designer® 3.0 assists you by providing a lists of discrepancies:
For SAE J2008, the remaining errors fell into several classes which will be described in the following sections. Types of errors included:
Inclusion exceptions are allowed in SGML DTDs, but not in XML DTDs. In XML it is expected that each content model be precisely declared. Eliminating inclusion exceptions is a relatively easy task if the inclusion falls in the terminal node of a DTD (at the #PCDATA level). Because of the certainty of white space handling, mixed content can be used to eliminate inclusions. Rather than using an inclusion, the same effect can be achieved with mixed content.
Fortunately, in SAE J2008 inclusions do not happen at a high level. In fact, they only happen at the PCDATA level. So eliminating inclusions in SAE J2008 was a relatively simple task. All inclusions were placed in a mixed content OR group with PCDATA as is allowed by the XML standard.
In XML, AND connectors (&) are not allowed. AND connectors are used to specify that elements may occur in any order. When AND connectors are used to connect two or three elements, the number of possible element combinations is not significant. However, when the AND connector is used with a large group of elements, the possible element combinations become staggering. AND was eliminated from XML to promote simpler, more precise data models. To convert from SGML to XML DTDs, AND connectors must be eliminated.
In SAE J2008, AND connectors were never used so no conversion was required. Figure 8 shows how AND connectors can be modeled into content should that be required for XML DTD conversion.
Exclusion exceptions are allowed in SGML DTDs, but not in XML DTDs. In XML it is expected that each content model be precisely declared. Eliminating exclusions can involve making some hard design decisions. First let's look at a valid exclusion. In this model a paragraph is either text or footnotes (mixed content). A footnote is defined as being either text or paragraphs. But in this model if we put a paragraph inside a footnote, we also allow a footnote (which can be inside a paragraph) inside a footnote. So to prevent a footnote from falling within a footnote, we use an exclusion to say that a footnote cannot occur within a footnote. This sort of exclusion is not allowed in XML.
Handling exclusions is usually not straight forward. One solution would be to simply ignore specification of the exclusion and to assume that good authoring practice would prevent a footnote from happening within a footnote.
The second, more precise solution is to give the elements within the structure where the exclusion is specified a unique (usually fully qualified) name. With a unique name, <ftnote.para> can have a unique content model which does not allow for the occurrence of footnote. This solution is clearly quite precise, but it is not upwardly compatible with the original SGML DTD. It also adds new tags which users must learn to use. Using this solution requires a transformation to deliver SGML data with an XML DTD as XML-coded data on the Web.
For SAE J2008, new elements with unique names were developed to eliminate exclusions. For example, new elements were developed to prevent attentions from occurring within other attentions and to prevent tables and figures from occurring within tables. At times these models became quite complex. See Figure 10.
In XML system file IDs are required. The system file ID is usually a host-specific file name.
<!ENTITY x33445 SYSTEM"file://c:/graphics/x3345.tif">
To make this task easier, Near and Far Designer® 3.0 notifies us whenever such system file IDs must be added.
Near & Far Designer® 3.0 was specially designed to help make the transition from SGML to XML as smooth and straightforward as possible. Near & Far Designer® 3.0 can evaluate any valid SGML DTD and interactively convert all mappings that are one-for-one. It will also highlight any remaining discrepancies, evaluate end user resolutions, and complete the transformation from an SGML DTD to an XML DTD -- taking all guess work out of this task. Near & Far Designer® 3.0 was designed to enable organizations to make the transition from SGML to XML in a cost and resource effective manner.
Following the transition from SGML to XML, the graphical interface of Near & Far Designer® 3.0 makes the ongoing creation of XML DTDs an easy task in the future. Designer now offers the document analyst a choice to create either new SGML DTDs or to create XML DTDs directly.