The Schema Adjunct Framework

Schema adjuncts are a mechanism for extending XML schema languages, and for providing information from such extensions to programs that process XML instances. To process XML instances for a given schema, many environments need additional information which is typically not available in the schema itself. Such information includes mappings to relational databases, indexing parameters for native XML databases, business rules for additional validation, internationalization and localization parameters, or parameters used for presentation and input forms. Some of this information is used for domain-specific validation, some to provide information for domain-specific processing. In either case, this information is best provided in a declarative manner, at the same logical level as the schema language. However, no schema language provides support for all the information that might be provided at this level, nor should it - instead, we need a way to associate such information with a schema without affecting the underlying schema language. The Schema Adjunct Framework is an XML-based language used to associate domain-specific data (which we will call adjunct-data) with schemas and their instances, effectively extending the power of existing XML schema languages such as DTDs or XML Schema.

Status of this document

This draft is intended for feedback by a small group which is collaborating to prepare a submission to the W3C. A future version of this document will be submitted for publication as a W3C NOTE. Please send comments or questions to schema-adjuncts@extensibility.com.

Portions of this document were previously published in Markup Languages: Theory and Practice.

Appendices

1 Schema Adjunct Concepts

1.1 Introduction

Most programs that use XML require information that is not encoded in the XML instance or in the schema that governs it. This information may specify what is to be done with the XML instance, describe properties of structures that are not directly supported by a given schema language, or provide data that is required in order to perform a particular action on the XML instance. In most cases, this information is hard-coded in the programs used to process XML, and these programs are each written to process specific document types. If the same functionality is needed for a different document type, a new program must be written, since programs that are hard-coded for a particular schema are "locked in" to that schema. We refer to this as "schema lock-in", and it leads to more complex, less maintainable systems that are hard to understand without carefully reading the source code. These problems become especially acute when the same action needs to be performed on many element types, or when several software systems need to manage XML instances in a consistent, compatible way.

One cure for schema lock-in is to represent that which is schema specific separately, in a language that the intended application understands. The schema adjunct framework allows this information to be associated with schemas or instance documents. This is analogous to the way DTDs or schemas are used when validating an XML instance. No schema-specific information is needed to determine if an XML document is well formed, but to determine whether it is valid requires a schema or a DTD, which contains the schema-specific definition of validity for a particular document type. When an element node is encountered in an XML document, it is associated with the content model for its element type. This content model is represented in a language that XML parsers understand, the DTD or schema syntax. The Schema Adjunct Framework generalizes this basic approach, allowing new domain-specific languages to be defined for various purposes, and allowing information defined in such languages to be associated with document nodes or schema definitions. When a node is encountered in a document, or a definition is encountered in a schema, the Schema Adjunct Framework provides a way to find associated information. This information is represented in a language that is understood by the applications that use a particular schema adjunct.

For example, consider a program that processes XML documents representing hospital patient admission information, mapping each document to a SQL insert statement and storing it in a relational database. That task is straightforward, and easy to implement. However, the implementation is now "locked into" the schema for patient admission records - solving this one problem did not provide any code that helps us store information from other document types, such as patient treatment logs, patient discharges, or billing information. Unless we find a generic way to define database mappings, we must now write similar code for each new kind of document we encounter.

One solution to this problem would be to use a schema language that allows relational mappings to be defined together with XML structures. If relational mappings were the only such problem, or one of very few, this might be a reasonable approach, but there are far too many similar extensions that might be equally useful. During the last few years, many schema languages have been designed for XML. Two clear lessons have been learned in the process: most schema languages are too complex, and none of them supports that one particular feature that may be near and dear to a particular developer's heart. Adding further features to a schema language, moreover, may threaten the most important schema language features: simplicity and coherent design.

The Schema Adjunct Framework provides a general way to extend schema languages. It may be used for any XML schema language, and for any extension that can be described in a language that is valid content for an XML element.

1.2 Introductory Example: Relational Database Mappings

In this section we will illustrate the Schema Adjunct Framework concretely by showing how it can be used to define relational database mappings. The first step is to design a language that describes the database mapping for a given XML node type. Our language will identify the database associated with the document type. This information is placed in the <document /> association. Some XML nodes correspond to a rows in a relational database; for these, our language must identify the table to which the corresponding row belongs. Some XML nodes contain simple data that will be placed in one column in a row; for these, our language must state which column the data belongs to.

A sample schema adjunct that takes this approach is shown below.

Example
A sample schema adjunct, describing how instances of the "pat-admit.xsd" schema might be mapped to relational data. The elements in the "sql" namespace are adjunct-data properties that pertain to this particular application (mapping XML data to relational data). Note that the <document/> element is used to associate information with the schema as a whole, the <element/> element is used to associate information with specific elements, and the <attribute/> element is used to associate information with specific attributes.
<schema-adjunct target="http://www.example.com/pat-admit.xsd"

				xmlns:sql="http://www.example.com/sql-map.xsd" ...>

  

	<document> 

		<sql:server>192.168.0.6</sql:server>

		<sql:database>PATIENT_RECORDS_DB</sql:database> 

	</document>

	

	<element context='admission'> 

		<sql:table>TBL_ADMISSIONS</sql:table> 

	</element>

	

	<element context='patient-name'> 

		<sql:column>COL_NAME</sql:column> 

	</element>

	

	<element context='age'> 

		<sql:column>COL_AGE</sql:column> 

	</element>

			 

	<attribute context='admission/@id'> 

		<sql:column>COL_ADMIT_NBR</sql:column>

	</attribute> 



</schema-adjunct>  

The above example defines the relationship between XML structures and a relational database. Since this information is contained in the adjunct, it need not be hard-coded into a program that writes information from the XML message into a relational database. If the database mappings change, the program does not need to be rewritten. Similarly, the same program can perform mappings for many kinds of XML documents without modification, as long as an adjunct is provided to define the mapping for each XML document type.

Schema adjuncts may be used to define relationships to any software system, not just SQL databases, and the contents may be expressed in any language that can be represented as XML text. The Schema Adjunct Framework merely associates those contents with the XML structures identified by the "context" clause; interpretation of the contents is left entirely to the schema adjunct processor. For instance, the "sql"-namespaced elements in the above example are designed for a processor that understands SQL; if the adjunct processor were providing client-side form data validation using the information in the adjunct, the adjunct might contain JavaScript expressions.

The Schema Adjunct Framework can be thought of as a means to extend the information set of XML instance documents or their schemas with the information contained in their adjuncts. In the above example, the XML instance document is extended to include SQL mappings as data associated with specific elements and attributes. To understand the relationship between schema adjuncts and XML instances, it is helpful to remember that traditional XML schema languages provide information that extends the information found in an XML document. For instance, DTDs provide default values for attributes, and these are used when parsing an XML document if no value is specified. Similarly, XML Schema provides data type information that is used to interpret the values found in an XML document. Schema adjuncts can be used to add other kinds of information to XML documents in a similar manner.

Those familiar with XSLT [] will notice that a schema adjunct has similarities to an XSLT stylesheet, which specifies transformations to be performed on individual elements and attributes in a document. Compare the above schema adjunct to an XSLT stylesheet that creates a table for a document.

Example

An XSLT stylesheet that presents the content of an admissions record as a document with a single table.

<xsl:stylesheet version="1.0">



<xsl:template match="/">

    <document>

	    <title>Sample Document</title>

		<xsl:apply-templates />

	</document>

</xsl:template>



<xsl:template match="admission">

	<table>

		<title>Admissions Record</title>

		<row>

			<xsl:apply-templates/>

		</row>

	</table>

</xsl:template>



<xsl:template match="admission/@id">

	<column>

		<xsl:value-of select="."/>

	</column>

</xsl:template>



<xsl:template match="patient-name">

	<column>

		<xsl:value-of select="."/>

	</column>

</xsl:template>



</xsl:stylesheet>

The XSLT stylesheet and the schema adjunct are structurally similar. However, schema adjuncts are a more general concept. In any XSLT stylesheet, the contents of the templates are always a combination of XSLT directives and data, and the action to be performed when an element matches is always the same, to transform the source data to the target data format. By contrast, a schema adjunct can hold data or code in any language, and any kind of processing may be supported. Schema adjuncts have been defined for a variety of applications, including data input forms, creating the indexing schemes for native XML repositories, assembling documents from foreign data sources, as well as pure data transformation.

1.3 Schema Adjunct Processing

So far we have shown how to declare schema extensions using schema adjuncts, but we have not said how adjuncts are associated with a schema, or how adjunct processors actually associate adjunct-data with instance data. In this section, we will show how instances, adjuncts, and processors are combined at run-time, a process known as schema adjunct processing.

Schema adjunct processing essentially consists of extending an input XML instance document with adjunct-data from a schema adjunct, providing the information needed by a specific adjunct processor. The adjunct to be used is selected based on the adjunct processor. In this section, we will start by discussing adjunct processing when the processor is known in advance, then we will proceed to discuss loosely-coupled scenarios in which the processor is selected based on the contents of the document.

Let us consider how schema adjunct processing would be applied in the relational database mapping example from our introduction. In this case, we will assume that our adjunct processor is known to be a program that maps XML documents into relational databases. The first step is to find the appropriate schema adjunct; that is, to find the schema adjunct that relates a particular schema to a particular processor. The schema adjunct is defined for to the right schema if its "target" attribute contains the name of the schema. The schema adjunct is defined for the right processor if the namespace used to define the adjunct's contents matches the namespace belonging to the processor. In our example, the processor namespace is "http://www.example.com/sql-map.xsd", associated with the "sql" prefix. It is worth noting that a family of schema adjuncts defined for this particular processor would all use the same namespace for their contents, but would name different schemas in their "target" attributes. Likewise, a family of schema adjuncts defined for different processors that operate on the same schema would have different namespaces for their contents, but would name the same schema in their "target" attributes.

Once the adjunct processor has identified the appropriate adjunct, it must be able to associate the adjunct-data with the elements and attributes in the instance document. To accomplish this, a schema adjunct processing library can provide a function that, given a DOM node, returns the associated adjunct-data from the schema adjunct. We will call this function the "associator". Associator functions can be implemented in many different environments. For instance, in a DOM environment, an adjunct object might contain a method that retrieves the associations for a given node:

NodeList assoc = sa.getAssociations(node);

Similarly, an XSLT implementation might define a function to return the associations for a node:

<template match=“*[associations(‘index’)]”>

In a SAX environment, events could be defined for associations, and a listener could be created for these events.

We will focus on the DOM environment for this example. Suppose a DOM node represents an "admission" element; in our example, calling the associator with that node and the string "sql:table" as arguments will return the string "TBL_ADMISSIONS", the name of the SQL table into which the admission information should be placed. The associator, of course, has no understanding of the mappings, it merely makes them available. In our example, the processor calls upon the associator to obtain the SQL mapping information for each element or attribute, and can then store the data using JDBC or ODBC.

In general, identifying adjuncts by their target schema and processor namespace allows an entire spectrum of processor frameworks, from tightly-coupled, specifically targeted systems with target schemas and processors known in advance, to loosely-coupled, data-directed systems. For instance, one kind of loosely-coupled framework accepts messages that consist of an XML document and a verb, such as "store" or "route". The framework then selects a processor that implements that verb, and selects an adjunct for that processor and the document's schema. Another possible configuration is one in which a schema adjunct is supplied as an input together with the XML instance document. In that case, the processor to be applied will be the one that is appropriate for the processor namespace in the adjunct.

In the rest of this paper we will explore how schema adjuncts can be used to solve specific problems with general, maintainable XML software architectures. We will describe three systems. The first is an HTML form generator used to create input forms for the data described by a schema. The second is a system that integrates a schema design tool from one company with a native XML repository from another company. The third system is an extensible validation processor that allows for validation beyond what is provided by standard XML schema languages.

1.4 Use Case: HTML Form Generation

In this scenario, the developer wishes to create a web application that provides update, query, and reporting access to a "database" that is represented as an XML document. Furthermore, the developer wishes to create a single application that can be repurposed for different data. To accomplish this, the developer writes the application as a family of cooperating adjunct processors, with each processor understanding a particular kind of schema adjunct.

To support insert and update operations on the data, two separate processes are required: one that generates an HTML form, and one that extracts an XML document from data received when the form is submitted. The extracted XML document might look something like this:

<admission>

	<patient-name>Methusaleh</patient-name>

	<age>969</age>

</admission>

The required input form HTML will therefore look something like this:

<form action="xml-form-parse.cgi">

	<input type="hidden" name="target-schema" value="pat-admit.xsd"/>

	<table>

		<tr>

			<td>Patient name</td>

			<td><input name="patnm" type="text"/></td>

		</tr>

		<tr>

			<td>age</td>

			<td><input name="patage" type="text"/></td>

		</tr>

	</table>

</form>

This form can be generated in a straightforward fashion using an adjunct like this:

<schema-adjunct target = "http://www.example.com/pat-admit.xsd"

	                xmlns:xfg = "http://www.example.com/xml-form-gen.xsd"

	                ...>

	

    <element context = 'admission'>

        <xfg:form/>

    </element>



    <element context = 'patient-name'>

        <xfg:label>Patient name</xfg:label>

        <xfg:type>text</xfg:type>

        <xfg:tag>patnm</xfg:tag>

    </element>



    <element context = 'age'>

        <xfg:label>age</xfg:label>

        <xfg:type>integer</xfg:type>

        <xfg:tag>patage</xfg:tag>

    </element>



</schema-adjunct>

Note that the adjunct really has no specific knowledge of HTML -- that knowledge is embedded in the adjunct processor. The adjunct itself contains only the parameters to make a generic input form specific to the purpose of entering data for the "pat-admit.xsd" schema. The input form generated need not be HTML at all: a similar processor could generate Java client user-interface code, or legacy mainframe input screens, using the same adjunct data.

Naturally, these examples have been simplified. For instance, the generated form will almost certainly require some client-side validation JavaScript code, additional layout hints, etc. Such things can be readily added to the schema adjunct shown, or could be placed in a related adjunct. The same adjunct can also be used in conjunction with the target schema, "pat-admit.xsd", to generate server-side processing code that constructs the XML document from the HTTP request received from the client.

Since the web application is adjunct-driven, it is easy to adapt it to handle instances of a different schema. For instance, the developer can write a similar adjunct that defines the form processing for patient discharges:

<discharge>

	<patient-name>Methusaleh</patient-name>

	<date-discharged>2000-04-19</date-discharged>

</discharge>

The new adjunct, being written for use in the same processing domain, will use the same "xfg" processor namespace. It will differ in the target schema, which will now be something like "pat-dischg.xsd", and of course the "context" selectors and the actual values within the "<xfg:label>", etc. tags will differ.

1.5 Use Case: Schema Design for an XML Data Dictionary

One of the advantages of a standard for expressing schema-level adjunct-data is that it encourages interoperability of software tools. This section describes the use of schema adjuncts in a system that allows a schema design tool to define the data dictionary for a native XML repository made by a different company. In this system, the schema adjuncts provide for definition of indexes for the repository, and the schema design tool supports the repository's indexes in the same way that it supports native XML constructs like elements and attributes. From the user's perspective, the two products are tightly integrated, but the interface between the two products is defined purely by using HTTP PUT, GET, and POST to exchange XML files, and schema adjuncts are used to describe the schema language extensions to both systems.

For this example, we want to allow a schema in XML Data Reduced (XDR) to be augmented with two additional properties. The "rootable" property determines whether an element may occur as the root of a document. The "index" property determines whether an attribute or element is indexed with a standard index, a full-text index, both, or not at all.

Example

A sample adjunct that indicates that the "admission" element is "rootable" and indexed as full-text, with standard indexing used for other elements.

<schema-adjunct target = "http://www.example.com/pat-admit.xsd"

	                xmlns:sto = "http://www.example.com/xml-store.xsd"

	                ...>

	

    <element context = 'admission'>

        <sto:rootable/>

        <sto:index>text</sto:index>

    </element>



    <attribute context = 'admission/@id'>

        <sto:index>standard</sto:index>

    </attribute>



    <element context = 'patient-name'>

        <sto:index>standard</sto:index>

    </element>



    <element context = 'age'>

        <sto:index>standard</sto:index>

    </element>



</schema-adjunct>

In schema adjuncts for this domain, the presence or absence of the "rootable" property corresponds to a "true" or "false" value. The "index" property can take the values "text", "standard", "text standard" (when both types of indexes are desired), or "final" (when an element is to be treated as a blob, without further decomposition). The XML schema design tool presents these choices to a user when a schema is designed for use with the database. For simplicity at design time, the design tool presents the schema and the schema adjunct in a single view, producing an "annotated" schema. Here is a sample schema annotated in accordance with the schema adjunct shown above, using the XML Data Reduced schema language:

Example

Trivial schema for a Patient Admissions Record, with annotations that correspond to a schema adjunct.

<?xml version ="1.0"?>

<Schema name = "patient.sto"

	 xmlns = "urn:schemas-microsoft-com:xml-data"

	 xmlns:dt = "urn:schemas-microsoft-com:datatypes"

	 xmlns:sto = "http://www.example.com/xml-store.xsd">



	<ElementType name = "admission"

		 sto:rootable = "true"

		 sto:index = "text" content = "eltOnly" order = "seq">

		<AttributeType name = "id" dt:type = "string" required = "yes"

			 sto:index = "standard"/>

		<attribute type = "id"/>

		<element type = "patient-name"/>

		<element type = "age"/>

	</ElementType>



	<ElementType name = "patient-name"

		 sto:index = "standard" content = "textOnly"/>

	<ElementType name = "age"

		 sto:index = "standard" content = "textOnly" dt:type = "int"/>

		 

</Schema>

In the above schema, associations are made in the schema in-line rather than writing a separate schema adjunct, and namespaces are used to distinguish associations from XML Data Reduced constructs. (There is no technical reason that this is not done with a separate file, but it is sometimes more convenient to represent such small extensions in the same file as the schema.) Using the above schema, the XML repository can create a data dictionary that limits which elements may be used as roots and has the indexing structures specified in the schema. For the user of the schema design tool, these two properties appear as first-class properties of the schema language, effectively creating an extended schema language that supports the XML repository.

1.6 Use Case: Extended Validation

Our final example applies the Schema Adjunct Framework to support extended validation of XML instance documents; that is, validation based on rules not contained in any schema language. Since validation is one of the functions of a schema, this is effectively a way of extending the schema language itself, adding rules that are outside the semantic universe of the schema language. These rules might be business rules defined in JavaScript, or database validity rules that use XPath to express referential integrity constraints. It is important to be able to support many languages to express extended validation constraints, since the choice of language depends on the semantics that are to be expressed.

In this example, we will illustrate a fairly general-purpose extended validation facility. The validation adjunct processor proposed here has the ability to evaluate XPath expressions and return true/false results. The XPath expressions may refer to functions that are not defined and packaged with the validator; the validator has the ability to load auxiliary functions from JAR files to extend the default XPath function library. Finally, the validator can support other languages for stating constraints, and it can support multiple levels of constraint checking.

Example
A schema adjunct showing an extended validation example. Target documents for this adjunct are invoices conforming to the "invoice.xdr" schema. The "xval" prefix and "morevalid.xsd" namespace URI identify the extended validation processor that will apply this adjunct to instance documents.
<schema-adjunct  target="invoice.xdr"  xmlns:xval="morevalid.xsd">



	<document>

		<xval:func   name="checkZip" signature="..."

				class="..."  codebase="..."/>

	</document>



	<element  context="mailto/address">

		<xval:test   lang="xpath"  level="5">

			checkZip( @zip, @city, @state ) = 0

		</xval:test>

	</element>



</schema-adjunct>

In the example adjunct above, the target invoice documents are being checked to ensure that the ZIP code in the mail-to address is consistent with the city and state. Since the validator has no implicit knowledge of ZIP codes, we supply an external library containing a "checkZip" function that checks ZIP code consistency, presumably accessing some database or service. The <document> association contains an <xval:func> item that is used to declare the "checkZip" function and its call signature and location. The <element> association then contains an <xval:test> element that defines an XPath expression to evaluate in order to check "mailto/address" validity with respect to ZIP codes, using the "checkZip" extension function defined. Note that the "lang" attribute of the test implies that the extended validation processor may know how to evaluate several languages. In addition, the "level" attribute implies that this processor supports various levels of validation, for varying tradeoffs of confidence/correctness versus efficiency.

This example shows how extensions in several languages can be used, allowing a wide range of possibilities for extended validation, including both general data consistency issues and very specific business rule conformance issues. Schema adjuncts provide a natural and extensible framework for performing declarative, rule-driven XML processing in a way that is clearly related to XML schemas.

1.7 Conclusion

The power of the Schema Adjunct Framework derives from its simplicity and generality. We have presented three views of the Schema Adjunct Framework - as a mechanism for extending schema languages, as a means for making XML processing applications generic with respect to schemas, or simply as a way to associate any kind of adjunct-data with XML instance data. Since the framework places no constraints upon the information contained in adjunct associations, adjuncts can be applied to any problem that requires association of such information with a schema or with the structures it defines.

The Schema Adjunct Framework is particularly useful as an integrating technology, defining characteristics of systems that do not belong to the universe understood by the schema language. For instance, in this paper we have defined relationships between an XML schema and the the relational representation of its data, the HTML forms that would be used to process instances, the indexes used to store instances in a native XML database, and a Java message queue that would process instances. In each of these scenarios, the schema adjunct defines a relationship between two different closed systems, which have no knowledge of each other. Schema languages describe structure of XML documents per se, without regard to external systems. Schema Adjunct Frameworks are often used to interface the XML representation to the universe of associated software.

As an integrating technology, the Schema Adjunct Framework should prove particularly useful in environments that involve many loosely coupled systems, including web application frameworks, business-to-business integration solutions, and e-commerce architectures. Many of these environments use more than one schema language. Since the Schema Adjunct Framework can be used with any schema language, including DTDs, SOX, XDR, DCD, RELAX, or the W3C XML Schema specification, it is well suited to environments that must manage XML instances originating from many sources.

2 Schema Adjunct Reference

This section lays out the details of the syntax and semantics of schema adjuncts. [Definition:] A schema adjunct is an XML document that associates domain-specific information with instance documents conforming to a particular schema. The domain-specific information represents schema-level information, the adjunct-data relating the schema to a particular application. The schema with which an adjunct is associated is known as the target schema, and is technically identified by one or more instance namespaces. The XML instance documents that conform to the target schema, and with which the adjunct is associating information, are referred to as target instances.

An informal syntax for schema adjuncts is presented below. The exact syntax is defined in the appendices: (§) and (§).

Schema adjunct syntax

`[1]`	`schema-adjunct`	`::=`	`<schema-adjunct>`
			`( instance-namespaces equate-namespaces* )?`
			`global?`
			`( element \| attribute \| type )+`
			`</schema-adjunct>`
`[2]`	`instance-namespaces`	`::=`	`<instance-namespaces default="prefix" others="prefix1 .. prefixN"/>`
`[3]`	`equate-namespaces`	`::=`	`<equate-namespaces alias="prefix" prefixes="prefix1 .. prefixN"/>`
`[4]`	`global`	`::=`	`<global>`
			`global-adjunct-data+`
			`</global>`
`[5]`	`element`	`::=`	`<element type="type-name" where="context">`
			`element-adjunct-data+`
			`</element>`
`[6]`	`attribute`	`::=`	`<attribute type="type-name" where="context">`
			`attribute-adjunct-data+`
			`</attribute>`
`[7]`	`type`	`::=`	`<type type="type-name" where="context">`
			`type-adjunct-data+`
			`</type>`
`[8]`	`global-adjunct-data`	`::=`	`domain-specific`
`[9]`	`element-adjunct-data`	`::=`	`domain-specific`
`[10]`	`attribute-adjunct-data`	`::=`	`domain-specific`
`[11]`	`type-adjunct-data`	`::=`	`domain-specific`

2.1 Adjunct-data Elements

Fundamentally, a schema adjunct is a container for domain-specific adjunct-data, organized in a way that associates particular adjunct-data elements with particular instance data. An adjunct-data element is simply defined to be an XML element appearing within an association in a schema adjunct. It is recommended that all adjunct-data elements be qualified with a prefix associated with an adjunct-data namespace [].

Example

A collection of sample adjunct-data elements.

<sql:table>TBL_EMPLOYEE</sql:table>

<sql:column>COL_EMPL_ID</sql:column>



<htmlformgen:label>Last name</htmlformgen:label>

<htmlformgen:fieldtype>text</htmlformgen:fieldtype>

<htmlformgen:tag>surname</htmlformgen:tag>



<i18n:Spanish>empleado</i18n:Spanish>



<javagen:check>

    return fileName .endsWith( ".xml" );

</javagen:check>



<xmledit:font>Geneva</xmledit:font>

<xmledit:color>0099cc</xmledit:color>

Adjunct-data elements may be richly structured, or completely unstructured. This specification does not constrain the content of adjunct-data elements in general, beyond the requirement of well-formedness. It is the primary function of adjunct-data-schemas to define and constrain the content of adjunct-data elements for particular applications.

2.2 Associations

A schema-adjunct organizes its adjunct-data elements into a series of associations. [Definition:] An association associates a set of adjunct-data elements with a particular set of elements or attributes in any target instance. The set of elements or attributes is constrained by the kind of association and the type name and context expression for the association.

Example
An association showing a type name and context expression.
<element type = 'address' context = 'invoice/shipTo' >

    <sql:table>SHIP_TO</sql:table>

</element>
This association states that the "sql:table" adjunct-data property having a value of "SHIP_TO" is associated with any element named "shipTo" that has type "address" and parent element "invoice".

A schema-adjunct may contain any number of element, attribute, and type associations. The adjunct-data contained in a given association is logically associated with all elements or attributes (in instance documents) that match the type name and context expression. Any number of associations may match a given attribute or element. [Correctness constraint] Either the type name or the context expression may be omitted, but not both. When both are present, the type name and context expression must not be in conflict: the context expression must name an element or attribute that can have the stated type.

A type name is a qualified name referring to a named simple or complex type, or referring to an element or attribute type that has an anonymous type. Local element types may be identified by prepending a global type name and "/".

Example
Two simple context expressions are illustrated here:
<element where = "a"> ... </element>



<attribute where = "a/@c"> ... </attribute>
They refer to, respectively: element "a", and attribute "c" of element "a".

In the most general case, each context expression is an attribute whose value is a simple XPath ([]) expression that returns a value of type node-set. This definition of context expression corresponds to a simplified version of the Pattern defined by XSLT []. The simplification is a result of requirements for both simplicity of expression and speed of processing -- XPath expressions of this nature can be evaluated using only the set of ancestor elements as possible contexts.

Syntax of Type Names and Context Expressions

`[12]`	`Type-name`	`::=`	`QName ( "/" QName )*`
`[13]`	`Context`	`::=`	`( "/" \| "//" )? RelativeContext`
`[14]`	`RelativeContext`	`::=`	`QName ( ( "/" \| "//" ) QName )*`
			`( "/" "@" QName )?`

A schema-adjunct optionally contains a single global association. The adjunct-data contained therein is logically associated with all nodes in any target instance. It may also be considered to be associated with the target schema/namespace as a whole. The global association has no type name or context expression.

2.3 Namespaces

The namespaces used in a schema adjunct can be divided into two categories. A namespace in the first category contains elements, attributes, and types that the schema adjunct is "about" - nodes with which it is associating adjunct-data. Namespaces in this category are called instance namespaces. A namespace in the second category contains the elements and attributes used to state adjunct-data itself. Namespaces in this category are called adjunct-data namespaces. The distinction between instance namespaces and adjunct-data namespaces is illustrated by the example below.

Example
The namespace associated with the "emps" prefix is an instance namespace. The namespace associated with the "sql" prefix is an adjunct-data namespace.
<element context = "emps:employee-rec">

    <sql:table>TBL_EMPLOYEE</sql:table>

    <sql:column>COL_EMPL_ID</sql:column>

</element>

A schema adjunct can explicitly identify its instance namespaces, using the instance-namespaces element. This allows applications to associate schema adjuncts with target schemas or namespaces automatically. The "default" attribute defines the prefix that is assumed when target element, type, and attribute names are unqualified. The "others" attribute defines a space-delimited list of prefixes that may appear in qualified target element, type, and attribute names. [Correctness constraint] Either attribute of the instance-namespaces element may be omitted, but not both. All prefixes mentioned must be defined in namespace declarations at the root of the schema adjunct.

Example
The preamble of a sample schema adjunct, showing namespace declarations and a "instance-namespaces" element.
<?xml version='1.0'?>

<schema-adjunct



    xmlns = "http://www.extensibility.com/namespaces/saf"



    xmlns:emp = "http://www.example.com/namespaces/empl-list"



    xmlns:sql = "http://www.example.com/namespaces/sql-mapping"



    xmlns:xsi = "http://www.w3.org/1999/XMLSchema/instance"

    xsi:schemaLocation =

       "http://www.example.com/namespaces/empl-list

        http://www.example.com/schemas/empl-list.xsd

        http://www.extensibility.com/namespaces/saf

        http://www.extensibility.com/schemas/saf.xsd

        http://www.example.com/namespaces/sql-mapping

        http://www.example.com/schemas/sql-mapping.xsd">



    <instance-namespaces default="emp"/>



    ...



</schema-adjunct>
This example associates adjunct-data with target instances of the "empl-list.xsd" schema, which defines the "empl-list" namespace. The adjunct-data-schema describing the adjunct-data is "sql-mapping.xsd", and the "sql" namespace prefix will be used for stating adjunct-data properties.

Schema adjuncts may also indicate that several instance namespaces are to be treated as identical for the purposes of making associations to instance data. This is accomplished using the equate-namespaces element. The "alias" attribute defines the prefix that is used to represent the union of the prefixes in the "prefixes" attribute. Note that this does not violate any requirements for schema-validity of adjuncts, since instance namespace prefixes only appear within attribute values. [Correctness constraint] All prefixes mentioned in the "prefixes" attribute must be mentioned in the "instance-namespaces" element. The prefix stated in the "alias" attribute must not be mentioned in the "instance-namespaces" element or defined in a namespace declaration.

It is recommended, but not required, that schema adjuncts contain a schema-location attribute as specified in []. The schema-location attribute should at least provide locations for all meta-data namespaces and the standard SAF namespace, to support validation and correctness checking for adjunct.

2.4 Schemas for Adjunct-data

[Definition:] An adjunct-data-schema is a schema that defines the syntax of adjunct-data for a particular application. All adjunct-data-schemas share a set of common element declarations, but differ in the domain-specific details of the adjunct-data elements described. The generic aspects of adjunct-data-schemas are set down in the (§) and the (§). Those appendices also define the formal syntax for schema adjuncts, in accordance with the informal syntax defined above in (§).

Example
The sample DTD adjunct-data-schema governing the schema adjunct examples above. An XML Schema version of this adjunct-data-schema can be seen in (§).




<!ENTITY % document-cm    "(sql:database,sql:server)">



<!ENTITY % element-cm     "(sql:table,sql:column)">



<!ENTITY % attribute-cm   "(sql:column)">



<!ELEMENT sql:database (#PCDATA)>

<!ELEMENT sql:server   (#PCDATA)>

<!ELEMENT sql:table    (#PCDATA)>

<!ELEMENT sql:column   (#PCDATA)>



<!ENTITY % saf SYSTEM

    "http://www.example.com/schemas/saf.ent">

%saf;
This adjunct-data-schema has two responsibilities: to define the domain-specific properties in use (the "sql"-namespaced element declarations), and to define the content models for the several adjunct-data association declarations in terms of those properties. The declarations of the associations and the top-level "schema-adjunct" element are found in the "saf" entity, which must be included at the end of the adjunct-data-schema.

Adjunct-data-schemas fill several roles. First, since a schema adjunct is an XML instance document, and an adjunct-data-schema is a schema describing a class of such instance documents, any validating XML parser can validate schema adjuncts in the usual fashion using an adjunct-data-schema. Second, adjunct-data-schemas inform automated schema adorning and distillation processes, as described in the next section. Finally, an adjunct-data-schema serves as human-readable documentation of a particular schema-driven XML processing application and its information requirements. For this role, the inclusion of explanatory material as annotations and comments is highly recommended.

The Schema Adjunct Framework

Draft 30 November 2000

Abstract

Status of this document

Table of contents

Appendices

1 Schema Adjunct Concepts

1.1 Introduction

1.2 Introductory Example: Relational Database Mappings

1.3 Schema Adjunct Processing

1.4 Use Case: HTML Form Generation

1.5 Use Case: Schema Design for an XML Data Dictionary

1.6 Use Case: Extended Validation

1.7 Conclusion

2 Schema Adjunct Reference

2.1 Adjunct-data Elements

2.2 Associations

2.3 Namespaces

2.4 Schemas for Adjunct-data

3 Relationship To Other Work

A References

B Generic Adjunct-data-Schema as an XML Schema

C Generic Adjunct-data-Schema as a DTD

D Sample Adjunct-data-Schema as an XML Schema