Customizing Akoma Ntoso: modularization, restrictions, extensions

Creating-custom-schemas

In this section we examine those customizations to the general Akoma Ntoso schema that imply performing actual modifications to the actual schema. Once we begin editing the actual Akoma Ntoso schema, it becomes possible to add here, remove there, and still come out with a valid XML Schema. Of course, even if the majority of the content in the edited schema derives from the original schema, this is not enough to guarantee that the result is a valid customization of Akoma Ntoso, which has strong requirements for compliance. For this reason, these customizations are best left to an expert in XML Schema and Akoma Ntoso, as it is very easy to generate either an incorrect XML Schema or a correct XML Schema for a language that is not compliant with Akoma Ntoso.

In general and in absolute, the fundamental rule for the customization of Akoma Ntoso is that any document that is correct with regard to the custom rules must be also correct (in validity and spirit) with regard to the full Akoma Ntoso schema.

Concretely, the customization possibilities are limited to what was described in the two previous sections: restrictions of content models and allowed values, and extensions using proprietary, foreign and generic elements. So, where is the difference now?

Except of course for subschemas generated through the subschema generator, the basic problem that the schema-less customizations present, is that there is no way to verify that the documents comply with the custom schema. The schema is in fact exactly the tool that verifies whether the document complies with the rules of the language, and schema-less customizations have no way to perform checks on the customized parts.

Therefore, if the customization needs exceed those offered by the subschema generator, and it is important to validate the documents with respect to the custom rules, there is no other way but to generate a custom schema. There are three ways to do so:

An advantage of the direct editing of a copy of the Akoma Ntoso schema is its simplicity, but this comes with a relevant disadvantage: it becomes pretty much impossible to verify whether the customized schema is a correct schema for Akoma Ntoso document, i.e., whether it complies to the fundamental rule expressed above. On the other hand, creating a redefinition of the schema or an additional schema provide also safe customizations, as they basically prevent violations to the fundamental rule. For this reason, in the following all examples will deal with either a redefined schema or an embedded Schematron schema.

In all cases, although opinions may differ on the most elegant way to proceed, the kinds of modifications that are possible are pretty much the same: you can derive types (most often to restrict rather than extend, as there are many more constraints in Akoma Ntoso with regard to extensions), you can define new attributes (in a different namespace) for any existing elements, or you can define new elements (in a different namespace) within the existing elements that allow them, such as proprietary or foreign.

Custom types

The first and most evident customization of the schema is the redefinition of existing types. Regardless of whether one creates a new type, restricts an existing type or edits the main Akoma Ntoso definition of the existing type, the basic requirement is that the fundamental rule for the customization of Akoma Ntos holds: the resulting type must make sure that valid documents for the custom schema are valid documents for the main schema.

In the following, for instance, we redefine (using approach 2, redefining the schema) the base hierarchy types, requiring that elements num and heading become required elements (they are both optional in the full Akoma Ntoso schema):

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" xmlns:an="http://www.akomantoso.org/2.0"

xmlns="http://www.akomantoso.org/2.0" targetNamespace="http://www.akomantoso.org/2.0">

    <xsd:redefine schemaLocation="./akomantoso20.xsd">

        <xsd:complexType name="basehierarchy">

            <xsd:complexContent>

                <xsd:restriction base="an:basehierarchy">

            <xsd:sequence>

                  <xsd:element ref="num" minOccurs="1" maxOccurs="1"/>

                  <xsd:element ref="heading" minOccurs="1" maxOccurs="1"/>

                  <xsd:element ref="subheading" minOccurs="0" maxOccurs="1"/>

            </xsd:sequence>

                </xsd:restriction>

            </xsd:complexContent>

        </xsd:complexType>

    </xsd:redefine>

</xsd:schema>

 

In this example, a custom version of the basehierarchy type is defined as a restriction of the type found in the full schema. This redefinition is placed within the xsd:redefine element, and it has the same name as the base type, so that this definition replaces the original one in all elements using it. By specifying that elements num and heading become required in all hierarchical elements (minOccurs='1' instead of minOccurs='0' of the original definition) we have created a concrete, widespread customization of the original schema, and also one that was not possible with the subschema generator.

Please also note that a requirement for redefinitions is that the redefined schema has the same target namespace as the original one, yet that it is possible to distinguish between structures defined in the base schema and structures defined in the custom one. This is possible by associating two different prefixes to the same namespace, i.e. no prefix (xmlns="http://www.akomantoso.org/2.0") and prefix an: (xmlns:an="http://www.akomantoso.org/2.0"). 

Custom attributes    

While being very rigid about new elements, the Akoma Ntoso schema is much more flexible about custom attributes to existing elements. The rule is that it is possible to create new attributes and assign them to any of the existing elements, as long as these attributes are assigned to a different namespace than Akoma Ntoso.

XML Schema requires that each XSD file is associated to only one namespace (its target namespace), so that the definition of the new attributes must be in a different namespace than the schema. This means that we need two separate schemas if we edit the Akoma Ntoso schema directly, and three files if we redefine the schema: the original Akoma Ntoso schema, the schema containing the new attributes, and the pivot schema that redefines the Akoma Ntoso structures.

Namely, we first need to define the new attributes in a separate schema files associated to a different target namespace, as in the following example:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.site.com/" targetNamespace="http://www.site.com/">
            <xsd:attribute name="myAttribute" type="xsd:string"/>
</xsd:schema>

 

Then we need to import the external schema file and use its defined structures. The following is the solution based on redefining the schema:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" xmlns:an="http://www.akomantoso.org/2.0" xmlns:x="http://www.site.com/10" xmlns="http://www.akomantoso.org/2.0" targetNamespace="http://www.akomantoso.org/2.0">
  <xsd:import namespace="http://www.site.com/10" schemaLocation="./site.xsd"/>
    <xsd:redefine schemaLocation="./akomantoso20.xsd">        
            <xsd:attributeGroup name="corereq">
                  <xsd:attributeGroup ref="an:corereq"/>
                  <xsd:attribute ref="x:myAttribute"/>
             </xsd:attributeGroup>
    </xsd:redefine>
</xsd:schema>

 

The file containing the new attributes is first imported with the <xsd:import> command, then the redefinition of the Akoma Ntoso schema takes place, where an attribute group (in our case, corereq) is redefined by adding to the previously specified definition (an:corereq) the new attribute x:myAttribute.

Although XML Schema would allow custom attributes to be specified in the Akoma Ntoso namespace, this would go against the rule of the Akoma Ntoso language and should not be performed.

Custom elements

A shown in the previous section, defining new attributes is easy, and there is no requirement for sequence and organization of the text content of the document. On the other hand, new elements are a completely different problem, because Akoma Ntoso expects a specific sequence of containment when dealing with the actual text content of a document. Custom elements, therefore are only indirectly introduced as generic elements or within special contexts, i.e. those whose type is defined as anyOtherType, a list which includes only one content-oriented element (i.e., foreign) and several metadata element, including proprietary, presentation, preservation, and otherAnalysis.

With regard to generic element, a likely customization need could be the constraint of the allowed names, e.g. to verify that only some values for the name attribute are used. As Akoma Ntoso stands now, literally any string can be used as the name for a generic element. If it is important that only a specific new element is used, then the name attribute of the corresponding generic element must be limited to its name only.

For instance, the following is a Schematron fragment verifying that only subsubsection is used as a value for the name attribute of the hcontainer generic element.

<sch:ns prefix="an" uri="http://www.akomantoso.org/2.0"/>
<sch:pattern id="Subsubsection">
    <sch:rule context="an:hcontainer">
        <sch:report test="@name!='subsubsection'">
            The only generic hierarchical container allowed is 'subsubsection'
        </sch:report>
    </sch:rule>
</sch:pattern>

 

As shown, the Akoma Ntoso namespace is first defined, and then a pattern (i.e., a named set of associated rules activated together for a specific validation need) is created with just one rule, evaluated when encountering and instance of the element hcontainer. This checks and reports an error message every time its name attribute has a value different than ‘subsubsection'.

Similarly, the type anyOtherType allows any structure and any element, as long as it uses a different namespace than Akoma Ntoso's. If it is important that only some vocabularies are used, and not just anything (for instance, if we want to restrict the use of foreign to only mathematical formulae expressed in MathML), we need to specify a restriction of the extension. The simplest solution is to provide a Schematron rule, since restricting the anyOtherType in a redefined schema will affect the allowed values in all elements of that type and not only foreign. The following is a suitable Schematron fragment:

    <sch:pattern id="MathML">
        <sch:rule context="an:foreign">
            <sch:assert
        test="*[namespace-uri()='http://www.w3.org/1998/Math/MathML']">
         The only allowed content for foreign is MathML
       </sch:assert>
        </sch:rule>
    </sch:pattern>

 

In this fragment, we simply require every element within the foreign element (the context of the rule) to have http://www.w3.org/1998/Math/MathML as its namespace.