SGML: ICADD in ISO 12083

SGML: ICADD in ISO 12083

ISO 12083 ANNEX A.8  [Informative]

Facilities for Braille, Large Print and Computer Voice

1.0 Introduction The International Committee for Accessible Document
Design ("ICADD") has published guidelines for designing SGML applications
which enable the preparation of texts for near-automatic conversion to
Grade 2 Braille and for publication both in large print and computer voice
editions.  This International Standard follows these guidelines.

By introducing fixed "SGML Document Access" (SDA) attributes, users
of any DTDs, including those which comprise this International Standard, will
create documents which can be easily mapped into documents conforming to
the ICADD DTD. Documents that conform to the ICADD DTD can be readily
translated into Braille.  Each element has an SDA attribute which
indicates how and onto which elements of the ICADD tagset it should be

2.0 Mapping to the Base Tag Set

A small set of "canonical" elements has been created to support the basic
output formats available in Braille. They are:

ANCHOR     Mark Spot on a Page
AU           Author(s)
B            Bold Emphasized Text
BOOK       Highest Level Element for Document
BOX        Boxed or Sidebar Information
BQ           Block Quotation
FIG        Figure Title and Description
FN           Footnote
H1           Major Level Heading within Book
H2           Second Level Heading
H3           Third Level Heading or BOX Heading
H4           Fourth Level Heading
H5           Fifth Level Heading
H6           Sixth Level Heading
IPP        Page Number of Ink Print Page
IT           Italic Emphasized Text
LANG       Language Indicator
LHEAD      List Heading
LIST       List of Items
LIT        Literal or Computer Text
LITEM      List Item
NOTE       Note in Text
OTHER      Other Emphasized Text
PARA       Paragraph
PP           Print Page Reference
TERM       Term or Keyword
TI           Title of the Book
XREF       Cross Reference

An optional set of canonical elements has been created to support the
creation of tables which may be used for Braille, large type and computer
voice. They are:

TABLE     The highest level element, which will include at least one TGROUP
TGROUP    Allows repeated combinations of the next three elements to appear
        within one table
THEAD     Table Header
TBODY     Table Body
TFOOT     Table Footer
COLDEF    Column Definition (which carries necessary attributes for the
        column information)
HDROW    Row in a Header
HDCELL    Cell in a Header
ROW    Row in the Table Body
STUBCELL    The Non-Data Carrying Stub Cell of a Row
SSTCELL    A Sub-Stub Cell in a Row (usually with different indent)
CELL    Table Cell
SHORTXT    Short Text Element provides alternative text for a stub cell
        or head cell for voice representation or for a cell reference to
        longer text carried in the NOTE in a Braille table
NOTE    Text extracted from Braille table cells in order to allow the
        narrowest possible column widths in the table body

2.1 One-to-One Mappings

In the book DTD of this International Standard, a chapter is defined as

<!ELEMENT  chapter            (no?, title, (%s.zz;)*, section*) >

Neither the chapter element, nor any of its subelements, appear in the
ICADD DTD. By using the SGML keyword "FIXED", a specific attribute value
is associated with every appearance of the related element.

Take the title element as an example. Depending on where this element
appears, it may mean different things. In the book's front material, it
will be used to indicate the book's title. We "fix" the SDAFORM attribute
for title so that it cannot be changed in the document instance.

<!ATTLIST title      SDAFORM        CDATA    #FIXED  "ti">

This indicates that wherever it is used, <title> stands for a <ti> in the
SDA tagset.

2.2 Simple Context-Sensitive Mappings

The ICADD technique includes both a simple mechanism for simple contextual
mappings, and  more complex one for those situations in which the mapping
may be dependent on the fulfillment of more than one condition in the
element's ancestry.

<!ATTLIST  chap       SDARULE   CDATA      #FIXED    "title h2" >
<!ATTLIST  sec        SDARULE   CDATA      #FIXED    "title h3" >

The attribute SDARULE always takes an even number of arguments. In this
example, we are defining the rules which apply while we are in chapter and
section elements. Within each ATTLIST, one can declare any number of pairs
of arguments.  That is, within a chapter, title maps to h2, and within a
section, title maps to h3.

Any element may have no declared mapping (that is, no fixed SDAFORM or
SDARULE attributes). For each such element, the transformation process
must discard both its start- and end-tags. A typical case where this must
occur is with "containing elements", those which do not contain character
content of their own, but which mark structural boundaries, and contain
only other elements.

Note that very useful mappings can be constructed using only the SDAFORM
and simple SDARULE attributes described above.

2.3 Complex Context-Sensitive Mappings

A more complex example allows one to set a stack of open rules, where the
rule closest to the current context overrides rules higher in the element
ancestry (or previous in the stack).  For example, in a content model
where chapter could occur at multiple levels in the document, we need to
be able to specify different mappings for title depending on whether
chapter is in a part or not:

<!ELEMENT       body        (part+ | chapter+)  >
<!ELEMENT       part        (no?, title, (%s.zz;)*, chapter* )      >
<!ELEMENT       chapter     (no?, title, (%s.zz;)*, section*)      >

We must recognize that if a title appears in a part it maps to h1, if it
appears in a chapter within a part, it maps to an h2, but if the chapter
is not in a part, the title maps to an h1.  We do this by establishing
rules within chapter that set the two mapping conditions carried in attributes
associated with the element immediately affected by the rules of its

<!ATTLIST chapter      SDABDY      NAMES      #FIXED     "title h1"
                       SDAPART     NAMES      #FIXED     "title h2" >

Use of the rules is established in the attributes of the elements which
may appear in the stack:

<!ATTLIST body   SDARULE    CDATA    #FIXED   "chapter #use SDABDY ">
<!ATTLIST part   SDARULE    CDATA    #FIXED   "ti h1
                                               chapter #use SDAPART">

As the transformation software encounters the attributes it sets the
stack: Chapter will use the SDABDY rule unless the part start-tag appears
first and resets the stack to SDAPART.

2.4 Generated Text

2.4.1 Character Text

The third type of ICADD attribute covers situations in which the name of
an SGML element (its generic identifier) carries useful information which
would be lost if the original element were transformed into a SDAFORM
which carries only the information needed for presentation.  (Often this
is the kind of content that would be generated by a formatter or
typesetting programme.)

<!ATTLIST abstract   SDAPREF   CDATA    #FIXED  "<h1>Abstract</h1>">

The SDAPREF attribute carries "generated text," words to be produced by
the translator software as a string to be substituted for the <abstract>
start-tag. Generated text can also be associated with the end-tag and
appears immediately before it.  Note that generated text may contain

When SDAPREF or SDASUFF attributes are used without an SDAFORM attribute,
the result is effectively the simple replacement of both or either the
source start- and end-tags by generated text. A basic example:

<!ATTLIST quote      SDAPREF   CDATA    #FIXED        '"'
                     SDASUFF   CDATA    #FIXED        '"'>

2.4.2 Consecutive Numbering

Two types of numbering are needed in a typical document.

In the first type, elements are numbered consecutively throughout. This is
supported by the following technique:

The SDAPREF attributes also allow the specification of automatic
numbering.  One may associate an automatically incremented value with an
element and may also access that value with a SDAPREF attribute for one of
that element's subelements.

The keyword #count causes the expression following to be interpreted. The
expression itself appears within parentheses and may be mixed in with
fixed text to be generated. The expression follows the form:

        #COUNT(element, format)

and specifies both element which is being established as a counter and the
format of the counter. One needs to specify the affected element since
counters are sometimes associated with a specific element, and sometimes
with its parent or child elements.

I specifies uppercase Roman numerals: I, II, III, IV, ...  i specifies
lowercase Roman numerals: i, ii, iii, iv, ...  1 specifies Arabic numbers:
1, 2, 3, 4, ...  A specifies uppercase alphabetic: A, B, C, D, ...  a
specifies lowercase alphabetic: a, b, c, d, ...

By default, the numbering starts at '1' (or the equivalent in the other
formats). A counter can be initialized with an expression of the form:


as in

<!ATTLIST app     SDAPREF CDATA #FIXED "#count (app, A=3)" >

which indicates that the appendix numbering (where app is the element name
for appendix) is upper case alphabetic and starts with the letter C. Note
that this example includes no other generated text and would simply print
the letter C in place of the app start-tag.

2.4.3 Numbering with Reset

The second type of numbering is one whose counter needs to be reset under
various conditions, particularly when a higher-level element changes.
Often the value of a counter will be used in the SDAPREF attribute of any
of an element's subelements. For example:

<!ATTLIST sec     SDAPREF CDATA #FIXED "Section #count (sec, A)" >
<!ATTLIST subsec  SDAPREF CDATA #FIXED "Subsection #count (sec,
                                         A).#count(subsec,a)" >

When the counter for a parent element changes (in this case SEC), the
counter for the subelement is automatically reset to '1'.

The example above would generate
        Section A when a SEC element first appeared, and
        Subsection A.a when the first SUBSEC element appears.

The following variation allows the format of the counter to be different
from that of a parent element:

        "Figure #count(sec, 1).#count(figure, 1))"   >

which will generate:
          Figure 1.1, 1.2, 1.3 ...
and so on, until the next section,
even if the sections are numbered A, B, C.

An exclamation mark in the counter format supports the case in which the
counter should not be reset when the parent changes. A typical example
might be figure numbers which are consecutive throughout a book but which
incorporate the chapter or section number as well:

        "Figure #count(sec, 1!).#count(figure, 1)"   >

which will generate:
          Figure 1.1, 1.2, 1.3
in section A, and
        Figure 2.4, 2.5, 2.6 in section B.

Under certain conditions, a counter needs to be reset even though the
parent has no counter or its counter doesn't appear in the current
element's generated text. The tilde character is the non-printing

A typical example:

        "#count(list ~)#count(listitem, 1). "   >

which will generate:
        1.  2. and so on and ensures that the listitems will be reset each
time a new list starts.

2.4.4 Assigning SDAPREF Values in a Parent Element

The final situation covered allows one to make numbering decisions based
on a parent element even when an element may have a variety of parents:

        "#set(listitem, #count(listitem, 1)'. ')Generated text for
orderedlist."   >
        "#set(listitem,'&bullet; ')"   >
        "#use(listitem, 1). "   >

#set always takes two arguments. The first is the name of the element in
the source DTD which governs the counter. The second argument is either
the format of the counter or the content of the prefix or suffix that will
be referenced and picked up by the sub-element that needs it.  Content
other than counter formats must be set off using single quotes.  In the
example, the orderedlist element establishes that a listitem is prefaced
with a numeric counter followed by a period and a space. When the same
listitem element appears within a bulletlist, however, its prefix is a
bullet followed by a space.

The #use function may take only one argument which is the name of the
counter it should use. If there is a second argument, it is a format for a
counter which overrides that which may have been set in the parent's

The #set function never appears in text generated by the element in which
it is declared. It may, therefore, appear with other content which is to
be generated, as in the example above where "Other generated text."
appears in place of the orderedlist start-tag.

The #set function is implied for any  #count which is not explicitly set.
That is, #set is used only for complex situations in which you wish to
establish multiple possible prefix or suffix strings.

2.4.5 Notes:

To use actual angle brackets, hash marks, tildes, exclamation marks and
quotation marks -- both single and double -- in all the SDAPREF and
SDASUFF values, one should use SGML entity references, even when the
special characters are used in a place where the context might inform
their correct usage.

Note that all the capabilities available in SDAPREF are also available in
SDASUFF attributes although they are not normally used there.

2.5 Attribute Handling

On occasion it will be necessary to carry the names and/or values of
attributes through the SDA transformation process. This is accomplished
with the use of three keywords  which may be employed in association with
any of the other SDA attributes.

#attlist brings forward the entire attribute list of the base element,
excluding any attribute whose name begins with SDA (or its replacement as
established by APPINFO; see below). This capability is used with SDAFORM
or SDAPREF attributes.

#attrib (xxxxx) brings forward the attribute xxxxx and its value (complete
with the equals sign and the quotation marks). This is used to isolate one
or more specific attributes from a longer list and may be used with
SDAFORM or SDAFORM attributes. That is: #attrib (xxxxx yyyyy) picks up

#attval (xxxxx) brings forward the value only of the attribute xxxxx. This
may be used with generated text in an SDAPREF attribute to rename an
attribute.  This keyword may also be used with more than one argument.

Two examples:
<!ATTLIST graphic    SDAFORM   CDATA    #FIXED  "fig #attlist">
                                "<h1 #attrib (ID)>Abstract</h1>">

2.6 A Basic Location Model

There are several classes of source DTD hierarchical structures which are
not well served by the techniques described earlier in this booklet. Most
important of those, by virtue of its use in a variety of existing DTDs, is
the requirement to allow for the mapping of elements within a recursively
nesting element.

For example, the following case

<!ELEMENT               sec     (title, (sec+|para+))   >

<sec><title>Level One Title</title>
        <sec><title>Level Two Title</title>
                <sec><title>Level Three Title</title>

can easily create a structure in which the first title element must be
mapped to the SDA h1, the second title must be mapped to an h2, and the
third title element must be mapped to the SDA h3.

The #use construct described above deals with a variety of structures, but
not with placement of an element within its tree or with respect to its
subelements. For that reason, the committee has developed a small
"location model" language to describe a set of standard conditions.

The syntax for these conditions involves use of ">", square brackets and
parentheses.  This was adopted because ">" is very unlikely to be a
character allowed in an element name. Except for the use of the rare SGML
feature CONCUR, the same is true of "(" and ")". The square brackets group
together the location model in order to allow non-significant white space
to occur.

The location model works exactly the same way as SDARULE except that the
first argument, which occurs within square brackets, may  represent a
complex set of conditions which must be fulfilled for the mapping to
occur. The location model may also appear within SDAPREF and SDASUFF
to create context-sensitive generated text.

[chap>>p>>emph] means "the current element and its ancestry matches the
pattern chap containing a p containing a emph". It is not necessary to put
the current element into this pattern if emph is to contain it, but not
necessarily immediately. You can put in the current element either by name
or, sometimes more usefully, by the special symbol #CE.

[chap>p>emph>#CE]  means "the current element and its ancestry matches the
pattern chap immediately containing p immediately containing emph
immediately containing the current element". ">>" and ">" can be mixed as

[(chap|sec)>>p] means "either a chap or a sec containing a p".

[chap >> p ID=AC555 >> emph] indicates that the transformation is to take
place only if the specified attribute value matches. The simpler case
of [emph type=2] or [#CE type=2] demonstrates the checking of an
attribute value for the current element.

Alternative values are allowed for attributes in location models.  Thus:
[chap>> p ID=(A|B|C) >> emph] means match a chap containing a
p with ID attribute equal to an A OR a B OR C containing an emph.

Accordingly, for the nested sec example described above, the following
attribute declarations would handle the mappings:

                        "title h1
                        [sec>>title] h2
                        [sec>>sec>>title] h3
                        [sec>>sec>>sec>>title] h4"

2.8 Braille Transcriber's Notes

Certain transformations will always require the intervention of an
experienced Braille transcriber. Often these can be predicted: One knows
that in a DTD with the potential for complex tables, or one which supports
the inclusion of graphics, that the Braillist should be alerted to either
proofread or create the required content and markup.

In the case of graphics, for example, a sighted person will have to
describe the image. It would be useful to have the transformation process
place a marker in the text at each point where one knows in advance that
such work will be necessary.

The ICADD technique recommends the consistent use of a processing
instruction as just such a marker. The marker is placed by declaring an
SDAPREF attribute at the highest level of the relevant element/sub-element
group. For example, a marker should be put an a deflist element rather
than on a dd or ddhd:

<!ATTLIST deflist   SDAFORM         CDATA    #FIXED   "list"
                    SDAPREF   CDATA    #FIXED   "<?SDATRANS>Definitions" >

2.9 Support for Mathematics and Other Special SGML Notations

Documents encoded in SGML and containing specialized markup for fields
such as mathematics and chemistry need particular, non-automatable
handling for presentation to the visually impaired.

For production of Braille, large print and synthesized voice, there
is one simple rule: The specialized markup in the original file must
be preserved as it represents the greatest likely source of information for
the domain expert who will prepare the files for production. (Note
that the ICADD transformation techniques normally discard all source
markup for which there is no mapping declared in the SDA attributes.)

Two ICADD techniques apply to this work.

2.9.1 Suspending the Transformation Process

A rarely used special attribute SDASUSP, intended to allow the
suspension of the ICADD transformation process,
may appear in the attribute list of any element.

With the argument
"SUSPEND", it will terminate the normal ICADD transformation for the entire
contents of the current element, allowing the
source markup to survive the transformation and appear in the
output file. For example, in this International Standard,
this technique is used to preserve mathematical markup. Note that
ICADD transformation software should not process any element
or entity which is declared as containing non-SGML data.

The argument
"RESUME" resumes transformation for the contents of the current
element. This argument, which is used only within an element
within which transformation has been suspended, allows the nested
element to be transformed while the parent is not.

The argument "DISCARD" allows the transformation process to take
input from any DTD and remove it -- including both markup and
content -- from the output file. This allows material which is
included in the document instance but does not appear in print
to be discarded rather than turned into Braille, large print
or voice content.

2.9.2 Attribute to Carry Semantic Markup

Since there is no consensus on how to describe the semantics of formulas,
the mathematics DTD included with this Interational Standard
describes only their presentational or visual structure. Since, however,
there is a strong need for such description (especially within the
print-disabled community), the following parameter entity
declaration should be added where there is a requirement for a consistent,
standardized mechanism to carry semantic meanings for the SGML

<!ENTITY % SDAMAP       "SDAMAP   NAME    #IMPLIED"           >

The attribute represented by %SDAMAP; is to be used for
all elements which may require a semantic association, or, in the simpler
case, be added to all elements in a mathematical or similar DTD
with requirements for specialized handling.

2.10 Support for Multiple Languages

The techniques described above for SDAPREF and SDASUFF are based on the
premise that it makes sense to incorporate text directly into the DTD that
will become part of the input stream to a Braille translation process.
This, in turn, assumes that the text of the marked-up file and the text of
the DTD will be the same -- and more importantly, will remain the same.

In fact, this cannot be safely assumed. SGML is part of an active,
international community in the forefront of reusing information across
many borders and boundaries. An additional technique is  needed to ensure
the separability of the generated text from the remaining work that goes
into making a DTD ICADD-enabled.

The ICADD committee suggests removing the specific contents of all
generated text from the attribute declarations and defining them
indirectly as SGML entities which are gathered in a set of declarations,
which may exist either in an external file or within the DTD. Both
mechanisms allow users to switch easily between different languages.

The following example illustrates the use of the external file. In the DTD
is a reference to a local (system) or public entity set:


Wherever SDAPREF and SDASUFF attributes were used, instead of the form
described elsewhere in this document, one would include:

<!ATTLIST email SDAPREF ENTITY #FIXED "sdaemail" >

Notice that the SDAPREF value is defined as having FIXED ENTITY attribute
values instead of the CDATA attribute values as elsewhere in this Annex.

In the SDAGEN entities file, we might find for instance:

<!ENTITY sdaemail SDATA "Electronic Mail Address: ">

When it becomes appropriate to re-run the transformation process for a
second language, the entity reference should be re-declared to refer to a
second SDAGEN file, in the desired second language, edited locally by a
translator (and not necessarily the DTD creator). This process is repeated
for as many languages as are needed.

(Note that one could do some fancy renaming of the external entity files
so that multiple files exist but, as needed, a copy is made  which is
temporarily called by the name embedded within the DTD. That way the DTD
doesn't have to have the file name re-declared each time it is used to
establish the mapping transformations for the new language.)

For any DTD which would be used in a variety of countries, this approach
means that one defines one common entity file (presumably in the base
language of the DTD) for the generated text which appears in all the
SDAPREF and SDASUFF attributes declared throughout the DTD.

There is a disadvantage to this approach in that the person creating the
ICADD-enabled DTD always needs to include (at least) one separate entity
file.  Accordingly, there is a slight risk of the two files becoming
separated or out-of-synchronization. However, in the committee's opinion,
this solution is better than the most obvious alternative: having to deal
with multiple versions of the same DTD whose only differences are that
they contain generated text attribute values in different languages.

There is a second approach which has the advantage of maintaining all the
content in one file and the disadvantage of creating a slightly more
cluttered DTD. One would decide which approach to use primarily based on
whether one wants to have the DTD seem to be unchanging -- only the
external file changes -- or whether one is more concerned about keeping
everything needed for the transformation in one file.

The second technique involves the use, within the DTD, of marked section
parameter entities for each language.

The example shows the principle:

<!ATTLIST email SDAPREF ENTITY #FIXED           "sdaemail"
                        COLOUR  (brown|green|blue)  #IMPLIED TYPE
                        CDATA   #REQUIRED         >

<!ENTITY % dutch        "IGNORE" >
<!ENTITY % french       "IGNORE" >
<!ENTITY % english      "INCLUDE" >

<![%dutch; [
<!ENTITY sdaemail CDATA "<para>Adres voor elektronische post:</para> "> ]]>

<![%english; [
<!ENTITY sdaemail CDATA "<para>Electronic Mail Address:</para> "> ]]>

<![%french; [
<!ENTITY sdaemail CDATA "<para>Adresse electronique:</para> "> ]]>

Here one declares a marked section parameter entity in the DTD for each
relevant language and sets all languages to IGNORE except for the current
one. The text -- as well as anything else which may appear in the
generated text attribute values, including context and markup -- appears
within the appropriately marked up marked section once for each language.
Notice that non-ICADD declarations appear intermingled with the SDA
attributes but that only the SDA declarations must use the declared

>From a practical point of view, the technique is quite easy to administer.
All SDAPREF and SDASUFF values are declared as entities and given a unique
name. Each is declared within a set of entity declarations gathered
together for convenience, perhaps at the end of the DTD, within the
appropriate marked section. That entire list is copied over and over, once
for each language, and only the language-sensitive words are translated.
The additional characters are left precisely as they are to ensure
identical handling by the transformation process.

Note that either of these mechanisms result in a set of declared entities
that will also now be valid elsewhere in documents conforming to the DTD.
This means that if users attempt to declare entities which, by
coincidence, have the same name, they will over-ride the declarations in
the DTD's entity set. The committee recommends naming all such entities so
that they begin with the letters "sda".

These techniques also mean that authors working with any of a number of
common SGML editing tools will likely be offered dialog box pick lists of
entity references -- and these lists will include the entity references
that are intended only for internal use within the DTD.  In theory, an
author could insert them anywhere in the document.

3.0 SDA Parameter Entities

The DTDs in this International Standard contain the following parameter
entities (whose use is encouraged by others implementing the ICADD
accessible document techniques):

<!--    Accessible Document Parameter Entities    -->

4.0 Handling of Special Characters as Entity References

Under most circumstances, an SGML parser will be used to transform a
source SGML file into one marked up for Braille, large print or computer
voice. That parser will normally transform all entity references into the
content that has been defined for them. At that point, their value to the
ongoing process vanishes; they will have been converted to machine
specific or software specific codes:

For ICADD purposes, it is critical that they remain "unexpanded" so they
are still computer and software independent when they reach  Braille or
other ICADD software.

Accordingly, all entity references used with the ICADD-enabling techniques
must be declared as being of the type CDATA or SDATA. This will ensure
they pass through the SGML parser unchanged.

The ICADD-enabled version of a typical SGML entity declaration:

<!ENTITY ntilde SDATA "&ntilde;">

5.0 Indicating ICADD Usage in the SGML Declaration

A document indicates conformance to the ICADD SGML Document Accessibility
architecture in the APPINFO parameter of the SGML declaration, which
specifies the characters in the "SDA prefix" that identifies attributes
that represent "SDA declarations".

SDA declaration facilities are provided by attributes described in this

Conformance of a document to SDA is indicated by a parameter of the
APPINFO parameter of the SGML declaration. Its format is:


The parameter can also specify the name of the "SDA prefix" if it is other
than "SDA".

The format is:


where the second "SDA" is replaced by the new prefix name.  A new name
must be a valid name in the concrete syntax of the SGML declaration.