A reference to a dictionary entry.

Elements in data instances such as scalar may have a dictRef attribute to point to an entry in a dictionary. To avoid excessive use of (mutable) filenames and URIs we recommend a namespace prefix, mapped to a namespace URI in the normal manner. In this case, of course, the namespace URI must point to a real XML document containing entry elements and validated against STMML Schema.

Where there is concern about the dictionary becoming separated from the document the dictionary entries can be physically included as part of the data instance and the normal XPointer addressing mechanism can be used.

This attribute can also be used on dictionary elements to define the namespace prefix

<stmml title="dictRef example">

<scalar dataType="xsd:float" title="surfaceArea" 
  dictRef="cmlPhys:surfArea" 
  xmlns:cmlPhys="http://www.xml-cml.org/dict/physical"
  units="units:cm2">50</scalar>
  </stmml>

<stmml title="dictRef example 2">
<stm:list xmlns:stm="http://www.xml-cml.org/schema/stmml">
  <stm:observation>
    <p>We observed <object count="3" dictRef="foo:p1"/> 
      constructing dwellings of different material</p>
  </stm:observation>
  <stm:entry id="p1" term="pig">
    <stm:definition>A domesticated animal.</stm:definition>
    <stm:description>Predators include wolves</stm:description>
    <stm:description class="scientificName">Sus scrofa</stm:description>
  </stm:entry>
</stm:list>
</stmml>

An attribute providing a unique ID for an element.

An attribute providing a mandatory unique ID for an element.

This is a horrible hack. It should be possible to add 'required' to the attributeGroup where used... (Maybe it is and I am still fighting Schema Wars)

A reference to an element of given type.

ref modifies an element into a reference to an existing element of that type within the document. This is similar to a pointer and it can be thought of a strongly typed hyperlink. It may also be used for "subclassing" or "overriding" elements.

<stmml title="ref example">
<cml>
  <molecule id="m1">
    <atomArray>
      <atom elementType="N"/>
      <atom elementType="O"/>
    </atomArray>
  </molecule>
  <html:p>The action of <molecule ref="#m1"/> on cardiac muscle ...</html:p>
</cml>
</stmml>

The size of an array, matrix, list, etc.

A title on an element.

No controlled value.

<stmml title="title example">
<action title="turn on heat" start="T09:00:00" convention="xsd"/>
</stmml>

Scientific units on an element.

These must be taken from a dictionary of units. There should be some mechanism for validating the type of the units against the possible values of the element.

An array of coordinateComponents for a single coordinate.

An array of coordinateComponents for a single coordinate where these all refer to an X-coordinate (NOT x,y,z) Instances of this type will be used in array-style representation of 2-D or 3-D coordinates.

Currently no machine validation

Currently not used in STMML, but re-used by CML (see example)

<stmml title="coordinateComponentArrayType">

<cml:atomArray 
  xmlns:cml="http://www.xml-cml.org/schema/cml2/core"
  x2="1.2 2.3 4.5 6.7"/>
  </stmml>

An x/y coordinate pair.

An x/y coordinate pair consisting of two real numbers, separated by whitespace or a comma. In arrays and matrices, it may be useful to set a separate delimiter

<stmml title="coordinate2Type example">
<list>
  <array dataType="xsd:decimal"
      >1.2,3.4   3.2,4.5   6.7,23.1 </array>
  <array delimiter="/" dataType="xsd:decimal"
      >/1.2 3.4/3.2 4.5/6.7 23.1/</array>
</list>
</stmml>

An x/y/z coordinate triple.

An x/y/z coordinate triple consisting of three real numbers, separated by whitespace or commas. In arrays and matrices, it may be useful to set a separate delimiter

<stmml title="coordinate3Type example">
<list>
  <array dataType="xsd:decimal">1.2,3.4,1.2   
    3.2,4.5,7.3   6.7,23.1,5.6 </array>
  <array delimiter="/" dataType="xsd:decimal"
  >/1.2 3.4 3.3/3.2 4.5 4.5/6.7 23.1 5.6/</array>
</list>
</stmml>

A count multiplier for an element

Many elements represent objects which can occur an arbitrary number of times in a scientific context. Examples are action, object or molecules.

<stmml title="countType example">

<list>
<object title="frog" count="10"/>
<action title="step3" count="3">
  <p>Add 10 ml reagent</p>
</action>
</list>
</stmml>

an enumerated type for all builtin allowed dataTypes in STM

dataTypeType represents an enumeration of allowed dataTypes (at present identical with those in XML-Schemas (Part2- datatypes). This means that implementers should be able to use standard XMLSchema-based tools for validation without major implementation problems.

It will often be used an an attribute on scalar, array or matrix elements.

<stmml title="dataType example">

<list xmlns="http://www.xml-cml.org/schema/cml2/core">
  <scalar dataType="xsd:boolean" title="she loves me">true</scalar>
  <scalar dataType="xsd:float" title="x">23.2</scalar>
  <scalar dataType="xsd:duration" title="egg timer">PM4</scalar>
  <scalar dataType="xsd:dateTime" title="current data and time">2001-02-01:00:30</scalar>
  <scalar dataType="xsd:time" title="wake up">06:00</scalar>
  <scalar dataType="xsd:date" title="where is it">1752-09-10</scalar>
  <scalar dataType="xsd:anyURI" title="CML site">http://www.xml-cml.org/</scalar>
  <scalar dataType="xsd:QName" title="CML atom">cml:atom</scalar>
  <scalar dataType="xsd:normalizedString" title="song">the mouse ran up the clock</scalar>
  <scalar dataType="xsd:language" title="UK English">en-GB</scalar>
  <scalar dataType="xsd:Name" title="atom">atom</scalar>
  <scalar dataType="xsd:ID" title="XML ID">_123</scalar>
  <scalar dataType="xsd:integer" title="the answer">42</scalar>
  <scalar dataType="xsd:nonPositiveInteger" title="zero">0</scalar>
</list>
</stmml>

A non-whitespace character used in arrays to separate components.

Some STMML elements (such as array) have content representing concatenated values. The default separator is whitespace (which can be normalised) and this should be used whenever possible. However in some cases the values are empty, or contain whitespace or other problematic punctuation, and a delimiter is required.

Note that the content string MUST start and end with the delimiter so there is no ambiguity as to what the components are. Only printable characters from the ASCII character set should be used, and character entities should be avoided.

When delimiters are used to separate precise whitespace this should always consist of spaces and not the other allowed whitespace characters (newline, tabs, etc.). If the latter are important it is probably best to redesign the application.

<stmml title="delimiter example">

<array size="4"  dataType="xsd:string" delimiter="|">|A|B12||D and   E|</array>
</stmml>


 The values in the array are
  "A", "B12", "" (empty string) and "D and   E" 
 note the spaces

The basis of an error value.

Errors in values can be of several types and this simpleType provides a small controlled vocabulary

<stmml title="scalar example">
<scalar 
    dataType="xsd:decimal" 
    errorValue="1.0" 
    errorBasis="observedStandardDeviation" 
    title="body weight"
    dictRef="zoo:bodywt"
    units="units:g">34.3</scalar>
    </stmml>

An observed or calculated estimate of the error in the value of a numeric quantity.

An observed or calculated estimate of the error in the value of a numeric quantity. . It should be ignored for dataTypes such as URL, date or string. The statistical basis of the errorValueType is not defined - it could be a range, an estimated standard deviation, an observed standard error, etc. This information can be added through errorBasisType.

<stmml title="scalar example">
<scalar 
    dataType="xsd:decimal" 
    errorValue="1.0" 
    errorBasis="observedStandardDeviation" 
    title="body weight"
    dictRef="zoo:bodywt"
    units="units:g">34.3</scalar>
    </stmml>

A unique ID for an element.

This is not formally of type ID (an XML NAME which must start with a letter and contain only letters, digits and .-_:). It is recommended that IDs start with a letter, and contain no punctuation or whitespace. The function generate-id() in XSLT will generate semantically void unique IDs.

It is difficult to ensure uniqueness when documents are merged. We suggest namespacing IDs, perhaps using the containing elements as the base. Thus mol3:a1 could be a useful unique ID. However this is still experimental.

An array of floats.

An array of floats or other real numbers. Not used in STM Schema, but re-used by CML and other languages.

<atomArray xmlns="http://www.xml-cml.org/schema/cml2/core"
  x2="1.2 2.3 3.4 5.6"/>

An array of integers.

An array of integers; for re-use by other schemas

Not machine-validatable

<stmml title="integerArray type">

<atomArray xmlns="http://www.xml-cml.org/schema/cml2/core"
  hydrogenCount="3 1 0 2"/>
  </stmml>

The maximum INCLUSIVE value of a quantity.

The maximum INCLUSIVE value of a sortable quantity such as numeric, date or string. It should be ignored for dataTypes such as URL. The use of min and max attributes can be used to give a range for the quantity. The statistical basis of this range is not defined. The value of max is usually an observed quantity (or calculated from observations). To restrict a value, the maxExclusive type in a dictionary should be used.

The type of the maximum is the same as the quantity to which it refers - numeric, date and string are currently allowed

<stmml title="maxType example">

<scalar dataType="xsd:float" max="20" min="12">15</scalar>
</stmml>

Allowed matrix types.

Allowed matrix types. These are mainly square matrices

<stmml title="matrix example">

<matrix id="m1" title="mattrix-1" dictRef="foo:bar"
  rows="3" columns="3" dataType="xsd:decimal" 
  delimiter="|" matrixType="squareSymmetric" units="unit:m"
  >|1.1|1.2|1.3|1.2|2.2|2.3|1.3|2.3|3.3|</matrix>
  </stmml>

Symmetric. Elements are zero except on the diagonal.

Square. Elements are zero below the diagonal

Symmetric. Elements are zero except on the diagonal.

User-defined matrix-type.

This definition must be by reference to a namespaced dictionary entry.

The name of the metadata.

Metadata consists of name-value pairs (value is in the "content" attribute). The names are from a semi-restricted vocabulary, mainly Dublin Core. The content is unrestricted. The order of metadata has no implied semantics at present. Users can create their own metadata names using the namespaced prefix syntax (e.g. foo:institution). Ideally these names should be defined in an STMML dictionary.

2003-03-05: Added UNION to manage non-controlled names

The extent or scope of the content of the resource.

Coverage will typically include spatial location (a place name or geographic coordinates), temporal period (a period label, date, or date range) or jurisdiction (such as a named administrative entity). Recommended best practice is to select a value from a controlled vocabulary (for example, the Thesaurus of Geographic Names [TGN]) and that, where appropriate, named places or time periods be used in preference to numeric identifiers such as sets of coordinates or date ranges.

An account of the content of the resource.

Description may include but is not limited to: an abstract, table of contents, reference to a graphical representation of content or a free-text account of the content.

An unambiguous reference to the resource within a given context.

Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system. Example formal identification systems include the Uniform Resource Identifier (URI) (including the Uniform Resource Locator (URL)), the Digital Object Identifier (DOI) and the International Standard Book Number (ISBN).

The physical or digital manifestation of the resource.

Typically, Format may include the media-type or dimensions of the resource. Format may be used to determine the software, hardware or other equipment needed to display or operate the resource. Examples of dimensions include size and duration. Recommended best practice is to select a value from a controlled vocabulary (for example, the list of Internet Media Types [MIME] defining computer media formats).

A reference to a related resource.

Recommended best practice is to reference the resource by means of a string or number conforming to a formal identification system.

Information about rights held in and over the resource.

Typically, a Rights element will contain a rights management statement for the resource, or reference a service providing such information. Rights information often encompasses Intellectual Property Rights (IPR), Copyright, and various Property Rights. If the Rights element is absent, no assumptions can be made about the status of these and other rights with respect to the resource.

The topic of the content of the resource.

Typically, a Subject will be expressed as keywords, key phrases or classification codes that describe a topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme.

A name given to the resource.

Typically, a Title will be a name by which the resource is formally known.

The nature or genre of the content of the resource.

Type includes terms describing general categories, functions, genres, or aggregation levels for content. Recommended best practice is to select a value from a controlled vocabulary (for example, the working draft list of Dublin Core Types [DCT1]). To describe the physical or digital manifestation of the resource, use the FORMAT element.

An entity responsible for making contributions to the content of the resource.

Examples of a Contributor include a person, an organisation, or a service. Typically, the name of a Contributor should be used to indicate the entity.

An entity primarily responsible for making the content of the resource.

Examples of a Creator include a person, an organisation, or a service. Typically, the name of a Creator should be used to indicate the entity.

An entity responsible for making the resource available.

Examples of a Publisher include a person, an organisation, or a service. Typically, the name of a Publisher should be used to indicate the entity.

A Reference to a resource from which the present resource is derived.

The present resource may be derived from the Source resource in whole or in part. Recommended best practice is to reference the resource by means of a string or number conforming to a formal identification system.

A language of the intellectual content of the resource.

Recommended best practice for the values of the Language element is defined by RFC 1766 [RFC1766] which includes a two-letter Language Code (taken from the ISO 639 standard [ISO639]), followed optionally, by a two-letter Country Code (taken from the ISO 3166 standard [ISO3166]). For example, 'en' for English, 'fr' for French, or 'en-uk' for English used in the United Kingdom.

A date associated with an event in the life cycle of the resource.

Typically, Date will be associated with the creation or availability of the resource. Recommended best practice for encoding the date value is defined in a profile of ISO 8601 [W3CDTF] and follows the YYYY-MM-DD format.

Entry contains information relating to chemical safety.

Typically the content will be a reference to a handbook, MSDS, threshhold or other human-readable string

Part or whole of the information was computer-generated.

Typically the content will be the name of a method or a program

3D structure included.

details included

The minimum INCLUSIVE value of a quantity.

The minimum INCLUSIVE value of a sortable quantity such as numeric, date or string. It should be ignored for dataTypes such as URL. The use of min and min attributes can be used to give a range for the quantity. The statistical basis of this range is not defined. The value of min is usually an observed quantity (or calculated from observations). To restrict a value, the minExclusive type in a dictionary should be used.

The type of the minimum is the same as the quantity to which it refers - numeric, date and string are currently allowed

<stmml title="maxType example">

<scalar dataType="xsd:float" max="20" min="12">15</scalar>
</stmml>

A string referencing a dictionary, units, convention or other metadata.

The namespace is optional but recommended where possible

Note: this convention is only used within STMML and related languages; it is NOT a generic URI.

<stmml title="namespace example">

<list>
<!-- dictRef is of namespaceRefType -->
  <scalar dictRef="chem:mpt">123</scalar>  
<!-- error -->
  <scalar dictRef="mpt23">123</scalar>  
</list>
</stmml>

The namespace prefix must start with an alpha character and can only contain alphanumeric and '_'. The suffix can have characters from the XML ID specification (alphanumeric, '_', '.' and '-'

A positive number. Note that we also provide nonNegativeNumber with inclusive zero. The maximum number is (quite large) since 'unbounded' is more difficult to implement. This is greater than Eddington's estimate of the number of particles in the universe so it should work for most people.

A reference to an existing element.

A reference to an existing element in the document. The target of the ref attribute must exist. The test for validity will normally occur in the element's appinfo

Any DOM Node created from this element will normally be a reference to another Node, so that if the target node is modified a the dereferenced content is modified. At present there are no deep copy semantics hardcoded into the schema.

The size of an array.

The size of an array. Redundant, but serves as a check for processing software (useful if delimiters are used)

Scientific units.

These will be linked to dictionaries of units with conversion information, using namespaced references (e.g. si:m)

Distinguish carefully from unitType which is an element describing a type of a unit in a unitList

<stmml title="unitList example">
<stm:unitList xmlns:stm="http://www.xml-cml.org/schema/stmml">

<!-- ======================================================================= -->
<!-- ========================= fundamental types =========================== -->
<!-- ======================================================================= -->

<stm:unitType id="length" name="length">
  <stm:dimension name="length" power="1"/>
</stm:unitType>

<stm:unitType id="time" name="time">
  <stm:dimension name="time" power="1"/>
</stm:unitType>

<!-- ... -->

<stm:unitType id="dimensionless" name="dimensionless">
  <stm:dimension name="dimensionless" power="1"/>
</stm:unitType>

<!-- ======================================================================= -->
<!-- ========================== derived types ============================== -->
<!-- ======================================================================= -->

<stm:unitType id="acceleration" name="acceleration">
  <stm:dimension name="length" power="1"/>
  <stm:dimension name="time" power="-2"/>
</stm:unitType>

<!-- ... -->

<!-- ======================================================================= -->
<!-- ====================== fundamental SI units =========================== -->
<!-- ======================================================================= -->

<stm:unit id="second" name="second" unitType="time">
  <stm:description>The SI unit of time</stm:description>
</stm:unit>

<stm:unit id="meter" name="meter" unitType="length"
  abbreviation="m">
  <stm:description>The SI unit of length</stm:description>
</stm:unit>

<!-- ... -->

<stm:unit id="kg" name="nameless" unitType="dimensionless"
  abbreviation="nodim">
  <stm:description>A fictitious parent for dimensionless units</stm:description>
</stm:unit>

<!-- ======================================================================= -->
<!-- ===================== derived SI units ================================ -->
<!-- ======================================================================= -->

<stm:unit id="newton" name="newton" unitType="force">
  <stm:description>The SI unit of force</stm:description>
</stm:unit>

<!-- ... -->

<!-- multiples of fundamental SI units -->

<stm:unit id="g" name="gram" unitType="mass"
  parentSI="kg"
  multiplierToSI="0.001"
  abbreviation="g">
  <stm:description>0.001 kg. </stm:description>
</stm:unit>

<stm:unit id="celsius" name="Celsius" parentSI="k"
  multiplierToSI="1" 
  constantToSI="273.18">
  <stm:description><p>A common unit of temperature</p></stm:description>
</stm:unit>

<!-- fundamental non-SI units -->

<stm:unit id="inch" name="inch" parentSI="meter"
   abbreviation="in"
  multiplierToSI="0.0254" >
  <stm:description>An imperial measure of length</stm:description>
</stm:unit>

<!-- derived non-SI units -->

<stm:unit id="l" name="litre" unitType="volume"
   parentSI="meterCubed"
   abbreviation="l"
   multiplierToSI="0.001">
  <stm:description>Nearly 1 dm**3 This is not quite exact</stm:description>
</stm:unit>

<!-- ... -->

<stm:unit id="fahr" name="fahrenheit" parentSI="k"
   abbreviation="F"
  multiplierToSI="0.55555555555555555" 
  constantToSI="-17.777777777777777777">
  <stm:description>An obsolescent unit of temperature still used in popular
  meteorology</stm:description>
</stm:unit>

</stm:unitList>
</stmml>

A homogenous 1-dimensional array of similar objects.

array manages a homogenous 1-dimensional array of similar objects. These can be encoded as strings (i.e. XSD-like datatypes) and are concatenated as string content. The size of the array should always be >= 1.

The default delimiter is whitespace. The normalize-space() function of XSLT could be used to normalize all whitespace to single spaces and this would not affect the value of the array elements. To extract the elements java.lang.StringTokenizer could be used. If the elements themselves contain whitespace then a different delimiter must be used and is identified through the delimiter attribute. This method is mandatory if it is required to represent empty strings. If a delimiter is used it MUST start and end the array - leading and trailing whitespace is ignored. Thus size+1 occurrences of the delimiter character are required. If non-normalized whitespace is to be encoded (e.g. newlines, tabs, etc) you are recommended to translate it character-wise to XML character entities.

Note that normal Schema validation tools cannot validate the elements of array (they are defined as string) However if the string is split, a temporary schema can be constructed from the type and used for validation. Also the type can be contained in a dictionary and software could decide to retrieve this and use it for validation.

When the elements of the array are not simple scalars (e.g. scalars with a value and an error, the scalars should be used as the elements. Although this is verbose, it is simple to understand. If there is a demand for more compact representations, it will be possible to define the syntax in a later version.

<stmml title="array example 1">

<array size="5" title="value" 
  dataType="xsd:decimal">  1.23 2.34 3.45 4.56 5.67</array>
  </stmml>

the size attribute is not mandatory but provides a useful validity check):

<stmml title="array example 2">

<array size="5" title="initials" dataType="xsd:string" 
  delimiter="/">/A B//C/D-E/F/</array>
  </stmml>

Note that the second array-element is the empty string ''.

<stmml title="array example 3">

<array title="mass" size="4"
  units="unit:g"
  errorBasis="observedStandardDeviation"
  minValues="10 11 10 9"
  maxValues="12 14 12 11"
  errorValues="1 2 1 1"
  dataType="xsd:float">11 12.5 10.9 10.2
</array>
</stmml>

The mandatory data type.

All elements of the array must have the same dataType.

an optional array of error values for numeric arrays.

an optional array of minimum values for numeric arrays.

an optional array of maximum values for numeric arrays.

A generic container with no implied semantics.

A generic container with no implied semantics. It just contains things and can have attributes which bind conventions to it. It could often act as the root element in an STM document.

<stmml title="list example">

<list>
  <array title="animals" dataType="xsd:string">frog bear toad</array>
  <scalar title="weight" dataType="xsd:float">3.456</scalar>
</list>
</stmml>

Type of the list.

Semeantics undefined.

<stmml title="list example">

<list>
  <array title="animals" dataType="xsd:string">frog bear toad</array>
  <scalar title="weight" dataType="xsd:float">3.456</scalar>
</list>
</stmml>

A rectangular matrix of any quantities.

By default matrix represents a rectangular matrix of any quantities representable as XSD or STMML dataTypes. It consists of rows*columns elements, where columns is the fasting moving index. Assuming the elements are counted from 1 they are ordered V[1,1],V[1,2],...V[1,columns],V[2,1],V[2,2],...V[2,columns], ...V[rows,1],V[rows,2],...V[rows,columns]

By default whitespace is used to separate matrix elements; see array for details. There are NO characters or markup delimiting the end of rows; authors must be careful!. The columns and rows attributes have no default values; a row vector requires a rows attribute of 1.

matrix also supports many types of square matrix, but at present we require all elements to be given, even if the matrix is symmetric, antisymmetric or banded diagonal. The matrixType attribute allows software to validate and process the type of matrix.

<stmml title="matrix example">

<matrix id="m1" title="mattrix-1" dictRef="foo:bar"
  rows="3" columns="3" dataType="xsd:decimal" 
  delimiter="|" matrixType="squareSymmetric" units="unit:m"
  >|1.1|1.2|1.3|1.2|2.2|2.3|1.3|2.3|3.3|</matrix>
  </stmml>

Number of rows

Number of columns

units (recommended for numeric quantities!!)

Type of matrix.

Mainly square, but extensible through the xsd:union mechanism.

an optional array of error values for numeric matrices.

an optional array of minimum values for numeric matrices.

an optional array of maximum values for numeric matrices.

A general container for metadata.

A general container for metadata, including at least Dublin Core (DC) and CML-specific metadata

In its simple form each element provides a name and content in a similar fashion to the meta element in HTML. metadata may have simpleContent (i.e. a string for adding further information - this is not controlled).

<stmml title="metadata example">

<list>
  <metadataList>
    <metadata name="dc:coverage" content="Europe"/>
    <metadata name="dc:description" content="Ornithological chemistry"/>
    <metadata name="dc:identifier"  content="ISBN:1234-5678"/>
    <metadata name="dc:format" content="printed"/>
    <metadata name="dc:relation" content="abc:def123"/>
    <metadata name="dc:rights" content="licence:GPL"/>
    <metadata name="dc:subject" content="Informatics"/>
    <metadata name="dc:title" content="birds"/>
    <metadata name="dc:type" content="bird books on chemistry"/>
    <metadata name="dc:contributor" content="Tux Penguin"/>
    <metadata name="dc:creator" content="author"/>
    <metadata name="dc:publisher" content="Penguinone publishing"/>
    <metadata name="dc:source" content="penguinPub"/>
    <metadata name="dc:language" content="en-GB"/>
    <metadata name="dc:date" content="1752-09-10"/>
  </metadataList>
  <metadataList>
    <metadata name="cmlm:safety" content="mostly harmless"/>
    <metadata name="cmlm:insilico" content="electronically produced"/>
    <metadata name="cmlm:structure" content="penguinone"/>
    <metadata name="cmlm:reaction" content="synthesis of penguinone"/>
    <metadata name="cmlm:identifier" content="smiles:O=C1C=C(C)C(C)(C)C(C)=C1"/>
  </metadataList>
  <metadataList>
    <metadata name="foo:institution" content="abc.org"/>
    <metadata name="bar" content="xyzzy"/>
    <metadata name="$deliberateError" content="error"/>
  </metadataList>
</list>
</stmml>

The metadata type.

The metadata.

A general container for metadata elements.

<stmml title="metadata example">

<list>
  <metadataList>
    <metadata name="dc:coverage" content="Europe"/>
    <metadata name="dc:description" content="Ornithological chemistry"/>
    <metadata name="dc:identifier"  content="ISBN:1234-5678"/>
    <metadata name="dc:format" content="printed"/>
    <metadata name="dc:relation" content="abc:def123"/>
    <metadata name="dc:rights" content="licence:GPL"/>
    <metadata name="dc:subject" content="Informatics"/>
    <metadata name="dc:title" content="birds"/>
    <metadata name="dc:type" content="bird books on chemistry"/>
    <metadata name="dc:contributor" content="Tux Penguin"/>
    <metadata name="dc:creator" content="author"/>
    <metadata name="dc:publisher" content="Penguinone publishing"/>
    <metadata name="dc:source" content="penguinPub"/>
    <metadata name="dc:language" content="en-GB"/>
    <metadata name="dc:date" content="1752-09-10"/>
  </metadataList>
  <metadataList>
    <metadata name="cmlm:safety" content="mostly harmless"/>
    <metadata name="cmlm:insilico" content="electronically produced"/>
    <metadata name="cmlm:structure" content="penguinone"/>
    <metadata name="cmlm:reaction" content="synthesis of penguinone"/>
    <metadata name="cmlm:identifier" content="smiles:O=C1C=C(C)C(C)(C)C(C)=C1"/>
  </metadataList>
  <metadataList>
    <metadata name="foo:institution" content="abc.org"/>
    <metadata name="bar" content="xyzzy"/>
    <metadata name="$deliberateError" content="error"/>
  </metadataList>
</list>
</stmml>

An observation or occurrence.

A container for any events that need to be recorded, whether planned or not. They can include notes, measurements, conditions that may be referenced elsewhere, etc. There are no controlled semantics

<stmml title="observation example">
<observation type="ornithology">
  <object title="sparrow" count="3"/>
  <observ/>
</observation>
</stmml>

Type of observation (uncontrolled vocabulary).

An element to hold scalar data.

scalar holds scalar data under a single generic container. The semantics are usually resolved by linking to a dictionary. scalar defaults to a scalar string but has attributes which affect the type.

scalar does not necessarily reflect a physical object (for which object should be used). It may reflect a property of an object such as temperature, size, etc.

Note that normal Schema validation tools cannot validate the data type of scalar (it is defined as string), but that a temporary schema can be constructed from the type and used for validation. Also the type can be contained in a dictionary and software could decide to retrieve this and use it for validation.

<stmml title="scalar example">
<scalar 
    dataType="xsd:decimal" 
    errorValue="1.0" 
    errorBasis="observedStandardDeviation" 
    title="body weight"
    dictRef="zoo:bodywt"
    units="units:g">34.3</scalar>
    </stmml>

an enumeration of allowed angle units.

A reference to an existing atom.

<cml title="atomRef example">
  <molecule id="m1">
    <atomArray>
      <atom id="a1"/>
    </atomArray>
    <electron id="e1" atomRef="a1"/>
  </molecule>
</cml>

A reference to two distinct existing atoms in order.

<cml title="atomRefs2 example">
  <molecule id="m1">
    <atomArray>
      <atom id="a1"/>  
      <atom id="a2"/>
    </atomArray>
    <bondArray>
      <bond atomRefs2="a1 a2"/>
    </bondArray>
  </molecule>
</cml>

A reference to three distinct existing atoms in order.

<cml title="atomRefs3 example">
  <molecule id="m1">
    <atomArray>
      <atom id="a1"/>  
      <atom id="a2"/>
      <atom id="a3"/>
    </atomArray>
    <angle atomRefs3="a1 a2 a3" units="degrees">123.4</angle>
  </molecule>
</cml>

A reference to four distinct existing atoms in order.

<cml title="atomRefs4 example">
  <molecule id="m1">
    <atomArray>
      <atom id="a1"/>  
      <atom id="a2"/>
      <atom id="a3"/>
     <atom id="a4"/>
    </atomArray>
    <torsion atomRefs4="a1 a2 a3 a4" units="degrees">123.4</torsion>
  </molecule>
</cml>

An array of atomRefs.

The atomRefs cannot be schema- or schematron-validated. Instances of this type will be used in array-style representation of bonds and atomParitys. It can also be used for arrays of atomIDTypes such as in complex setereochemistry, geometrical definitions, atom groupings, etc.

<cml title="atomArray example">
  <molecule id="m1">
    <atomArray atomID="a2 a4 a6"  
      elementType="O N S"/>
  </molecule>
</cml>

An identifier for an atom.

Of the form prefix:suffix where prefix and suffix are purely alphanumeric (with _ and -) and prefix is optional. This is similar to XML IDs (and we promote this as good practice for atomIDs. Other punctuation and whitespace is forbidden, so IDs from (say) PDB files are not satisfactory.

The prefix is intended to form a pseudo-namespace so that atom IDs in different molecules may have identical suffixes. It is also useful if the prefix is the ID for the molecule (though this clearly has its limitation). Atom IDs should not be typed as XML IDs since they may not validate.

<cml title="example of IDs on atoms">
  <molecule id="m1">
    <atomArray>
<!-- this atom might be referenced as m1:a2. This is not formally
     part of CML yet -->
      <atom id="a2" elementType="O"/>
    </atomArray>
  </molecule>
  <molecule id="m2">
    <atomArray>
<!-- this atom might be referenced as m2:a2. This is not formally
     part of CML yet -->
      <atom id="a2" elementType="O"/>
    </atomArray>
  </molecule>
</cml>

A reference to an existing bond.

A reference to a bond may be made by atoms (e.g. for multicentre or pi-bonds), electrons (for annotating reactions or describing electronic properties) or possibly other bonds (no examples yet). The semantics are relatively flexible.

<cml title="bondArray example">
  <bondArray>
    <bond id="b1" atomRefs2="a3 a8" order="D">
      <electron bondRef="b1"/>
      <bondStereo>C</bondStereo>
    </bond>
    <bond id="b2" atomRefs2="a3 a8" order="S">
      <bondStereo convention="MDL" conventionValue="6"/>
    </bond>
  </bondArray>
</cml>

An array of references to bonds.

The references cannot (yet) cannot be schema- or schematron-validated. Instances of this type will be used in array-style representation of electron counts, etc. It can also be used for arrays of bondIDTypes such as in complex stereochemistry, geometrical definitions, bond groupings, etc.

Allowed elementType values.

The periodic table (up to element number 118. In addition the following strings are allowed:

Du. ("dummy") This does not correspond to a "real" atom and can support a point in space or within a chemical graph.
R. ("R-group") This indicates that an atom or group of atoms could be attached at this point.

<cml title="elementType example">
  <atomArray>
    <atom id="a1" elementType="C"/>
    <atom id="a2" elementType="N"/>
    <atom id="a3" elementType="Pb"/>
    <atom id="a4" elementType="Dummy"/>
  </atomArray>
</cml>

Any isotope of hydrogen.

There are no special element symbols for D and T which should use the isotope attribute.

A point or object with no chemical semantics.

Examples can be centroids, bond-midpoints, orienting "atoms" in small z-matrices.

Note "Dummy" has the same semantics but is now deprecated.

A point at which an atom or group might be attached.

Examples are abbreviated organic functional groups, Markush representations, polymers, unknown atoms, etc. Semantics may be determined by the role attribute on the atom.

An array of elementTypes.

Instances of this type will be used in array-style representation of atoms.

<cml title="atomArray with elementTypes">
  <atomArray elementType="O N S Pb"/>
</cml>

The formal charge on an atom.

Used for electron-bookeeping. This has no relation to its calculated (fractional) charge.

<cml title="formalCharge example">
  <atomArray>
    <atom id="a1" elementType="N" formalCharge="+1"/>
    <atom id="a2" elementType="O" formalCharge="-1"/>
  </atomArray>
</cml>

A concise representation for a molecular formula.

This MUST adhere to a whitespaced syntax so that it is trivially machine-parsable. Each element is followed by its count, and the string is optionally ended by a formal charge. NO brackets or other nesting is allowed.

<cml title="formulaType example (concise)">
  <list>
    <formula id="methane" concise="C 1 H 4"/>
    <formula id="chloroacetate" concise="Cl 1 H 2 C 2 O 2 -1"/>
    <formula id="sodiumSulfate">
      <formula concise="H 2 O 1" count="10"/>
      <formula concise="Na 1 +1" count="2"/>
      <formula concise="S 1 O 4 -2"/>
    </formula>
  </list>
</cml>

The total number of hydrogen atoms bonded to an atom.

The total number of hydrogen atoms bonded to an atom, whether explicitly included as atoms or not. It is an error to have hydrogen count less than the explicit hydrogen count. There is no default value and no assumptions about hydrogen Count can be made if it is not given.

If hydrogenCount is given on every atom, then the values can be summed to give the total hydrogenCount for the (sub)molecule. Because of this hydrogenCount should not be used where hydrogen atoms bridge 2 or more atoms.

<cml title="single atom example">
<atom id="a1" title="O3'" elementType="O" 
  formalCharge="1" hydrogenCount="1"
  isotope="17" occupancy="0.7" 
  x2="1.2" y2="2.3" 
  x3="3.4" y3="4.5" z3="5.6"
  convention="ABC" dictRef="chem:atom"
>
  <scalar title="dipole" dictRef="d:dip" 
    units="units:debye">0.2</scalar>
  <atomParity atomRefs4="a3 a7 a2 a4">1</atomParity>
  <electron id="e1" atomRef="a1" count="2"/>
</atom>
</cml>

The numeric representation of an isotope.

In core CML this represents a single number; either the combined proton/neutron count or a more accurate estimate of the nuclear mass. This is admittedly fuzzy, and requires a more complex object (which can manage conventions, lists of isotopic masses, etc.) See isotope.

The default is "natural abundance" - whatever that can be interpreted as.

Delta values (i.e. deviations from the most abundant istopic mass) are never allowed.

A non-signed angle, such as a bond angle. Note that we also provide positiveAngleType (e.g. for cell angles) and torsionAngleType for - guess what - torsion.

Re-used by angle

<stmml title="nonNegativeAngle type">

<scalar dataType="nonNegativeAngleType">123</scalar>  
</stmml>

The number of non-hydrogen atoms attached to an atom.

Obsolete in core CML. Only useful in CML queries

A non-signed angle, such as a cell angle. Note that we also provide nonNegativeAngleType (e.g. for bond angles).

Re-used by crystal

<cml title="positiveAngleType example">
  <list>
    <scalar title="alpha" units="units:degree">70.123</scalar>
    <scalar title="beta" units="units:degree">80.456</scalar>
    <scalar title="gamma" units="units:degree">90.789</scalar>
  </list>
</cml>

Occupancy of an atomic site.

Primarily for crystallography. Values outside 0-1 are not allowed.

See atom.

An array of bond orders.

(seeAlso orderType)

Bond order (as a string).

This is purely conventional and used for bond/electron counting. There is no default value. The emptyString attribute can be used to indicate a bond of unknown or unspecified type. The interpretation of this is outside the scope of CML-based algorithms. It may be accompanied by a convention attribute on the bond which links to a dictionary. Example: <bond convention="ccdc:9" atomRefs2="a1 a2"/> could represent a delocalised bond in the CCDC convention.

Single bond.

Double bond.

Triple bond.

Aromatic bond.

State of a substance or property.

The state(s) of matter appropriate to a substance or property. It follows a partially controlled vocabulary. It can be extended through namespace codes to dictionaries

An aqueous solution Gas or vapor. The default state for computation on isolated molecules A glassy state Normally pure liquid (use solution where appropriate) The nematic phase The smectic phase A solid A solid solution A (liquid) solution

(Bond) stereochemistry (as a string).

. This is purely conventional; . There is no default value. The emptyString attribute can be used to indicate a bond of unknown or unspecified type. The interpretation of this is outside the scope of CML-based algorithms. It may be accompanied by a convention attribute which links to a dictionary

<cml title="bondArray example">
  <bondArray>
    <bond id="b1" atomRefs2="a3 a8" order="D">
      <electron bondRef="b1"/>
      <bondStereo>C</bondStereo>
    </bond>
    <bond id="b2" atomRefs2="a3 a8" order="S">
      <bondStereo convention="MDL" conventionValue="6"/>
    </bond>
  </bondArray>
</cml>

A cis bond.

A trans bond.

A wedge bond.

A hatch bond.

empty or missing.

The type of a torsion angle.

A reference to an atom.

Typical use would be a bond with only one atom (e.g. the other end is to a bond or electrons).

An array of references to atoms.

Typical use would be to atoms defining a plane.

A list of two references to atoms.

Typically used for defining bonds.

A list of three references to atoms.

Typically used for defining angles, but could also be used to define a three-centre bond.

A list of 4 references to atoms.

Typically used for defining torsions and atomParities, but could also be used to define a four-centre bond.

Restricts units to radians or degrees.

The amount of a substance.

The units attribute is mandatory and can be customised to support mass, volumes, moles, percentages, or rations (e.g. ppm).

<cml title="substanceList example">
  <substanceList id="s1">
    <amount units="units:ml">100</amount>
    <substance id="s1">
      <amount units="units:l">1</amount>
      <molecule id="h2o" ref="mols:water"/>
    </substance>
    <substance id="s2">
      <amount units="units:mole">0.1</amount>
      <molecule id="nacl" formula="Na 1 O 1 H 1"/>
    </substance>
  </substanceList>
</cml>

A "bond" angle between three atoms.

It can be used for:

Recording experimentally determined bond angles (e.g. in a crystallographic paper).
Providing the angle component for internal coordinates (e.g. z-matrix).

<molecule id="m1" title="angle example">
  <atomArray>
    <atom id="a1"/>
    <atom id="a2"/>
    <atom id="a3"/>
  </atomArray>
  <angle units="degrees" atomRefs3="a1 a2 a3">123.4</angle>
</molecule>

An atom.

Usually within a molecule. It is almost always contained within atomArray.

<cml title="single atom example">
<atom id="a1" title="O3'" elementType="O" 
  formalCharge="1" hydrogenCount="1"
  isotope="17" occupancy="0.7" 
  x2="1.2" y2="2.3" 
  x3="3.4" y3="4.5" z3="5.6"
  convention="ABC" dictRef="chem:atom"
>
  <scalar title="dipole" dictRef="d:dip" 
    units="units:debye">0.2</scalar>
  <atomParity atomRefs4="a3 a7 a2 a4">1</atomParity>
  <electron id="e1" atomRef="a1" count="2"/>
</atom>
</cml>

One or more electrons associated with the atom. The atomRef on the electron should point to the id on the atom. We may relax this later and allow reference by context.

The occurrence count of the atom.

Most useful in formula but possibly useful in atomArray where coordinates and connectivity is not defined. No formal default, but assumed to be 1.

The elementType. Almost mandatory

The formalCharge on the atom.

NOT the calculated charge or oxidation state. No formal default, but its absence implies 0. It may be good practice to include it explicitly.

The explicit hydrogen count

The non-hydrogen count (obsolete - moved to CML Query)

The isotopic mass. Default implies "natural abundance"

The occupancy (mainly from crystallography)

The x coordinate of a 2-D representation (unrelated to 3-D structure). Note that x- and y- 2D coordinates are required for graphical stereochemistry such as wedge/hatch. x- and y- coordinates must be both present or both absent.

The x coordinate of a 3-D cartesian representation. x3 y3 and z3 coordinates must be both present or both absent.

The fractional x coordinate in a crystal structure. xFract, yFract and zFract coordinates must be all present or all absent. A crystal element is required

The combined x and y coordinates of a 2-D representation (unrelated to 3-D structure). Note that x- and y- 2D coordinates are required for graphical stereochemistry such as wedge/hatch.

The combined x, y, z coordinates of a 3-D cartesian representation.

The combined x, y, z fractional coordinates in a crystal structure. A crystal element is required

The y coordinate of a 2-D representation (unrelated to 3-D structure). Note that x2 and y2 coordinates are required for graphical stereochemistry such as wedge/hatch. x2 and y2 coordinates must be both present or both absent.

The y coordinate of a 3-D cartesian representation. x3 y3 and z3 coordinates must be both present or both absent.

The fractional x coordinate in a crystal structure. xFract, yFract and zFract coordinates must be all present or all absent. A crystal element is required

The z coordinate of a 3-D cartesian representation. x3 y3 and z3 coordinates must be both present or both absent.

The fractional x coordinate in a crystal structure. xFract, yFract and zFract coordinates must be all present or all absent. A crystal element is required

This can be used to describe the purpose of atoms whose elementTypes are dummy or locant.

A container for a list of atoms.

A child of molecule and contains atom information. There are two strategies:

Create individual atom elements under atomArray (in any order). This gives the greatest flexibility but is the most verbose.
Create *Array attributes (e.g. of elementTypeArrayType under atomArray. This requires all arrays to be of identical lengths with explicit values for all atoms in every array. This is NOT suitable for complexType atom children such as atomParity or composite types such as xy2. It also cannot be checked as easily by schema- and schematron validation. The atomIDArray attribute is mandatory. It is allowed (though not yet recommended) to add *Array children such as floatArray

The attributes are directly related to the scalar attributes under atom which should be consulted for more info.

NOTE: The CML-1 specifications are also supported but are deprecated

Example - these are exactly equivalent representations

<cml title="atomArray CML1">
<list>
  <atomArray>
    <atom id="a1" elementType="O" hydrogenCount="1"/>
     <atom id="a2" elementType="N" hydrogenCount="1"/>
    <atom id="a3" elementType="C" hydrogenCount="3"/>
  </atomArray>
<!-- is equivalent to -->
  <atomArray
    atomID="a1 a2 a3"
    elementType="O N C"
    hydrogenCount="1 1 3"/>
</list>
</cml>

Almost mandatory. see elementType

See count

See formalCharge

See hydrogenCount

See nonHydrogenCount

See isotope

See occupancy

See x2

See x3

See xFract

See y2

See y3

See yFract

See z3

See zFract

See atomID

Available for subclassing to provide alternative collections for atoms.

The stereochemistry round an atom centre.

It follows the convention of the MIF format, and uses 4 distinct atoms to define the chirality. These can be any atoms (though they are normally bonded to the current atom). There is no default order and the order is defined by the atoms in the atomRefs4 attribute. If there are only 3 ligands, the current atom should be included in the 4 atomRefs.

The value of the parity is a signed number. (It can only be zero if two or more atoms are coincident or the configuration is planar). The sign is the sign of the chiral volume created by the four atoms (a1, a2, a3, a4):

       |  1  1  1  1 |
       | x1 x2 x3 x4 |
       | y1 y2 y3 y4 |
       | z1 z2 z3 z4 |

Note that atomParity cannot be used with the *Array syntax for atoms.

<cml title="atom parity example">
  <atom id="a1">
    <atomParity atomRefs4="a3 a5 a2 a9">1</atomParity>
  </atom>
</cml>

A bond between atoms, or between atoms and bonds.

bond is a child of bondArray and contains bond information. Bond must refer to at least two atoms (using atomRefs2) but may also refer to more for multicentre bonds. Bond is often EMPTY but may contain electron, length or bondStereo elements.

<cml title="bondArray example">
  <bondArray>
    <bond id="b1" atomRefs2="a3 a8" order="D">
      <electron bondRef="b1"/>
      <bondStereo>C</bondStereo>
    </bond>
    <bond id="b2" atomRefs2="a3 a8" order="S">
      <bondStereo convention="MDL" conventionValue="6"/>
    </bond>
  </bondArray>
</cml>

<cml title="metal-bond example">
<!-- Zeise's salt: [Cl3Pt(CH2=CH2)]- -->
  <atomArray>
    <atom id="pt1" elementType="Pt"/>
    <atom id="cl1" elementType="Cl"/>
    <atom id="cl2" elementType="Cl"/>
    <atom id="cl3" elementType="Cl"/>
    <atom id="c1" elementType="C" hydrogenCount="2"/>
    <atom id="c2" elementType="C" hydrogenCount="2"/>
  </atomArray>
  <bondArray>
    <bond id="b1" atomRefs2="c1 c2" order="D"/>
    <bond id="b2" atomRefs2="pt1 cl1" order="S"/>
    <bond id="b3" atomRefs2="pt1 cl2" order="S"/>
    <bond id="b4" atomRefs2="pt1 cl3" order="S"/>
    <bond id="b5" atomRefs="pt1" bondRefs="b1"/>
  </bondArray>
</cml>

  <val:comment>Validate Bonds</val:comment>
  <val:template match="bond">
  <val:comment>Atom Refs for 2-atom bond</val:comment>
    <val:variable name="at1" select="substring-before(normalize-space(@atomRefs2),' ')"/>
    <val:variable name="at2" select="substring-after(normalize-space(@atomRefs2),' ')"/>
    <val:comment>Do both atoms exist in current molecule context?</val:comment>
    <val:if test="not(key('atoms', $at1))">
      <val:call-template name="error">
        <val:with-param name="error">BOND (<val:value-of select="@id"/>): ATOMREF not found: <val:value-of select="$at1"/></val:with-param>
      </val:call-template>
    </val:if>
  </val:template>

Validate Bonds Atom Refs for 2-atom bond Are atoms distinct?

BOND (): ATOMS not distinct:

Do both atoms exist in current molecule context?

BOND (): ATOMREF not found:

One or more electrons associated with the bond.

. The bondRef on the electron should point to the id on the bond. We may relax this later and allow reference by context.(We

The stereo convention for the bond.

only one convention allowed

the length between the atoms.

This is either an experimental measurement or used to build up internal coordinates (as in a z-matrix) (only one allowed)

We expect to move length as a child of molecule and remove it from here

The two atoms in the bond.

. This will be the normal reference attribute on the bond element. The order of atoms is preserved and may matter for some conventions (e.g. wedge/hatch or donor bonds)

The atoms in the bond.

. This is designed for multicentre bonds (as in delocalised systems or electron-deficient centres. The semantics are experimental at this stage. As an example, a B-H-B bond might be described as <bond atomRefs="b1 h2 b2"/>

Bonds involved in the bond.

. This is designed for pi-bonds and other systems where formal valence bonds are not drawn to atoms. The semantics are experimental at this stage. As an example, a Pt-|| bond (as the Pt-ethene bond in Zeise's salt) might be described as <bond atomRefs="pt1" bondRefs="b32"/>

The order of the bond.

There is NO default. This order is for bookkeeping only and is not related to length, QM calculations or other experimental or theoretical calculations. see orderType

A container for a number of bond

bondArray is a child of molecule and contains bond information. There are two strategies:

Create individual bond elements under bondArray (in any order). This gives the greatest flexibility but is the most verbose.
Create *Array attributes (e.g. of orderArrayType under bondArray. This requires all arrays to be of identical lengths with explicit values for all bonds in every array. This is NOT suitable for complexType bond children such as bondStereo , nor can IDs be added to bonds.. It also cannot be checked as easily by schema- and schematron validation. The atomRef1Array and atomRef2Array attributes are then mandatory. It is allowed (though not yet recommended) to add *Array children such as floatArray

The attributes are directly related to the scalar attributes under atom which should be consulted for more info.

Example - these are exactly equivalent representations

<cml title="bondArray example 1">
  <list>
    <bondArray>
      <bond id="b1" atomRefs2="a1 a2" 
        order="1"/>
      <bond id="b2" atomRefs2="a1 a3" order="2"/>
      <bond id="b3" atomRefs2="a3 a5" order="1"/>
    </bondArray>
    <bondArray
      atomRef1="a1 a1 a3"
      atomRef2="a2 a3 a5"
      order="1 2 1"/>
  </list>
</cml>

The IDs for the bonds. Required in array mode

The first atoms in each bond. Required in array mode

The second atoms in each bond. Required in array mode

The bond orders in each bond. Used in array mode

A general container for CML elements.

Often the root of the CML (sub)document. Has no explicit function but serves to hold the dictionaries, namespace, and can alert CML processors and search/XMLQuery tools that there is chemistry in the document. Can contain any content, but usually a list of molecules and other CML components. Can be nested

<cml id="c1" title="demo of cml subelements"
  xmlns:cmlr="http://www.xml-cml.org/schema/reaction"
  xmlns:cmls="http://www.xml-cml.org/schema/spectrum"
  xmlns:stm="http://www.xml-cml.org/schema/stmml">
  <stm:dictionary dictRef="d1" href="dict1.xml"/>
  <stm:unitList dictRef="u1" href="units1.xml"/>
  <cml>
    <molecule id="m1"/>
  </cml>  
  <molecule id="m2" title="dummy"/>
  <metadata/>
  <cmlr:reaction>
    <cmlr:reactantList>
      <molecule id="r1"/>
    </cmlr:reactantList>
    <cmlr:productList>
      <molecule id="p1"/>
    </cmlr:productList>
  </cmlr:reaction>
  <cmls:spectrum>
    <cmls:data>
      <stm:array/>
      <stm:array/>
    </cmls:data>
  </cmls:spectrum>
  <substanceList id="subList1"/>
  <list>
    <scalar title="some scalar"/>
  </list>
</cml>

No specific restrictions.

A container for crystallographic cell parameters and spacegroup.

. Required if fractional coordinates are provided for a molecule.

There are precisely SIX child scalars to represent the cell lengths and angles in that order. There are no default values;

<cml title="crystal example">
  <molecule id="m1">
    <crystal z="4">
      <scalar title="a" errorValue="0.001" units="units:angstrom">4.500</scalar>
      <scalar title="b" errorValue="0.001" units="units:angstrom">4.500</scalar>
      <scalar title="c" errorValue="0.001" units="units:angstrom">4.500</scalar>
      <scalar title="alpha" units="units:degree">90</scalar>
      <scalar title="beta" units="units:degree">90</scalar>
      <scalar title="gamma" units="units:degree">90</scalar>
      <symmetry id="s1" spaceGroup="Fm3m"/>
    </crystal>
    <atomArray>
      <atom id="a1" elementType="Na" formalCharge="1" xyzFract="0.0 0.0 0.0" xy2="+23.2 -21.0"/> 
      <atom id="a2" elementType="Cl" formalCharge="-1" xyzFract="0.5 0.0 0.0"/> 
    </atomArray>
  </molecule>
</cml>

All 6 cell parameters must be given, even where angles are fixed by symmetry. The order is fixed as a,b,c,alpha,gamma,beta and software can neglect any title or dictRef attributes. Error estimates can be given if required. Any units can be used, but the defaults are Angstrom (10^-10 m) and degrees. .

The number of molecules per cell. Molecules are defined as the molecule which directly contains the crystal element.

One or more electrons.

Since there is very little use of electrons in current chemical information this is a fluid concept. I expect it to be used for electron counting, input and output of theochem operations, descriptions of orbitals, spin states, oxidation states, etc. Electrons can be associated with atoms, bonds and combinations of these. At present there is no hardcoded semantics. However, atomRef and similar attributes can be used to associate electrons with atoms or bonds

<cml title="electron example">
  <molecule id="m1">
    <atomArray atomID="a1 a2 a3 a4 a5 a6"/>
    <bondArray 
      order="A A A A A A"
      bondID="b1 b2 b3 b4 b5 b6"
      atomRef1="a1 a2 a3 a4 a5 a6"
      atomRef2="a6 a1 a2 a3 a4 a5"/>
    <electron count="6" 
      bondRefs="b1 b2 b3 b4 b5 b6"
      atomRefs="a1 a2 a3 a4 a5 a6"/>
  </molecule>
</cml>

The number of electrons.

No formal default, but assumed to be 1. At present restricted to integers.

Available for subclassing to provide alternative properties for atoms.

The stochiometry of the molecule.

It is defined by atomArrays each with a list of elementTypes and their counts (or default=1). All other information in the atomArray is ignored. formula are nestable so that aggregates (e.g. hydrates, salts, etc.) can be described. CML does not require that formula information is consistent with (say) crystallographic information; this allows for experimental variance.

An alternative briefer representation is also available through the conciseForm. This must include whitespace round all elements and their counts, which must be explicit.

<cml title="formula example">
  <molecule id="sulfuricAcid">
    <formula concise="H 2 S 1 O 4"/>
  </molecule>
  <molecule id="CuprammoniumSulfate">
    <formula title="[Cu(NH3)4]2+ SO42-]">
      <formula formalCharge="+2">
        <atomArray elementType="Cu"/>
        <formula count="4">
          <atomArray elementType="N H" count="1 3"/>
        </formula>
      </formula>
      <formula formalCharge="-2">
        <atomArray elementType="S O" count="1 4"/>
      </formula>
    </formula>    
  </molecule>
</cml>

A multiplier for the formula.

No formal default but assumed to be 1. Allows for fractional components.

The formal charge is normally calculated from the formal charges of the atoms. If the formalCharge attribute is given it overrides this information completely. This allows (say) metal complexes to be represented when it is difficult to apportion the charges to atoms.

A concise string representing an (unstructured) formula.

2003-03-12: Added isotopic and atoms.

IChI identifier.

Supports compund identifiers such as IChI. At present uses the V0.9 IChI XML representation verbatim but will almost certainly change with future IChIs.

The inclusion of elements from other namespaces causes problems with validation. The content model is deliberately LAX but the actual elements in IChI will fail the validation as they are not declared in CML.

IChI basic string.

NOT PART OF CML. this is the IChI element supporting the unique string for the connection table. It is included in this distribution because validation requires all elements to have been declared.

IChI formal charge.

NOT PART OF CML. this is the IChI element supporting the charge on a molecular fragment. It is included in this distribution because validation requires all elements to have been declared.

IChI stereo.

NOT PART OF CML. this is the IChI element supporting the stereochemistry of a molecular fragment. It is included in this distribution because validation requires all elements to have been declared.

IChI double bobd stereochemistry.

NOT PART OF CML. this is the IChI element supporting the stereochemistry of "double bonds". It is included in this distribution because validation requires all elements to have been declared.

IChI atom-based stereochemistry.

NOT PART OF CML. this is the IChI element supporting the stereochemistry of an atom. It is included in this distribution because validation requires all elements to have been declared.

Isotopic substitution element.

Only present if molecule is isotopically substituted

Atoms within IChI isotopic element.

NOT PART OF CML.

Appears to be a whitespace separated string of atoms

A length between two atoms.

<cml title="length example">
  <molecule id="m1">
    <atomArray atomID="a1 a2 a3"/>
    <length atomRefs2="a3 a1">1.534</length>
  </molecule>
</cml>

Available for subclassing to provide alternative lengths (e.g. in conformations).

A container for atoms, bonds and submolecules.

molecule is a container for atoms, bonds and submolecules along with properties such as crystal and non-builtin properties. It should either contain molecule or *Array for atoms and bonds. A molecule can be empty (e.g. we just know its name, id, etc.)

"Molecule" need not represent a chemically meaningful molecule. It can contain atoms with bonds (as in the solid-sate) and it could simply carry a name (e.g. "taxol") without formal representation of the structure. It can contain "sub molecules", which are often discrete subcomponents (e.g. guest-host).

Molecule can contain a <list> element to contain data related to the molecule. Within this can be string/float/integer and other nested lists

<cml title="schematic molecule example">
  <molecule id="dummyId">
    <atomArray>
      <atom id="a1" elementType="C" 
        hydrogenCount="0" x2="6.1964" y2="8.988"/>
      <atom id="a2" elementType="C" 
        hydrogenCount="0" x2="6.1964" y2="7.587"/>
      <atom id="a3" elementType="C" 
        hydrogenCount="2" x2="4.983" y2="6.887"/>
<!-- omitted -->
      <atom id="a28" elementType="C" 
        hydrogenCount="3" x2="15.777" y2="6.554"/>
      <atom id="a29" elementType="O" 
        hydrogenCount="0" x2="13.388" y2="6.188"/>
    </atomArray>
    <bondArray>
      <bond atomRefs2="a1 a2" order="1"/>
      <bond atomRefs2="a2 a3" order="1"/>
      <bond atomRefs2="a3 a4" order="1"/>
<!-- omitted -->
      <bond atomRefs2="a11 a15" order="1"/>
      <bond atomRefs2="a12 a18" order="1">
        <bondStereo>W</bondStereo>
      </bond>
      <bond atomRefs2="a2 a19" order="1">
        <bondStereo>W</bondStereo>
      </bond>
      <bond atomRefs2="a5 a20" order="2"/>
      <bond atomRefs2="a17 a21" order="1"/>
      <bond atomRefs2="a21 a22" order="1"/>
<!-- omitted -->
      <bond atomRefs2="a10 a9" order="1"/>
      <bond atomRefs2="a16 a29" order="2"/>
    </bondArray>
  </molecule>
</cml>

Revised content model to allow any order of lengths, angles, torsions 2003-01-01.

Added role attribute 2003-03-19.

The float|integer|string children are for compatibility with CML-1 and are deprecated. scalar|array|matrix should be used instead.

The formula attribute should only be used for simple formulae (i.e. without brackets or other nesting for which the formula child should be used. The attribute might be used as a check on the child elements or for ease of representation.

The count for the molecule.

No formal default but assumed to be 1. Fractional values are allowed to describe variable stoichiometry.

The chirality of the complete system.

This is being actively investigated by a IUPAC committee (2002) so the convention is likely to change. No formaldefault.

The formalCharge on the molecule.

NOT the calculated charge or oxidation state. This attribute should be used when it is impossible or artificial to assign charges to each atom, as in coordination complexes. It is then required that all atom formalCharge attributes are omitted. No formal default, but assumed to be zero if omitted. It may become good practice to include it.

The spin multiplicity for the molecule.

This attribute gives the spin multiplicity of the molecule and is independent of any atomic information. No default, and it may take any positive integer value (though values are normally between 1 and 5)

Is the molecule oriented to the symmetry.

No formal default, but a molecule is assumed to be oriented according to any <symmetry> children. This is required for crystallographic data, but some systems for isolated molecules allow specification of arbitrary Cartesian or internal coordinates, which must be fitted or refined to a prescribed symmetry. In this case the attribute value is false.

Role of the molecule

No formal semantics (yet). The role describes the purpose of the molecule element at this stage in the information. Examples can be "conformation", "dynamicsStep", "vibration", "valenceBondIsomer", etc. This attribute may be used by applications to determine how to present a set of molecule elements.

A string identifying a molecule, atom or (possibly) other elements.

name is used for chemical names (formal and trivial) for molecules and also for identifiers such as CAS registry and RTECS. It can also be used for labelling atoms. It should be used in preference to the title attribute because it is repeatable and can be linked to a dictionary.

Constraining patterns can be described in the dictionary and used to validate names.

<cml title="name example">
  <molecule id="aspirin">
    <name convention="INN"> name="builtin" type="xsd:string"in</name>
    <name convention="IUPAC">2-acetoxybenzoic acid</name>
    <name convention="trivial">acetylsalicylic acid</name>
  </molecule>
</cml>

A container for a property.

property can contain one or more children, usually scalar, array or matrix. The dictRef attribute is required, even if there is a single scalar child with the same dictRef. The property may have a different dictRef from the child, thus providing an extension mechanism.

Properties may have a state attribute to distinguish the state of matter

Not yet written

The role of the property. Semantics are not yet controlled but could include thermochemistry, kinetics or other common properties.

A container for one or more properties.

propertyList can contain several properties. These include (but are not limited to observations, or numeric quantities.

Not yet written

The role of the propertyList. Semantics are not yet controlled but could include thermochemistry, kinetics or other common properties.

A container supporting "cis/trans", "wedge hatch" and other stereochemistry.

An explict list of atomRefs must be given, or it must be a child of bond. There are no implicit conventions such as E/Z. This will be extended to other types of stereochemistry.

At present the following are supported:

No atomRefs attribute. Deprecated, but probably unavoidable. This must be a child of bond where it picks up the two atomRefs in the atomRefs2 attribute. Possible values are C/T (which only makes sense if there is exactly one ligand at each end of the bond) and W/H. The latter should be raplaced by atomParity wherever possible. Note that W/H makes no sense without 2D atom coordinates.
atomRefs4 attribute. The 4 atoms represent a cis or trans configuration. This may or may not be a child of bond; if so the second and third atomRefs should be identical with the two atomRefs in the bond. This structure can be used to guide processors in processing stereochemistry and is recommended, since there is general agreement on the semantics. The semantics of bondStereo not related to bonds is less clear (e.g. cumulenes, substituted ring nuclei) etc.It is currently an error to have more than one bondStereo referring to the same ordered 4-atom list
atomRefs attribute. There are other stereochemical conventions such as cis/trans for metal complexes which require a variable number of reference atoms. This allows users to create their own - at present we do not see CML creating exhaustive tables. For example cis/trans square-planar complexes might require 4 (or 5) atoms for their definition, octahedral 6 or 7, etc. In principle this is very powerful and could supplement or replace the use of cis-, mer-, etc.

the atomRefs and atomRefs4 attributes cannot be used simultaneously.

<cml title="bondArray example">
  <bondArray>
    <bond id="b1" atomRefs2="a3 a8" order="D">
      <electron bondRef="b1"/>
      <bondStereo>C</bondStereo>
    </bond>
    <bond id="b2" atomRefs2="a3 a8" order="S">
      <bondStereo convention="MDL" conventionValue="6"/>
    </bond>
  </bondArray>
</cml>

The stereo value when the convention attribute is used.

When convention is used this attribute must be present and element content must be empty.

A chemical substance.

substance represents a chemical substance which is deliberately very general. It can represent things that may or may not be molecules, can and cannot be stored in bottles and may or may not be microscopic. Solutions and mixtures can be described by substanceLists of substances. The type attribute can be used to give qualitative information characterising the substance ("granular", "90%", etc.) and role to describe the role in process ("desiccant", "support", etc.). There is currently no controlled vocabulary. Note that reaction is likely to have more precise semantics.

The amount of a substance is controlled by the optional amount child

<cml title="substance example">
  <substance title="ethanol" id="ethanol">
    <amount units="units:l">1.2</amount>
  </substance>
</cml>

Added property as a child 2002-12-29

type can represent concepts such as physical form, but is not limited to any vocabulary.

role depends on context, and indicates some purpose associated with the substance. It might indicate 'catalyst', 'solvent', 'antoxidant', etc. but is not limited to any vocabulary.

The count of the substance.

No fixed semantics or default.

The state of the substance

2003-03-12: Added role attribute

A list of "chemical substances".

Deliberately very general - see substance. substanceList is designed to manage solutions, mixtures, etc. and there is a small enumerated controlled vocabulary, but this can be extended through dictionaries.

substanceList can have an amount child. This can indicate the amount of a solution or mixture; this example describes 100 ml of 0.1M NaOH(aq). Although apparently longwinded it is precise and fully machine-interpretable

<cml title="substanceList example">
  <substanceList id="s1">
    <amount units="units:ml">100</amount>
    <substance id="s1">
      <amount units="units:l">1</amount>
      <molecule id="h2o" ref="mols:water"/>
    </substance>
    <substance id="s2">
      <amount units="units:mole">0.1</amount>
      <molecule id="nacl" formula="Na 1 O 1 H 1"/>
    </substance>
  </substanceList>
</cml>

Type of the substanceList:

Extension is allowed through the "other" value.

Role of the substanceList

Uncontrolled vocabulary. Might describe

Molecular, crystallographic or other symmetry.

symmetry provides a label and/or symmetry operations for molecules or crystals. Point and spacegroups can be specified by strings, though these are not enumerated, because of variability in syntax (spaces, case-sensitivity, etc.), potential high symmetries (e.g. TMV disk is D17) and non-standard spacegroup settings. Provision is made for explicit symmetry operations through <matrix> child elements.

By default the axes of symmetry are defined by the symbol - thus C2v requires z to be the unique axis, while P21/c requires b/y. Spacegroups imply the semantics defined in International Tables for Crystallography, (Int Union for Cryst., Munksgaard). Point groups are also defined therein.

The element may also be used to give a label for the symmetry species (irreducible representation) such as "A1u" for a vibration or orbital.

The matrices should be 3x3 for point group operators and 3x4 for spacegroup operators. The use of crystallographic notation ("x,1/2+y,-z") is not supported - this would be <matrix>1 0 0 0.0 0 1 0 0.5 0 0 1 0.0<matrix>.

The default convention for point group symmetry is Schoenflies and for spacegroups is "H-M". Other conventions (e.g. "Hall") must be specfied through the convention attribute.

This element implies that the Cartesians or fractional coordinates in a molecule are oriented appropriately. In some cases it may be useful to specify the symmetry of an arbitarily oriented molecule and the <molecule> element has the attribute symmetryOriented for this purpose.

<cml title="symmetry example 1">
<symmetry pointGroup="C2v" id="s1">
  <matrix id="e" rows="3" columns="3" dataType="xsd:float" matrixType="rotation33">
    1 0 0
    0 1 0
    0 0 1
  </matrix>
  <matrix id="c2" rows="3" columns="3" dataType="xsd:float" matrixType="rotation33">
    -1 0 0
    0 -1 0
    0 0 1
  </matrix>
  <matrix id="sx" rows="3" columns="3" dataType="xsd:float" matrixType="rotation33">
    -1 0 0
    0 1 0
    0 0 1
  </matrix>
  <matrix id="sy" rows="3" columns="3" dataType="xsd:float" matrixType="rotation33">
    1 0 0
    0 -1 0
    0 0 1
  </matrix>
</symmetry>
</cml>

A point group.

No fixed semantics, though Schoenflies is recommended over Hermann-Mauguin. We may provide a controlled-extensible list in the future.

A point group.

No fixed semantics, though Hermann-Mauguin or Hall is recommended over Schoenflies. We may provide a controlled-extensible list in the future.

A symmetry species.

No fixed semantics, though we may provide a controlled-extensible list in the future.

2003-03-30: added number attribute

The rotational symmetry number

Used for calculation of entropy, etc.

A torsion angle ("dihedral") between 4 distinct atoms.

The atoms need not be formally bonded. It can be used for:

Recording experimentally determined torsion angles (e.g. in a crystallographic paper).
Providing the torsion component for internal coordinates (e.g. z-matrix).

Note that the order of atoms is important.

<molecule id="m1">
  <atomArray atomID="a1 a2 a3 a4"/>
  <torsion atomRefs4="a4 a2 a3 a1" units="degrees">123</torsion>
</molecule>

A reference to an exitsing torsion.

Available for subclassing to provide alternative torsions for conformations.

CML-1 dataType (DEPRECATED).

<cml title="CML-1 JCICS examples">
  <molecule id="formamide">

    <atomArray>
      <stringArray builtin="atomId">H1 C1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="elementType">H  C  O  N  C   C</stringArray>
      <integerArray builtin="hydrogenCount">0  1  0  1  3   3</integerArray>
    </atomArray>
    <bondArray>
      <stringArray builtin="atomRef">C1 C1 C1 N1 N1</stringArray>
      <stringArray builtin="atomRef">H1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="order">1 2 1 1 1</stringArray>
    </bondArray>
  <!-- this is not schema-validatable at present -->
<!--
  <list title="documentation">
    <h:html xmlns:h="http://www.w3.org/TR/html20">
    <p>Formamide is the simplest amide ...</p>
    <p>This represents a <emph>connection table</emph>
 for formamide. The structure corresponds to the diagram:</p>

    <pre>
      H3       H1
        \     /
         N1-C1=O1
        /
      H2
</pre>
  </h:html>
  </list>
  -->
      <list title="local information">
      <float title="molecularWeight" units="g">45.03</float>
<!--    <link title="safety" href="/safety/chemicals.xml#formamide">
    </link>
-->
      <string title="location">Storeroom 12.3</string>
    </list>
  </molecule>
</cml>

CML-1 dataType (DEPRECATED).

<cml title="CML-1 JCICS examples">
  <molecule id="formamide">

    <atomArray>
      <stringArray builtin="atomId">H1 C1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="elementType">H  C  O  N  C   C</stringArray>
      <integerArray builtin="hydrogenCount">0  1  0  1  3   3</integerArray>
    </atomArray>
    <bondArray>
      <stringArray builtin="atomRef">C1 C1 C1 N1 N1</stringArray>
      <stringArray builtin="atomRef">H1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="order">1 2 1 1 1</stringArray>
    </bondArray>
  <!-- this is not schema-validatable at present -->
<!--
  <list title="documentation">
    <h:html xmlns:h="http://www.w3.org/TR/html20">
    <p>Formamide is the simplest amide ...</p>
    <p>This represents a <emph>connection table</emph>
 for formamide. The structure corresponds to the diagram:</p>

    <pre>
      H3       H1
        \     /
         N1-C1=O1
        /
      H2
</pre>
  </h:html>
  </list>
  -->
      <list title="local information">
      <float title="molecularWeight" units="g">45.03</float>
<!--    <link title="safety" href="/safety/chemicals.xml#formamide">
    </link>
-->
      <string title="location">Storeroom 12.3</string>
    </list>
  </molecule>
</cml>

CML-1 dataType (DEPRECATED).

<cml title="CML-1 JCICS examples">
  <molecule id="formamide">

    <atomArray>
      <stringArray builtin="atomId">H1 C1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="elementType">H  C  O  N  C   C</stringArray>
      <integerArray builtin="hydrogenCount">0  1  0  1  3   3</integerArray>
    </atomArray>
    <bondArray>
      <stringArray builtin="atomRef">C1 C1 C1 N1 N1</stringArray>
      <stringArray builtin="atomRef">H1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="order">1 2 1 1 1</stringArray>
    </bondArray>
  <!-- this is not schema-validatable at present -->
<!--
  <list title="documentation">
    <h:html xmlns:h="http://www.w3.org/TR/html20">
    <p>Formamide is the simplest amide ...</p>
    <p>This represents a <emph>connection table</emph>
 for formamide. The structure corresponds to the diagram:</p>

    <pre>
      H3       H1
        \     /
         N1-C1=O1
        /
      H2
</pre>
  </h:html>
  </list>
  -->
      <list title="local information">
      <float title="molecularWeight" units="g">45.03</float>
<!--    <link title="safety" href="/safety/chemicals.xml#formamide">
    </link>
-->
      <string title="location">Storeroom 12.3</string>
    </list>
  </molecule>
</cml>

CML-1 dataType (DEPRECATED).

<cml title="CML-1 JCICS examples">
  <molecule id="formamide">

    <atomArray>
      <stringArray builtin="atomId">H1 C1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="elementType">H  C  O  N  C   C</stringArray>
      <integerArray builtin="hydrogenCount">0  1  0  1  3   3</integerArray>
    </atomArray>
    <bondArray>
      <stringArray builtin="atomRef">C1 C1 C1 N1 N1</stringArray>
      <stringArray builtin="atomRef">H1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="order">1 2 1 1 1</stringArray>
    </bondArray>
  <!-- this is not schema-validatable at present -->
<!--
  <list title="documentation">
    <h:html xmlns:h="http://www.w3.org/TR/html20">
    <p>Formamide is the simplest amide ...</p>
    <p>This represents a <emph>connection table</emph>
 for formamide. The structure corresponds to the diagram:</p>

    <pre>
      H3       H1
        \     /
         N1-C1=O1
        /
      H2
</pre>
  </h:html>
  </list>
  -->
      <list title="local information">
      <float title="molecularWeight" units="g">45.03</float>
<!--    <link title="safety" href="/safety/chemicals.xml#formamide">
    </link>
-->
      <string title="location">Storeroom 12.3</string>
    </list>
  </molecule>
</cml>

CML-1 dataType (DEPRECATED).

<cml title="CML-1 JCICS examples">
  <molecule id="formamide">

    <atomArray>
      <stringArray builtin="atomId">H1 C1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="elementType">H  C  O  N  C   C</stringArray>
      <integerArray builtin="hydrogenCount">0  1  0  1  3   3</integerArray>
    </atomArray>
    <bondArray>
      <stringArray builtin="atomRef">C1 C1 C1 N1 N1</stringArray>
      <stringArray builtin="atomRef">H1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="order">1 2 1 1 1</stringArray>
    </bondArray>
  <!-- this is not schema-validatable at present -->
<!--
  <list title="documentation">
    <h:html xmlns:h="http://www.w3.org/TR/html20">
    <p>Formamide is the simplest amide ...</p>
    <p>This represents a <emph>connection table</emph>
 for formamide. The structure corresponds to the diagram:</p>

    <pre>
      H3       H1
        \     /
         N1-C1=O1
        /
      H2
</pre>
  </h:html>
  </list>
  -->
      <list title="local information">
      <float title="molecularWeight" units="g">45.03</float>
<!--    <link title="safety" href="/safety/chemicals.xml#formamide">
    </link>
-->
      <string title="location">Storeroom 12.3</string>
    </list>
  </molecule>
</cml>

CML-1 dataType (DEPRECATED).

<cml title="CML-1 JCICS examples">
  <molecule id="formamide">

    <atomArray>
      <stringArray builtin="atomId">H1 C1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="elementType">H  C  O  N  C   C</stringArray>
      <integerArray builtin="hydrogenCount">0  1  0  1  3   3</integerArray>
    </atomArray>
    <bondArray>
      <stringArray builtin="atomRef">C1 C1 C1 N1 N1</stringArray>
      <stringArray builtin="atomRef">H1 O1 N1 Me1 Me2</stringArray>
      <stringArray builtin="order">1 2 1 1 1</stringArray>
    </bondArray>
  <!-- this is not schema-validatable at present -->
<!--
  <list title="documentation">
    <h:html xmlns:h="http://www.w3.org/TR/html20">
    <p>Formamide is the simplest amide ...</p>
    <p>This represents a <emph>connection table</emph>
 for formamide. The structure corresponds to the diagram:</p>

    <pre>
      H3       H1
        \     /
         N1-C1=O1
        /
      H2
</pre>
  </h:html>
  </list>
  -->
      <list title="local information">
      <float title="molecularWeight" units="g">45.03</float>
<!--    <link title="safety" href="/safety/chemicals.xml#formamide">
    </link>
-->
      <string title="location">Storeroom 12.3</string>
    </list>
  </molecule>
</cml>