This document has been automatically generated from the XSD Schema, using XSLT stylesheets. Schemas are complex and it is not easy to produce the "best" view. It is possible that some information is included twice and (possibly) some is omitted. The Schema itself should always be taken as definitive
This schema represents a fundamental core for future CML. Some of the earlier elements may be obsolete, and some will be moved into new CML schemaspaces. The vocabulary is essentially unaltered but the syntax is simpler and the validation is more powerful.
CML2.1 is the reference release for the JCICS publication and can be used with confidence that it will not be altered (other than essential bugfixes and addition documentation). Further versions will proceed via the CML2.2 branch, and are primarily driven by the need to support the extended CML family of schemas.
There is a prototypic validation procedure based on XSLT stylesheets with namespace prefix val. The syntax is XSL. The only example occurs in bond at present. Some global val resources will be defined in this section.
<xsd:appinfo>
<val:key names="atoms" match="atom" use="@id"/>
<val:key names="bonds" match="bond" use="@id"/>
<val:key names="molecules" match="molecule" use="@id"/>
<val:template name="error">
<val:param name="error"/>
<val:message>XSLT validation error: <val:value-of select="$error"/></val:message>
<val:element name="error">
XSLT validation error: <val:value-of select="$error"/>
</val:element>
</val:template>
</xsd:appinfo>
There is no controlled vocabulary for conventions, but the author must ensure that the semantics are openly available and that there are mechanisms for implementation. The convention is inherited by all the subelements, so that a convention for molecule would by default extend to its bond and atom children. This can be overwritten if necessary by an explicit convention.
It may be useful to create conventions with namespaces (e.g. iupac:name). Use of convention will normally require non-STMML semantics, and should be used with caution. We would expect that conventions prefixed with "ISO" would be useful, such as ISO8601 for dateTimes.
There is no default, but the conventions of STMML or the related language (e.g. CML) will be assumed.
<bond convention="fooChem" order="-5" xmlns:fooChem="http://www.fooChem/conventions"/>
Example:
In the protein database ' CA' and 'CA' are different atom types, and and array could be:
<array delimiter="/" dictRef="pdb:atomTypes">/ N/ CA/CA/ N/</array>
Note that the array starts and ends with the delimiter, which must be chosen to avoid accidental use. There is currently no syntax for escaping delimiters.
A reference to a dictionary entry.
Elements in data instances such as scalar may have a dictRef attribute to point to an entry in a dictionary. To avoid excessive use of (mutable) filenames and URIs we recommend a namespace prefix, mapped to a namespace URI in the normal manner. In this case, of course, the namespace URI must point to a real XML document containing entry elements and validated against STMML Schema.
Where there is concern about the dictionary becoming separated from the document the dictionary entries can be physically included as part of the data instance and the normal XPointer addressing mechanism can be used.
This attribute can also be used on dictionary elements to define the namespace prefix
<stmml title="dictRef example"> <scalar dataType="xsd:float" title="surfaceArea" dictRef="cmlPhys:surfArea" xmlns:cmlPhys="http://www.xml-cml.org/dict/physical" units="units:cm2">50</scalar> </stmml>
<stmml title="dictRef example 2">
<stm:list xmlns:stm="http://www.xml-cml.org/schema/stmml">
<stm:observation>
<p>We observed <object count="3" dictRef="foo:p1"/>
constructing dwellings of different material</p>
</stm:observation>
<stm:entry id="p1" term="pig">
<stm:definition>A domesticated animal.</stm:definition>
<stm:description>Predators include wolves</stm:description>
<stm:description class="scientificName">Sus scrofa</stm:description>
</stm:entry>
</stm:list>
</stmml>
ref modifies an element into a reference to an existing element of that type within the document. This is similar to a pointer and it can be thought of a strongly typed hyperlink. It may also be used for "subclassing" or "overriding" elements.
<stmml title="ref example">
<cml>
<molecule id="m1">
<atomArray>
<atom elementType="N"/>
<atom elementType="O"/>
</atomArray>
</molecule>
<html:p>The action of <molecule ref="#m1"/> on cardiac muscle ...</html:p>
</cml>
</stmml>
<stmml title="title example"> <action title="turn on heat" start="T09:00:00" convention="xsd"/> </stmml>
An array of coordinateComponents for a single coordinate where these all refer to an X-coordinate (NOT x,y,z) Instances of this type will be used in array-style representation of 2-D or 3-D coordinates.
Currently no machine validation
Currently not used in STMML, but re-used by CML (see example)
<stmml title="coordinateComponentArrayType">
<cml:atomArray
xmlns:cml="http://www.xml-cml.org/schema/cml2/core"
x2="1.2 2.3 4.5 6.7"/>
</stmml>
An x/y coordinate pair consisting of two real numbers, separated by whitespace or a comma. In arrays and matrices, it may be useful to set a separate delimiter
<stmml title="coordinate2Type example">
<list>
<array dataType="xsd:decimal"
>1.2,3.4 3.2,4.5 6.7,23.1 </array>
<array delimiter="/" dataType="xsd:decimal"
>/1.2 3.4/3.2 4.5/6.7 23.1/</array>
</list>
</stmml>
An x/y/z coordinate triple consisting of three real numbers, separated by whitespace or commas. In arrays and matrices, it may be useful to set a separate delimiter
<stmml title="coordinate3Type example">
<list>
<array dataType="xsd:decimal">1.2,3.4,1.2
3.2,4.5,7.3 6.7,23.1,5.6 </array>
<array delimiter="/" dataType="xsd:decimal"
>/1.2 3.4 3.3/3.2 4.5 4.5/6.7 23.1 5.6/</array>
</list>
</stmml>
A count multiplier for an element
Many elements represent objects which can occur an arbitrary number of times in a scientific context. Examples are action, object or molecules.
<stmml title="countType example">
<list>
<object title="frog" count="10"/>
<action title="step3" count="3">
<p>Add 10 ml reagent</p>
</action>
</list>
</stmml>
an enumerated type for all builtin allowed dataTypes in STM
dataTypeType represents an enumeration of allowed dataTypes (at present identical with those in XML-Schemas (Part2- datatypes). This means that implementers should be able to use standard XMLSchema-based tools for validation without major implementation problems.
It will often be used an an attribute on scalar, array or matrix elements.
<stmml title="dataType example">
<list xmlns="http://www.xml-cml.org/schema/cml2/core">
<scalar dataType="xsd:boolean" title="she loves me">true</scalar>
<scalar dataType="xsd:float" title="x">23.2</scalar>
<scalar dataType="xsd:duration" title="egg timer">PM4</scalar>
<scalar dataType="xsd:dateTime" title="current data and time">2001-02-01:00:30</scalar>
<scalar dataType="xsd:time" title="wake up">06:00</scalar>
<scalar dataType="xsd:date" title="where is it">1752-09-10</scalar>
<scalar dataType="xsd:anyURI" title="CML site">http://www.xml-cml.org/</scalar>
<scalar dataType="xsd:QName" title="CML atom">cml:atom</scalar>
<scalar dataType="xsd:normalizedString" title="song">the mouse ran up the clock</scalar>
<scalar dataType="xsd:language" title="UK English">en-GB</scalar>
<scalar dataType="xsd:Name" title="atom">atom</scalar>
<scalar dataType="xsd:ID" title="XML ID">_123</scalar>
<scalar dataType="xsd:integer" title="the answer">42</scalar>
<scalar dataType="xsd:nonPositiveInteger" title="zero">0</scalar>
</list>
</stmml>
Some STMML elements (such as array) have content representing concatenated values. The default separator is whitespace (which can be normalised) and this should be used whenever possible. However in some cases the values are empty, or contain whitespace or other problematic punctuation, and a delimiter is required.
Note that the content string MUST start and end with the delimiter so there is no ambiguity as to what the components are. Only printable characters from the ASCII character set should be used, and character entities should be avoided.
When delimiters are used to separate precise whitespace this should always consist of spaces and not the other allowed whitespace characters (newline, tabs, etc.). If the latter are important it is probably best to redesign the application.
<stmml title="delimiter example">
<array size="4" dataType="xsd:string" delimiter="|">|A|B12||D and E|</array>
</stmml>
The values in the array are
"A", "B12", "" (empty string) and "D and E"
note the spaces
Errors in values can be of several types and this simpleType provides a small controlled vocabulary
<stmml title="scalar example">
<scalar
dataType="xsd:decimal"
errorValue="1.0"
errorBasis="observedStandardDeviation"
title="body weight"
dictRef="zoo:bodywt"
units="units:g">34.3</scalar>
</stmml>
An observed or calculated estimate of the error in the value of a numeric quantity. . It should be ignored for dataTypes such as URL, date or string. The statistical basis of the errorValueType is not defined - it could be a range, an estimated standard deviation, an observed standard error, etc. This information can be added through errorBasisType.
<stmml title="scalar example">
<scalar
dataType="xsd:decimal"
errorValue="1.0"
errorBasis="observedStandardDeviation"
title="body weight"
dictRef="zoo:bodywt"
units="units:g">34.3</scalar>
</stmml>
This is not formally of type ID (an XML NAME which must start with a letter and contain only letters, digits and .-_:). It is recommended that IDs start with a letter, and contain no punctuation or whitespace. The function generate-id() in XSLT will generate semantically void unique IDs.
It is difficult to ensure uniqueness when documents are merged. We suggest namespacing IDs, perhaps using the containing elements as the base. Thus mol3:a1 could be a useful unique ID. However this is still experimental.
An array of floats or other real numbers. Not used in STM Schema, but re-used by CML and other languages.
<atomArray xmlns="http://www.xml-cml.org/schema/cml2/core" x2="1.2 2.3 3.4 5.6"/>
An array of integers; for re-use by other schemas
Not machine-validatable
<stmml title="integerArray type">
<atomArray xmlns="http://www.xml-cml.org/schema/cml2/core"
hydrogenCount="3 1 0 2"/>
</stmml>
The maximum INCLUSIVE value of a sortable quantity such as numeric, date or string. It should be ignored for dataTypes such as URL. The use of min and max attributes can be used to give a range for the quantity. The statistical basis of this range is not defined. The value of max is usually an observed quantity (or calculated from observations). To restrict a value, the maxExclusive type in a dictionary should be used.
The type of the maximum is the same as the quantity to which it refers - numeric, date and string are currently allowed
<stmml title="maxType example"> <scalar dataType="xsd:float" max="20" min="12">15</scalar> </stmml>
Allowed matrix types. These are mainly square matrices
<stmml title="matrix example">
<matrix id="m1" title="mattrix-1" dictRef="foo:bar"
rows="3" columns="3" dataType="xsd:decimal"
delimiter="|" matrixType="squareSymmetric" units="unit:m"
>|1.1|1.2|1.3|1.2|2.2|2.3|1.3|2.3|3.3|</matrix>
</stmml>
1 2 3 4
0 3 5 6
0 0 4 8
0 0 0 2
The minimum INCLUSIVE value of a sortable quantity such as numeric, date or string. It should be ignored for dataTypes such as URL. The use of min and min attributes can be used to give a range for the quantity. The statistical basis of this range is not defined. The value of min is usually an observed quantity (or calculated from observations). To restrict a value, the minExclusive type in a dictionary should be used.
The type of the minimum is the same as the quantity to which it refers - numeric, date and string are currently allowed
<stmml title="maxType example"> <scalar dataType="xsd:float" max="20" min="12">15</scalar> </stmml>
The namespace is optional but recommended where possible
Note: this convention is only used within STMML and related languages; it is NOT a generic URI.
<stmml title="namespace example">
<list>
<!-- dictRef is of namespaceRefType -->
<scalar dictRef="chem:mpt">123</scalar>
<!-- error -->
<scalar dictRef="mpt23">123</scalar>
</list>
</stmml>
The namespace prefix must start with an alpha character and can only contain alphanumeric and '_'. The suffix can have characters from the XML ID specification (alphanumeric, '_', '.' and '-'
A reference to an existing element in the document. The target of the ref attribute must exist. The test for validity will normally occur in the element's appinfo
Any DOM Node created from this element will normally be a reference to another Node, so that if the target node is modified a the dereferenced content is modified. At present there are no deep copy semantics hardcoded into the schema.
The size of an array. Redundant, but serves as a check for processing software (useful if delimiters are used)
These will be linked to dictionaries of units with conversion information, using namespaced references (e.g. si:m)
Distinguish carefully from unitType which is an element describing a type of a unit in a unitList
<stmml title="unitList example"> <stm:unitList xmlns:stm="http://www.xml-cml.org/schema/stmml"> <!-- ======================================================================= --> <!-- ========================= fundamental types =========================== --> <!-- ======================================================================= --> <stm:unitType id="length" name="length"> <stm:dimension name="length" power="1"/> </stm:unitType> <stm:unitType id="time" name="time"> <stm:dimension name="time" power="1"/> </stm:unitType> <!-- ... --> <stm:unitType id="dimensionless" name="dimensionless"> <stm:dimension name="dimensionless" power="1"/> </stm:unitType> <!-- ======================================================================= --> <!-- ========================== derived types ============================== --> <!-- ======================================================================= --> <stm:unitType id="acceleration" name="acceleration"> <stm:dimension name="length" power="1"/> <stm:dimension name="time" power="-2"/> </stm:unitType> <!-- ... --> <!-- ======================================================================= --> <!-- ====================== fundamental SI units =========================== --> <!-- ======================================================================= --> <stm:unit id="second" name="second" unitType="time"> <stm:description>The SI unit of time</stm:description> </stm:unit> <stm:unit id="meter" name="meter" unitType="length" abbreviation="m"> <stm:description>The SI unit of length</stm:description> </stm:unit> <!-- ... --> <stm:unit id="kg" name="nameless" unitType="dimensionless" abbreviation="nodim"> <stm:description>A fictitious parent for dimensionless units</stm:description> </stm:unit> <!-- ======================================================================= --> <!-- ===================== derived SI units ================================ --> <!-- ======================================================================= --> <stm:unit id="newton" name="newton" unitType="force"> <stm:description>The SI unit of force</stm:description> </stm:unit> <!-- ... --> <!-- multiples of fundamental SI units --> <stm:unit id="g" name="gram" unitType="mass" parentSI="kg" multiplierToSI="0.001" abbreviation="g"> <stm:description>0.001 kg. </stm:description> </stm:unit> <stm:unit id="celsius" name="Celsius" parentSI="k" multiplierToSI="1" constantToSI="273.18"> <stm:description><p>A common unit of temperature</p></stm:description> </stm:unit> <!-- fundamental non-SI units --> <stm:unit id="inch" name="inch" parentSI="meter" abbreviation="in" multiplierToSI="0.0254" > <stm:description>An imperial measure of length</stm:description> </stm:unit> <!-- derived non-SI units --> <stm:unit id="l" name="litre" unitType="volume" parentSI="meterCubed" abbreviation="l" multiplierToSI="0.001"> <stm:description>Nearly 1 dm**3 This is not quite exact</stm:description> </stm:unit> <!-- ... --> <stm:unit id="fahr" name="fahrenheit" parentSI="k" abbreviation="F" multiplierToSI="0.55555555555555555" constantToSI="-17.777777777777777777"> <stm:description>An obsolescent unit of temperature still used in popular meteorology</stm:description> </stm:unit> </stm:unitList> </stmml>
array manages a homogenous 1-dimensional array of similar objects. These can be encoded as strings (i.e. XSD-like datatypes) and are concatenated as string content. The size of the array should always be >= 1.
The default delimiter is whitespace. The normalize-space() function of XSLT could be used to normalize all whitespace to single spaces and this would not affect the value of the array elements. To extract the elements java.lang.StringTokenizer could be used. If the elements themselves contain whitespace then a different delimiter must be used and is identified through the delimiter attribute. This method is mandatory if it is required to represent empty strings. If a delimiter is used it MUST start and end the array - leading and trailing whitespace is ignored. Thus size+1 occurrences of the delimiter character are required. If non-normalized whitespace is to be encoded (e.g. newlines, tabs, etc) you are recommended to translate it character-wise to XML character entities.
Note that normal Schema validation tools cannot validate the elements of array (they are defined as string) However if the string is split, a temporary schema can be constructed from the type and used for validation. Also the type can be contained in a dictionary and software could decide to retrieve this and use it for validation.
When the elements of the array are not simple scalars (e.g. scalars with a value and an error, the scalars should be used as the elements. Although this is verbose, it is simple to understand. If there is a demand for more compact representations, it will be possible to define the syntax in a later version.
<stmml title="array example 1"> <array size="5" title="value" dataType="xsd:decimal"> 1.23 2.34 3.45 4.56 5.67</array> </stmml>
the size attribute is not mandatory but provides a useful validity check):
<stmml title="array example 2"> <array size="5" title="initials" dataType="xsd:string" delimiter="/">/A B//C/D-E/F/</array> </stmml>
Note that the second array-element is the empty string ''.
<stmml title="array example 3"> <array title="mass" size="4" units="unit:g" errorBasis="observedStandardDeviation" minValues="10 11 10 9" maxValues="12 14 12 11" errorValues="1 2 1 1" dataType="xsd:float">11 12.5 10.9 10.2 </array> </stmml>
A generic container with no implied semantics. It just contains things and can have attributes which bind conventions to it. It could often act as the root element in an STM document.
<stmml title="list example"> <list> <array title="animals" dataType="xsd:string">frog bear toad</array> <scalar title="weight" dataType="xsd:float">3.456</scalar> </list> </stmml>
<stmml title="list example"> <list> <array title="animals" dataType="xsd:string">frog bear toad</array> <scalar title="weight" dataType="xsd:float">3.456</scalar> </list> </stmml>
By default matrix represents a rectangular matrix of any quantities representable as XSD or STMML dataTypes. It consists of rows*columns elements, where columns is the fasting moving index. Assuming the elements are counted from 1 they are ordered V[1,1],V[1,2],...V[1,columns],V[2,1],V[2,2],...V[2,columns], ...V[rows,1],V[rows,2],...V[rows,columns]
By default whitespace is used to separate matrix elements; see array for details. There are NO characters or markup delimiting the end of rows; authors must be careful!. The columns and rows attributes have no default values; a row vector requires a rows attribute of 1.
matrix also supports many types of square matrix, but at present we require all elements to be given, even if the matrix is symmetric, antisymmetric or banded diagonal. The matrixType attribute allows software to validate and process the type of matrix.
<stmml title="matrix example">
<matrix id="m1" title="mattrix-1" dictRef="foo:bar"
rows="3" columns="3" dataType="xsd:decimal"
delimiter="|" matrixType="squareSymmetric" units="unit:m"
>|1.1|1.2|1.3|1.2|2.2|2.3|1.3|2.3|3.3|</matrix>
</stmml>
Number of rows
Number of columns
units (recommended for numeric quantities!!)
A general container for metadata, including at least Dublin Core (DC) and CML-specific metadata
In its simple form each element provides a name and content in a similar fashion to the meta element in HTML. metadata may have simpleContent (i.e. a string for adding further information - this is not controlled).
<stmml title="metadata example">
<list>
<metadataList>
<metadata name="dc:coverage" content="Europe"/>
<metadata name="dc:description" content="Ornithological chemistry"/>
<metadata name="dc:identifier" content="ISBN:1234-5678"/>
<metadata name="dc:format" content="printed"/>
<metadata name="dc:relation" content="abc:def123"/>
<metadata name="dc:rights" content="licence:GPL"/>
<metadata name="dc:subject" content="Informatics"/>
<metadata name="dc:title" content="birds"/>
<metadata name="dc:type" content="bird books on chemistry"/>
<metadata name="dc:contributor" content="Tux Penguin"/>
<metadata name="dc:creator" content="author"/>
<metadata name="dc:publisher" content="Penguinone publishing"/>
<metadata name="dc:source" content="penguinPub"/>
<metadata name="dc:language" content="en-GB"/>
<metadata name="dc:date" content="1752-09-10"/>
</metadataList>
<metadataList>
<metadata name="cmlm:safety" content="mostly harmless"/>
<metadata name="cmlm:insilico" content="electronically produced"/>
<metadata name="cmlm:structure" content="penguinone"/>
<metadata name="cmlm:reaction" content="synthesis of penguinone"/>
<metadata name="cmlm:identifier" content="smiles:O=C1C=C(C)C(C)(C)C(C)=C1"/>
</metadataList>
<metadataList>
<metadata name="foo:institution" content="abc.org"/>
<metadata name="bar" content="xyzzy"/>
<metadata name="$deliberateError" content="error"/>
</metadataList>
</list>
</stmml>
<stmml title="metadata example">
<list>
<metadataList>
<metadata name="dc:coverage" content="Europe"/>
<metadata name="dc:description" content="Ornithological chemistry"/>
<metadata name="dc:identifier" content="ISBN:1234-5678"/>
<metadata name="dc:format" content="printed"/>
<metadata name="dc:relation" content="abc:def123"/>
<metadata name="dc:rights" content="licence:GPL"/>
<metadata name="dc:subject" content="Informatics"/>
<metadata name="dc:title" content="birds"/>
<metadata name="dc:type" content="bird books on chemistry"/>
<metadata name="dc:contributor" content="Tux Penguin"/>
<metadata name="dc:creator" content="author"/>
<metadata name="dc:publisher" content="Penguinone publishing"/>
<metadata name="dc:source" content="penguinPub"/>
<metadata name="dc:language" content="en-GB"/>
<metadata name="dc:date" content="1752-09-10"/>
</metadataList>
<metadataList>
<metadata name="cmlm:safety" content="mostly harmless"/>
<metadata name="cmlm:insilico" content="electronically produced"/>
<metadata name="cmlm:structure" content="penguinone"/>
<metadata name="cmlm:reaction" content="synthesis of penguinone"/>
<metadata name="cmlm:identifier" content="smiles:O=C1C=C(C)C(C)(C)C(C)=C1"/>
</metadataList>
<metadataList>
<metadata name="foo:institution" content="abc.org"/>
<metadata name="bar" content="xyzzy"/>
<metadata name="$deliberateError" content="error"/>
</metadataList>
</list>
</stmml>
A container for any events that need to be recorded, whether planned or not. They can include notes, measurements, conditions that may be referenced elsewhere, etc. There are no controlled semantics
<stmml title="observation example"> <observation type="ornithology"> <object title="sparrow" count="3"/> <observ/> </observation> </stmml>
scalar holds scalar data under a single generic container. The semantics are usually resolved by linking to a dictionary. scalar defaults to a scalar string but has attributes which affect the type.
scalar does not necessarily reflect a physical object (for which object should be used). It may reflect a property of an object such as temperature, size, etc.
Note that normal Schema validation tools cannot validate the data type of scalar (it is defined as string), but that a temporary schema can be constructed from the type and used for validation. Also the type can be contained in a dictionary and software could decide to retrieve this and use it for validation.
<stmml title="scalar example">
<scalar
dataType="xsd:decimal"
errorValue="1.0"
errorBasis="observedStandardDeviation"
title="body weight"
dictRef="zoo:bodywt"
units="units:g">34.3</scalar>
</stmml>
<cml title="atomRef example">
<molecule id="m1">
<atomArray>
<atom id="a1"/>
</atomArray>
<electron id="e1" atomRef="a1"/>
</molecule>
</cml>
<cml title="atomRefs2 example">
<molecule id="m1">
<atomArray>
<atom id="a1"/>
<atom id="a2"/>
</atomArray>
<bondArray>
<bond atomRefs2="a1 a2"/>
</bondArray>
</molecule>
</cml>
<cml title="atomRefs3 example">
<molecule id="m1">
<atomArray>
<atom id="a1"/>
<atom id="a2"/>
<atom id="a3"/>
</atomArray>
<angle atomRefs3="a1 a2 a3" units="degrees">123.4</angle>
</molecule>
</cml>
<cml title="atomRefs4 example">
<molecule id="m1">
<atomArray>
<atom id="a1"/>
<atom id="a2"/>
<atom id="a3"/>
<atom id="a4"/>
</atomArray>
<torsion atomRefs4="a1 a2 a3 a4" units="degrees">123.4</torsion>
</molecule>
</cml>
The atomRefs cannot be schema- or schematron-validated. Instances of this type will be used in array-style representation of bonds and atomParitys. It can also be used for arrays of atomIDTypes such as in complex setereochemistry, geometrical definitions, atom groupings, etc.
<cml title="atomArray example">
<molecule id="m1">
<atomArray atomID="a2 a4 a6"
elementType="O N S"/>
</molecule>
</cml>
Of the form prefix:suffix where prefix and suffix are purely alphanumeric (with _ and -) and prefix is optional. This is similar to XML IDs (and we promote this as good practice for atomIDs. Other punctuation and whitespace is forbidden, so IDs from (say) PDB files are not satisfactory.
The prefix is intended to form a pseudo-namespace so that atom IDs in different molecules may have identical suffixes. It is also useful if the prefix is the ID for the molecule (though this clearly has its limitation). Atom IDs should not be typed as XML IDs since they may not validate.
<cml title="example of IDs on atoms">
<molecule id="m1">
<atomArray>
<!-- this atom might be referenced as m1:a2. This is not formally
part of CML yet -->
<atom id="a2" elementType="O"/>
</atomArray>
</molecule>
<molecule id="m2">
<atomArray>
<!-- this atom might be referenced as m2:a2. This is not formally
part of CML yet -->
<atom id="a2" elementType="O"/>
</atomArray>
</molecule>
</cml>
A reference to a bond may be made by atoms (e.g. for multicentre or pi-bonds), electrons (for annotating reactions or describing electronic properties) or possibly other bonds (no examples yet). The semantics are relatively flexible.
<cml title="bondArray example">
<bondArray>
<bond id="b1" atomRefs2="a3 a8" order="D">
<electron bondRef="b1"/>
<bondStereo>C</bondStereo>
</bond>
<bond id="b2" atomRefs2="a3 a8" order="S">
<bondStereo convention="MDL" conventionValue="6"/>
</bond>
</bondArray>
</cml>
The references cannot (yet) cannot be schema- or schematron-validated. Instances of this type will be used in array-style representation of electron counts, etc. It can also be used for arrays of bondIDTypes such as in complex stereochemistry, geometrical definitions, bond groupings, etc.
The periodic table (up to element number 118. In addition the following strings are allowed:
<cml title="elementType example">
<atomArray>
<atom id="a1" elementType="C"/>
<atom id="a2" elementType="N"/>
<atom id="a3" elementType="Pb"/>
<atom id="a4" elementType="Dummy"/>
</atomArray>
</cml>
There are no special element symbols for D and T which should use the isotope attribute.
Examples can be centroids, bond-midpoints, orienting "atoms" in small z-matrices.
Note "Dummy" has the same semantics but is now deprecated.
Examples are abbreviated organic functional groups, Markush representations, polymers, unknown atoms, etc. Semantics may be determined by the role attribute on the atom.
Instances of this type will be used in array-style representation of atoms.
<cml title="atomArray with elementTypes"> <atomArray elementType="O N S Pb"/> </cml>
Used for electron-bookeeping. This has no relation to its calculated (fractional) charge.
<cml title="formalCharge example">
<atomArray>
<atom id="a1" elementType="N" formalCharge="+1"/>
<atom id="a2" elementType="O" formalCharge="-1"/>
</atomArray>
</cml>
This MUST adhere to a whitespaced syntax so that it is trivially machine-parsable. Each element is followed by its count, and the string is optionally ended by a formal charge. NO brackets or other nesting is allowed.
<cml title="formulaType example (concise)">
<list>
<formula id="methane" concise="C 1 H 4"/>
<formula id="chloroacetate" concise="Cl 1 H 2 C 2 O 2 -1"/>
<formula id="sodiumSulfate">
<formula concise="H 2 O 1" count="10"/>
<formula concise="Na 1 +1" count="2"/>
<formula concise="S 1 O 4 -2"/>
</formula>
</list>
</cml>
The total number of hydrogen atoms bonded to an atom, whether explicitly included as atoms or not. It is an error to have hydrogen count less than the explicit hydrogen count. There is no default value and no assumptions about hydrogen Count can be made if it is not given.
If hydrogenCount is given on every atom, then the values can be summed to give the total hydrogenCount for the (sub)molecule. Because of this hydrogenCount should not be used where hydrogen atoms bridge 2 or more atoms.
<cml title="single atom example">
<atom id="a1" title="O3'" elementType="O"
formalCharge="1" hydrogenCount="1"
isotope="17" occupancy="0.7"
x2="1.2" y2="2.3"
x3="3.4" y3="4.5" z3="5.6"
convention="ABC" dictRef="chem:atom"
>
<scalar title="dipole" dictRef="d:dip"
units="units:debye">0.2</scalar>
<atomParity atomRefs4="a3 a7 a2 a4">1</atomParity>
<electron id="e1" atomRef="a1" count="2"/>
</atom>
</cml>
In core CML this represents a single number; either the combined proton/neutron count or a more accurate estimate of the nuclear mass. This is admittedly fuzzy, and requires a more complex object (which can manage conventions, lists of isotopic masses, etc.) See isotope.
The default is "natural abundance" - whatever that can be interpreted as.
Delta values (i.e. deviations from the most abundant istopic mass) are never allowed.
Re-used by angle
<stmml title="nonNegativeAngle type"> <scalar dataType="nonNegativeAngleType">123</scalar> </stmml>
Obsolete in core CML. Only useful in CML queries
Re-used by crystal
<cml title="positiveAngleType example">
<list>
<scalar title="alpha" units="units:degree">70.123</scalar>
<scalar title="beta" units="units:degree">80.456</scalar>
<scalar title="gamma" units="units:degree">90.789</scalar>
</list>
</cml>
Primarily for crystallography. Values outside 0-1 are not allowed.
(seeAlso orderType)
This is purely conventional and used for bond/electron counting. There is no default value. The emptyString attribute can be used to indicate a bond of unknown or unspecified type. The interpretation of this is outside the scope of CML-based algorithms. It may be accompanied by a convention attribute on the bond which links to a dictionary. Example: <bond convention="ccdc:9" atomRefs2="a1 a2"/> could represent a delocalised bond in the CCDC convention.
The state(s) of matter appropriate to a substance or property. It follows a partially controlled vocabulary. It can be extended through namespace codes to dictionaries
. This is purely conventional; . There is no default value. The emptyString attribute can be used to indicate a bond of unknown or unspecified type. The interpretation of this is outside the scope of CML-based algorithms. It may be accompanied by a convention attribute which links to a dictionary
<cml title="bondArray example">
<bondArray>
<bond id="b1" atomRefs2="a3 a8" order="D">
<electron bondRef="b1"/>
<bondStereo>C</bondStereo>
</bond>
<bond id="b2" atomRefs2="a3 a8" order="S">
<bondStereo convention="MDL" conventionValue="6"/>
</bond>
</bondArray>
</cml>
The units attribute is mandatory and can be customised to support mass, volumes, moles, percentages, or rations (e.g. ppm).
<cml title="substanceList example">
<substanceList id="s1">
<amount units="units:ml">100</amount>
<substance id="s1">
<amount units="units:l">1</amount>
<molecule id="h2o" ref="mols:water"/>
</substance>
<substance id="s2">
<amount units="units:mole">0.1</amount>
<molecule id="nacl" formula="Na 1 O 1 H 1"/>
</substance>
</substanceList>
</cml>
It can be used for:
<molecule id="m1" title="angle example">
<atomArray>
<atom id="a1"/>
<atom id="a2"/>
<atom id="a3"/>
</atomArray>
<angle units="degrees" atomRefs3="a1 a2 a3">123.4</angle>
</molecule>
Usually within a molecule. It is almost always contained within atomArray.
<cml title="single atom example">
<atom id="a1" title="O3'" elementType="O"
formalCharge="1" hydrogenCount="1"
isotope="17" occupancy="0.7"
x2="1.2" y2="2.3"
x3="3.4" y3="4.5" z3="5.6"
convention="ABC" dictRef="chem:atom"
>
<scalar title="dipole" dictRef="d:dip"
units="units:debye">0.2</scalar>
<atomParity atomRefs4="a3 a7 a2 a4">1</atomParity>
<electron id="e1" atomRef="a1" count="2"/>
</atom>
</cml>
One or more electrons associated with the atom. The atomRef on the electron should point to the id on the atom. We may relax this later and allow reference by context.
The elementType. Almost mandatory
The explicit hydrogen count
The non-hydrogen count (obsolete - moved to CML Query)
The isotopic mass. Default implies "natural abundance"
The occupancy (mainly from crystallography)
The x coordinate of a 2-D representation (unrelated to 3-D structure). Note that x- and y- 2D coordinates are required for graphical stereochemistry such as wedge/hatch. x- and y- coordinates must be both present or both absent.
The x coordinate of a 3-D cartesian representation. x3 y3 and z3 coordinates must be both present or both absent.
The fractional x coordinate in a crystal structure. xFract, yFract and zFract coordinates must be all present or all absent. A crystal element is required
The combined x and y coordinates of a 2-D representation (unrelated to 3-D structure). Note that x- and y- 2D coordinates are required for graphical stereochemistry such as wedge/hatch.
The combined x, y, z coordinates of a 3-D cartesian representation.
The combined x, y, z fractional coordinates in a crystal structure. A crystal element is required
The y coordinate of a 2-D representation (unrelated to 3-D structure). Note that x2 and y2 coordinates are required for graphical stereochemistry such as wedge/hatch. x2 and y2 coordinates must be both present or both absent.
The y coordinate of a 3-D cartesian representation. x3 y3 and z3 coordinates must be both present or both absent.
The fractional x coordinate in a crystal structure. xFract, yFract and zFract coordinates must be all present or all absent. A crystal element is required
The z coordinate of a 3-D cartesian representation. x3 y3 and z3 coordinates must be both present or both absent.
The fractional x coordinate in a crystal structure. xFract, yFract and zFract coordinates must be all present or all absent. A crystal element is required
This can be used to describe the purpose of atoms whose elementTypes are dummy or locant.
The attributes are directly related to the scalar attributes under atom which should be consulted for more info.
NOTE: The CML-1 specifications are also supported but are deprecated
.Example - these are exactly equivalent representations
<cml title="atomArray CML1">
<list>
<atomArray>
<atom id="a1" elementType="O" hydrogenCount="1"/>
<atom id="a2" elementType="N" hydrogenCount="1"/>
<atom id="a3" elementType="C" hydrogenCount="3"/>
</atomArray>
<!-- is equivalent to -->
<atomArray
atomID="a1 a2 a3"
elementType="O N C"
hydrogenCount="1 1 3"/>
</list>
</cml>