This document records all known errors in the XML 1.0 specification, http://www.w3.org/TR/1998/REC-xml-19980210. The errata are numbered, classified as Substantive, Editorial or Clarification and listed in reverse chronological order of their date of publication. Early errata (1999-02-17 and before) are neither numbered, classified nor dated.
Please email error reports to xml-editor@w3.org.
choice ::= '(' S? cp ( S? '|' S? cp )* S? ')'
"choice ::= '(' S? cp ( S? '|' S? cp )+ S? ')'
"The second possible case occurs when the XML entity is accompanied by encoding information, as in some file systems and some network protocols. When multiple sources of information are available, their relative priority and the preferred method of handling conflict should be specified as part of the higher-level protocol used to deliver XML. In particular, please refer to [IETF RFC2376] "XML Media Types" which defines the text/xml and application/xml MIME types and provides some useful guidance. In the interests of interoperability, however, the following rule is recommended.
With a Byte Order Mark: 00 00 FE FF: UCS-4, big-endian machine (1234 order) FF FE 00 00: UCS-4, little-endian machine (4321 order) FE FF 00 ##: UTF-16, big-endian FF FE ## 00: UTF-16, little-endian EF BB BF: UTF-8 Without a Byte Order Mark: 00 00 00 3C: UCS-4, big-endian machine (1234 order) 3C 00 00 00: UCS-4, little-endian machine (4321 order) 00 00 3C 00: UCS-4, unusual octet order (2143) 00 3C 00 00: UCS-4, unusual octet order (3412) 00 3C ## ##, 00 25 ## ##, 00 20 ## ##, 00 09 ## ##, 00 0D ## ## or 00 0A ## ##: Big-endian UTF-16 or ISO-10646-UCS-2. Note that, absent an encoding declaration, these cases are strictly speaking in error. 3C 00 ## ##, 25 00 ## ##, 20 00 ## ##, 09 00 ## ##, 0D 00 ## ## or 0A 00 ## ##: Little-endian UTF-16 or ISO-10646-UCS-2. Note that, absent an encoding declaration, these cases are strictly speaking in error. 3C 3F 78 6D: UTF-8, ISO 646, ASCII, some part of ISO 8859, Shift-JIS, EUC, or any other 7-bit, 8-bit, or mixed-width encoding which ensures that the characters of ASCII have their normal positions, width, and values; the actual encoding declaration must be read to detect which of these applies, but since all of these encodings use the same bit patterns for the ASCII characters, the encoding declaration itself may be read reliably 4C 6F A7 94: EBCDIC (in some flavor; the full encoding declaration must be read to tell which code page is in use) other: UTF-8 without an encoding declaration, or else the data stream is corrupt, fragmentary, or enclosed in a wrapper of some kind
Add the following to the second paragraph after the list (this also takes care of the previous erratum on UTF-7): "Note: Since external parsed entities in UTF-16 may begin with any character, this autodetection does not always work. Also, because of the overloaded usage it makes of ASCII-valued bytes, the UTF-7 encoding may fail to be reliably detected."
standalone='yes'
", they must not process entity
declarations or attribute-list declarations encountered after a
reference to a parameter entity that is not read, since the entity may
have contained overriding declarations."standalone='yes'
"', there
is no guarantee that making a document standalone will cause all XML processors
to reports the same results to the application.--->
'. The
following example is not well-formed." and an
example: "<!-- B+, B, or B--->
"Before the value of an attribute is passed to the application or checked for validity, but after the end-of-line normalization described in section 2.11 has been performed, the XML processor must normalize the attribute value as follows:
If the attribute type is not CDATA, then the XML processor must further process the normalized attribute value by discarding any leading and trailing space (#x20) characters, and by replacing sequences of space (#x20) characters by a single space (#x20) character.
"Validity Constraint: Unique Notation Name: only one notation declaration can declare a given Name."
"For interoperability, if a parameter-entity reference appears in a choice, seq, or Mixed construct, its replacement text should not be empty, and neither the first nor last non-blank character of the replacement text should be a connector (| or ,)."
to
"For interoperability, if a parameter-entity reference appears in a choice, seq, or Mixed construct, its replacement text must contain at least one non-blank character, and neither the first nor last non-blank character of the replacement text should be a connector (| or ,)."