ISO/IEC JTC 1/SC34/WG1 NXX

[Cache from: http://www.y12.doe.gov/sgml/sc34/document/0410.htm, fetched 2003-06-21.]

ISO/IEC JTC 1/SC34 N 0410

ISO/IEC JTC 1/SC34/WG1

Information Technology --
Document Description and Processing Languages
-- Information Presentation

TITLE:	Mobile Subset of XML Schema Part 2
SOURCE:	SC34 Japan
PROJECT:	19757-5
PROJECT EDITOR:	Martin Bryan
STATUS:	For information and review
ACTION:
DATE:	2003-04-22
DISTRIBUTION:	SC34 and Liaisons
REFER TO:
REPLY TO:

1. Introduction

We propose to create a compact and reliable subset of W3C XML Schema Part 2 and publish it as an ISO standard. The main target of this subset is mobile devices (such as cellular phones).

Mobile devices are expected to use XML in the near future. Small XML parsers have been developed already. Validators for schema languages are expected to follow, and a prototypical validator for RELAX NG on mobile phones has been developed. Such parsers and validators will hopefully be used for implementing XForms and Web Service on mobile devices.

Part 2 of W3C XML Schema provides a set of datatypes and facets. Although it might not be perfect, it is likely to be widely used by many XML applications including mobile applications. We just cannot believe that an incompatible set of general-purpose datatype (e.g., int) libraries will be accepted by the market.

However, datatypes and facets of W3C XML Schema Part 2 are too complicated for mobile devices. Some specifications such as XForms have already created their own subsets of W3C XML Schema Part 2. However, if different specifications introduce different subsets, incomparability will be significantly spoiled. It would be much nicer if one subset is internationally standardized.

2. Choice of datatypes

Table 1 gives a list of datatypes of W3C XML Schema Part 2. The exclamation mark indicates those types which we propose to incorporate into the mobile subset.

We omit

datatypes requiring infinite precision,
datatypes that do not have obvious mapping to J2ME,
archaic datatypes such as IDREFS, ENTITY, ENTITIES, and NOTATION,
unsolid datatypes (dateTime and so forth),
datatypes such that validity depends on namespace declarations

Table 1: The list of datatypes
3.2 Primitive datatypes
3.2.1 string !
3.2.2 boolean !
3.2.3 decimal
3.2.4 float !
3.2.5 double !
3.2.6 duration
3.2.7 dateTime
3.2.8 time
3.2.9 date
3.2.10 gYearMonth
3.2.11 gYear
3.2.12 gMonthDay
3.2.13 gDay
3.2.14 gMonth
3.2.15 hexBinary
3.2.16 base64Binary
3.2.17 anyURI !
3.2.18 QName
3.2.19 NOTATION
3.3 Derived datatypes
3.3.1 normalizedString !
3.3.2 token !
3.3.3 language !
3.3.4 NMTOKEN !
3.3.5 NMTOKENS !
3.3.6 Name !
3.3.7 NCName !
3.3.8 ID !
3.3.9 IDREF !
3.3.10 IDREFS
3.3.11 ENTITY
3.3.12 ENTITIES
3.3.13 integer
3.3.14 nonPositiveInteger
3.3.15 negativeInteger
3.3.16 long !
3.3.17 int !
3.3.18 short !
3.3.19 byte !
3.3.20 nonNegativeInteger
3.3.21 unsignedLong !
3.3.22 unsignedInt !
3.3.23 unsignedShort !
3.3.24 unsignedByte !
3.3.25 positiveInteger

Note: In addition to these datatypes, XForms introduces two groups of datatypes. One group of datatypes contains decimal, dateTime, time, date, gYearMonth, gYear, gMonthDay, gDay, and gMonth. The other group contains integer, nonPositiveInteger, negativeInteger, nonNegativeInteger, and positiveInteger, all of which require infinite precision.

On the other hand, our subset allows normalizedString, token, language, NMTOKEN, NMTOKENS, Name, NCName, ID, IDREF, which are not allowed by XForms. These are not particularly hard to implement. However, XForms people probably did not think that these datatypes are important for XForms.

3. Choice of facets

Table 2 shows facets of W3C XML Schema Part 2. The exclamation mark indicates those facets which we propose to incorporate into the mobile subset.

We omit

the pattern facet which requires the property list of Unicode characters
whitespace, which does not affect validity but controls PSVI,
totalDigits and fractionDigits

Table 2: The list of facets
4.3 Constraining Facets
4.3.1 length !
4.3.2 minLength !
4.3.3 maxLength !
4.3.4 pattern
4.3.5 enumeration !
4.3.6 whiteSpace
4.3.7 maxInclusive !(when datatypes are neither date, dateTime, time, nor duration)
4.3.8 maxExclusive !(when datatypes are neither date, dateTime, time, nor duration)
4.3.9 minExclusive !(when datatypes are neither date, dateTime, time, nor duration)
4.3.10 minInclusive !(when datatypes are neither date, dateTime, time, nor duration)
4.3.11 totalDigits
4.3.12 fractionDigits

4. Implementation considerations

We have studied the source code of Jing implementation by James Clark. We believe that if the above restrictions are accepted, an implementation of the remaining datatypes and facets will require less than 20KB as the size of a JAR file.

ISO/IEC JTC 1/SC34 N 0410