Language Tag Registry Update
IETF Proposed Working Group: Language Tag Registry Update (LTRU)
WG Review: Language Tag Registry Update (LTRU)
A new IETF working group has been proposed in the Applications Area. The IESG has not made any determination as yet. The following description was submitted, and is provided for informational purposes only. Please send your comments to the IESG mailing list (iesg at ietf.org) by March 2nd.
Language Tag Registry Update (LTRU)
Current Status: Proposed Working Group
Description of Working Group
RFC 3066 and its predecessor, RFC 1766, defined language tags for use on the Internet. Language tags are necessary for many applications, ranging from cataloging content to computer processing of text. The RFC 3066 standard for language tags has been widely adopted in various protocols and text formats, including HTML, XML, and CLDR, as the best means of identifying languages and language preferences. Since the publication of RFC 3066, however, several issues have faced implementors of language tags:
- Stability and accessibility of the underlying ISO standards
- Difficulty with registrations and their acceptance
- Lack of clear guidance on how to identify script and region where necessary
- Lack of parseability and the ability to verify well-formedness
- Lack of specified algorithms, apart from pure prefix matching, for operations on language tags
This working group will address these issues by developing two documents. The first is a successor to RFC 3066. It will describe the structure of the IANA registry and how the registered tags will relate to the generative mechanisms (originally described in RFC 3066, but likely to be updated by the document). In order to be complete, it will need to address each of the challenges set out above:
For stability, it is expected that the document will describe how the meaning of language tags remains stable, even if underlying references should change, and how the structure is to remain stable in the future. For accessibility, it is to provide a mechanism for easily determining whether a particular subtag is valid as of a given date, without onerous reconstruction of the state of the underlying standard as of that time.
For extensibility, it is expected that the document will describe how generative mechanisms could use ISO 15924 and UN M.49 codes without explicit registration of all combinations. The current registry contains pairs like uz-Cyrl/uz-Latn and sr-Cyrl/sr-Latn, but RFC 3066 contains no general mechanism or guidance for how scripts should be incorporated into language tags; this replacement document is expected to provide such a mechanism.
It is also expected to provide mechanisms to support the evolution of the underlying ISO standards, in particular ISO 639-3, mechanisms to support variant registration and formal extensions, as well as allowing generative private use when necessary.
It is expected to specify a mechanism for easily identifying the role of each subtag in the language tag, so that, for example, whenever a script code or country code is present in the tag it can be extracted, even without access to a current version of the registry. Such a mechanism would clearly distinguish between well-formed and valid language tags, to allow for maximal compatibility between implementations released at different times, and thus using different versions of the registry.
The second document will describe matching algorithms for use with language tags. Language tags are used in a broad variety of contexts and it is not expected that any single matching algorithm will fit all needs. Developing a small set of common matching and sorting algorithms does seem likely to contribute to interoperability, however, as it seems likely that using protocols could reference these well-known algorithms in their specifications.
This working group will not take over the existing review function of the ietf-languages list. The ietf-languages list will continue to review tags according to RFC 3066 until the first document produced by the WG is finished. Then it will review according to whatever procedures the first document specifies.
Prepared by Robin Cover for The XML Cover Pages archive. See details in the news story "IESG Announces Proposed IETF Working Group for Language Tag Registry Update." General references in "Language Identifiers in the Markup Context."