The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Created: August 30, 2001.
News: Cover StoriesPrevious News ItemNext News Item

Standards Bodies Face Growing Demand for Enhanced Language Identifier Systems.

Proposals are now being floated within several user communities for increasing the number of standardized language codes beyond the 200-400 range found in current ISO standards. A new work item approved by ISO earlier in 2001, for example, addresses the need for an International Standard with mechanisms for encoding language variation in terms of time, geography, dialectal variation, writing system, and so forth. An initial proposal calls for codes supporting representation of the language along at least five axes: "geog (geographical specification), script (writing system), temp (temporal specification), socli (sociolinguistic specification), and style (stylistic specification)." Other draft proposals call for adoption of schemes that identify 7,000 or even 70,000 languages and dialects. As the mass of networked digital information grows ever larger and becomes easily accessible, demand increases for a taxonomy of human languages adequate to support language data classification, categorization, and linguistic annotation. It is now widely recognized that the ISO standards providing "codes for the representation of names of languages" (ISO 639, ISO/FDIS 639-1, ISO 639-2) are inadequate to meet the application requirements being levied by users in new domains. The concern for better language description facility is now felt as urgent among digital librarians and archivists seeking to classify and linguistically annotate materials representing minority languages; others now worry about the emergence of de facto standards which conflict with the work of registered standards bodies. Language identification is of critical importance to markup since the use of language codes to assist in machine processing of text is documented in a wide range of specifications, including markup metalanguages (SGML, XML) and most markup language applications. Seeking to raise interest in this topic and awareness of its importance for markup language design, I have prepared a reference document "Language Identifiers in the Markup Context" with summaries of the major standards and emerging initiatives.

The document "Language Identifiers in the Markup Context" contains description and references for standards which authorize the use of language codes, as well as the [standardized] language identifier listings. In overview:

  • Introduction
  • Language Code Listings
    • ANSI/NISO Codes for the Representation of Languages for Information
    • Ethnologue
    • IETF RFCs
    • ISO 639
    • Linguasphere Project
    • E-MELD Language Codes Workgroup
    • Linguist List Genetic Classification Coding Scheme
    • MARC Code List for Languages
  • Use of Standard Code Lists
    • SGML (Standard Generalized Markup Language)
    • XML (Extensible Markup Language)
    • HTML (Hypertext Markup Language)
    • TEI (Text Encoding Initiative Guidelines)
    • Encoded Archival Description (EAD)
    • Corpus Encoding Standard
    • Language Tagging in Unicode
  • Language Tags and Operating Systems
  • General References
  • Principal references:


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Bottom Globe Image

Document URI: http://xml.coverpages.org/ni2001-08-30-a.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org