The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
Advanced Search
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

Cover Stories
Articles & Papers
Press Releases

XML Query

XML Applications
General Apps
Government Apps
Academic Apps

Technology and Society
Tech Topics
Related Standards
Last modified: July 12, 2002
Markup and Terminological Databases

[July 08, 2002] Several of the markup-related standards for terminology (glossary, thesaurus) address requirements for multilingualism.

Provisional references:

  • Open Lexicon Interchange Format (OLIF). "OLIF, the Open Lexicon Interchange Format, is a user-friendly vehicle for exchanging terminological and lexical data. OLIF is XML-compliant and offers support for natural language processing (NLP) systems, such as machine translation, by providing coverage of a wide and detailed range of linguistic features. OLIF is a joint effort of a group of major NLP technology suppliers, corporate users of NLP, and research institutions. In order to strenghten their power, this group has formed the OLIF Consortium which is headed by SAP."

  • XLT: XML representation of Lexicons and Terminologies. "XLT is a major deliverable of the SALT Project (Standards-based Access service to multilingual Lexicons and Terminologies). XLT is an XML-based application developed with the intent of facilitating the exchange of lexicons and terminologies. The primary member of the XLT family of formats is Default XLT Format (DXLT)."

  • Geneter (GENEric model for TERminology). "Geneter is a standard in project actually under debate at the TC37/SC3/WG4 (ISO technical committee for Terminology and sub-committee for computer applications). Its objective will be the management, the distribution and the reuse of terminological data." There is a Geneter test suite for SGML/XML compatibility. See the DTD.

  • TermBase Exchange (TBX). "TBX is an open XML-based standard format for terminological data. This standard provides a number of benefits so long as TBX files can be imported into and exported from most software packages that include a terminological database. This capability will greatly facilitate the flow of terminological information throughout the information cycle both inside an organization and with outside service providers. In addition, terminology that is made available to the general public will become much more accessible to humans and more easily integrated into existing terminological resources." See the April 2002 announcement from LISA.

  • Machine-Readable Terminology Interchange Format (MARTIF). "The terminological community has realised the problems of standardisation for the interchange of terminological information and provided a way forward with ISO FDIS 12200-1, Computer Applications In Terminology - Machine-Readable Terminology Interchange Format (Martif) -Part 1: Negotiated Interchange. The MARTIF standard ISO FDIS 12200-1 uses Standardised Generalised Markup Language (SGML) in conjunction with ISO FDIS 12620 for the robust interchange of terminological data. MARTIF provides a Document Type Definition (DTD) in SGML and ISO FDIS 12620 provides the data categories. See "Introduction to ISO 12200 (negotiated MARTIF)" and ISO 12620 Data Categories. [from The Virtual HyperGlossary]

  • Terminological Markup Framework (TMF). See ISO/DIS 16642. From ISO/TC 37/SC 3/. Computer applications in terminology -- Terminological markup framework (TMF). Date: 2001-12-5. 85 pages. From the introduction/scope statements: "Terminological data are collected, managed and stored in a wide variety of systems, typically in applications, i.e., various kinds of database management systems, ranging from personal computer applications for individual users to mainframe term-bank systems operated by major companies and governmental agencies. Termbases are comprised of various sets of data categories and based on various kinds of data models. Terminological data often need to be shared and reused in a number of applications, and this sharing is usually accomplished using intermediate formats. To facilitate co-operation and to prevent duplicate work, it is important to develop standards and guidelines for creating and using terminological data collections as well as for sharing and exchanging data... This International Standard specifies a model that has been designed for the purpose of providing guidance on the basic principles for representing terminological data, as well as for describing specific terminological markup languages. This International Standard is designed to support the development and use of computer applications for terminological data and the exchange of such data between different applications. Standardisation of data categories and methods for the definition of data structures are specified in ISO 12620 and other related International Standards. This International Standard specifies a framework designed to provide guidance on the basic principles for representing data recorded in terminological data collections. This framework includes a meta-model and methods for describing specific terminological markup languages (TMLs) expressed in XML. The mechanisms for implementing constraints in a TML are defined in this International Standard, but not the specific constraints for individual TMLs (which can be the subject of further standardizations), exept for the three TMLs defined in the annexes This International Standard also defines the conditions that allow the data expressed in one TML to be mapped onto another TML and specifies a generic mapping tool, GMT, for this purpose. In addition, this International Standard also describes a generic model for describing linguistic data..." See also XSL transformations for TMF. [cache DIS]

  • Termado. See "CNet Sweden Publishes Multilingual Term Catalogues With 'Termado' XML Technology." - CNet Sweden has announced the availability of its Termado software for "management and publishing of term catalogues, lexicons and dictionaries. Using the latest XML and Web Services technology, Termado publishes term catalogues to different media but can also export term data to different applications helping businesses establishing common concepts throughout the organization. Termado consists of a termbase management system and a termbase publishing engine. The termbase management system has an easy-to-use interface for creating and managing a catalogue with terms. The termbase has been designed using linguistic and terminological models. It can represent anything from simple glossaries and dictionaries to very complex terminologies for different subject domains... The web interface supports concept searches as well as free text searching in the term catalogue. Termado automatically creates links between related concepts in the database, which can be used to navigate and explore the termbase." The product supports exchange formats such as OLIF, MARTIF, Geneter, and XTL... Termado is based on XML (Extensible Markup Language), the latest technology for cross media publishing and data exchange. Therefore Termado has open interfaces for import and export of terminological entries to and from other programs. Developers can easily create any XML-based format to allow support for exchange formats such as OLIF, MARTIF, Geneter and XTL... Termado features many functions for converting different sources to a uniform XML-format. The software has several built-in parsers for reading none-XML-based formats and convert it into a common XML format for term data."


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation


XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI:  —  Legal stuff
Robin Cover, Editor: