The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: February 11, 2008
XML Localization Interchange File Format (XLIFF)

Contents


[February 01, 2008] In February 2008, OASIS announced the approval of XLIFF v1.2 as an OASIS Standard: see New OASIS Standard: XML Localization Interchange File Format (XLIFF) v1.2.

The XLIFF 1.2 Specification defines the XML Localization Interchange File Format (XLIFF). The purpose of this vocabulary is to store localizable data and carry it from one step of the localization process to the other, while allowing interoperability between tools.

The specification is tool-neutral, supports the entire localization process, and supports common software, document data formats, and mark-up languages. The specification provides an extensibility mechanism to allow the development of tools compatible with an implementer's data formats and workflow requirements. The extensibility mechanism provides controlled inclusion of information not defined in the specification.

XLIFF is an XML-based vocabulary. Use of XLIFF is represented in the DITA Translation Subcommittee, and will be featured in a translation best practices document. XLIFF has a working relationship with LISA/OSCAR standards related to Translation and Localization, and is a requirement for several LISA standards: (1) TMX supports XLIFF inline markup, where TMX Version 2.0 was in progress [public review draft] as of March 28, 2007; (2) Global Information Management Metrics eXchange (GMX), where XLIFF is a requirement; (3) xml:tm, where XLIFF is a requirement.

[June 27, 2006] As described in the TC Charter Clarification of June 2006:

The purpose of the OASIS XML Localisation Interchange File Format (XLIFF) TC is to define, through extensible XML vocabularies, and promote the adoption of, a specification for the interchange of localisable software and document based objects and related metadata. To date, the committee has published two specifications — XLIFF 1.0 and XLIFF 1.1 — that define how to mark up and capture localisable data that will interoperate with different processes or phases without loss of information. The specifications are tool-neutral, support the entire localization process, and support common software and document data formats and mark-up languages. The specifications provide an extensibility mechanism to allow the development of tools compatible with an implementer's data formats and workflow requirements. The extensibility mechanism provides controlled inclusion of information not defined in the specification.

The state of software and documentation localisation before XLIFF was that a software or documentation provider delivered their localisable resources to a localisation service provider in a number of disparate file formats. Once software providers and technical communicators commenced implementing XLIFF, the task of interchanging localisation data was simplified. Using proprietary and/or non-standard resource formats force either the source provider or the localisation service provider to implement costly and inefficient pre-processing of localisable content. For publishers with many proprietary or non-standard formats, this requirement becomes a major hurdle when attempting to localise their software. For software developers and technical communicators employing enterprise localisation tools and processes, XLIFF defines a standard but extensible vocabulary that captures relevant metadata for any point in the lifecycle which can be exchanged between a variety of commercial and open-source tools.

The first phase, completed 31-October-2003, created a 1.1 version committee specification that concentrated on software UI resource file localisable data requirements. The next phase consists of promoting the adoption of XLIFF throughout the industry through additional collateral and specifications, continuing to advance the committee specification towards an official OASIS standard, and revising the XLIFF spec to 1.2 version to support document based content segmentation and alignment requirements. To encourage adoption of XLIFF, the TC will define and publish implementation guides for some of the most commonly used resource formats (HTML, Java Resource Bundles, and gettext PO Files)...

[July 2002] From the XLIFF TC Charter 2002-07: "The purpose of the OASIS XLIFF TC is to define, through XML vocabularies an extensible specification for the interchange of localization information. The specification will provide the ability to mark up and capture localizable data and interoperate with different processes or phases without loss of information. The vocabularies will be tool-neutral, support the localization-related aspects of internationalization and the entire localization process. The vocabularies will support common software and content data formats. The specification will provide an extensibility mechanism to allow the development of tools compatible with an implementer's own proprietary data formats and workflow requirements".

[December 17, 2001]   OASIS Technical Committee to Standardize an XML Localisation Interchange File Format (XLIFF).    A proposal has been submitted to OASIS for the creation of an XLIFF technical committee to "define a specification for an extensible localisation interchange format that will allow any software provider to produce a single interchange format that can be delivered to and understood by any localisation service provider. The format should be tool independent, standardised, and support the whole localisation process. It will comprehensively support common software data formats and be open enough to allow the development of tools compatible with an implementer's own proprietary data formats and company culture." The first phase of TC work will be based on work previously done by the Yahoo DataDefinition Group; this group has produced a white paper, a specification, and a DTD, which were made public through that group's site. The existing specification will be submitted for approval as the XLIFF 1.0 specification in the first meeting. The Technical Committee Proposal contains a statement of intellectual property rights and provisional list of deliverables. [Full context]

XLIFF description from Yves Savourel:

At the end of 2000, a group driven by companies including Oracle, Novell, Sun, and IBM/Lotus started to define an exchange format for translatable data: XLIFF (XML Localisation Interchange File Format). The format is based on the principles defined by Open Tag and borrows some of its tags. It also adopts some of the ideas developed later in TMX and adds a few innovations of its own: project information, pretranslation and history, versioning, binary objects, and so forth. The first draft of XLIFF was released in May 2001. The information provided here is subject to change as the draft is finalized. You can find the latest specifications and more information on XLIFF at http://www.xliff.org. There is also a discussion group at http://groups.yahoo.com/group/DataDefinition.

XLIFF is close to OpenTag in many respects, but it is more defined format, enabling fewer possibilities to express the same content in different ways, and therefore offering better interoperability. The format also, for now, specializes in storing text extracted from software-type files and tagged documents. This more specialized aim eliminates the need for some compromises that OpenTag made to accommodate documentation-type data.

The base element of XLIFF is <trans-unit>. It corresponds to a unique item extracted from the original file (label, caption, paragraph, string, and so forth). The content of the item is stored in its <source> element for the source language, and, optionally, its <target> element for the target language. Both <source> and <target> elements contain the text and any inline elements inculded with the text. Note that currently no mechanism is dedicated to break the item into smaller segments, for instance, sentences inside a paragraph.

Extract/excerpts from pages 383ff. of XML Internationalization and Localization, by Yves Savourel (used with permission). See the full context.

Overview from the XLIFF draft specification:

"XLIFF is the XML Localisation Interchange File Format designed by a group of software providers, localisation service providers, and localisation tools providers. It is intended to give any software provider a single interchange file format that can be understood by any localisation provider. It is loosely based on the OpenTag version 1.2 specification and borrows from the TMX 1.2 specification. However, it is different enough from either one to be its own format.

"XLIFF is XML, as such it begins with an XML declaration. After the XML declaration comes the XLIFF document itself, enclosed within the <xliff> element. A XLIFF document is composed of zero, one or more sections, each enclosed within a <file> element. The <file> element consists of a <header> element, which contains meta-data about the <file>, and a <body> element, which contains the extracted translatable data from the <file>. The translatable data is contained within <trans-unit> elements in <source> and <target> paired elements. These <trans-unit> elements can be grouped recursively in <group> elements.

"In addition, XLIFF provides the ability to maintain information about the processing of the file via the <phase> element. Possible translations for a specific <source> element can be generated from any number of MT/CAT systems and stored near the <source> in <trans-match> elements. Context for a <source> that could be used by a translator or a TM system is provided by the <context> element. Binary data can be made available via the <bin-unit>, which may also be translated and contain an associated <trans-unit>.

From Opentag.com: "The XML Localisation Interchange File Format is a format developed by a group of localization customers, localization suppliers, and tools vendors, including: Oracle, Novell, IBM/Lotus, Sun MicroSystems, Alchemy Software, Berlitz, Moravia-IT, and the RWS Group. XLIFF final draft of version 1.0 was ready at the end of May 2001. XLIFF is a format to store extracted text and carry the data from one step to another in the localization process. It follows the same principles as OpenTag, and borrows also a few ideas from TMX. For example, to specify in-line codes XLIFF supports both the placeholder OpenTag's method and TMX's encapsulation system."

From the Moravia IT paper by Milan Karásek

Most companies in the localization industry (such as Microsoft, Oracle, Sun, etc.) use their own localization tools to achieve correct translations. These tools accept a number of input file formats, so the processing of different formats on the vendor side is quite difficult. For example, making a unique glossary from several formats can pose certain difficulties; also, looking for a suitable translation in more than one customer file by browsing using different tools is uncomfortable. We can see other disadvantages for clients, this time financial disadvantages, in the form of forced tool development-in most cases, clients produce their own tools.

As a response to this situation, an initiative to define a standard file format in the localization industry was established. To assure a truly standardized file format, representatives from many leading client companies were invited for cooperation. The initiative, called Data Definition Group, started in October 2000. Over time, the group was renamed to a more specific name - XLIFF, which stands for XML-based Localization Interchange File Format.

As the initiative's name suggests: the standard is based on the XML technology. This means that the interchange format will be transparent to all partners in the localization process and tools developers. In most cases, an XML file is a plain text file, containing data in a pre-defined structure. The structure description is stored in a special file (DTD), accessible to everyone. The goal of the XLIFF group was to define this public structure and a DTD file has to be the result. A new discussion group was started on the Yahoo! Groups server. As the mission statement defined a very strict deadline, the problem was divided into three areas, each processed by a subset of the participants. This is the list of all subareas, with their descriptions: (1) The Architecture group's goals were to define a core of the XLIFF format, propose a naming convention for XLIFF, and reconcile the differences between what was proposed for XLIFF in the first proposals and was inherited from OpenTag/TMX. (2) The Status Flags-Version Control group was concerned with how XLIFF would deal with file metadata. In particular, they were concerned with version-information data and job-information data. Moravia IT participated in this group. (3) The TM group ensured that the features within XLIFF relevant to Translation Memory are adequate for the defined purposes. Outcomes of all the areas were merged together during a 2-day conference, which took place in Dublin at the end of March. All main issues of the XLIFF were finalized and the draft of the first specification was completed. It is supposed to have an official white paper and the finalized specifications (with the DTD) released in the second week in May [2001]."

[January 22, 2002] "OASIS Members Form Technical Committee to Develop Localization Interchange File Format. Commerce One, HP, IBM, Novell, Oracle, SAP, Sun Microsystems, Xerox, and Others Unite on Multi-Lingual Data Exchange Standard." - "OASIS, the XML interoperability consortium, today announced its members have formed the OASIS XML Localization Interchange File Format (XLIFF) Technical Committee to advance a specification for multi-lingual data exchange. XLIFF will allow any software provider to produce a single interchange format that can be delivered to and understood by any localization service provider. Supporting the entire localization process, XLIFF will be product independent, open enough to allow the development of tools compatible with an implementer's own proprietary data formats and company culture... Initial development of XLIFF was begun by an independent group of localization experts who plan to submit their work to the new OASIS technical committee. Many existing XML formats such as UIML, OpenTag and Translation Memory Exchange (TMX) were used as points of reference in building XLIFF. Yves Savourel, author of XML Internationalization and Localization, applauded the consolidation under OASIS, saying, "The creation of an OASIS Technical Committee for XLIFF is good news for our industry. XLIFF brings the promise of much greater interoperability between the tools used by customers and providers of localization services. This should lead to more efficient and less costly processes." Members of the OASIS XLIFF Technical Committee include consortium sponsors, Commerce One, Hewlett-Packard Company, IBM, Novell, Oracle, SAP, Sun Microsystems, Xerox, and other OASIS members..."


Principal References


XLIFF Specification

XLIFF Version 1.2

XLIFF Version 1.2. OASIS Standard. 01-February-2008. Also in PDF format [source]. Edited by Yves Savourel, John Reid, Tony Jewtushenko, and Rodolfo M. Raya. See the specification drafts for XLIFF Version 1.2, and URIs for the OASIS Standard.

With final Version 1.2 XML schemas: strict and transitional.

XLIFF Version 1.1

XLIFF 1.1 Specification. OASIS Committee Specification. 31-October-2003. Edited by Yves Savourel and John Reid.

This document defines the XML Localization Interchange File Format (XLIFF). The purpose of this vocabulary is to store localizable data and carry it from one step of the localization process to the other, while allowing interoperability between tools.

XLIFF Version 1.0

XLIFF 1.0 Specification. OASIS Committee Specification. 15-April-2002. Edited by Yves Savourel and John Reid.

"XLIFF is the XML Localisation Interchange File Format designed by a group of software providers, localisation service providers, and localisation tools providers. It is intended to give any software provider a single interchange file format that can be understood by any localisation provider. It is loosely based on the OpenTag version 1.2 specification and borrows from the TMX 1.2 specification. However, it is different enough from either one to be its own format."

XLIFF Early Version


XLIFF Resources: Software

  • XLIFF Tools project. "The XLIFF Tools project was initiated in 2005 with the aim of encouraging the adoption of the XML Localisation Interchange File Format (XLIFF) in open source localisation processes."

  • Ektron CMS400.NET. Ektron CMS400.NET v6 and later support the following versions of the XLIFF standard: 1.0, 1.1, and 1.2. Ektron has been an early-adopter of the XLIFF standard and has received substantial recognition for the localization capabilities of Ektron CMS400.NET, which are made possible by XLIFF.

  • IBM XLIFF to ICU ResourceBundle Format Converter. XML Localization Interchange File Format (XLIFF), a format designed by localization industry experts for solving problems faced by translators, is an emerging industry standard for authoring and exchanging content for localization. XLIFF is a lossless and tool-neutral interchange format for localizable content. More information about XLIFF can be obtained from this page. ICU provides tools for converting XLIFF files to ICU ResourceBundle format. The binaries of one of these tools, the XLIFF2ICUConverter is available for download. See the tutorial.

  • opentag.com XLIFF resources. The XML Localisation Interchange File Format is a format developed by a group of localization customers, localization suppliers, and tools vendors, including: Oracle, Novell, IBM/Lotus, Sun Microsystems, Alchemy Software, Berlitz, Moravia-IT, and ENLASO Corporation (formerly the RWS Group). The version 1.2 became an official OASIS Standard in February 2008.

  • Drupal XLIFF Tools. By Gábor Hojtsy. This module converts node bodies and titles written with valid(!) HTML markup to XLIFF (XML Localization Interchange File Format) and back to HTML. You can use Computer Aided Translation (CAT) tools to support your content translation process.

  • xliffRoundTrip Tool. Automates a roundtrip between any XML file and XLIFF. It consists of 2 XSL files and a Java application. The first transforms XML to XLIFF. The second transforms that XLIFF back to the original XML, presumably after a language translation on the XLIFF file"

  • Java.net XLIFF Translation Editor. Sun Microsystems. "The Open Language Tools are a set of translation tools that aim to make the task of translating software and documentation a lot easier. Initially, they comprise of a full-featured XLIFF Translation Editor and a set of XLIFF file-filters for a number of documentation and software file formats. Our intended audience is tools developers and translators of software and documentation..."

  • itools "itools is a collection of Python libraries which provides a wide range of capabilities, including an abstraction over directory and file resources, a search engine, type marshallers, datatype schemas, i18n support, URI handlers, a Web programming interface, a workflow interface, and support for data formats such as (X)HTML, XML, iCalendar, RSS 2.0, and XLIFF."

  • Transolution: An Open Source Translation Suite. "Transolution is a Computer Aided Translation (CAT) suite supporting the XLIFF standard. It provides the open source community with features and concepts that have been used by commercial offerings for years to improve translation efficiency and quality. The suite is modular to make it flexible and provides a XLIFF Editor, translation memory engine and filters to convert different formats to and from XLIFF. The use of XLIFF means that almost any content can be localized as long as there is a filter for it (XML, SGML, PO, RTF,StarOffice/OpenOffice)."

  • Swordfish. is a cross-platform CAT (Computer Aided Translation) tool based on XLIFF 1.2 open standard published by OASIS. Swordfish supports TMX (Translation Memory eXchange), the vendor-neutral open XML standard for the exchange of Translation Memory (TM) data created by Computer Aided Translation (CAT) and localization tools, published by LISA (Localisation Industry Standards Association).


General: News, Articles, Reports, Drafts

  • [February 11, 2008] New OASIS Standard: XML Localization Interchange File Format (XLIFF) v1.2 — OASIS has announced the approval of the XML Localization Interchange File Format (XLIFF) specification Version 1.2 as an OASIS Standard. The specification was produced by members of the OASIS XML Localisation Interchange File Format (XLIFF) Technical Committee. The purpose of the XLIFF vocabulary is to store localizable data and carry it from one step of the localization process to the other, while allowing interoperability between tools. The specification is tool-neutral, supports the entire localization process, and supports common software, document data formats, and markup languages. The specification provides an extensibility mechanism to allow the development of tools compatible with an implementer's data formats and workflow requirements. The extensibility mechanism provides controlled inclusion of information not defined in the specification. The XLIFF file format serves as a container for externalized data to be interchanged between software publishers, documentation writers (including, but not limited to documents written in DITA, Docbook, HTML, and other XML document formats), localization tools, and software services providers in order to facilitate all the phases of the localization process.

  • [February 08, 2008] "XLIFF: An Aid to Localization." By John Corrigan and Tim Foster. Sun Developer Network (SDN). "Translators today can expect to receive documents for translation in any one of several formats... From a translator's point of view, this is quite a difficult mix to deal with. You would need to maintain several editing tools, be proficient in many file formats (knowing the syntax and grammar of each type), and that's before you've even started to translate the content. As a localization engineer, a similar problem exists: it's difficult to write tools for each file format. For example, if your boss asks you to calculate the number of new words for translation between the last delivery and the current one, you need a tool capable of dealing with all formats or a separate tool for each format. Normally during localization, files are processed by tools such as translation memories and machine translation systems. Translation memory systems, known as TM systems, work by looking up segments in a database containing a large number of previously translated segments and their translations. (Segments are pieces of source files, usually sentences, that can be translated reasonably independently.) The database might contain segments that match the input segment exactly or segments that are similar to the segment presented for translation. These translations are then provided to the translator as suggested translations for each segment... XLIFF is an XML-based format that enables translators to concentrate on the text to be translated. Likewise, since it's a standard, manipulating XLIFF files makes localization engineering easier: once you have converters written for your source file formats, you can simply write new tools to deal with XLIFF and not worry about the original file format. It also supports a full localization process by providing tags and attributes for review comments, the translation status of individual strings, and metrics such as word counts of the source sentences. XLIFF aids localization in a number of ways. (1) XLIFF removes the complexities of localizing different types of source files. (2) XLIFF provides a common platform for localization tools vendors to write to, thus increasing the number of tools available. (3) XLIFF highlights the parts of a file that are important to the localization process. (4) XLIFF provides support to the localization process, through its commenting features, support for phases, and metrics... See also "Using Translation Technology at Sun Microsystems" (also PDF, cache).

  • [January 15, 2008] XLIFF 1.2. White Paper. Covers Version 1.2 of the XML Localization Interchange File Format (XLIFF). Revision: 1.0. Issue Date: October 17, 2007. 34 pages. Submitted to the XLIFF TC document repository by Bryan S. Schnabel on January 15, 2008. ['This white paper is provided as a high level guide to anyone who seeks to better understand XLIFF in general terms, with particular emphasis on XLIFF 1.2's features. It provides an introduction and overview of XLIFF 1.2. It includes an architecture overview, how to use XLIFF, use cases, a case study, resources and tools, and contact information.'] "XLIFF was designed as a solution to the complex documentation and technical communication localization process. Challenges in translating proprietary word processor and desktop publishing files, and even challenges having to do with the age-old task of localizing text strings in graphics showed proved to be a potential good fit for XLIFF. XLIFF reduces this complexity of localising software by providing a standard, XML-based, end-to-end, tool neutral resource container. Software and documentation publishers can extract their localizable content into XLIFF and localize them using shrink-wrapped tools solutions, customized tools or automated enterprise workflow systems. Additional process efficiency is achieved by XLIFF's built-in support for Computer Aided Translation technologies such as translation memory and machine translation. XLIFF is based on the concept of extracting the source localization-related data from the original format, and merging it back in place after the localization has been done. Depending on the extract/merge method, the parts that are not related to localization can be preserved temporarily into the Skeleton. Or, usually when the source is already XML, the non localizable parts can be preserved within the XLIFF hierarchy using 'group' elements to preserve the hierarchy. There are no rules to date on how to represent the data in the Skeleton itself, this is left to the discretion of the filters. XLIFF 1.2 focuses on how to store and organize the extracted parts. Skeletons can be either embedded directly in the XLIFF document with the 'internal-file' element or simply referred to with the 'external-file' element... Inline codes (e.g., markers for bold or italics, links information, or image references) can be represented using either an encapsulation mechanism or a placeholder method. Those are derived respectively from TMX (LISA's Translation Memory Exchange Standard), and OpenTag, a localization data container..." [source]

  • [January 02, 2008] OASIS XLIFF Version 1.2 to be Considered for Standardization. Staff, OASIS Announcement. Members of the OASIS XML Localization Interchange File Format (XLIFF) Technical Committee have submitted an approved Committee Specification document set for XLIFF 1.2 to be considered as an OASIS Standard. The XLIFF 1.2 Specification defines the XML Localization Interchange File Format (XLIFF), designed by a group of software providers, localization service providers, and localization tools providers. The purpose of this vocabulary is to store localizable data and carry it from one step of the localization process to the other, while allowing interoperability between tools. It is intended to give any software provider a single interchange file format that can be understood by any localization provider. The specification is tool-neutral, supports the entire localization process, and supports common software, document data formats, and markup languages. The specification provides an extensibility mechanism to allow the development of tools compatible with an implementer's data formats and workflow requirements. The extensibility mechanism provides controlled inclusion of information not defined in the specification. XLIFF is loosely based on the OpenTag version 1.2 specification and borrows from the TMX 1.2 specification. However, it is different enough from either one to be its own format. The Version 1.2 specification set includes a Core prose document, XML schemas (strict, and transitional), a Representation Guide for HTML, a Representation Guide for Java Resource Bundles, and Representation Guide for Gettext PO (defines a guide for mapping the GNU Gettext Portable Object file format to XLIFF). Statements of successful use are provided by Lionbridge Inc., SDL International, OSCAR, LISA, Idiom Technologies Inc., and Localisation Research Centre (LRC), University of Limerick.

  • [September 28, 2007] "XLIFF in the Localisation of Open Source Software. One step forward, two steps back?" By Asgeir Frimannsson Presentation. CRICOS No. 00213J. 59 slides. Presented at LRC XII — The Localisation Research Forum. The 12th Annual Internationalisation and Localisation Conference organised by the Localisation Research Centre (LRC) with the Global Initiative for Local Computing (GILC). 26-28 September 2007. European Foundation, Loughlinstown, Dublin, Ireland.

  • [April 06, 2007] "New module: XLIFF Tools." By Gábor Hojtsy. Blog. "By looking at what people described as their use case, there is a considerable amount of interest in a CAT (Computer Aided Translation) support tool in Drupal. While Drupal 6 could (and will by default or with a contrib module) provide a translation interface for nodes, content created in professional systems is often not translated inhouse, translation work is outsourced to professional translators. These professionals employ tools to remember previous translations, build on existing terminology and reuse what is already done. The industry standard for data interchange with these tools is the XLIFF format, for which fortunately Bryan Schnabel developed (and released under GNU GPL) some XSL transformations I was able to reuse. All this resulted in the first implementation of XLIFF Tools, a module to export and import XLIFF data of Drupal nodes. I have tested some basic content nodes with Heartsome's XLIFF Translation Editor, and everything seems to be working smoothly, but there are probably hard edges you will be able to find if you try out this new tool. A development snapshot for Drupal 5.x will be available as soon as the build system generates the tarball... Compare Linking Translation Tools and Drupal from 2007-07-11.

  • [February 21, 2007] "OAXAL: Open Architecture for XML Authoring and Localization." By Andrzej Zydron. From XML.com (February 21, 2007). Related draft, as contributed to the OASIS DITA TC, in PDF format, from Word/DOC source. See XLIFF in Figure 1, Object 1 along with W3C ITS, Unicode TR29, SRX (Segmentation Rules eXchange), GMX (Global Information Management Metrics Exchange), and TMX (Translation Memory eXchange) — part of an elegant and integrated environment for document creation and localization. xml:tm mandates the use of the OASIS XLIFF standard for extracting the text for the actual translation process. "XML is now acknowledged as the best format for authoring technical documentation. Its wide support, extensible nature, separation of form and content, and ability to publish in a wide variety of output formats such as PDF, HTML, and RTF make it a natural choice. In addition, the costs associated with implementing an XML publishing solution have decreased significantly. Nevertheless, there are some clear do's and don'ts when authoring in XML, some of which are detailed in Coping with Babel, a paper from the XML 2004 conference. XML, thanks to its extensible nature and rigorous syntax, has also spawned many standards that allow the exchange of information between different systems and organizations, as well as new ways of organizing, transforming, and reusing existing assets. For publishing and translation, this has created a new way of using and exploiting existing documentation assets, known as Open Architecture for XML Authoring and Localization (OAXAL). OAXAL takes advantage of the arrival of some core XML-related standards: (1) DITA — Darwin Information Typing Architecture from OASIS; (2) xml:tm — XML-based text memory from LISA OSCAR. DITA is a very well thought-out way of introducing object-oriented concepts into document construction. It introduces the concepts of reuse and granularity into publishing within an XML vocabulary. It is having a big impact on the document publishing industry. xml:tm is also a pivotal standard that provides a unified environment within which other localization standards can be meaningfully integrated, thus providing a complete environment for OAXAL. OAXAL allows system builders to create an elegant and integrated environment for document creation and localization. The OAXAL model provides full process automation, right up to delivering matched files to the translator. Automation eliminates the costs associated with project management and manual processes. Data gets processed faster and more efficiently and without the costs associated with a traditional localization workflow..." PDF source: See the posting by Andrzej Zydron with the document title "Making Effective Use of XML for Publishing" [ZIP file, Word/DOC] and copy/ZIP, cache.

  • [February 16, 2007] "The Loneliness of the Long Distance Standards Committee." By Peter Reynolds. Blog. "Sometime soon, at least I hope soon, XLIFF will be published as a standard for localization file interchange by OASIS. After six years of work the finishing post is in sight, specifications have gone through reviews by our peers, comments have been made and responded to and we now have something that helps solve problems faced by those who translate their electronic data. Although it's not really a finishing post. When we have got past the stage of releasing XLIFF 1.2 as an OASIS standard we are starting to work on the next version. Now seems like a good time to look back on this committee... At the early stage we had two names. The first was data Definitions and the second was LIFF (Localization Interchange File Format). Data Definitions was a very early name which only lasted because we had called our yahoo group by this name. LIFF seemed to perfectly explain what we wanted to achieve. However, there was one problem with LIFF. Leeds International Film Festival was already using this acronym so we put an X for XML in front of LIFF and came up with XLIFF..."

  • [January 5, 2007] "Moravia Worldwide and Tektronix Implement XLIFF-based Localization Workflow." Moravia IT. Announcement. January 5, 2007. "Moravia Worldwide announced the joint development and implementation of a technical authoring, localization and publishing workflow based on open standards such as XLIFF (XML Localization Interchange File Format) and XML. This single source/multiple target file solution connects technical writers, translators and publishers in a streamlined process. This process uses customized deployment of industry standard tools such as Arbortext Epic Editor, and development of specific extensions which utilize the benefits of the XLIFF-enhanced XML files throughout. The combined solution meets the objective of delivering more localized content for Tektronix products, shortens turnaround times, and reduces the overall costs associated with authoring, localizing and ultimately publishing the content. The major benefit of this new solution lies in the way document translation is managed in XML. Each XML element is individually tracked through the translation process, whether the entire document is being translated anew or whether only a few elements need to be changed to update the document. In the case of an update, the translators still have the complete document to work with, which gives them the overall context they need to do a consistent translation. Another benefit is its tight integration of the localization function into the writers' XML working environment. Writers can process several different XML document types into a common XLIFF interface and then back again into publishable documents after the translation is complete. Moravia only needs to support one format, XLIFF, rather than the growing range of document types that Tektronix is developing in XML. The workflow simplifies the process for submitting content for localization, and provides an automated and immediate conversion from XLIFF to an Adobe Acrobat PDF file for a streamlined language and layout review. This XML process helped to radically transform and simplify desktop publishing requirements, thanks to the capacity for Epic Editor to enable robust automated formatting options for the XML that produce correct layouts automatically. In addition, the process ensures that all the localized content is immediately available in the XML repository and can be easily re-used in new product documentation, further saving on translation costs and time..."

  • [September 19, 2005] "What Is XLIFF and Why Should I Use It? A Brief Overview of the XML Localization Interchange File Format (XLIFF)." By Peter Reynolds and Tony Jewtushenko. From SYS-CON XML Journal. "Localization includes not only translation of the displayed text, but also adaptation of a product to comply with a country's cultural and legal practices. Examples of cultural conventions include date/time formats, postal address formats, font sizes, appropriateness of colors, numeric or currency formats and symbols, culturally appropriate icons or graphics, etc. The diversity of software platforms and technologies means that tools and technologies that support localization are also diverse and are frequently incompatible with each other. Industry standards drive process and technology efficiencies, and OASIS XLIFF (XML Localization Interchange File Format) has emerged as a standard interchange file format for localization-related data and metadata. This article will introduce the process of localization and summarize the challenges and issues facing those who localize. It will illustrate how XLIFF addresses many of the challenges and issues with descriptions of its architecture, provide examples of how to use it in real life, and discuss how it was developed and where it goes from here..."

  • [August 2005] "An Introduction to XLIFF." By Tony Jewtushenko. Director R&D, Product Innovator Ltd. also, Chair, OASIS XLIFF TC. 73 slides. "The XML Localisation Interchange File Format is a specification for the lossless interchange of localizable data and its related information, which is tool-neutral, has been formalized as an XML vocabulary, and features an extensibility mechanism. An XLIFF document can capture anything needed for a localization project: (1) Localizable objects (e.g. text strings) in source and target languages. (2) Supplementary information (e.g. glossaries, or material to recreate the original format). (3) Administrative information (e.g. workflow data). (4) Custom data (e.g. initialization information for tools). XLIFF allows not only text string as localizable object but also other object types such as graphics. Supplementary information can be represented in a generic way through inline codes (e.g. formatting of text). Relationship between object can be captured (e.g., all items in a menu). XLIFF provides hooks for storing supplementary information (for example to glossaries or translation memories which should be used). The supplementary information can be referenced (i.e., reside outside of the document), or embedded within the document. XLIFF provides mechanisms for capturing administrative information: relating source material to XLIFF documents; storing workflow data; providing pre-translation entries; keeping track of changes..." Note: this presentation also includes listings for XLIFF Editor Support (with filters), Software Publisher Support for XLIFF, Open Source Tools Support for XLIFF.

  • [August 2005] "Implement support for OASIS XLIFF 1.1 in KBabel." By Asgeir Frimannsson. Google Summer of Code 2005 Projects. "Localisation of open source (including KDE and Gnome) is in a majority of projects handled by the GNU Gettext library and the associated PO file format. The PO file format is a simple string table used by translators to translate the English sources to their native language. Several tools exists for editing PO files, and the most used (and most advanced) of these is KDE's KBabel. In recent years, industry standards have been developed in the area of software localisation, including the OASIS XLIFF file format for storage and exchange of localisable resources in the localisation process. The philosophy behind XLIFF is to extract resources from native formats into a common standard localisation format (XLIFF), and merge the translated resources back into the native format when translation is complete. Filters and specifications for converting to and from XLIFF have been developed for a number of file types, including PO, HTML and DocBook... By implementing support for the industry standard XLIFF format in KBabel, the KDE project will have a fully fledged native XLIFF localisation tool. As we move towards and beyond KDE 4, we then have a stronger argument for considering adopting XLIFF in KDE's localisation process, with its inherent benefits..."

  • [October 22, 2004] "XML in Localisation: Use XLIFF to Translate Documents." By Rodolfo Raya (Director of Product Development, Heartsome Holdings Pte. Ltd). From IBM developerWorks. "The first article in this series briefly explained the most relevant XML standards used in the localisation industry. This second part focuses on XML Localisation Interchange File Format (XLIFF) and explains with practical examples how to use it for translating different kinds of documents. This article presents a step-by-step guide to translating multilingual documents using XLIFF as an intermediary file format, and provides useful resources for localizing Java applications. XLIFF is a format that's used to exchange localisation data between participants in a translation project. This special format enables translators to concentrate on the text to be translated, without worrying about text layout. The XLIFF standard is supported by a large group of localisation service providers and localisation tools providers..."

  • [August 2004] "XML in Localisation: A Practical Analysis. An overview of the most relevant XML standards used in the localisation industry." By Rodolfo M. Raya (Director of Product Development, Heartsome Holdings Pte. Ltd). From Maxprograms Software Development and Consulting Services; first published by IBM developerWorks, August 2004. "This article focuses on the most common XML formats used in the localisation industry to show you how important XML is becoming in multilingual document exchange. XLIFF, a common format One of the advantages of XLIFF is its relative simplicity. An XLIFF file can be described as a collection of translation units. Each translation unit contains a sentence or paragraph that's extracted from the original document in an element called 'source', and the translator has to fill a 'target' element with the appropriate translation. Legacy translations from previous projects can be added to a new translation unit using 'alt-trans' elements. Translators can use these translations as a guideline. Sometimes the translation in an 'alt-trans' element is perfect, and all the translator has to do is accept the suggested text..."

  • [April 02, 2004] Localizing with XLIFF and ICU. By Ram Viswanadha and Markus Scherer (IBM Corporation). Tutorial given at the Unicode 25 Conference, Washington D.C., USA (March 31 - April 2, 2004). "XLIFF, a format designed by localization industry experts for solving problems faced by translators, is an emerging industry standard for authoring and exchanging content for localization. This talk presents an overview of how ICU facilites the localization of a product using XLIFF. A process for managing the localization is defined proposed. Finally, a case study of a localizing a product adhering to the process is presented. This talk discusses software localization and some of the file formats and processes that are involved. Platform-specific formats are contrasted with an emerging industry standard that is designed for efficient localization. Localizing applications involves separating user interface elements from the source code and translating these elements. These user interface elements are called resources. Many formats exist for representing resources. Different platforms and technologies provide different formats, e.g.,: VC++ RC files, Java ResourceBundles, POSIX message catalogs, ICU resource bundles etc. Every format is designed for a specific purpose and a platform. The translators, who usually are non-programmers, have to deal with these formats for translating the content. Tools available for assisting translators support some of the formats..." See the description of the XLIFF2ICUConverter download.

  • [August 2003] Enabling Language Translation with XML Tools and Standards." By Bryan Schnabel (XML Information Architect, Tektronix, Inc). Edited by Gail Toft-Vizzini. From The Center for Information-Development Management, Best Practices Newsletter, Volume 5, Number 4, August 2003. Reprinted with permission. Maintaining consistency between a source document and its translated counterparts can be complex and troublesome. Innumerable challenges can arise with character sets, version control, text in graphics, tables, expansion of text, updates, and so on. Using XML for translation can help overcome some of these challenges. In this article, I explain how XML tools and standards can help remedy tricky issues related to translation... XLIFF (XML Localization Interchange File Format) defines a specification for an extensible localization interchange format that will allow any software provider to produce a single interchange format that can be delivered to and understood by any localization service provider. The format is tool-independent, standardized, and supports the entire localization process. XLIFF stored information in an XML document (a non-proprietary document format); it can be opened, manipulated, and saved in popular XML packages like ArborText, XML Spy, and NotePad. XLIFF is being developed by translators, tool vendors, and documentation experts..." [source]

  • [August 27, 2003] "Translating XML-Based Documents." By Andrzej Zydron. From SYS-CON SOA World Magazine. "The advent of text in electronic format poses a number of problems for translators. These problems were: 1. How to manage the differing encoding standards and their corresponding font support and availability; 2. How to present the text to translators without having to purchase additional copies of the original creation program; 3. How to translate the text while preserving the formatting; 4. How to build translation memories for these documents to reduce the cost of translation and improve consistency... xml:tm radically changes the approach to the translation of XML-based documents. It is an open standard created and maintained by XML-Intl, for the benefit of those involved in the translation of XML documents. xml:tm is an open standard created and maintained by XML-Intl based on XML and XLIFF. Full details of the xml:tm definitions (XML Data Type Definition and XML Schema) are available from the XML-Intl Web site... XLIFF is an OASIS standard for the interchange of translatable text in XML format. xml:tm translatable files can be created in XLIFF format. The XLIFF format can then be used to create dynamic Web pages for translation. A translator can access these pages via a browser and undertake the whole of the translation process over the Internet. This has many potential benefits. The problems of running filters and the delays inherent in sending data out for translation, such as inadvertent corruption of character encoding or document syntax, or simple human workflow problems, can be totally avoided. Using XML technology it's now possible to reduce and control the cost of translation as well as reduce the time it takes for translation and improve reliability... Using XLIFF you can protect the original document syntax from accidental corruption during the translation process. In addition, you can supply other relevant information to the translator such as translation memory and preferred terminology."

  • [June 10, 2003] "Internationalization: Implementing the XLIFF Standard." By Jon Allen. Presentation at the uPortal and JA-SIG Summer Conference 2003. June 10, 2003. This presentation describes a workflow based on ANT and XSL stylesheets for converting XML documents to XLIFF.

  • [February 05, 2003] "An Introduction to Using XLIFF. Technical Aspects and Implementation of XML Localisation Interchange File Format." By Yves Savourel (RWS Group). In MultiLingual Computing and Technology #54, Volume 14, Issue 2 (March 2003), pages 28-34. "This article will give you a technical overview of XLIFF, describe the different parts, and explain how they fit together." The OASIS XML Localization Interchange File Format Technical Commmittee has been chartered to "define, through XML vocabularies, an extensible specification for the interchange of localization information. The specification will provide the ability to mark up and capture localizable data and interoperate with different processes or phases without loss of information. The vocabularies will be tool-neutral, support the localization-related aspects of internationalization and the entire localization process. The vocabularies will support common software and content data formats. The specification will provide an extensibility mechanism to allow the development of tools compatible with an implementer's own proprietary data formats and workflow requirements."

  • [November 12, 2002] "XLIFF: Update on the XML-based Localisation Interchange File Format." By Peter Reynolds (Manager, Development Team, Bowne Global Solutions) and Tony Jewtushenko (Oracle and Chair XLIFF Technical Committee, OASIS). Presented Tuesday, 12-November-2002 at the LRC 2002 Conference on eContent Localisation, Dublin, Ireland. 39 pages. [source .PPT]

  • [November 05, 2002] "OASIS-LISA Global e-Biz Survey." By Patrick Gannon (OASIS President and CEO). Summary of results from the OASIS-LISA Global eBusiness Survey. Source posted to the OASIS XML Localization Interchange File Format TC mailing list by Jonathan Clark. Slides presented at the XLIFF TC face to face meeting November 4, 2002. The subject of the survey was Global eBusiness and Web Services requirements with particular reference to Language Processing Standards. It was designed to "determine the impact of multi-lingual technologies on global e-business is the latest product of cooperation between two international organizations developing complementary standards for localization. The survey is structured in two parts: The Business Process section is for managers responsible for international product, services, sales or support operations; the Localization Expert section is designed for product, web services, development and standards professionals." See the announcement: "OASIS and LISA Collaborate on Global e-Business Survey. New Study to Determine Impact of Multi-Lingual Technologies." [source .PPT]

  • [February 18, 2002] "Internationalization Features in XML and XLIFF. Extensible Markup Language and XML Localization Interchange File Format are Powerful Tools for Multilingual Applications." By Ultan Ó Broin (Oracle Corporation). In MultiLingual Computing and Technology Volume 13 Issue 2 [#46] (March 2002), pages 53-55. ISSN: 1523-0309. "In this article I will first look at the internationalization features of XML in terms of what content development teams must do to provide for character set encodings, character representation, language identification, and the presentation and rendering of global content in different languages. Then I will look at what development teams must do to facilitate the localization of XML content and how XML features enhance the localization process... The best way to provide for localization of XML is to use the XML Localisation Interchange File Format (or XLIFF). XLIFF is an XML-based file format for the exchange of localization data, based on OpenTag 1.2 and including features of TMX. It was developed by a group of localization partners including Oracle, Novell, IBM/Lotus, Sun Microsystems, Alchemy, Berlitz, LionBridge, Moravia-IT, and the RWS Group. XLIFF is now maintained under the aegis of the Organization for the Advancement of Structured Information Standards (OASIS). XLIFF defines a specification for an extensible format that caters specifically for localization requirements. It allows any software publisher to produce a single interchange format understandable by any localization service provider. It requires that the format should be tool independent, standardized, and support the whole localization process. The XLIFF data format successfully meets the goal of the separation of localization data and process, providing a focus on automation, stopping the proliferation of internal XML formats, and turning localization into a commodity for all players. Software publishers are freed to focus on producing international products and vendors are freed to focus on translating without managing multiple translation tools or file formats." [excerpt provided by the author]

  • [November 09, 2001] XSL Template Collection for XLIFF/TMX. November 09, 2001. 'A set of XSL templates to execute various tasks. It includes for example: XLIFF to Java properties file conversion, XLIFF to TMX, TMX to tab-delimited, Leveraging of existing translation into an XLIFF document, conversion to UTF-8 encoding for any XML document, etc.' From the posting of Yves Savourel 2001-11-09 (and see the README): "...a note to let you know that there is now a small collection of XSL templates freely available that offers utilities for XLIFF and TMX. For now it includes 6 templates: (1) LeverageXLIFF.xsl - Leverages the existing translation of a XLIFF document into a newer XLIFF document. (2) XLIFFToPO.xsl - Converts the <target> elements of an XLIFF document into a PO (Portable Object) file. (3) XLIFFToProperties.xsl - Converts the <target> elements of an XLIFF document into a Java properties files. (4) XLIFFToTMX.xsl - Converts an XLIFF document into a TMX document. (5) TMXToTDF.xsl - Converts the entries of a TMX document into a tab-delimited file. (6) ToUTF8.xsl - Converts any XML file into UTF-8 encoding. More will come later (suggestions are welcome)..." [cache]

  • XLIFF Settings Files. Jul-01-2001. Settings files (.anl and .ini files) to translate XLIFF documents with SDLX or TagEditor. [cache]

  • XML Internationalization and Localization. By Yves Savourel. Sams Publishing. Published June 26, 2001. ISBN: 0-672-32096-7. Chapter 17 of the book deals extensively with XLIFF, 'XML Localisation Interchange File Format'. See the book summary and extract.

  • XLIFF White paper June 11, 2001. [cache]

  • See also: "OpenTag Markup Format."

  • See also: "Translation Memory Exchange."


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI: http://xml.coverpages.org/xliff.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org