ISO/TC37/SC2/WG1 N77
 
Date of presentation 2001-08-13
Proposer BSI
 
 
Draft technical report:
 
Development and Application of ISO 639
in the identification, classification and
alphanumeric coding of the
world's languages
 
 

Contents

1  Introductory Note
2  Clarification of Terms and Categories
3  The Global Context
4  The Proposal
5  Towards a Global Public Resource

Bibliography

Appendix 1  ISO 639 Codes correlated with the Linguasphere Referential Framework (extract A-H)
Appendix 2  Linguasphere Referential Framework of 10 Sectors and 100 Zones
Appendix 3  Chart of the World's Arterial Languages (printed as 2 pages in landscape view)
Appendix 4  Global Language Index (sample extract)

 

1  Introductory Note

There is an established need for a standardised system of codes for the tagging and identification of the world's languages.  Variation still exists, however, in the form of language codes used by different organisations and in different countries.  The ISO 639 codes provide the base for standardisation in this field, although they at present cover only a small proportion of the world's languages.  These ISO language codes also exist in 3 different versions, the ISO 639-1 two-letter code, and the ISO 639-2/T and 639-2/B three-letter codes (as designed for terminological and bibliographical use, respectively).

A fully classified inventory of the world's languages and speech communities was published in 1999/2000, including a coded index of over 71,000 names (Linguasphere Register of the World's Languages and Speech Communities, see Bibiography).

The following proposal outlines how the 3 versions of the ISO 369 codes may be unified as a single standard, and how the formal linking of this standard with the Linguasphere zones of reference would create an alphanumeric Global Identification Code (GIC) with increased informational content and inbuilt protection from error.

2  Clarification of terms and categories

2.1 Classification codes, identification codes and referential codes

A clear distinction needs to be maintained among 3 forms of language code:

2.1.1  modifiable classification codes or "relationship scale", recording proximities of interrelationship among languages but subject to modification as research progresses;

2.1.2  fixed identification codes or "language tags", enabling individual languages to be identified without ambiguity; and

2.1.3  a stable "referential framework", providing a meeting-point for the correlation of classification codes and identification codes (as in the proposed Global Identification Code).

2.2  Language names and umbrella names

A clear distinction needs to be maintained between:

2.2.1  language and dialect names as applied to individual spoken and/or written varities of language; and

2.2.2  umbrella names, often artificially created, covering groups or families of related languages (the treatment of which has not been presented in the following pages, for reasons of time and space).

2.3  Languages and dialects

The continuum which frequently exists among adjacent forms of speech means that it has always been difficult define the boundary between usage of the terms "language" and "dialect".  The situation is eased by recognising that many languages are better analysed and distinguished in terms of three (rather than two) layers of immediate relationship.  These layers are best explained by reference to specific examples:

2.3.1  outer language, as applied, for example, to the totality of the Welsh language in all its spoken and written forms;

2.3.2  inner language, as applied to the 3 major components of the modern Welsh (outer) language: literary Welsh (as written, and progressively standardised, in recent centuries); northern spoken Welsh (in north Wales); and southern spoken Welsh (in south Wales);

2.3.3  dialect, as applied to distinct varieties of written Welsh (e.g. Bible or "pulpit" Welsh) or to local varieties of northern or southern spoken Welsh (e.g. Anglesey Welsh in the north, or Pembrokeshire Welsh in the south).

2.4  Spoken languages and standard written languages

It is of great importance that a clear distinction be maintained between:

2.4.1  spoken languages and their dialects, which may also be written (in dialect literature, or in phonetic transcriptions, for example).

2.4.2  standard(ised) written languages, which have acquired a status independent of the spoken word but which may themselves be spoken (in speech which is modelled on the written tradition of a language).  Part of the present proposal is that the 2-letter codes of ISO 639-1 should be formally recognised as designating the relevant standard written languages (e.g. en for Standard English), in contrast to the general coverage of the 3-letter codes of ISO 639-2 (e.g. eng for English in any or all its forms).


3  The Global Context

 

The objective of clearly identifying all the languages and speech communities of humankind, regardless of their demographic size, is today clear, attainable and of global importance. 

3.1  The Twenty-first century perspective

At the onset of the twenty-first century, humankind is aware of itself as a single planetary community, with means of instant global communication and of increasing global planning and coordination.  Languages are the key to that communication and coordination. 

For the first time, the languages of the world may be viewed as integral parts of humankind's greatest and most fundamental creation, the continuous global web or "linguasphere" of human speech and writing. 

Languages no longer need to be listed and catalogued as a vast array of independent objects, belonging to rival and often warring communities.  They can now be viewed and classified as integral parts of a collective human heritage.

The classification of languages has until now been the preserve of erudite specialists, often tracking down words of ancient languages in the pursuit of evidence about the human past. 

Today, however, the classification of modern languages has a direct relevance to the way humankind perceives and organises itself as a single global and multilingual community.  Individual languages can now be perceived, not as the individual creation and property of specific communities, but as mutable and interrelating subsystems within a vast global kaleidoscope of words, grammatical rules, speech sounds and elements of writing.

All languages have benefitted or may potentially benfefit from the modern communications revolution in two fundamental ways:

·         The recording and global transmission of the spoken word allow any spoken language to share the advantages previously reserved to written languages, enabling even small speech communities to maintain worldwide spoken contact.

·         The instant transmission and exchange of the written word allow any written language to share the advantages previously reserved to speech, encouraging even children to use writing (instant messages by phone and computer, and e-mail) as an integral part of their social life.

3.2  The Need for the Identification of Languages within a Referential Framework

Any system of linguistic classification needs to contain an element of fluidity, in order to deal not only with the fundamental nature of the linguasphere but also with a still expanding knowledge of its complexity.

At the same time, it is necessary that the identifiable written and spoken languages of human communities be clearly and unambiguously catalogued and identified, from the international use of English or French to the unique speech of an isolated village in central Africa.

It is important to be aware of this contrast between (a) the need for fluidity in establishing and updating a sliding scale of linguistic interrelationships, and (b) the need for stability in identifying the individual spoken and recorded languages of humankind.

The primary objective in this field is therefore to complete a standardised international system of identification codes for the unambiguous tagging of all known forms of spoken and written languages, alive or recorded from the past, and for the correlation of those fixed tags to a separate scale of linguistic interrelationships.

 

4  The Proposal

 

4.1  The Institutional Background

The first comprehensive coded and classified inventory of the languages and speech communities of humankind during the 20th century was completed in December 1999 and published in Wales in 2000. 

This Linguasphere Register of the World's Languages and Speech Communities provides a referential framework for the location and classification of over 22,000 identifiable varieties of speech and writing.  The Linguasphere Register is supported by a unique and expandable Index of over 71,000 linguistic and ethnolinguistic names, each classified and coded within the referential framework, using comprehensive scale of linguistic interrelationships.

The agency responsible for compiling and maintaining the Register is the Linguasphere Observatory (www.linguasphere.org), a transnational research network devoted to the study and maintenance of multilingualism.  Conceived in Canada in 1983, the Observatory was established in France during the 1980's.  During the 1990's, it has worked in close collaboration with the University of London's School of Oriental and African Studies, and has been directed from bilingual Wales since 1995, with scientific support from Russia, India and the United States.  See further details at the end of section 3 of this paper.

In July 2001, the BSI (British Standards Institution) requested the Linguasphere Observatory to make a firm proposal for the establishment of a standardised alphanumeric coding system covering all the world's languages, based on existing and future codes of ISO 639 and correlated with the referential framework and relationship scale of the Linguasphere Register.

4.2  The Technical Background

The ISO Alpha-2 and Alpha-3 Codes for the Representation of Names of Languages (ISO 639) are complementary in purpose and form to the Numeric-2 Code employed for the Linguasphere Referential Framework (LRF) of the world's languages. 

ISO 639 provides 2-letter or 3-letter tags (or "standardised abbreviations") for the identification of specific languages and groups of languages, whereas the LRF provides 2-digit tags for a referential inventory of the world's languages within 10 sectors (1st digit) and 100 zones (2nd  digit).

4.2.1  ISO 639 Codes

The ISO 639 codes for a range of the most commonly encountered names of languages (and groups of languages) are presented in the International Organisation for Standardisation's Code for the representation of names of languages (1998), an are available online at /http://lcweb.loc.gov/standards/iso639-2/langhome.html/.

ISO 639-2, originally devised for use in library systems, now exists in slightly divergent forms, known as ISO 639-2/T (terminology) and ISO 639-2/B (bibliographic).  Although the 3-letter codes of ISO 639-2 could provide codes for 26x26x26 languages, limits specified in the standard currently restrict the creation of new codes to languages with a substantial body of literature.  If rigorously applied, this restriction limits the more generalised use of the IS-639 codes, particularly in ICT usage. 
As a result, some ICT users – including ministries and official agencies - have either made use of the SIL (Summer Institute of Linguistics) codes, or have developed their own coding systems, notably the OpenType specifications (OT) used in font and rendering technologies.  Such variant codes have been developed in certain countries, including the UK, Sweden and Germany, which in some cases have caused clashes in bibliographic information interchange.

4.2.2  Linguasphere Codes

The Linguasphere 2-digit code is defined for over 22,000 modern languages and dialects, and for their historical forms where relevant, in the Linguasphere Register of the World's Languages and Speech Communities (published in 2 volumes by Linguasphere Press, Hebron, Wales 2000).   The Linguasphere code (digital Reference Framework, plus alpha Relationship Scale) is discussed and exemplified on the Linguasphere Observatory website (http://www.linguasphere.org). 

The fully coded Linguasphere classification and annotation of the world's languages is already available with limited access online (http://www.linguasphere.net) and is to be made freely accessible as a public resource within the next year (see section 5 below). 

Appendix 1 to this paper displays the 2-digit tags of the Linguasphere Reference Framework as an additional column in the listing of ISO 639, as exemplified by the letters A-H in an alphabetical arrangement by English names of languages.  Appendix 2 lists and explains the Linguasphere numerical tags, with their linguistic and/or geographical applications.  Appendix 3 provides a table of the world's arterial languages (each reaching over 1% of the world's total population), with the relevant ISO 639 and Linguasphere codes.  Appendix 4 presents an extract from the proposed Global Language Index, which is available as a starting point for the systematic extension of ISO 639 codes.

4.3  Formulation of the Proposal

It is proposed by the British TS/1 Committee and the British Standards Institution  that a standard Global Identification Code for all known languages and speech communities be established, by the prefixing of the 2 digits of the Linguasphere Referential Framework (LRF) to the 2 or 3 letters of ISO 639 codes and to the extension of those codes to cover all spoken and recorded languages.

The purpose of this proposal is NOT to create yet another method of coding, but to enable existing ISO (TC/37) standards to work more efficiently and accurately, and to be expanded systematically to cover all languages and speech communities.  The following pages outline how the 3 versions of the ISO 369 codes may be unified within this single standard, and how the formal linking of the ISO codes with the Linguasphere zones of reference would create an alphanumeric Global Identification Code (GIC) with compact informational content and inbuilt protection from error.

4.4  Practical considerations of the present proposal

Some of the problems hitherto associated with the codes of ISO 639, and with language identification in general, were discussed by Peter Constable and Gary Simons of SIL International in their paper Language Identification and IT: Addressing problems of linguistic diversity on a global scale, presented to the 17th International Unicode Conference (San Jose California, September 2000).

Their paper proposes the extension of the present Alpha-3 system to cover all known varieties of written and spoken languages in the world, following the example established in the SIL's Ethnologue (14th edition, 2000).   This proposal allows for the establishment of thousands of 3-letter codes to represent language names, not necessarily related in form to those names, but fails to address some of the fundamental problems of isolated 3-letter codes.

Constable and Simons (page 15) recognise the Linguasphere Register as "the only likely candidate" as an alternative to their proposed SIL system.


4.4.1  Advantages of the proposed Global Identification Code

The prefixing of the digits of the relevant LRF numeric code to a 2-letter or 3-letter form of an ISO 639 tag for a specific language would create a combined alphanumeric LRF/ISO tag or Global Identification Code.  This combined tag would assist in solving several existing problems of identification and referential classification.

In comparison with the existing ISO or SIL tags, the combined LRF/ISO tag would be:

·         more transparent, with the initial digit indicating one of five major affinities or one of five continental areas (e.g. 5 = Indo-European or 8 = native South American: see Appendix 2 to this paper);

·         more easily located, with the 2 digits indicating a linguistic group or area (e.g. 53 = Slavic or 87 = Amazon: see Appendix 2 to this paper);

·         more readily classifiable, together with the names of other related and/or adjacent languages, either within the same sector (first digit in common) or zone (both digits in common)

·         better protected against typographical error in the citation of tags (each alpha component being tied to a specific numeric component).

At the same time, the continued existence of a single series of unambiguous ISO 2-letter and 3-letter codes to identify the languages of the world would mean that the combined LRF/ISOtags could be abbreviated for practical purposes by the optional omission of the LRF numerical prefix. 

·         In such abbreviated usage, the invisible LRF code would still underlie any IT usage of the 2-letter or 3-letter components. 

·         The LRF numerical prefix would thus be available not only to classify language codes as required but, very importantly, to serve as a check against typographical error.  A mistyped 2-letter or 3-letter code would have only a 1% chance of matching the correct numerical prefix.

Use of combined LRF/ISO tags would also open the way to a more structured approach to the classification of "language names", which may be used to indicate a wide variety of different categories of language name with an identical form of 3 letters.

·         The existing ISO Alpha-3 codes may indicate either the name of a specific standardised language or of a wider "language" composed of two or more closely related spoken and/or written languages (e.g. the Ashkharik and Arewmta varieties of Armenian, or the Gheg and Tosk varieties of Albanian), or an historical and/or liturgical language (e.g. Church Slavonic), or a grouping of languages of undetermined dimension or nature (e.g. Athapascan languages, or "other" Austronesian, or "other" Creoles and pidgins).  See examples in Appendix 1.

In contrast, the use of combined LRF/ISO tags could be associated with the reconsolidation and extension of the 3 existing lists of ISO tags, to create a single, more coherent and explicit system.

·         In an increasingly internationalised world, it is appropriate that alpha codes for specific languages should be based wherever possible on the autoglossonym (or indigenous form of the language name) rather than on the English name (where this is different). 

·         In this respect, where ISO 639-2/B diverges from 639-2/T, the 2/T code is generally to be preferred (e.g. /eus/ in preference to /baq/ for Basque, for which the autoglossonym is Euskara).  It may be noted that the Language Register also gives precedence to autoglossonyms.

Most importantly, it would be important to distinguish between the application of LRF/ISO tags to specific objects (i.e. standard written languages) as opposed to "fuzzy" phenomena (i.e. non-standardised languages, or continua of closely related spoken languages or dialects) or to referential boxes created or identified for use in the classification of languages (i.e. language groups, families or categories or areas of languages, including the Linguasphere sectors and zones).

·         The most frequent use of tags to represent language names in IT is for the identification of specific standard written languages.  

à     For practical purposes, a standard written language may be defined as a language whose form is largely fixed by means of a system of graphic conventions, established and exemplified by the publication of a large corpus of texts (normally in thousands or more).

à     A standard written language may also have a spoken form, modelled largely on the use of the written form, in the same way that a spoken language may also have a written form, transcribing actual speech. 

·         Most such languages are already provided for under the ISO 639-1 Alpha-2 code, and it would be helpful if the use of combined LRF/ISO tags of the form Numeric-2 plus Alpha-2 could be specifically confined to the identification of standard written languages (including their standard spoken forms, wherever these are modelled on the written language).

·         The potential number of Alpha-2 codes (26 x 26 = 676) is adequate to retain the existing ISO 639/1 codes for standard written languages, regardless of the prefixed digits. 

·         If the linguasphere were one day to include more than 676 such languages, then it would be possible to duplicate some alpha codes under different digits. 

·         A more stable solution, however, would be to limit the use of Alpha-2 codes to a closed list of all those written languages standardised before the end of the 20th century.

·         This basic LRF/ISO tag "for the representation of the names of standard languages" would be only one character longer than the existing Alpha-3 tags, but would be considerably more systematic and rich in information, and more secure against typographical error.  Cf. 79zh for Standard Chinese (rather than /zho/ or /chi/ for all forms of Chinese) or 55sq for Standard Albanian (rather than /sqi/ or /alb/ for all forms of Albanian). 

·         The initial 7 or 5 locates Chinese and Albanian within Sino-Tibetan or Indo-European, respectively, and it would be useful to produce a list of LRF/ISO tags classified numerically by the LRF digits (alongside the existing ISO lists arranged by alpha code or by names of languages in English or in French).

·         The alphanumeric form of LRF/ISO tags would be readily identifiable as language codes within other text, as opposed to the potential confusion of some existing Alpha-3 tags with real words (e.g. /bug/ for Buginese or /got/ for Gothic).

In contrast, LRF/ISO tags applied to "other types" of language name (non-standardised languages, fuzzy continua or referential boxes) could be based on the Alpha-3 codes of ISO 639-2/T. 

·         This distinction between Alpha-2 and Alpha-3 codes would be useful in distinguishing standardised languages within fuzzy continua of spoken languages and dialects, e.g. between the varieties of standardised Norwegian (Bokmål = 52no or Norwegian Nynorsk = 52nn) and "wider" Norwegian in all its forms (= 52nor, which has fuzzy boundaries within the continuum of other spoken forms of Scandinavian languages).

There are a number of other more detailed points to be considered in the design of any improved codes for language names (including the treatment of historical languages, for example), which will would dealt with within the fully developed presentation of the proposed LRF/ISO system.   The Linguasphere Observatory looks forward to productive discussions on all aspects of the development of ISO 639, with members of BSI and ISO, and beyond.

 

5  Towards a Global Public Resource

 

5.1  Progress towards a tripartite global reference guide

The present proposal is designed to provide the key element in the production of a fully coded and interactive global reference guide to

·         the languages and speech communities of the world,

·         their established linguistic relationships,

·         their global corpus of linguistic and ethnic names, and

·         their geographic positions and demography.

This global reference guide would take the form of a freely available, independent and multilingual website, comprising 3 interdependent "panoramas".  These would be interdependent, with a common alphanumeric coding system throughout (ISO-639 plus Linguasphere), and would be fully inter-referenced and interaccessible at every point:

5.1.1  the Global Index of the World's Languages and Speech Communities (or ISO 639/ Linguasphere Index), presenting an alphabetical key to the identification and location of all known written and recorded languages and dialects, and all varieties of linguistic, ethnic and communal names.  This panorama, covering a total of over 71,000 names, is already available in a first printed edition (but without ISO codes), as the Index to the Linguasphere Register.   This progressively updated and expanded edition will be opened to free public access and dialogue on the internet within the next year.   An extract from this Index, covering names beginning G-, has been prepared and is now being extended as part of the current proposal.   This will include existing and proposed additional ISO-639 codes.

5.1.2  the Global Register of the World's Languages and Speech Communities, presenting a comprehensive scale of linguistic relationships among the spoken and recorded languages and dialects of the world, and their relevant speech communities.  This panorama is already available in a first printed edition as the Linguasphere Register of the World's Languages and Speech Communities, covering over 22,000 varieties of languages and dialects.  This edition will be opened to free public access and dialogue on the internet within the next year, and will be progressively updated and expanded online.  Extensive extracts are already freely available at /www.linguasphere.org/.


5.1.3  the Global Mapbase of the World's Languages and Speech Communities, presenting a cartographic survey of the location, distribution and interrelationships of the world's languages and speech communities.  This panorama has already been developed by the Linguasphere Observatory for Africa (linguistically the most complex continent in the world), in collaboration with the London School of Oriental and African Studies (SOAS).  It has been printed as the first sheet of the Linguasphere Mapbase of the World's Languages and Speech Communities and is currently being extended into southern Europe and western Asia, in collaboration with the Languages of the World unit of the Russian Academy of Sciences (Akademia Nauk) .  The first African layer of this map is viewable at /http://www.soas.ac.uk/Geography/LanguageMapping/home.html/.   This same page on the SOAS website illustrates how subsequent layers of the Linguasphere Mapbase will be accessible by zooming, down to the layer of urban speech communities (as already surveyed and published for over 300 minority languages of London, see Bibliography below).

5.2  Applications of the tripartite reference guide

This three-part electronic reference guide will serve as

·         a transnational reference system and educational resource for teaching covering

à     the global complexity of humankind, as represented by the overlying diversity of its languages and the divergent welfare and cultures of its individual speech communities,

à     the underlying unity of humankind, as represented by a worldwide continuum of multilingual communication and intercommunal identities (the "linguasphere"), and

à     the establishment of comprehensive links with - and annotated signposts towards - a vast range of other electronic sources on the languages, peoples and cultures of the world;

·         a stimulus to innovative teaching and research, including

à     the active investigation and surveying of linguistic and ethnic realities and relationships, including the continuous updating and expansion of the global reference guide itself;

à     the transnational observation and documentation, regardless of frontiers, of

- the actual and relative welfare of all speech communities in the world,

- the movement and migration of speech communities and their members,

- the formation and distribution of minority urban speech communities,

- the incidence of all forms of genocide and other forms of discrimination
       among ethnolinguistic communities;

à     the awakening of public interest in questions of the transnational and multilingual heritage and origins of communities and individuals (the "languages of our ancestors").  Linked to the growing strength of public interest in genealogical research, this  development may be of particular importance in encouraging the development of bilingualism among first language English-speaking communities (in danger of becoming the only communities deprived of the advantages of bilingualism, in an otherwise multilingual world).


5.3  The Linguasphere Observatory

The present proposals and products are the outcome of many years research and development at the Linguaphere Observatory in Wales and at its previous location in France as the Observatoire Linguistique.   Created in 1983, after planning and discussion in Quebec (at CIRB, the Centre International pour la Recherche en Bilinguisme at the Université Laval), the Observatory was set up in Normandy as a transnational research network devoted to the study and development of multilingualism (under the honorary presidency of Léopold Sédar Senghor of Senegal, and registered under the French law of association of 1901). 

Among other linguistic activities, the Observatory was responsible for two bilingual exhibitions on languages at the Centre Georges Pompidou in Paris during the 1980's, with substantial support from the Government of Canada.  (These exhibitions subsequently toured internationally, including London, Liège and Lagos, and around the world to Canberra.)  Since 1995, the Observatory has been based in a bilingual area of west Wales, under the directorship of David Dalby, where the Linguasphere Register of the World's Languages and Speech Communities was first published at the turn of the millennium (1999/2000).   Scientific support has been received from Russia, France, India and the United States.

It is appropriate that the present proposals and products should emanate from Wales, a country whose language has successfully resisted and survived the successive invasion of its territory by two of the most powerful languages in the history of the world, Latin and English.  All speech communities now need to consider their relationship to English, as a global lingua franca, and in this respect the indigenous speech community of Wales has the longest experience in the world, having faced the growing strength of its English neighbour for more than one millennium.  The cultural strength and linguistic survival of the Welsh-speaking community offer an important message of encouragement to small speech communities everywhere.  English has a transnational role to play in the world, along with other "arterial" languages, but should be developed in the service of a multilingual global society, NOT as the medium of a monolingual culture.

That the British Standards Institution in London (BSI) and the University of London's School of Oriental and African Studies (SOAS) should have given their support to the proposals and products of the Linguasphere Observatory in Wales is also significant.  At a time when countries around the world are devoting resources to the study of a language associated with England, it is appropriate that major public institutions in that country should devote resources to the study and development of multilingualism and of the languages of the world.

 

Linguasphere Observatory  and  British Standards Institution                                   August 2001

 

 

Comments on this paper, prepared at relatively short notice for the ISO TC/37 meeting in Toronto, will be greatly welcomed, by post or by e-mail to /research@linguasphere.net/. 
A more detailed proposal will be prepared by the Linguasphere Observatory for the beginning of 2002, including the orderly extension of identification codes to all spoken and written languages, and the examination of procedures for combining language codes with codes for countries and for scripts.


Bibliography

 

Baker, Philip & Eversley, John, Multilingual Capital: the languages of London's schoolchildren, Battlebridge Press: London, 2000

Constable, Peter and Simons, Gary (SIL), Language Identification and IT: Addressing problems of linguistic diversity on a global scale, presented to the 17th International Unicode Conference, San Jose (California), September 2000.

Grimes, Barbara F. (editor), Ethnologue: Languages of the World (14th ed.), SIL: Dallas, 2000

ISO Code for the representation of names of languages, ISO, 1998

Linguasphere Register of the World's Languages and Speech Communities (2 volumes), Linguasphere Press: Hebron (Wales), 2000

 

 


Appendix 1:  ISO 639 Codes for the Representation of Language Names
 correlated with the Linguasphere Referential Framework of 100 Zones

ISO 639-1 is an Alpha-2 code
 ISO 639-2/T & /B are Alpha-3 (/T= terminology code; /B = bibliographic code)
 The Linguasphere Referential Framework (LRF) is a Numeric-2 code (see Appendix 2)

(extract)  A-H

as arranged alphabetically by English name of language

The proposed Identification Code will comprise the LRF + ISO 639-1 or 639-2/T elements

Language Name (English)

Language Name (French)

 

LRF +

639-1

639-2/T

639-2/B

Abkhazian

abkhaze

42

ab

abk

abk

Achinese

aceh

31

 

ace

ace

Acoli

acoli

04

 

ach

ach

Adangme

adangme

96

 

ada