Cover Pages Logo SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic

LINGUIST Language Database


Date:      Mon, 5 Nov 2001 18:17:18 -0000
From:      Anthony Aristar <aristar@LINGUISTLIST.ORG>
Subject:   LINGUIST Language Database

We have now finished putting together the initial version of the language database facility we talked about in Santa Barbara. This system includes all of Ethnologue (which SIL has generously allowed us to use) as well as a supplementary database of ancient and constructed languages, which we ourselves have put together, and which includes brief descriptions and unique codes. The intent is to allow us to precisely categorize by language any data we encounter.

The language search facility based on this database allows four kinds of searches:

  1. Search by language name. (This searches a database of around 48,000 alternate names, and does a fuzzy match on your input.)

  2. Search by Ethnologue or LINGUIST code.

  3. Search by family of subgroup name. (This will return a list of languages if the node dominates language names, and a clickable tree if it doesn't)

  4. Generate a tree of any of the language families in the database.

These last two will only work properly if you have Java enabled on your machine.

We've also set up pages that will give quick listings of all the ancient and constructed languages in the database.

Some caveats apply:

  1. The languages and subgrouping found in this database are almost all Ethnologue. But there are some differences. Where we saw clear mistakes in Ethnologue subgroupings, we fixed them. However, there was no systematic attempt to go over the whole Ethnologue system and reassess or correct it. So the changes we made were very minor. Our intention in the future is to have experts in each area go over the subgroupings, and to incorporate these changes into the system. The system will in theory allow multiple subgroupings to be defined, in areas where opinions differ. We'd love to hear from people who'd like to help with this task!

  2. The database is in its initial satges, and hasn't been optimized yet. It can be quite slow.

We would be delighted to hear your opinions. The URL is:

     http://saussure.linguistlist.org/cfdocs/pub/

We'd love to hear your comments.

All the best

Anthony


Anthony Aristar                         Associate Professor
Moderator, LINGUIST
Linguistics Program
College of Liberal Arts                 aristar@linguistlist.org
Dept. of English                        aristar@wayne.edu
Wayne State University
51 W. Warren
Detroit, MI 48202
U.S.A.

URL: http://linguistlist.org/aristar/

Prepared by Robin Cover for The XML Cover Pages archive. See "Language Identifiers in the Markup Context."


Globe Image

Document URL: http://xml.coverpages.org/LINGUISTLanguageDatabase-Announce.html