[Archive copy mirrored from: http://www.ornl.gov/sgml/wg8/9573ent/ENTITIES.HTM, descriptive test only]
Sample collections of entities and glyphs (proposed) for potential inclusion into ISO 9573. For: Ugaritic, Old Persian, Glagolitic, Croatian, Buginese, Cherokee, Gothic Uncials. Developed by Anders Berglund (and others).
SGML, ISO/IEC 8879:1986, contains a mechanism to refer to characters, syllables and symbols that are not to be found on normal keyboards or that are difficult to store and transmit unambigously. It is acheived by defining so called (SDATA) Entities, where one has essentially given a name to a character, syllable or symbol and is assuming that a system processing the SGML data will be able to understand the reference, either by its name or the so called replacement text. To refer to an entity in an SGML file the name is prefixed by "&" and followed by ";". For example α to refer to the greek alpha. ISO has published some number of collections of entities; the Public Entity Sets, and work is in progress to add a large number of entity sets for non-latin languages.
For the purposes of reviewing and commenting on the sets the name and comment are the only relevant parts. The pubished entity sets also refer to characters, if present, in ISO 10646 as well as to entries in the International Glyph Registry, for which AFII is the registrar.
A large number of the entities represent characters. For cases where presentation forms exist and where it is desirable to be able to easily refer to a particular form entities have been created for these. For example, up to five entities have been defined for each Arabic letter - one as a character, four when it is required to be able to specify one of the four presentation forms.
For entity sets representing scripts of scholarly interest additional entities are included to enable recording of variations that are important for research purposes. In such cases there is normally a "nominal" entity representing a character or syllable that can be used to record texts where variations are not important. In addition there are entities for each signifficant variation of a character or syllable that may be used in those studies where variations are important to record. Thus for example if a character has two distinct presentation forms there would normally be three entities for it.
Be warned that the Web page for a proposal contains a number of gif images showing a typical glyph for each entity. Display may thus be slow...
The proposed entity sets will, shortly, also be available as a zip file containing a scanned tif image of the proposal.
Please send comments on the proposals to Anders Berglund; bcatf@ibm.net.
Glagolitic, Croatian Proposal, HTML
Glagolitic, Croatian Proposal, zipfile
Gothic Uncials Proposal, zipfile
Copyright BC&TF, 1997.