SGML: Journals of the Basque Administration

SGML: Journals of the Basque Administration



From owner-tei-l@listserv.uic.edu Thu Mar 13 12:03:06 1997
From: Joseba Abaitua <abaitua@fil.deusto.es>
Organization: Universidad de Deusto
Subject:      Re: DTD for legal documents
To: TEI-L@listserv.uic.edu

  -----------------------------------------------------------------

It might interest you to know about the LEGEBIDUNA project.

We've created a corpus of administrative (not legal) documentation:
Official Bilingual Journals of the Basque Administration (almost 10
million words in each lang. Basque and Spanish). We're now tagging
the texts, i.e. recognizing administrative and legal formulae and
terminology, and their distribution in the texts' structure. Our DTDs
are deduced from the tagged corpora, i.e first we tag the text, then
we construct the DTDs.

Similar experiments have been reported in "Automatic generation of
SGML content models", Electronic Publishing, vol.8:195-206, by Helena
Ahonen <helena.ahonen@helsinki.fi>.

Also you can have a look to Keith Shafer's Fred parser for automatic
DTD creation in http://www.oclc.org/fred/docs/papers/

For our project, we have a page in Spanish at:
http://www.deusto.es/~abaitua/konzeptu/lege2dun.htm
___________________________________________________________

Joseba Abaitua  abaitua@fil.deusto.es  http://www.deusto.es/~abaitua
Facultad de Filosofia y Letras,  Universidad de Deusto,  Apartado 1
E-48007 Bilbao || Tel: +34-4-4139092  (Ext. 2292) || Fax: +34-4-4458916