Derek Walker of the ISSCO Research Centre (University of Geneva) has developed a free/GPL XML transducer 'XMLTrans'. The tool is "similar to IBM alphaWorks' PatML, but the syntax is meant to be less heavy. The pattern matching focuses on horizontal constraints (regular expressions over siblings); it is optimized for transducing lots of relatively small but complex chunks of the same type in a big file without putting the whole file in memory. XMLTrans provides rooted/recursive transductions, similar to transducers used for natural language translation. It is written in standard Java and is available to the general public."
"The XmlTrans transducer takes as input a well-formed XML file and a set of transformation rules and gives as output the application of the rules on the input XML file. It was designed for the processing of large XML files, keeping only the minimum necessary part of the document in memory at all times. The program is written in Java and uses an XML DOM parser. The XmlTrans parser keeps a list of XML elements in the document which can be transformed by the rule set. All other elements are ignored by the parser and will be suppressed in the output, though each child is searched for transformable elements. At the top of the rule file at least one "trigger" is required to indicate which elements can be processed. This trigger associates an element with a rule set. Consequently, each XmlTrans rule file must contain at least one rule set. This is a collection of rules which are grouped together for convenience... To ensure that all the original elements are represented in the rule set, it is often useful to work from the DTD, writing at least one rule per element in the original DTD..."
"XMLTrans was developed as part of the DicoPro project, a project funded within the Multilingual Information Society programme (MLIS), an EU initiative launched by the European Commission's DG XIII and the Swiss Federal Office of Education and Science.
A paper "XMLTrans: a Java-based XML Transformation Language for Structured Data" presents the XMLTrans transduction language. See the abstract in the articles reference collection. A demonstration ("XMLTrans: a Java-based XML Transformation Language for Structured Data Demo") was also given recently at Coling 2000.
Principal references:
- XMLTrans web site [alt URL]
- XmlTrans User's Guide
- "XMLTrans: a Java-based XML Transformation Language for Structured Data" [See abstract]
- Download [cache]
- "XMLTrans: a Java-based XML Transformation Language for Structured Data." [cache]
- DicoPro On-Line Dictionary Consultation for Language Professionals on the Intranet
- Contact: Derek Walker