Cover Pages: Cover Pages News Clippings 2002-12

What are "Clippings?"

Contents

[December 26, 2002] RELAX NG J2ME Processor. A posting from MURATA Makoto provides a provisional reference for an XML 2002 presentation entitled "RELAX NG Validator on Mobile Phones": A web page created from my slides is temporarily available... My validator does not support interleave, list, value, and data yet. Its Jar file is 27KB, including kXML2 and a table created from a small RELAX NG schema. I will improve my validator and disclose its source..." See the abstract for "Implementing RELAX NG Validators as State Machines" on the IDEAlliance website: "We propose a two-phase validation for RELAX NG. The first phase creates a state machine from a RELAX NG schema. This phase can be performed without having any instance documents. The second phase validates instance documents by using this state machine. This phase does not require schemas in their original forms. This two-phase validation has three advantages. First, it becomes easier to implement RELAX NG validators on many platforms, since only the second phase has to be ported. Second, validators become lightweight, since the first phase is not necessary at run-time. Third, validation is expected to become faster, because a state machine works in a way much simpler than the schema validation. Although this approach can be easily applied to DTD-based validation, epressiveness of RELAX NG imposes significant challenges as below. (1) Non-deterministic tree (or hedge) automata: Since RELAX NG can capture any tree (or hedge) regular language, state machines are required to mimic non-deterministic tree (or hedge) automata. (2) Attribute-element constraints: RELAX NG allows elements and attributes to be freely combined in one content model. Although such compound content models are powerful in expressing constraints between attributes and elements, our state machines have to handle attributes as well as elements. (3) Interleaving: RELAX NG supports supports fully-generalized unordered content models (interleaving). Naive construction of state machines for interleaving leads to exponential blowup. We first introduce state machines for handling RELAX NG in its entirety. Our state machines meet these challenges: (1) our state machines capture non-deterministic tree (or hedge) automata by maintaining a set of (possibly multiple) states per element or attribute; (2) state machines handle attribute-element constraints by converting attribute-element automata to element automata at run time; and (3) state machines simulate variations of shuffle automata for handling interleave patterns without exponential blowup. Next, we demonstrate how a RELAX NG schema is compiled into such a state machine. This process is done by computing derivatives of content models. We also present ptimization techniques for reducing the size of state machines. Finally, we show our open-source implementations of RELAX NG validators. As of this writing, two schema compilers (the first-phase) have been implemented, and two run-time systems (Java and Win32/Visual C++,) have been implemented. We also show how these run-time systems interface with other components in their respective environments..." See: (1) other RELAX NG tools referenced on the OASIS website; (2) kXML website; (3) local references in "RELAX NG." [snapshot]
[December 24, 2002] Versioning Machine Beta for TEI Parallel Segmentation Encoding of Textual Variants. A posting from researchers (Susan Schreibman, Amit Kumar, eriC White, Jarom Mcdonald, Lara Vetter) at the Maryland Institute for Technology in the Humanities (MITH) announced a beta version of the The Versioning Machine (VM) 1.0. The Versioning Machine "is a software tool designed by a team of programmers, designers, and literary scholars for displaying and comparing multiple versions of texts. The display environment seeks not only to provide for features traditionally found in codex-based critical editions, such as annotation and introductory material, but to take advantage of opportunities of electronic publishing, such as providing a frame to compare diplomatic versions of witnesses side by side, allowing for manipulatable images of the witness to be viewed alongside the diplomatic edition, and providing users with an enhanced typology of notes... Because the TEI critical apparatus tagset offers the most efficient and thorough methodology for inscribing variants in a structured, machine-readable format, the Versioning Machine (VM) has adopted it in version 1.0 as its foundation. Using this tagset, then, allows an editor to encode in one document multiple versions of that text; VM 1.0 is able to reconstruct multiple witnesses from the single XML-encoded document and display them, side-by-side, as individual documents. The critical apparatus tagset supports three different types of encoding variation: location-referenced, double-end-point, and parallel-segmentation; however, only parallel-segmentation is currently supported by VM 1.0..." MSIE 6.0+ (only as of 2002-12). See: (1) "Text Encoding Initiative (TEI)"; (2) TEI P4, Chapter 19 "Critical Apparatus."
[December 21, 2002] XPath Visualizer for the Mozilla Browser. Dimitre Novatchev posted an announcement for an 'XPath Visualizer for the Mozilla Browser' -- an implementation of the popular XPath Visualizer which now will work with a Mozilla browser. "This is a full blown Visual XPath Interpreter for the evaluation of XPath expressions and visual presentation of the resulting nodeset or scalar value. The XPath Visualiser's value as an XSLT and XPath learning and authoring tool results from its ability to present the results of any XPath expression in an immediate, appealing and straightforward visualization. The source XML Document is displayed with any node hi-lighted that satisfies the XPath expression. In case the expression evaluates to a scalar (string | number | boolean) then the result is displayed in a separate window. The expandable/collapsible syntax colour-coded display of the source XML Document is the same as the one of the XPath Visualizer for Internet Explorer. Best uses of the tool include: (1) Composing the exact XPath expression when designing an XSLT stylesheet. (2) A "nodeset view" in a watch window of an XSLT debugger. (3) Obtaining any quantitative measures of the xml source by evaluating expressions that return a scalar. (4) Learning and playing with XPath expressions. (5) As a good example how to process completely un-anticipated XML documents using 'push processing'... requires Mozilla 1.2 or later. To use, download the zip file [cache], uncompress it in a separate folder and follow the guidelines..."
[December 21, 2002] XPointerLib as an XML Linking Implementation. XPointerLib is a project providing XPointer support for Mozilla 1.0+, Netscape 7, and Phoenix 0.4. This code was motivated by the Annotea Project's use of XPointers to specify annotation locations. It originally lived inside of Annozilla, a Mozilla-based annotations client. Today it has evolved into a standalone XPCOM Service implemented in JavaScript. In the future if I find some time I'd like to rewrite it in C++ (less portable, more efficient)..." See the posting from Heikki Toivonen to the W3C 'www-xml-linking-comments@w3.org' mailing list references XPointerLib linking technology being developed by Doug Daniels and others at the Rice University Connexions Project. "XPointerLib is a mozdev.org project providing XPointer support for Mozilla / Netscape 7 / Phoenix browsers. It is an XPCOM service written in JavaScript that creates and resolves a subset of the XPointer language... XPointerLib both creates and evaluates XPointers. On the creation side, given either a Mozilla Selection object or a DOM Range and the Document referenced by the XPointer, it will return the XPointer representation. For evaluation, given an XPointer and the Document, it will resolve the XPointer to a DOM Range. Difficulties discovered in using the XPointer specification: 'The XPointer specification itself does not convey how important it is to read and understand the XPath specification first. This would be an invaluable piece of knowledge, especially when first approaching XPointer.' Difficulties discovered in implementing XPointer: 'There were a few contortions necessary to properly lex the ambiguous grammar, but nothing too terrible.' XPointer conformance: 'As far as I understand full conformance, it is fully XPointer conformant.' The code is publicly available from http://xpointerlib.mozdev.org/. "Please note that XPointerLib is different than the XPointer/FIXptr implementation in the baseline Mozilla product." Contact: rhaptos@cnx.rice.edu. See also: XPointer Implementations list.
[December 19, 2002] TagSoup Open Source SAX Parser for Nasty, Ugly HTML. John Cowan announced the public release of TagSoup (TagSoup version 0.8). TagSoup is "an Open Source SAX parser in Java for nasty, ugly HTML." John says: "For the last year I have been working on a new parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML. TagSoup is now ready for its first public Open Source release under the Academic Free License, a cleaned-up and patent-safe BSD-style license which allows proprietary re-use. It's also licensed under the GPL, since unfortunately the GPL and the AFL are incompatible. TagSoup is a parser, not a whole application; it isn't intended to permanently clean up bad HTML, as HTML Tidy does, only to parse it on the fly. Therefore, it does not convert presentation HTML to CSS or anything similar. It does guarantee well-structured results: tags will wind up properly nested, default attributes will appear appropriately, and so on. The semantics of TagSoup are as far as practical those of actual HTML browsers. In particular, never, never will it throw any sort of syntax error: the TagSoup motto is Just Keep On Truckin'. But there's much, much more. For example, if the first tag is LI, it will supply the application with enclosing HTML, BODY, and UL tags. Why UL? Because that's what browsers assume in this situation. For the same reason, overlapping tags are correctly restarted whenever possible..." From the 38 presentation slides (PDF) "TagSoup: A SAX parser in Java for nasty, ugly HTML": "Where Is Tag Soup? - On the World Wide Web (About 2 gigadocs), On about 10,000 corporate intranets (Unknown number), The results of 'Save As HTML...' (Unknown number)... Tag Soup Guarantees: (1) Start-tags and end-tags are matched; (2) Illegal characters in element and attribute names are suppressed; (3) Windows-specific characters (0x80-0x9F) are mapped to Unicode, independently of character encoding; (4) Attribute defaulting and normalization done... Tag Soup is A Parser, Not An Application: Does what is necessary to ensure correct syntax and bare HTML semantics; Not a substitute for the HTML Tidy program, which actually cleans up HTML files, converts markup to CSS, etc.; Meant to be embedded in XML applications to let them process any HTML as if it were well-formed XML..." See the ZIP download and .PPT presentation source.
[December 18, 2002] Updated IBM Web Services Toolkit Supports WSRP Specification. A posting from Thomas Schaeck (Chair, OASIS WSRP Technical Committee) reports that the latest update of IBM's WSTK provides an initial impementation for WSRP (Web Services for Remote Portals). The December 17, 2002 (Version 3.3.1) release of the IBM Web Services Toolkit has support for WS-Policy, WSRP, Tivoli Management Web Services and Common Event Format, Federated Identity demo, Wide Spectrum Stress Tool, Reputation Protocol, WS-Inspection crawler utility, Pluggable Discovery Framework, Privacy Authorization Director, and Updated Utility Services. "An installed image of the WSTK 3.3.1 is now running on developerWorks. This version contains all of the WSTK 3.3.1 documentation and a selected subset of demos that can be run without downloading anything..." See "IBM WSRP v0.85 Implementation" and the list of demos for references to the WSRP application. Thomas says: "It is not yet a complete implementation, but has the most important parts of the WSRP spec draft implemented as well as a few sample producers and a SWING based sample consumer. As I mentioned before, we plan to move a more complete implementation to a new open source project probably in February 2003, but in the meantime, I'd like to encourage everybody to download the WSTK, play around with the WSRP implementation and run it against your implementations to do some initial interop tests. This should help us to identify parts of the [Working Draft Version 0.85] specification where different interpretations leading to different behavior were possible and to prove that the spec actually 'works'..." See also "IBM developerWorks Web Services Demos" and "Web Services for Remote Portals (WSRP) Whitepaper."
[December 16, 2002] XML Europe 2003 Call for Papers. A posting from Marion Elledge (IDEAlliance) announces an extension of the Call for Papers deadline in connection with XML Europe 2003. Proposals for papers will be accepted through December 20, 2002; candidate submissions should follow the published Guidelines. The XML Europe 2003 Conference will be held at the Hilton London Metropole on May 5-8, 2003. Edd Dumbill will again serve as conference chair. XML Europe "provides the premier European forum for the XML community, spanning the worlds of electronic business, publishing, the Internet, e-government, software and open standards development."
[December 13, 2002] XACML Specification Submitted to OASIS Membership for Approval as OASIS Standard. A communiqué from Carlisle Adams (Entrust) and Hal Lockhart (Entegrity Solutions), Co-Chairs of the OASIS XACML TC certifies that the XACML 1.0 Committee Specification has successfully passed through a public review period and that a unanimous vote of the TC endorses submission of the specification to OASIS members for ratification as an OASIS Open Standard. The posting references declarations from three OASIS organizational members (Sun Microsystems, Overxeer, Entegrity Solutions) indicating successful use of XACML 1.0. "Although there is nothing explicit in the OASIS rules requiring implementation of a specification, the chairs are also pleased to report that there are at least three known implementations of the XACML 1.0 Committee Specification that pass the full suite of tests created for conformance determination, as well as several other implementations of various portions of the specification. The members of the TC are therefore confident that the Committee Specification is fully implementable." The XACML TC was chartered to "to define a core schema and corresponding namespace for the expression of authorization policies in XML against objects that are themselves identified in XML. The schema will be capable of representing the functionality of most policy representation mechanisms available at the time of adoption. It is also intended that the schema be extensible in order to address that functionality not included, custom application requirements, or features not yet envisioned. Issues to be addressed include, but are not limited to: fine grained control, the nature of the requestor, the protocol over which the request is made, content introspection, the types of activities authorized." See: (1) the earlier news item "Public Review for OASIS Extensible Access Control Markup Language (XACML) Specification"; (2) the minutes of the TC's December 12 meeting referencing a decision to rescind the earlier interpretation of "successfully using" [the specification] in light of last-minute IPR claims; (3) general references in "Extensible Access Control Markup Language (XACML)." Note 2003-01-01: a 2002-12-31 posting from Karl Best outlines the approval: Call For Vote on 2003-01-16, voting 2003-01-16 through 2003-01-31.
[December 13, 2002] Sun xmlroff XSL Formatter. Tony Graham (XML Technology Center, Sun Microsystems Ireland) gave a presentation on 2002-12-11 at the XML 2002 Conference on the topic "Sun XSL Formatter Goes Open Source": 'A new XSL formatter developed by Sun Microsystems is being donated to open source. This session is the public introduction of the formatter and the first call for participation to join in the further development of this exciting new software..." An update notice posted to the XSL lists added the following: "The Sun xmlroff XSL formatter is written in C, and it uses libxml2 and libxslt plus the GLib, GObject, and Pango libraries that underlie GTK+ and GNOME (although it does not require either GTK+ or GNOME). The formatter currently produces PDF output only. xmlroff is a command line program, but the bulk of the XSL formatting is implemented as a libfo library that can be linked to any program that requires XSL formatting capability. It will be available under a BSD license. It is being developed on both Solaris and Linux. The formatter is awaiting final approval before the code can be made public source. An announcement will be made on xsl-list, www-xsl-fo, and XSL-FO@YahooGroups once the code is available..." Related references in "XSL/XSLT Software Support."
[December 10, 2002] XML Cup Award 2002 Presented to Jon Bosak and Tim Bray. News from the XML 2002 Conference in Baltimore 2002-12-10: "IDEAlliance today awarded the XML Cup to Jon Bosak and Tim Bray for outstanding contributions to XML. The presentation was part of the opening session of the XML Conference and Exposition 2002, being held this week at the Baltimore Convention Center, in Baltimore, Maryland. 'Jon Bosak was the Chair of the W3C Working Group that created XML, while Tim Bray was one of the co-editors of the XML specification as well as writing the first XML parser. Both have been energetic evangelists for XML and both are worthy recipients of the XML Cup', said Lauren Wood, Chair of the XML 2002 Conference, who awarded the Cup. 'The biggest decision the XML 2002 Planning Committee had to make was which of them should get the Cup this year, and which next year, so we decided to give it to both'... Jon Bosak organized and led the working group that created XML, subsequently serving for two years as chair of the XML Coordination Group of the World Wide Web Consortium. He is a long-time member of OASIS, the Organization for the Advancement of Structured Information Standards, and he chaired the committee that developed the OASIS process for the definition of industry-specific XML markup standards. He also served on the Advisory Board of the Electronic Business XML initiative (ebXML), a joint project of OASIS and the United Nations body for Trade Facilitation and Electronic Business (UN/CEFACT). He currently chairs the OASIS Universal Business Language Technical Committee and serves as the Sun Microsystems representative to the RosettaNet Solution Provider Board. Tim Bray has been in the software profession since 1981. He managed the New Oxford English Dictionary Project 1987. In 1989, Mr. Bray co-founded Open Text Corporation. He built one of the first popular commercial Web Search Engines in 1995 and in 1996-99, as an Invited Expert at the W3C, he co-invented XML and XML Namespaces. Mr. Bray is the author of Bonnie, a file system benchmark widely used in the Linux community and Lark, the world's first conformant XML processor. In 1999 he founded Antarcti.ca Systems Inc. and is currently employed there. He also serves on the W3C Technical Architecture Group..."
[December 10, 2002] SourceForge EXSLFO Project for Standard, Non-Proprietary XSL-FO Extensions. A posting from W. Eliot Kimber (ISOGEN International) announces the creation of an EXSLFO project on SoureForge. The EXSLFO project "is a community effort to define functional extensions to thet XSL Formatting Objects specification in advance of development of new versions of the XSL FO specification by the W3C. It is intended to be an adjunct to the formal W3C specification development process. Expected outputs of this activity are: [1] Clear definitions of requirements for features and functions not addressed by the current FO specification. These features may be core layout functions (e.g., revision bars) or may be related to specific delivery technologies (PDF bookmarks and metadata); [2] Specifications for non-proprietary extensions to the XSL-FO language that attempt to satsify these requirements; [3] Where possible or applicable, implementations of these requirements in whatever form (pre-processors, post-processors, extensions to existing FO implementations, etc.)." Kimber says: "I have created a initial requirements document to serve as a starting point for discussion and a model for the type of requirements the project is intended to gather... a starting point for discussion. The motivation in starting this project is driven largely by a desire to have standardized those extensions that are clearly generally required but probably outside of the scope of the FO spec, such as generation of PDF bookmarks and metadata, capturing of page-to-object mappings, etc. As an integrator, I would like to see an industry standard for these extensions that would allow one to have a single FO instance for a variety of FO implementations that could take advantage of all the features of, for example, PDF." A mailing list has been set up; the Internet domains exslfo.org and and exslfo.com have also been registered for the project.
[December 10, 2002] Constraining TMs with AsTMa Topic Map Processing. A posting from Robert Barta reports on research at Bond University focused upon topic map constraints. The "AsTMa language family is designed to support authoring, updating, constraining and querying of Topic Maps... Whenever Topic Maps are authored, they might have to follow a particular structure. In the same way as relational databases are constrained by schemas and XML languages follow constraints (provided by DTDs, XML Schemas, Schematrons, ...), Topic Maps can also be constrained by a constraint or a set of constraints. AsTMa! is one AsTMa sub-language which allows to formulate such constraints. Conceptually, AsTMa! is a language to define ontologies. For our purposes here an ontology is defined as: a set of concepts (vocabulary), a type system connecting the concepts of the vocabulary, and qualitative and quantitative rules on the structure and extent of associations..." See the draft version 0.4 of the AsTMa! Language Definition and the tutorial. The ISO SC34 WG3 is working on ISO 19756: Topic Maps Constraint Language (TMCL) which "will provide a schema or constraint language for topic maps; using TMCL one can write schemas for topic maps that constrain what is allowed to say in the topic map, such as 'a person must be born in a place,' 'a person must have at least one name,' and so on. A TMCL requirements draft has been produced. General references in "(XML) Topic Maps."
[December 10, 2002] Metastorage Generator Based on Component Persistence Markup Language (CPML). A communiqué from Manuel Lemos announces the release of a 'Metastorage' application "that is capable of generating persistence layer APIs. Metastorage is a persistence layer generator application based on the persistence module of the MetaL compiler engine. It is capable of generating the necessary software components to implement a persistence layer API from a description in a format based on XML named Component Persistence Markup Language (CPML). A CPML definition describes classes of objects with the functions that will make the API to store and retrieve their objects from persistence storage containers like relational databases without requiring the developer to write SQL manually. The generated API may also contain a class to install the database schema... The main goal of Metastorage is to drastically reduce the time to develop applications that traditionally use on SQL based relational databases... CPML is independent of the type of persistent container. This means that while it can be used to model classes of persistent objects that may be stored in relational databases, such objects may as well be stored in other types of persistence containers. For instance, if an application needs to move a directory of objects with user information from a relational database to a LDAP server to increase the application scalability, the same CPML component definition would be used. Metastorage would then generate classes objects that implement the same API for interfacing with a LDAP server that is compatible with the API generated to interface with relational databases. This make the migration process easier and with reduced risks. Another possible benefit of the persistence container independence of the APIs generated by Metastorage, is the case where an application may need to run in environments where a SQL based database server is not available. In that case the same API could be generated to store persistent objects in flat file databases or plain XML files..." See also "MetaL: XML based Meta-Programming Language Technology."
[December 09, 2002] XML/XSLT Web Services Framework (XWSF). A posting from Gunther Schadow (Regenstrief Institute for Health Care, Indiana University School of Medicine) announces a XWSF SourceForge project. "XML/XSLT Web Services Framework (XWSF) provides a simple and lightweight yet powerful environment in which one can script Web Services clients and servers using XSLT and Java. A version 'xwfs-0.5' was released on December 8, 2002. See the XWSF CVS repository for updates. Note: "For this first release 'xwsf-0.5.jar' you need to unjar it with $ jar xvf xwsf-0.5.jar. It's not something you can run right away. If you have the java web services development pack (jwsdp) or similar jakarta catalina installation, you can move the gstserver directory into the web-apps subdirectory of the jakarta catalina installation. This should be all to deploy it. You could make a war file out of this as well, but most likely you'll want to experiment with this stuff, so a war file wouldn't do you much good. I have precompiled the code so you could run it right away after you have set your classpath accordingly. Need to include the ./classes subdirectories OR you add the jar files in gstserver/WEB-INF/lib to your CLASSPATH. This includes a hacked version of SAXON7.3 (the hacks are in the file saxon7.diff.) If you need to compile the Java code, if you are on UNIX or cygwin, you could say make. There are few enough files so that you could actually do it manually. Any questions, pleas use those sourceforge forum..."


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY