This issue of XML Daily Newslink is sponsored by:
IBM Updates Free Enterprise Search Tool
Chris Kanaracus, InfoWorld
IBM and Yahoo issued a new version of their free enterprise search product on Tuesday, just weeks after rival Microsoft announced a competing product. The latest release of IBM's OmniFind Yahoo Edition contains a number of enhancements. Users can now generate up to five separate indexes of documents, thereby enabling them to search from a particular set instead of the entire repository. Other tweaks include the ability to define additional custom search fields, according to Aaron Brown, IBM's director of search and content discovery. IBM also said the software is now easier to install as a Windows service. OmniFind Yahoo Edition is based on the open source Lucene project. The update includes the latest version of the Lucene core, according to Brown: "It helps us close the loop with the community, because we've contributed a lot of IBM code back into Lucene." However, the update does not include any scalability improvements, and remains limited to searching 500,000 documents per instance. Brown said the updates were primarily guided by feedback from customers. The software has been downloaded about 25,000 times, according to IBM. Yahoo and IBM released the first version of the search engine about one year ago. From the product description: "Open and extensible: (1) Built on Apache Lucene; (2) Open URL-based APIs (REST); (3) Define, populate and search your own custom fields; (4) Easily embeddable and customizable UI output using XML/XSTL/HTML, HTML snippets." The Blog: "Also new in this release is custom extensible metadata fields. This means you can define your own fields in the index. Populate them via HTML meta tags, extracted document meta-data, or directly through the push API, and then search your custom fields. Not everyone needs this capability but those that do need it need it badly and we've seen users jump through incredible hoops to hack this capability into the fixed meta-data search support we offered previously."
See also: the product description
Use Custom Collations in XSLT 2.0
Doug Tidwell, IBM developerWorks
One emphasis of XSLT 2.0 is better support for internationalization, especially sorting and comparing text. This seemingly simple task is quite complicated in some languages; for example, accented characters can be considered the same or different depending on context. Are A+acute, A+grave, and A the same letter? Sometimes the answer needs to be yes, despite the fact that they are three different code points. The simple string comparison functions found in most languages, including XSLT 1.0, aren't up to the task. This article demonstrates how to write a custom collation function using XSLT extensions and invoke it from an XSLT 2.0 stylesheet with the open-source Saxon processor. To use a custom collation with Saxon, you specify the name of the Java class that implements the collation function. XSLT 2.0 has a number of functions and elements that allow you to specify a collation. A collation is the heart of any sorting algorithm. A collation function compares two items and returns one of three values. If the first item appears before the second, the function returns a value less than zero. If the two items are equal, the function returns zero. Finally, as you might expect, if the first item appears after the second, the return value is greater than zero...
See also: the Saxon web site
CURIE Syntax 1.0: A Syntax for Expressing Compact URIs
Mark Birbeck and Shane McCarron (eds), W3C Technical Report
W3C announced the publication of an updated version of "CURIE Syntax 1.0." The document was produced by members of the W3C XHTML 2 Working Group as part of the HTML Activity. Originally this document was based upon work done in the definition of XHTML2, and work done by the RDF-in-HTML task force, a joint task force of the Semantic Web Best Practices and Deployment Working Group and XHTML 2 Working Group. It is not yet stable, but has had extensive review and some use in other W3C documents. It is being released in a separate, stand-alone specification in order to speed its adoption and facilitiate its use in various specifications. The aim of the document is to outline a syntax for expressing URIs in a generic, abbreviated syntax. While it has been produced in conjunction with the HTML Working Group, it is not specifically targeted at use by XHTML Family Markup Languages. The target audience for this document is Language designers, not the users of those Languages. More and more languages are expressing URIs in XML using QNames. Since QNames are invariably shorter than the URI that they express, this is obviously a very useful device. The definition of a QName insists on the use of valid XML element names, but an increasingly common use of QNames is as a means to abbreviate URIs, and unfortunately the two are in conflict with each other. This specification addresses the problem by creating a new data type whose purpose is specifically to allow for the abbreviation of URIs in exactly this way. This type is called a "CURIE" or a "Compact URI", and QNames are a subset of this. CURIEs can be used in exactly the same way that QNames have been used in attribute values, with the modification that the format of the strings after the colon are looser. In all cases a parsed CURIE will produce an IRI. However, the process of parsing involves substituting the value represented by the prefix for the prefix itself, and then simply appending the part after the colon.
Model-driven SOA Emerges
Rich Seeley, SearchSOA.com
The combination of business process management (BPM) with service-oriented architecture (SOA) is driving modeling for application development, according to Steve Hendrick, group vice president of application development research at Independent Data Corp. (IDC). As enterprise IT looks for a more structured and consistent way of building applications so that it can get the SOA benefits of Web services reuse, modeling from the high level business requirements to the nitty-gritty processes provides a way to do that, Hendrick said. But it is a trend that most analysts, himself included, did not expect to emerge so quickly. Back in the day, developers pretty much stuck with gathering requirements, which usually ended up gathering dust on a shelf, and then got down to coding applications. With the adoption of SOA and BPM and attendant technologies, including business process modeling notation (BPMN) and business process execution language (BPEL), that approach is going over the application development waterfall in a barrel.
Manage RSS Feeds With the Rome API
John Ferguson Smart, JavaWorld Magazine
RSS (Really Simple Syndication) is an established way of publishing short snippets of information, such as news headlines, project releases, or blog entries. Modern browsers such as Firefox, and more recently Internet Explorer 7, and mail clients such as Thunderbird recognize and support RSS feeds; not to mention the a large number of dedicated RSS readers (aggregators) out there. The large number of individual formats (at least six flavors of RSS plus Atom) can make it difficult to manipulate the feeds by hand, however. RSS feeds aren't just for end-users, though. A variety of application scenarios could require you to read, publish, or process RSS feeds from within your code. Your application could need to publish information through a set of RSS feeds, or need to read, and possibly manipulate, RSS data from another source. For example, some applications use RSS feeds to inform users of changes to the application database that could affect them. RSS feeds also can be useful inside of a development project. Tools like Trac let you use RSS feeds to monitor changes made to a Subversion repository or to a project Web site, which can be a good way to easily keep tabs on the status of many projects simultaneously. Some development projects, for instance, use RSS feeds to monitor continuous integration build results. End users simply subscribe to the CI server's feed in their RSS reader or RSS-enabled Web browser. The server publishes real-time build results in RSS format, which the client can then consult at any time without having to go to the server's Web site. In this article the author shows how to manipulate RSS feeds in Java using the Rome (RSS and Atom utilities) API. He also develops a concrete application of these techniques, writing a simple class that publishes build results from a Continuum build server in RSS format.
See also: Atom references
XML Daily Newslink and Cover Pages are sponsored by:
|BEA Systems, Inc.||http://www.bea.com|
|Sun Microsystems, Inc.||http://sun.com|
XML Daily Newslink: http://xml.coverpages.org/newsletter.html
Newsletter Archive: http://xml.coverpages.org/newsletterArchive.html
Newsletter subscribe: firstname.lastname@example.org
Newsletter unsubscribe: email@example.com
Newsletter help: firstname.lastname@example.org
Cover Pages: http://xml.coverpages.org/